Median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value.
So the median is the mean of the two middle value. For example:
[2,3,4], the median is 3
[2,3], the median is (2 + 3) / 2 = 2.5
Design a data structure that supports the following two operations:
====== Examples ======
addNum(1)
addNum(2)
findMedian() -> 1.5
addNum(3)
findMedian() -> 2
To approach this problem need to keep two things in mind:
Most important thing though to keep in mind is that we only need a consistent way to access the median elements and keeping entire input sorted is not required.
Any Data Structure available to satisfy our need ?
As it turns out there are two data structures for the job:
Heaps are a natural ingredient for this dish, adding elements to them take logarithmic order of time and also give direct access to the maximal/minimal elements in a group.
If we could maintain two heaps in the following way:
This gives access to median values in the input: they comprise the top of the heaps!
If the following conditions are met:
Then we can say that:
Then xxx is smaller than (or equal to) almost half of the elements and yyy is larger than (or equal to) the other half, thats what median is.
But then how to Balance two Heaps ?
Code:
from heapq import heappop, heappush
class MedianFinder:
def __init__(self):
self.low_max_heap = []
self.high_min_heap = []
def addNum(self, num: int) -> None:
# Add num low_max_heap, since low_max_heap received a new element
# Balance high_min_heap by removing the largest element from low_max_heap and offer to high_min_heap.
heappush(self.low_max_heap, -num)
low_max_heap_largest_element = -heappop(self.low_max_heap)
heappush(self.high_min_heap, low_max_heap_largest_element)
# high_min_heap might end having more elements than the low_max_heap, after the previous operation.
# Fix that by removing the smallest element from high_min_heap and offering it to low_max_heap.
if (len(self.high_min_heap) > len(self.low_max_heap)):
high_min_heap_smallest_element = heappop(self.high_min_heap)
heappush(self.low_max_heap, -high_min_heap_smallest_element)
def findMedian(self) -> float:
if ((len(self.high_min_heap) + len(self.low_max_heap)) % 2 == 1):
return float(-self.low_max_heap[0])
else:
return float((-self.low_max_heap[0] + self.high_min_heap[0])/2)
mf = MedianFinder()
mf.addNum(1)
mf.addNum(2)
print(mf.findMedian())
mf.addNum(3)
print(mf.findMedian())
Output:
1.5
2.0
Complexity: