# 295. Find Median from Data Stream

Reference: LeetCode

Difficulty: Hard

## Problem

Median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value. So the median is the mean of the two middle value.

For example,

`[2,3,4]`

, the median is`3`

`[2,3]`

, the median is`(2 + 3) / 2 = 2.5`

Design a data structure that supports the following two operations:

`void addNum(int num)`

- Add a integer number from the data stream to the data structure.`double findMedian()`

- Return the median of all elements so far.

**Example:**

1 | addNum(1) |

**Follow up:** (Reference: Solutions to follow-ups)

- If all integer numbers from the stream are between 0 and 100, how would you optimize it?

- We can maintain an integer array of length 100 to store the count of each number along with a total count. Then, we can iterate over the array to find the middle value to get our median. Time and space complexity would be
`O(100) = O(1)`

.

- If 99% of all integer numbers from the stream are between 0 and 100, how would you optimize it?

- In this case, we can keep a counter for
`lessThanHundred`

and`greaterThanHundred`

. As we know the solution will be definitely in 0-100 we don’t need to know those number which are >100 or <0, only count of them will be enough.

## Analysis

Reference: Solution Post

### Simple Sorting

Every time we add in a new value, we need to re-sort the array.

1 | private List<Integer> list = new ArrayList<>(); |

**Time:** `addNum()`

takes $O(N\log{N})$ `462 ms`

**Space:** $O(N)$

### Search + Insert (Binary Search)

**Linear Search:**

When a new number comes, we have to add it to the list while maintaining the sorted nature of the list. This is achieved easily by finding the correct place to insert the incoming number.

1 | private List<Integer> list = new ArrayList<>(); |

**Time:** $O(N)$ `265 ms`

**Space:** $O(N)$

Pop quiz Can we use `binary search`

to improve efficiency? **Sure!**

Since the list is always sorted, we can use binary search.

**Binary Search:** (improvement)

Once the position is found, we need to shift all higher elements by one space to make room for the number, which takes $O(N)$.

**Note:** Here we use `lower bound`

binary search since we need to find the position where the value is `>= num`

.

1 | private List<Integer> list = new ArrayList<>(); |

**Time:** $O(N)$ `76 ms`

- Binary search takes $O(\log{N})$ time.
- Insertion can take up to $O(N)$ time.

**Space:** $O(N)$

### Two Heaps

**Idea:** We only need a consistent way to access the median elements. Keeping the entire input sorted is `not a requirement`

.

As it turns out there are two data structures for the job: `heaps`

and `self-balancing binary search trees`

.

Our goal is to maintain two heaps (balancing) in the following way:

- A
`max-heap`

to store the smaller half of the input numbers. - A
`min-heap`

to store the larger half of the input numbers. **Constraint:**The max-heap is allowed to store, at most,`one more element`

than the min-heap does.

**Steps:** Add the new number to `max-heap`

. Since the `max-heap`

received a new element, we must do a balancing step for the `min-heap`

.

**Size Check (maxPQ > minPQ + 1):**If the constraint is not satisfied, then remove the largest element from the`max-heap`

and offer it to the`min-heap`

.**Value Check (maxPQ.peek() > minPQ.peek()):**Check the max value and min value from two heaps where`maxVal <= minVal`

; otherwise, remove the largest element from the`max-heap`

and offer it to the`min-heap`

.**Size Check (maxPQ < minPQ):**If the`min-heap`

then holds one more elements than the`max-heap`

after the previous operation. We fix that by removing the smallest element from the`min-heap`

and offering it to the`max-heap`

.

1 | private PriorityQueue<Integer> maxPQ; |

**Time:** $O(5\log{N}) = O(\log{N})$**Space:** $O(N)$

### Multiset and Two Pointers

Reference: LeetCode Solution

It is based on self-balancing trees.