Moving maximum variant

378 views Asked by At

Yesterday, I got asked the following question during a technical interview.

Imagine that you are working for a news agency. At every discrete point of time t, a story breaks. Some stories are more interesting than others. This "hotness" is expressed as a natural number h, with greater numbers representing hotter news stories.

Given a stream S of n stories, your job is to find the hottest story out of the most recent k stories for every t >= k.

So far, so good: this is the moving maximum problem (also known as the sliding window maximum problem), and there is a linear-time algorithm that solves it.

Now the question gets harder. Of course, older stories are usually less hot compared to newer stories. Let the age a of the most recent story be zero, and let the age of any other story be one greater than the age of its succeeding story. The "improved hotness" of a story is then defined as max(0, min(h, k - a)).

Here's an example:

n = 13, k = 4

S indices:   0   1   2   3   4   5   6   7   8   9  10
S values:    1   3   1   7   1   3   9   3   1   3   1

mov max hot indices:     3   3   3   6   6   6   6   9
mov max hot values:      7   7   7   9   9   9   9   3

mov max imp-hot indices: 3   3   5   6   7   7   9   9
mov max imp-hot values:  4   3   3   4   3   3   3   3

I was at a complete loss with this question. I thought about adding the index to every element before computing the maximum, but that gives you the answer for when the hotness of a story decreases by one at every step, regardless of whether it reached the hotness bound or not.

Can you find an algorithm for this problem with sub-quadratic (ideally: linear) running time?

2

There are 2 answers

2
David Eisenstat On BEST ANSWER

I'll sketch a linear-time solution to the original problem involving a double-ended queue (deque) and then extend it to improved hotness with no loss of asymptotic efficiency.

Original problem: keep a deque that contains the stories that are (1) newer or hotter than every other story so far (2) in the window. At any given time, the hottest story in the queue is at the front. New stories are pushed onto the back of the deque, after popping every story from the back until a hotter story is found. Stories are popped from the front as they age out of the window.

For example:

S indices:   0   1   2   3   4   5   6   7   8   9  10
S values:    1   3   1   7   1   3   9   3   1   3   1

deque: (front) [] (back)
push (0, 1)
deque: [(0, 1)]
pop (0, 1) because it's not hotter than (1, 3)
push (1, 3)
deque: [(1, 3)]
push (2, 1)
deque: [(1, 3), (2, 1)]
pop (2, 1) and then (1, 3) because they're not hotter than (3, 7)
push (3, 7)
deque: [(3, 7)]
push (4, 1)
deque: [(3, 7), (4, 1)]
pop (4, 1) because it's not hotter than (5, 3)
push (5, 3)
deque: [(3, 7), (5, 3)]
pop (5, 3) and then (3, 7) because they're not hotter than (6, 9)
push (6, 9)
deque: [(6, 9)]
push (7, 3)
deque: [(6, 9), (7, 3)]
push (8, 1)
deque: [(6, 9), (7, 3), (8, 1)]
pop (8, 1) and (7, 3) because they're not hotter than (9, 3)
push (9, 3)
deque: [(6, 9), (9, 3)]
push (10, 1)
pop (6, 9) because it exited the window
deque: [(9, 3), (10, 1)]

To handle the new problem, we modify how we handle aging stories. Instead of popping stories as they slide out of the window, we pop the front story whenever its improved hotness becomes less than or equal to its hotness. When determining the top story, only the most recently popped story needs to be considered.

In Python:

import collections

Elem = collections.namedtuple('Elem', ('hot', 't'))


def winmaximphot(hots, k):
    q = collections.deque()
    oldtop = 0
    for t, hot in enumerate(hots):
        while q and q[-1].hot <= hot:
            del q[-1]
        q.append(Elem(hot, t))
        while q and q[0].hot >= k - (t - q[0].t) > 0:
            oldtop = k - (t - q[0].t)
            del q[0]
        if t + 1 >= k:
            yield max(oldtop, q[0].hot) if q else oldtop
        oldtop = max(0, oldtop - 1)


print(list(winmaximphot([1, 3, 1, 7, 1, 3, 9, 3, 1, 3, 1], 4)))
0
SergeyS On

Idea is the following: for each breaking news, it will beat all previous news after k-h steps. It means for k==30 and news hotness h==28, this news will be hotter than all previous news after 2 steps. Let's keep all moments of time when next news will be the hottest. At step i we get moment of time when current news will beat all previous ones equal to i+k-h. So we will have such sequence of objects {news_date | news_beats_all_previous_ones_date}, which is in increasing order by news_beats_all_previous_ones_date:

{i1 | i1+k-h} {i3 | i3+k-h} {i4 | i4+k-h} {i7 | i7+k-h} {i8 | i8+k-h}

At current step we get i9+k-h, we are adding it to the end of this list, removing all values which are bigger (since sequence is increasing this is easy). Once first element's news_beats_all_previous_ones_date becomes equal current date (i), this news becomes answer to the sliding window query and we remove this item from the sequence. So, you need a data structure with ability to add to the end, and remove from beginning and from the end. This is Deque. Time complexity of solution is O(n).