Find longest increasing sequence

27.1k views Asked by At

You are given a sequence of numbers and you need to find a longest increasing subsequence from the given input(not necessary continuous).

I found the link to this(Longest increasing subsequence on Wikipedia) but need more explanation.

If anyone could help me understand the O(n log n) implementation, that will be really helpful. If you could explain the algo with an example, that will be really appreciated.

I saw the other posts as well and what I did not understand is: L = 0 for i = 1, 2, ... n: binary search for the largest positive j ≤ L such that X[M[j]] < X[i] (or set j = 0 if no such value exists) above statement, from where to start binary search? how to initialize M[], X[]?

8

There are 8 answers

0
bvaughn On

Late to the party, but here's a JavaScript implementation to go along with the others.. :)

var findLongestSubsequence = function(array) {
  var longestPartialSubsequences = [];
  var longestSubsequenceOverAll = [];

  for (var i = 0; i < array.length; i++) {
    var valueAtI = array[i];
    var subsequenceEndingAtI = [];

    for (var j = 0; j < i; j++) {
      var subsequenceEndingAtJ = longestPartialSubsequences[j];
      var valueAtJ = array[j];

      if (valueAtJ < valueAtI && subsequenceEndingAtJ.length > subsequenceEndingAtI.length) {
        subsequenceEndingAtI = subsequenceEndingAtJ;
      }
    }

    longestPartialSubsequences[i] = subsequenceEndingAtI.concat();
    longestPartialSubsequences[i].push(valueAtI);

    if (longestPartialSubsequences[i].length > longestSubsequenceOverAll.length) {
      longestSubsequenceOverAll = longestPartialSubsequences[i];
    }
  }

  return longestSubsequenceOverAll;
};
16
fgb On

A simpler problem is to find the length of the longest increasing subsequence. You can focus on understanding that problem first. The only difference in the algorithm is that it doesn't use the P array.

x is the input of a sequence, so it can be initialized as: x = [0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15]

m keeps track of the best subsequence of each length found so far. The best is the one with the smallest ending value (allowing a wider range of values to be added after it). The length and ending value is the only data needed to be stored for each subsequence.

Each element of m represents a subsequence. For m[j],

  • j is the length of the subsequence.
  • m[j] is the index (in x) of the last element of the subsequence.
  • so, x[m[j]] is the value of the last element of the subsequence.

L is the length of the longest subsequence found so far. The first L values of m are valid, the rest are uninitialized. m can start with the first element being 0, the rest uninitialized. L increases as the algorithm runs, and so does the number of initialized values of m.

Here's an example run. x[i], and m at the end of each iteration is given (but values of the sequence are used instead of indexes).

The search in each iteration is looking for where to place x[i]. It should be as far to the right as possible (to get the longest sequence), and be greater than the value to its left (so it's an increasing sequence).

 0:  m = [0, 0]        - ([0] is a subsequence of length 1.)
 8:  m = [0, 0, 8]     - (8 can be added after [0] to get a sequence of length 2.)
 4:  m = [0, 0, 4]     - (4 is better than 8. This can be added after [0] instead.)
 12: m = [0, 0, 4, 12] - (12 can be added after [...4])
 2:  m = [0, 0, 2, 12] - (2 can be added after [0] instead of 4.)
 10: m = [0, 0, 2, 10]
 6:  m = [0, 0, 2, 6]
 14: m = [0, 0, 2, 6, 14]
 1:  m = [0, 0, 1, 6, 14]
 9:  m = [0, 0, 1, 6, 9]
 5:  m = [0, 0, 1, 5, 9]
 13: m = [0, 0, 1, 5, 9, 13]
 3:  m = [0, 0, 1, 3, 9, 13]
 11: m = [0, 0, 1, 3, 9, 11]
 7:  m = [0, 0, 1, 3, 7, 11]
 15: m = [0, 0, 1, 3, 7, 11, 15]

Now we know there is a subsequence of length 6, ending in 15. The actual values in the subsequence can be found by storing them in the P array during the loop.

Retrieving the best sub-sequence:

P stores the previous element in the longest subsequence (as an index of x), for each number, and is updated as the algorithm advances. For example, when we process 8, we know it comes after 0, so store the fact that 8 is after 0 in P. You can work backwards from the last number like a linked-list to get the whole sequence.

So for each number we know the number that came before it. To find the subsequence ending in 7, we look at P and see that:

7 is after 3
3 is after 1
1 is after 0

So we have the subsequence [0, 1, 3, 7].

The subsequences ending in 7 or 15 share some numbers:

15 is after 11
11 is after 9
9 is after 6
6 is after 2
2 is after 0

So we have the subsequences [0, 2, 6, 9, 11], and [0, 2, 6, 9, 11, 15] (the longest increasing subsequence)

0
mridul On

One of the best explanation to this problem is given by MIT site. http://people.csail.mit.edu/bdean/6.046/dp/

I hope it will clear all your doubts.

0
kiwirafe On

I strongly recommend to click here for a more detailed explaination. I have also used this link for the following explanation.

The general idea of the solution is to mimic the process of Patience Sort to find the length of Longest Increasing Subsequence (LIS):

  1. Construct a list, tails. This list will store the top cards in each pile.

  2. Loop through each element in the nums list.

  3. For each element x, update the tails list based on the rules below:

    (1) If x is larger than all the elements in tails, append it.
    (2) If tails[i-1] < x <= tails[i], update tails[i] with x.

    This step finds the optimal Longest Increasing Subsequence at each stage.

    Note that this step is completed using Binary Search, since we can use Binary Search to loop throught tails and find the appropriate position for element x.

  4. The tails list is a Longest Increasing Subsequence. To find all of the subsequences, we can just store the tails list at each iteration.

For example, if we have nums = [4,5,6,3], then the tails list increases like this:

[4]
[4, 5]
[4, 5, 6]
[3, 5, 6]

Overall, we have iterated through nums once, and for every element we have used Binary Search once, so the Big O would be O(nlogn).

1
Trying On

Below is the O(NLogN) longest increasing subsequence implementation:

// search for the index which can be replaced by the X. as the index can't be
//0 or end (because if 0 then replace in the findLIS() and if it's greater than the 
//current maximum the just append)of the array "result" so most of the boundary 
//conditions are not required.
public static int search(int[] result, int p, int r, int x)
{
    if(p > r) return -1;
    int q = (p+r)/2;
    if(result[q] < x && result[q+1]>x)
    {
        return q+1;
    }
    else if(result[q] > x)
    {
        return search(result, p, q, x);
    }
    else
    {
        return search(result, q+1, r, x);
    }
}
    public static int findLIS(int[] a)
    {
        int[] result = new int[a.length];
        result[0] = a[0];
        int index = 0;
        for(int i=1; i<a.length; i++)
        {
            int no = a[i];
            if(no < result[0]) // replacing the min number
            {
                result[0] = no;
            }
            else if(no > result[index])//if the number is bigger then the current big then append
            {
                result[++index] = no;
            }
            else
            {
                int c = search(result, 0, index, no);
                result[c] = no;
            }
        }
        return index+1;
    }
0
iamcrypticcoder On

The O(N lg N) solution comes from patience sorting of playing card. I found this from my code comment and hence sharing here. I believe it would be really easier to understand for everyone how it works. Also you can find all possible longest increasing sub-sequence list if you understand well.

https://www.cs.princeton.edu/courses/archive/spring13/cos423/lectures/LongestIncreasingSubsequence.pdf

Code:

vector<int> lisNlgN(vector<int> v) {
    int n = v.size();
    vector<int> piles = vector<int>(n, INT_MAX);
    int maxLen = 0;

    for(int i = 0; i < n; i++) {
        int pos = lower_bound(piles.begin(), piles.end(), v[i]) - piles.begin();
        piles[pos] = v[i];
        maxLen = max(maxLen, pos+1); // Plus 1 because of 0-based index.
    }

//    // Print piles for debug purpose
//    for (auto x : piles) cout << x << " ";
//    cout << endl;
//
//    // Print position for debug purpose
//    for (auto x : position) cout << x << " ";
//    cout << endl;

    vector<int> ret = vector<int>(piles.begin(), piles.begin() + maxLen);
    return ret;
}

Code:

vector<vector<int>> allPossibleLIS(vector<int> v) {
    struct Card {
        int val;
        Card* parent = NULL;
        Card(int val) {
            this->val = val;
        }
    };
    auto comp = [](Card* a, Card* b) {
        return a->val < b->val;
    };

    int n = v.size();
    // Convert integers into card node
    vector<Card*> cards = vector<Card*>(n);
    for (int i = 0; i < n; i++) cards[i] = new Card(v[i]);
    vector<Card*> piles = vector<Card*>(n, new Card(INT_MAX));
    vector<Card*> lastPileCards;
    int maxLen = 0;

    for(int i = 0; i < n; i++) {
        int pos = lower_bound(piles.begin(), piles.end(), new Card(v[i]), comp) - piles.begin();
        piles[pos] = cards[i];

        // Link to top card of left pile
        if (pos == 0) cards[i]->parent = NULL;
        else cards[i]->parent = piles[pos-1];

        // Plus 1 because of 0-based index.
        if (pos+1 == maxLen) {
            lastPileCards.push_back(cards[i]);
        } else if (pos+1 > maxLen) {
            lastPileCards.clear();
            lastPileCards.push_back(cards[i]);
            maxLen = pos + 1;
        }
    }

//    Print for debug purpose
//    printf("maxLen = %d\n", maxLen);
//    printf("Total unique lis list = %d\n", lastPileCards.size());

    vector<vector<int>> ret;
    for (auto card : lastPileCards) {
        vector<int> lis;
        Card* c = card;
        while (c != NULL) {
            lis.push_back(c->val);
            c = c->parent;
        }
        reverse(lis.begin(), lis.end());
        ret.push_back(lis);
    }

    return ret;
}
0
Yaguang On

Based on @fgb 's answer, I implemented the algorithm using c++ to find the longest strictly increasing sub-sequence. Hope this will be somewhat helpful.

M[i] is the index of the last element of the sequence whose length is i, P[i] is the index of the previous element of i in the sequence, which is used to print the whole sequence.

main() is used to run the simple test case: {0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15}.

#include <vector>
using std::vector;
int LIS(const vector<int> &v) {
  int size = v.size(), max_len = 1;
  // M[i] is the index of the last element of the sequence whose length is i
  int *M = new int[size];
  // P[i] is the index of the previous element of i in the sequence, which is used to print the whole sequence
  int *P = new int[size];
  M[0] = 0; P[0] = -1;
  for (int i = 1; i < size; ++i) {
    if (v[i] > v[M[max_len - 1]]) {
      M[max_len] = i;
      P[i] = M[max_len - 1];
      ++max_len;
      continue;
    }
    // Find the position to insert i using binary search
    int lo = 0, hi = max_len - 1;
    while (lo <= hi) {
      int mid = lo + ((hi - lo) >> 1);
      if (v[i] < v[M[mid]]) {
        hi = mid - 1;
      } else if (v[i] > v[M[mid]]) {
        lo = mid + 1;
      } else {
        lo = mid;
        break;
      }
    }
    P[i] = P[M[lo]];  // Modify the previous pointer
    M[lo] = i;  
  }
  // Print the whole subsequence
  int i = M[max_len - 1];
  while (i >= 0) {
    printf("%d ", v[i]);
    i = P[i];
  }
  printf("\n");
  delete[] M, delete[] P;
  return max_len;
}
int main(int argc, char* argv[]) {
  int data[] = {0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15};
  vector<int> v;
  v.insert(v.end(), data, data + sizeof(data) / sizeof(int));
  LIS(v);
  return 0;
}
2
James Yu On

based on FJB's answer, java implementation:

public class Lis {

private static int[] findLis(int[] arr) {
    int[] is = new int[arr.length];
    int index = 0;
    is[0] = index;

    for (int i = 1; i < arr.length; i++) {
        if (arr[i] < arr[is[index]]) {
            for (int j = 0; j <= index; j++) {
                if (arr[i] < arr[is[j]]) {
                    is[j] = i;
                    break;
                }
            }
        } else if (arr[i] == arr[is[index]]) {

        } else {
            is[++index] = i;
        }
    }

    int[] lis = new int[index + 1];
    lis[index] = arr[is[index]];

    for (int i = index - 1; i >= 0; i--) {
        if (is[i] < is[i + 1]) {
            lis[i] = arr[is[i]];
        } else {
            for (int j = is[i + 1] - 1; j >= 0; j--) {
                if (arr[j] > arr[is[i]] && arr[j] < arr[is[i + 1]]) {
                    lis[i] = arr[j];
                    is[i] = j;
                    break;
                }
            }
        }
    }

    return lis;
}

public static void main(String[] args) {
    int[] arr = new int[] { 0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11,
            7, 15 };
    for (int i : findLis(arr)) {
        System.out.print(i + "-");
    }
    System.out.println();

    arr = new int[] { 1, 9, 3, 8, 11, 4, 5, 6, 4, 19, 7, 1, 7 };
    for (int i : findLis(arr)) {
        System.out.print(i + "-");
    }
    System.out.println();
}

}