Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
265 views
in Technique[技术] by (71.8m points)

python - Longest increasing subsequence

Given an input sequence, what is the best way to find the longest (not necessarily continuous) increasing subsequence

[0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15]  # input

[1, 9, 13, 15]  # an example of an increasing subsequence (not the longest)

[0, 2, 6, 9, 13, 15]  # longest increasing subsequence (not a unique answer)
[0, 2, 6, 9, 11, 15]  # another possible solution

I'm looking for the best algorithm. If there is code, Python would be nice, but anything is alright.

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I just stumbled in this problem, and came up with this Python 3 implementation:

def subsequence(seq):
    if not seq:
        return seq

    M = [None] * len(seq)    # offset by 1 (j -> j-1)
    P = [None] * len(seq)

    # Since we have at least one element in our list, we can start by 
    # knowing that the there's at least an increasing subsequence of length one:
    # the first element.
    L = 1
    M[0] = 0

    # Looping over the sequence starting from the second element
    for i in range(1, len(seq)):
        # Binary search: we want the largest j <= L
        #  such that seq[M[j]] < seq[i] (default j = 0),
        #  hence we want the lower bound at the end of the search process.
        lower = 0
        upper = L

        # Since the binary search will not look at the upper bound value,
        # we'll have to check that manually
        if seq[M[upper-1]] < seq[i]:
            j = upper

        else:
            # actual binary search loop
            while upper - lower > 1:
                mid = (upper + lower) // 2
                if seq[M[mid-1]] < seq[i]:
                    lower = mid
                else:
                    upper = mid

            j = lower    # this will also set the default value to 0

        P[i] = M[j-1]

        if j == L or seq[i] < seq[M[j]]:
            M[j] = i
            L = max(L, j+1)

    # Building the result: [seq[M[L-1]], seq[P[M[L-1]]], seq[P[P[M[L-1]]]], ...]
    result = []
    pos = M[L-1]
    for _ in range(L):
        result.append(seq[pos])
        pos = P[pos]

    return result[::-1]    # reversing

Since it took me some time to understand how the algorithm works I was a little verbose with comments, and I'll also add a quick explanation:

  • seq is the input sequence.
  • L is a number: it gets updated while looping over the sequence and it marks the length of longest incresing subsequence found up to that moment.
  • M is a list. M[j-1] will point to an index of seq that holds the smallest value that could be used (at the end) to build an increasing subsequence of length j.
  • P is a list. P[i] will point to M[j], where i is the index of seq. In a few words, it tells which is the previous element of the subsequence. P is used to build the result at the end.

How the algorithm works:

  1. Handle the special case of an empty sequence.
  2. Start with a subsequence of 1 element.
  3. Loop over the input sequence with index i.
  4. With a binary search find the j that let seq[M[j] be < than seq[i].
  5. Update P, M and L.
  6. Traceback the result and return it reversed.

Note: The only differences with the wikipedia algorithm are the offset of 1 in the M list, and that X is here called seq. I also test it with a slightly improved unit test version of the one showed in Eric Gustavson answer and it passed all tests.


Example:

seq = [30, 10, 20, 50, 40, 80, 60]

       0    1   2   3   4   5   6   <-- indexes

At the end we'll have:

M = [1, 2, 4, 6, None, None, None]
P = [None, None, 1, 2, 2, 4, 4]
result = [10, 20, 40, 60]

As you'll see P is pretty straightforward. We have to look at it from the end, so it tells that before 60 there's 40,before 80 there's 40, before 40 there's 20, before 50 there's 20 and before 20 there's 10, stop.

The complicated part is on M. At the beginning M was [0, None, None, ...] since the last element of the subsequence of length 1 (hence position 0 in M) was at the index 0: 30.

At this point we'll start looping on seq and look at 10, since 10 is < than 30, M will be updated:

if j == L or seq[i] < seq[M[j]]:
    M[j] = i

So now M looks like: [1, None, None, ...]. This is a good thing, because 10 have more chanches to create a longer increasing subsequence. (The new 1 is the index of 10)

Now it's the turn of 20. With 10 and 20 we have subsequence of length 2 (index 1 in M), so M will be: [1, 2, None, ...]. (The new 2 is the index of 20)

Now it's the turn of 50. 50 will not be part of any subsequence so nothing changes.

Now it's the turn of 40. With 10, 20 and 40 we have a sub of length 3 (index 2 in M, so M will be: [1, 2, 4, None, ...] . (The new 4 is the index of 40)

And so on...

For a complete walk through the code you can copy and paste it here :)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...