you're reading...

Longest Increasing Subsequence in O(n lg n) in Erlang.

In this attempt I have tried to code for more of a typical problem, finding longest increasing subsequence, which I will call LIS throughout.

LIS is sister of a more general Longest Common Subsequence (LCS) problem used to find longest common subsequences in two strings. Both of them are solved using dynamic programming paradigm in O(n2) time. In LIS the longest increasing subsequence is found out. LIS is actually LCS of S and S(sorted). So sort the sequence, find LCS and the resulting substring will be the LIS. This being overall O(n2) operation.

But the tricky part here is giving the solution for LIS in O(n lg n) time. The solution derives from something called as patience sort which deals with sorting card solitare game and performs in O(n lg n) time. The data structure used in Erlang is modified version of gb_trees.erl, where an element is searched in gb_tree and if it’s found its value is returned : {value, Value} and if its not found then its neighbors are returned : {neighbors, {K1, V1}, {K2, V2}} when both neighbors exist and  {neighbors, min, {K2, V2}} and {neighbours, {K1, V1}, max} when less than least and greater than largest values are queried for the tree. When there is single element in tree and element is not found then tuple {single, {K, V}} is returned. Save the source as mytrees.erl, http://pastebin.com/EzY3Au5s.

The algorithm for finding LIS runs like this. Initially there’s this whole range from 0-length(list) , after processing each element of the list sequentially, the range is broken down and corresponding value for length of largest subsequence till that element is given by the value for the key as the current element of the list. We go along with two gb_trees, thats basically two binary trees one saving range, this one initially being (suppose length of list is 10); {K,V}=> {“0-9″=>1} and {K1,V1}=>{“0-4”,1}, {K2,V2}=>{“5-9”,2} the other tree stores the range as the value {K, V}=>{0,9} and {K1,V1}=>{0,4}, {K2,V2}=>{5,9} and so on. Save this one as lis_dc.erl, http://pastebin.com/vS5WfKZP

1> c(mytrees.erl).
2> c(lis_dc.erl).

3> lis_dc:test([34,76,99,31,90,111,455,213,231,534,11,33]).
{0, 0}
{3, 0}
{4, 0}
{0, 3}
{4, 6}
{5, 0}
{7, 0}
{7, 10}
{8, 0}
{9, 0}
{0, 4}

Length of longest Increasing subsequence is 7 and it ends at 534.

Each time we search on the element of the list, we make a search in O (log n) time on trees to get the value and adjusting the keys in O(1)  time, making it in overall in O(n log n) time.

There’s one obvious drawback with this code/solution that this doesn’t handle the case with duplicate occurring value in the list. Since erlang gb_trees take only single values this is the case.

As far as that is ignored, we are good to go.



No comments yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: