Longest Common Subsequence¶

We can think of subsequences like binary strings.

rules are not all that great

1111111110000000000000011111

Ursinus college students are great students

01000100001000000000100000000010010111111111000000000

We know there are $2^N$ unique binary strings of length $N$, so a brute force algorithm would be exponential at best. Instead, we're going to make a recursive operation to work backwards. Your job in the next exercise will be to use memoization to make this faster, since there are many overlapping subproblems that you can remember to cut off branches in the recursion

In [1]:
from functools import lru_cache

@lru_cache(maxsize=None) # Python's built-in memoization
def LCS(s1, s2):
    """
    Parameters
    ----------
    s1: string
        First string
    s2: string
        Second string
    """
    ## Problem instance
    key = (len(s1), len(s2))
    
    if len(s1) == 0 or len(s2) == 0:
        return 0
    if s1[-1] == s2[-1]:
        return 1 + LCS(s1[0:-1], s2[0:-1])
    else:
        return max(LCS(s1[0:-1], s2), LCS(s1, s2[0:-1]))

s1 = "rules are not all that great"
s2 = "Ursinus college students are great students"
LCS(s1, s2)
Out[1]:
16

Memoization review with Fibonacci¶

In [ ]:
def fib(N, mem={}):
    res = 1
    # Check once at the beginning if 
    # my problem has already been computed
    if N in mem: 
        res = mem[N]
    else:
        # Otherwise, compute the solution
        if N > 2:
            res = fib(N-1) + fib(N-2)
        # Store the solution in memory so we don't have
        # to repeat it again
        mem[N] = res
    return res

for i in range(1, 100):
    print(fib(i), end=",")
In [ ]: