Longest Common Subsequence Pattern

Problems Following Longest Common Subsequence Pattern

1. Longest Common Subsequence

Problem Statement:

Given two strings str1 and str2 find the length of the longest common subsequence in both the strings.

A subsequence is a sequence that can be derived from another sequence by deleting some or no elements without changing order.

===== Examples =====
Input: str1 = "abdca", str2 = "cbda"
Output: 2
Explanation: The longest common substring is "bd"

Input: str1 = "passport", str2 = "ppsspt"
Output: 5
Explanation: The longest common substring is "psspt"


Brute-Force : Recursive Solution

Code:

def find_longest_common_subsequence_recursion(str1, str2):
    i, j = 0, 0
    return find_longest_common_subsequence_recursion_util(str1, str2, i, j)


def find_longest_common_subsequence_recursion_util(str1, str2, i, j):
    if (i == len(str1) or j == len(str2)):
        return 0

    if (str1[i] == str2[j]):
        return 1 + find_longest_common_subsequence_recursion_util(str1, str2, i + 1, j + 1)

    skip_1st_count = find_longest_common_subsequence_recursion_util(str1, str2, i + 1, j)
    skip_2nd_count = find_longest_common_subsequence_recursion_util(str1, str2, i, j + 1)

    return max(skip_1st_count, skip_2nd_count)


print("Recursive Method :")
print(find_longest_common_subsequence_recursion("abdbca", "cbda"))
print(find_longest_common_subsequence_recursion("passport", "ppsspt"))

Output:

Recursive Method :
3
5

Complexity:


DP : Recursion + Memoization (Top-Down) Solution

Code:

def find_longest_common_subsequence_dp_memoization(str1, str2):
    i, j = 0, 0
    n1, n2 = len(str1), len(str2)
    memory = [[None]*(n2) for _ in range(n1)]
    return find_longest_common_subsequence_dp_memoization_util(str1, str2, i, j, memory)


def find_longest_common_subsequence_dp_memoization_util(str1, str2, i, j, memory):
    if (i == len(str1) or j == len(str2)):
        return 0

    if (memory[i][j]):
        return memory[i][j]

    if (str1[i] == str2[j]):
        memory[i][j] = 1 + find_longest_common_subsequence_dp_memoization_util(str1, str2, i + 1, j + 1, memory)
        return memory[i][j]

    skip_1st_count = find_longest_common_subsequence_dp_memoization_util(str1, str2, i + 1, j, memory)
    skip_2nd_count = find_longest_common_subsequence_dp_memoization_util(str1, str2, i, j + 1, memory)
    memory[i][j] = max(skip_1st_count, skip_2nd_count)

    return memory[i][j]


print("\nDP -> Recursion + Memoization Method :")
print(find_longest_common_subsequence_dp_memoization("abdbca", "cbda"))
print(find_longest_common_subsequence_dp_memoization("passport", "ppsspt"))

Output:

DP -> Recursion + Memoization Method :
3
5

Complexity:


DP : Iteration + Tabulation (Bottom-Up) Solution

Table Filling Formula:

if(str1[i] == str2[j]):

​ table[i][j] = 1 + table[i-1][j-1]

else:

​ table[i][j] = max(table[i-1][j], table[i][j-1])

Tabulation:

Final Table:

Code:

def find_longest_common_subsequence_dp_tabulation(str1, str2):
    n1, n2 = len(str1), len(str2)
    table = [[0] * (n2 + 1) for _ in range(n1 + 1)]

    for i in range(1, n1 + 1):
        for j in range(1, n2 + 1):
            if (str1[i - 1] == str2[j - 1]):
                table[i][j] = 1 + table[i-1][j-1]
            else:
                table[i][j] = max(table[i - 1][j], table[i][j - 1])

    return table[i][j]


print("\nDP -> Iteration + Tabulation Method :")
print(find_longest_common_subsequence_dp_tabulation("abdbca", "cbda"))
print(find_longest_common_subsequence_dp_tabulation("passport", "ppsspt"))

Output:

DP -> Iteration + Tabulation Method :
3
5

Complexity:


DP : Iteration + Tabulation (Bottom-Up) Space Optimized Solution

Code:

def find_longest_common_subsequence_dp_tabulation_space_optimized(str1, str2):
    n1, n2 = len(str1), len(str2)
    table = [[0] * (n2 + 1) for _ in range(2)]

    for i in range(1, n1 + 1):
        for j in range(1, n2 + 1):
            if (str1[i - 1] == str2[j - 1]):
                table[i % 2][j] = 1 + table[(i - 1) % 2][j - 1]
            else:
                table[i % 2][j] = max(table[(i - 1) % 2][j], table[i % 2][j - 1])

    return table[i % 2][j]


print("\nDP -> Iteration + Tabulation Space Optimized Method :")
print(find_longest_common_subsequence_dp_tabulation_space_optimized("abdbca", "cbda"))
print(find_longest_common_subsequence_dp_tabulation_space_optimized("passport", "ppsspt"))

Output:

DP -> Iteration + Tabulation Space Optimized Method :
3
5

Complexity:



2. Longest Common Substring (LCS)

Problem Statement:

Given two string str1 and str2, find the length of longest substring which is common in both the strings.

===== Examples =====
Input: str1 = "abdca", str2 = "cbda"
Output: 2
Explanation: The longest common substring is "bd"

Input: str1 = "passport", str2 = "psspt"
Output: 3
Explanation: The longest common substring is "ssp"


Brute-Force : Recursive Solution

Code:

def find_lcs_length_recursion(str1, str2):
    i, j, lcs_length = 0, 0, 0
    return find_lcs_length_recursion_util(str1, str2, i, j, lcs_length)


def find_lcs_length_recursion_util(str1, str2, i, j, lcs_length):
    if (i == len(str1) or j == len(str2)):
        return lcs_length

    lcs_length_1 = lcs_length
    if (str1[i] == str2[j]):
        lcs_length_1 = find_lcs_length_recursion_util(str1, str2, i + 1, j + 1, lcs_length + 1)

    lcs_length_2 = find_lcs_length_recursion_util(str1, str2, i, j + 1, 0)
    lcs_length_3 = find_lcs_length_recursion_util(str1, str2, i + 1, j, 0)

    return max(lcs_length_1, lcs_length_2, lcs_length_3)


print("Recursive Method :")
print(find_lcs_length_recursion("abdbca", "cbda"))
print(find_lcs_length_recursion("passport", "psspt"))

Output:

Recursive Method :
2
3

Complexity:


DP : Recursion + Memoization (Top-Down) Solution

Code:

def find_lcs_length_dp_memoization(str1, str2):
    n1, n2 = len(str1), len(str2)
    max_lcs_length = min(n1, n2)
    memory = [[[None] * max_lcs_length for _ in range(n2)] for _ in range(n1)]
    i, j, lcs_length = 0, 0, 0

    return find_lcs_length_dp_memoization_util(str1, str2, i, j, lcs_length, memory)


def find_lcs_length_dp_memoization_util(str1, str2, i, j, lcs_length, memory):
    if (i == len(str1) or j == len(str2)):
        return lcs_length

    if (memory[i][j][lcs_length]):
        return memory[i][j][lcs_length]

    lcs_length_1 = lcs_length
    if (str1[i] == str2[j]):
        lcs_length_1 = find_lcs_length_dp_memoization_util(str1, str2, i + 1, j + 1, lcs_length + 1, memory)

    lcs_length_2 = find_lcs_length_dp_memoization_util(str1, str2, i, j + 1, 0, memory)
    lcs_length_3 = find_lcs_length_dp_memoization_util(str1, str2, i + 1, j, 0, memory)

    memory[i][j][lcs_length] = max(lcs_length_1, lcs_length_2, lcs_length_3)

    return memory[i][j][lcs_length]


print("\nDP -> Recursion + Memoization Method :")
print(find_lcs_length_dp_memoization("abdbca", "cbda"))
print(find_lcs_length_dp_memoization("passport", "psspt"))

Output:

DP -> Recursion + Memoization Method :
2
3

Complexity:


DP : Iteration + Tabulation (Bottom-Up) Solution

Table Filling Formula:

if(str1[i] == str2[j]):

​ table[i][j] = 1 + table[i-1][j-1]

else:

​ table[i][j] = 0

Tabulation:

Final Table:

Here we can see that longest common substring is of length 2, represented by table[3][3].

Code:

def find_lcs_length_dp_tabulation(str1, str2):
    n1, n2 = len(str1), len(str2)
    table = [[0] * (n2 + 1) for _ in range(n1 + 1)]

    lcs_length = 0
    for i in range(1, n1 + 1):
        for j in range(1, n2 + 1):
            if (str1[i - 1] == str2[j - 1]):
                table[i][j] = 1 + table[i - 1][j - 1]
                lcs_length = max(lcs_length, table[i][j])
            else:
                table[i][j] = 0

    return lcs_length


print("\nDP -> Iteration + Tabulation Method :")
print(find_lcs_length_dp_tabulation("abdbca", "cbda"))
print(find_lcs_length_dp_tabulation("passport", "psspt"))

Output:

DP -> Iteration + Tabulation Method :
2
3

Complexity:


DP : Iteration + Tabulation (Bottom-Up) Space Optimized Solution

Code:

def find_lcs_length_dp_tabulation_space_optimized(str1, str2):
    n1, n2 = len(str1), len(str2)
    table = [[0] * (n2 + 1) for _ in range(2)]

    lcs_length = 0
    for i in range(1, n1 + 1):
        for j in range(1, n2 + 1):
            if (str1[i - 1] == str2[j - 1]):
                table[i % 2][j] = 1 + table[(i - 1) % 2][j - 1]
                lcs_length = max(lcs_length, table[i % 2][j])
            else:
                table[i % 2][j] = 0

    return lcs_length


print("\nDP -> Iteration + Tabulation Space Optimized Method :")
print(find_lcs_length_dp_tabulation_space_optimized("abdbca", "cbda"))
print(find_lcs_length_dp_tabulation_space_optimized("passport", "psspt"))

Output:

DP -> Iteration + Tabulation Space Optimized Method :
2
3

Complexity:



3. Minimum Deletions & Insertions to Transform a String into another

Problem Statement:

Given string str1 and str2, we need to transform str1 into str2 by deleting and inserting characters.

Calculate the count of the minimum number of deletion and insertion operations.

====== Examples ======
Input: str1 = "abc", str2 = "fbc"
Output: 1 deletion and 1 insertion
Explanation: We need to delete {'a'} from str1 and insert {'f'} into it to transform to str2

Input: str1 = "abdca", str2 = "cbda"
Output: 2 deletions and 1 insertion
Explanation: We need to delete {'a', 'c'} from str1 and insert {'c'} into it to transform to str2

Input: str1 = "passport", str2 = "ppsspt"
Output: 3 deletions and 1 insertion
Explanation: We need to delete {'a', 'o', 'r'} from str1 and insert {'p'} into it to transform to str2


Solution Approach:


DP : Iteration + Tabulation (Bottom-Up) Solution
def min_delete_insert_to_transform(str1, str2):
    n1, n2 = len(str1), len(str2)
    table = [[0] * (n2 + 1) for _ in range(2)]

    for i in range(1, n1 + 1):
        for j in range(1, n2 + 1):
            if (str1[i - 1] == str2[j - 1]):
                table[i % 2][j] = 1 + table[(i - 1) % 2][j - 1]
            else:
                table[i % 2][j] = max(table[(i - 1) % 2][j], table[i % 2][j - 1])

    lcs_count = table[i % 2][j]
    print(f"Minimum Deletions Needed : {len(str1)-lcs_count}")
    print(f"Minimum Insertions Needed : {len(str2)-lcs_count}\n")


print("DP -> Iteration + Tabulation Space Optimized Method :")
min_delete_insert_to_transform("abc", "fbc")
min_delete_insert_to_transform("abdca", "cbda")
min_delete_insert_to_transform("passport", "ppsspt")

Output:

DP -> Iteration + Tabulation Space Optimized Method :
Minimum Deletions Needed : 1
Minimum Insertions Needed : 1

Minimum Deletions Needed : 2
Minimum Insertions Needed : 1

Minimum Deletions Needed : 3
Minimum Insertions Needed : 1

Complexity:



4. Shortest Common Super-sequence

5. Longest Repeating Subsequence

6. Subsequence Pattern Matching

7.Edit Distance

8. Strings Interleaving




← Previous: Palindromic Subsequence Pattern

Next: Longest Increasing Subsequence Pattern →