SPOJ 6219. Edit Distance (EDIST) with F#

This problem can be solved using dynamic programming with memoization technique. In essence, it is about computing the Edit Distance, also known as, Levenshtein Distance between two given strings.

Definition

Edit Distance—a.k.a “Lavenshtein Distance”–is the minimum number of edit operations required to transform one word into another. The allowable edit operations are letter insertion, letter deletion and letter substitution.

Implementation

Using Dynamic Programming, we can compute the edit distance between two string sequences. But for that, we need to derive a recursive definition of Edit Distance. We denote the distance between two strings as D, which can be defined using  a recurrence as follows.

Case 1 : Both and are empty strings, denoted as :

Case 2 : Either or is :

Case 3 : Both and are not :

daum_equation_1359460361956

Here , where is the last character of and contains rest of the characters. Same goes for .  We define edit distance between and using a recurrence and in term of and .

can be defined as the minimum, or the least expensive one of the following three alternatives stated in the above equation.

  • Substitution: If , then the overall distance is simply . Otherwise, we need a substitution operation that replaces with , and thus, the overall distance will be .
  • Insertion: Second possibility is to convert to by inserting in . In this case, the distance will be . Here, +1 is the cost of the insert operation.
  • Deletion: Last alternative is to convert to by deleting from that costs +1. Then the distance become .

As this is a ternary recurrence, it would result in an exponential run-time, which is quite  impractical. However, using the dynamic programming with memoization, this recurrence can be solved using a 2D array. The code to solve this problem is outline below.

let computeEditDistance (source:string,target:string) =
let height,width = (source.Length, target.Length)
let grid: int [,] = Array2D.zeroCreate<int> (height+1) (width+1) // 2D Array for memoization
for h = 0 to height do
for w = 0 to width do
grid.[h,w] <-
match h,w with
| h,0 -> h // case 1 and 2
| 0, w -> w
| h, w ->
let s,t = source.[h-1],target.[w-1]
let substitution = grid.[h-1,w-1]+(if s = t then 0 else 1)
let insertion = grid.[h,w-1] + 1
let deletion = grid.[h-1,w] + 1
min (insertion, deletion, substitution) // case 3
grid.[height,width]
view raw gistfile1.fs hosted with ❤ by GitHub

As shown in line 14, the distance grid.[h,w] can be computed locally by taking the min of the three alternatives stated in the recurrence (computed in line 11,12, 13). By obtaining the locally optimum solutions, we eventually get the edit distance from  grid.[s.length, t.length].

Complexity: Run-time complexity: . Lets denote the lengths of both strings as . Then, the complexity become . Space complexity is also same.

Complete source code is outlined in the next page.

Advertisement

2 thoughts on “SPOJ 6219. Edit Distance (EDIST) with F#”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: