Constructing a balanced Binary Search Tree from a sorted List in O(N) time

This post discusses a O(n) algorithm that construct a balanced binary search tree (BST) from a sorted list. For instance, consider that we are given a sorted list: [1,2,3,4,5,6,7]. We have to construct a balanced BST as follows.

        4
        |
    2        6 
    |        |
  1   3   5     7

To do so, we use the following definition of Tree, described in Scala By Example book.

abstract class IntSet
case object Empty extends IntSet
case class NonEmpty(elem: Int, left: IntSet, right: IntSet) extends IntSet

One straight-forward approach would be to repeatedly perform binary search on the given list to find the median of the list, and then, to construct a balanced BST recursively. Complexity of such approach is O(nlogn), where n is the number of elements in the given list.

A better algorithm constructs balanced BST while iterating the list only once. It begins with the leaf nodes and construct the tree in a bottom-up manner. As such, it avoids repeated binary searches and achieves better runtime complexity (i.e., O(n), where n is the total number of elements in the given list). Following Scala code outlines this algorithm, which effectively converts a list ls to an IntSet, a balanced BST:

def toTree(ls: List[Int]): IntSet = {
def toTreeAux(ls: List[Int], n: Int): (List[Int], IntSet) = {
if (n <= 0)
(ls, Empty)
else {
val (lls, lt) = toTreeAux(ls, n / 2) // construct left sub-tree
val x :: xs = lls // extract root node
val (xr, rt) = toTreeAux(xs, n - n / 2 - 1) // construct right sub-tree
(xr, IntSet(x, lt, rt)) // construct tree
}
}
val (ls_1, tree) = toTreeAux(ls, List.length(ls))
tree
}
view raw toTree.scala hosted with ❤ by GitHub

Any comment or query regarding this post is highly appreciated. Thanks.

Advertisement

UVa 136. Ugly Numbers

This blog-post is about UVa 136: Ugly Number, a trivial, but interesting UVa problem. The crux involves computing 1500th Ugly number, where a Ugly number is defined as a number whose prime factors are only 2, 3 or 5. Following illustrates a sequence of Ugly numbers:

1,2,3,4,5,6,8,9,10,12,15...

Using F#, we can derive 1500th Ugly Number in F#’s REPL as follows.

seq{1L..System.Int64.MaxValue}
|> Seq.filter (isUglyNumber)
|> Seq.take 1500
|> Seq.last
view raw gistfile1.fs hosted with ❤ by GitHub

In this context, the primary function the determines whether a number is a Ugly number or not–isUglyNumber–is outlined as follows. As we can see, it is a naive algorithm that can be further optimized using memoization (as listed here).

(*
* :a:int64 -> b:bool
*)
let isUglyNumber (x:int64) :bool =
let rec checkFactors (n_org:int64) (n:int64) lfactors =
match n with
| 1L -> true
| _ ->
match lfactors with
| [] -> false
| d::xs ->
if isFactor n d then
checkFactors n_org (n/d) lfactors
elif n > d then
checkFactors n_org n xs
else
false
checkFactors x x [2L;3L;5L]
view raw gistfile1.fs hosted with ❤ by GitHub

After computing the 1500th Ugly number in this manner, we submit the Java source code listed below. For complete source code. please visit this gist. Alternatively, this script is also available at tryfsharp.org for further introspection.

class Main {
public static void main(String[] args) {
System.out.println("The 1500'th ugly number is 859963392.");
}
}
view raw gistfile1.java hosted with ❤ by GitHub

Please leave a comment in case of any question or improvement of this implementation. Thanks.

SPOJ 6219. Edit Distance (EDIST) with F#

This problem can be solved using dynamic programming with memoization technique. In essence, it is about computing the Edit Distance, also known as, Levenshtein Distance between two given strings.

Definition

Edit Distance—a.k.a “Lavenshtein Distance”–is the minimum number of edit operations required to transform one word into another. The allowable edit operations are letter insertion, letter deletion and letter substitution.

Implementation

Using Dynamic Programming, we can compute the edit distance between two string sequences. But for that, we need to derive a recursive definition of Edit Distance. We denote the distance between two strings as D, which can be defined using  a recurrence as follows.

Case 1 : Both and are empty strings, denoted as :

Case 2 : Either or is :

Case 3 : Both and are not :

daum_equation_1359460361956

Here , where is the last character of and contains rest of the characters. Same goes for .  We define edit distance between and using a recurrence and in term of and .

can be defined as the minimum, or the least expensive one of the following three alternatives stated in the above equation.

  • Substitution: If , then the overall distance is simply . Otherwise, we need a substitution operation that replaces with , and thus, the overall distance will be .
  • Insertion: Second possibility is to convert to by inserting in . In this case, the distance will be . Here, +1 is the cost of the insert operation.
  • Deletion: Last alternative is to convert to by deleting from that costs +1. Then the distance become .

As this is a ternary recurrence, it would result in an exponential run-time, which is quite  impractical. However, using the dynamic programming with memoization, this recurrence can be solved using a 2D array. The code to solve this problem is outline below.

let computeEditDistance (source:string,target:string) =
let height,width = (source.Length, target.Length)
let grid: int [,] = Array2D.zeroCreate<int> (height+1) (width+1) // 2D Array for memoization
for h = 0 to height do
for w = 0 to width do
grid.[h,w] <-
match h,w with
| h,0 -> h // case 1 and 2
| 0, w -> w
| h, w ->
let s,t = source.[h-1],target.[w-1]
let substitution = grid.[h-1,w-1]+(if s = t then 0 else 1)
let insertion = grid.[h,w-1] + 1
let deletion = grid.[h-1,w] + 1
min (insertion, deletion, substitution) // case 3
grid.[height,width]
view raw gistfile1.fs hosted with ❤ by GitHub

As shown in line 14, the distance grid.[h,w] can be computed locally by taking the min of the three alternatives stated in the recurrence (computed in line 11,12, 13). By obtaining the locally optimum solutions, we eventually get the edit distance from  grid.[s.length, t.length].

Complexity: Run-time complexity: . Lets denote the lengths of both strings as . Then, the complexity become . Space complexity is also same.

Complete source code is outlined in the next page.