## SPOJ 8545. Subset Sum (Main72) with Dynamic Programming and F#

The Subset Sum (Main72) problem, officially published in SPOJ, is about computing the sum of all integers that can be obtained from the summations over any subset of the given set (of integers). A naïve solution would be to derive all the subsets of the given set, which unfortunately would result in  time complexity, given that is the number of elements in the set.

This post outlines a more efficient (pseudo-polynomial) solution to this problem using Dynamic Programming and F#. Additionally, we post C# code of the solution.

Note that we have solved a similar problem in Party Schedule (PARTY) with F# blog-post.

# Interpretation¶

This problem provides a set of integers , and specifies the following constraints–

The size of the given set, i.e., , where the value of is bounded by: .
, the following condition holds: .

Given this input, we would like to find all the integers: and is the sum of the items of any subset over . Afterward, we sum all these integers, and return it as the result to the problem instance.

In essence, we reduce this problem as follows: Given  , can we express it using any subset over ? If yes, we include it in the solution set for summation. Interestingly, we  realize that the stated problem is a special case of a more general problem called Subset Sum, given that the sum is .

# Algorithm¶

What would be the maximum possible value for ? Indeed, is not practical at all, as can be bounded by the following upper limit: , i.e., the summation of all the items in . This observation effectively reduces the search space to , for a given .

It implies that a naïve algorithm would require to iterate all the subsets over and verify whether their sum is within . Recall that, due to its exponential time complexity, it is quite impractical .

Using dynamic programming technique, a pseudo-polynomial algorithm can be derived, as the problem has an inherent optimal substructure property. That is, a solution of an instance of the problem can be expressed as the solutions of its subproblems, as described next.

We define as the function that determines whether the summation over any subset can result in the integer . So, it yields if sum can be derived over any subset, otherwise, . Also, note that,  and .

To define the recurrence, we describe in terms of its smaller subproblems as follows.

In Eq. (1), the first case refers to the fact that is larger than . Consequently,  can not be included in the subset to derive . Then, the case 2 of Eq. (1) expresses the problem into two subproblems as follows: we can either ignore though , or we can include it. Using any case stated in Eq. (1), if we can derive   i.e. = true, we can include it in the solution set.

As we can see overlapping subproblems, we realize that we can effectively solve them using a bottom-up dynamic programming approach. What about the base cases?

Using a table– `dp`, we can implement the stated algorithm as follows.

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
 let computeSubsetSum (set:int array, n:int, sum:int):int = // dp[i,j] = True, if subset of [0..i-1] sum // equals to : // j-set[i] => ith item is included // or // j => ith item is not included let dp:bool[,] = Array2D.zeroCreate (n+1) (sum+1) // This dp tries answer folloowing question, given a sum // j, whether we can derive it by summing any subset of // [0..i]. It computes this answer in bottom up manner, // hence, if starting from (i,j) if it can reach (i,0), the // answer is yes. Otherwise, if sum<>0 but i=0, then answer is // no due to the fact that, using any subsets, the sum cannot be // computed. // Therefore. // base case 1 : given sum = 0, answer is true // sum is zero with the provided set of items. // Hence, storing it as 0 for i = 0 to n do dp.[i,0] <- true // base case 2 : sum <> 0 but OPT = empty --> // answer is false for j=1 to sum do dp.[0,j] <- false for i = 1 to n do let v_i = set.[i-1] for j = 1 to sum do dp.[i,j] <- if j - v_i < 0 then // we can't include i th item in OPT dp.[i-1,j] else dp.[i-1,j]||dp.[i-1,j-v_i] let mutable result = 0 for j=1 to sum do result <- result+ ( if dp.[n,j] = true then j else 0 ) result
view raw SPOJ_MAIN72.fsx hosted with ❤ by GitHub

In essence, the Nth row in the table provides the set of integers that can be derived by summing over any subset . Thereby, we compute the summation of all these integers that satisfies subsum(N,j) = true, and returns it as the result.

# Conclusion¶

Enough said… now, it’s time for a and a new problem. See you soon; till then, happy problem-solving!

## SPOJ 346. Bytelandian Gold Coins (COINS) with Dynamic Programming and F#

The Bytelandian Gold Coins problem, officially published in SPOJ, concerns computing the maximum dollars that can be exchanged for a Bytelandian gold coin. In this post, we outline a solution to this problem with memoization and F#.

# Interpretation¶

The problem definition enforces following rules to perform the exchange. Consider, a Bytelandian gold coin

It can be exchanged to three other coins, i.e., coins. Thus,  coin  yields value in bytelandian gold coins.
Alternatively, coin can be exchanged for dollars.

Our objective is to derive an algorithm that maximizes the dollars exchanged from the gold coin .

# Algorithm¶

From the above interpretation, it is evident that the maximum achievable dollars, (from the exchange of coin ) can be computed  as follows.

It effectively demonstrates an optimal substructure and therefore, hints to a dynamic programming (DP) technique to solve it. That is, for a coin , the optimal value of dollar is given by the following function.

We employ a top-down DP approach, as it seems more efficient than a bottom-up approach in this context. It is due to the fact that a bottom-up approach generally requires an OPT table to persist results of smaller subproblems. As in this case, the value of can be very large (i.e., , a bottom-up DP would require a very large array, and performs more computations.  Hence, for the overlapping subproblems, we employ memoization.

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
 let computeMaxDollars (n:int) (memo:Dictionary)= let rec computeMaxDollars' (ni:int64) = if ni = 0L || ni = 1L then // base case ni else match memo|> Memo.tryFind ni with | Some (nx) -> nx // found in memo. Returning Result. | None -> let f = computeMaxDollars' let nx = (ni/2L, ni/3L, ni/4L) |> (fun (x,y,z) -> (f x) + (f y) + (f z)) |> (fun nx -> Math.Max(ni,nx)) memo|> Memo.add ni nx |> ignore // storing the result in memo nx computeMaxDollars' (n|>int64)
view raw spoj_COINS.fsx hosted with ❤ by GitHub

The following code snippet outlines the implementation of `Memo`.

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
 module Memo = let empty () = new Dictionary() let add k v (memo:Dictionary) = memo.[k] <- v; memo let tryFind k (memo:Dictionary) = match memo.TryGetValue(k) with | true, v -> Some(v) | false,_ -> None
view raw spoj_COINS.fsx hosted with ❤ by GitHub

Happy problem-solving!

## Collatz Problem a.k.a. 3n+1 Problem

This post focuses on Collatz problem, which is also known as, among others, the 3n+1 problem, and  the Syracuse problem.

Outline. We begin by introducing Collatz conjecture; afterwards, we presents an algorithm to solve the problem (UVa 100 or SPOJ 4073) published in both UVa and SPOJ. The primary advantage of having it in SPOJ is that we can use F# to derive a simple and elegant solution; at the same time, we can verify it via SPOJ’s online judge.

# Background: Collatz Conjecture

Collatz problem, also known as problem, concerns the iterates generated by the repeated applications of following the function: Given a positive integer,

which implies that if is even, it returns , otherwise . This function is called Collatz function. Consider, for instance, , the generated iterates are following:. This sequence is referred to as Collatz sequence, or hailstone numbers.

Collatz conjecture, which is credited to Luther Collatz at the University of Hamburg, asserts that for any positive integer , the repeated applications of the Collatz function, i.e., eventually produces value ; that is,  , where denotes application of . Considering , it follows: .

The generated sequence of iterates: is called the Collatz-trajectory of . For instance, beginning from , the resulted sequence converged to as follows: . Therefore, Collatz-trajectory of 26 is . Note that, although Collatz problem is based on this simple concept, it is intractably hard. So far, it has been verified for by Leaven and Vermeluen.

# Algorithm Design: 3n+1 problem

In the rest of this post, we intend to solve the problem, which is essentially a restricted or bounded version of the Collatz problem.

Interpretation. It restricts the iteration by exiting as soon as the Collatz sequence reaches to value 1 at . For = 26, the resulting Collatz sequence is therefore: , and length of the sequence is i.e., 11. This problem asks to compute the largest Collatz sequence that results from any integer between and , which are provided as inputs. Note that, the value of and are both positive integers: .

Implementation. We first apply a naïve brute-force algorithm to solve this problem, which computes the length of the sequence for each integer from to , and returns the maximum length found.  It is worth noting that-

A naïve brute-force algorithm redundantly computes sequences again and again. Consider =13 and =26. For = 26, we also compute the sequence for 13 that has already  been computed during = 13.

We must apply a tail-recursive implementation to compute the sequence, as naïve implementation might results in stack overflow.

As we shall see next, we have optimized the naïve implementation considering the above observations. First, we define function, `nextCollatz`, that returns next integer of the Collatz sequence, given an integer . In effect, it computes from as follows.

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
 let nextCollatz (n:int64) = if n%2L = 0L then n/2L else 3L*n+1L
view raw gistfile1.fs hosted with ❤ by GitHub

Using the algorithm outlined in `collatzSeqLength`, the length of the Collatz sequence is computed for any given integer .

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
 let max = 1000001L let memo = Array.create (max|>int) 0L // Computes Collatz sequence length, given a // positive integer, x. let rec collatzSeqLength (x:int64):int64 = let rec seqLength' (n:int64) contd = match n with | 1L -> contd 1L // initalizing sequence length with 1 | _ -> if n < max && memo.[n|>int] <> 0L then contd memo.[n|>int] else seqLength' (nextCollatz n) (fun x -> let x' = x+1L //incrementing length and storing it in memo. if nint] <- x' else () contd x' ) x|>(fun i -> seqLength' i id)
view raw gistfile1.fs hosted with ❤ by GitHub

It includes the following optimizations over the naïve implementation we stated earlier:

Memorization has been incorporated to effectively optimize the algorithm by avoiding redundant computations of the sequence and its length, which in turn provides a faster algorithm than its naïve counterpart, albeit with the cost of additional space.

Tail-recursive algorithm enables computation of Collatz sequence with larger .

Continuation-passing-style  has been applied in this algorithm to accommodate, and to combine tail-call optimization with memorization.

The following snippet demonstrates how the maximum sequence length is obtained by invoking collatzSeqLength for each integer between the given  and .

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
 let maxCollazSeqLength (x:int64,y:int64) = let x',y' = System.Math.Min(x,y), System.Math.Max(x,y) seq[x'..y'] |> Seq.fold (fun max x -> let r = collatzSeqLength x if r > max then r else max) 0L
view raw gistfile1.fs hosted with ❤ by GitHub

Complete source code of this problem can be found in this gist. Following IDEONE page (with sample inputs and outputs) has been provided to further play with the code (in case of the unavailability of F# in local system).  Java source code is also available for this problem.

Try solving Euler Problem 14, which resembles this problem and based on these stated concepts. Please leave a comment if you have any question/suggestion regarding this post. Happy coding!

## SPOJ 97. Party Schedule (PARTY) with F#

The Party Schedule problem, published in SPOJ website, is about deriving an optimal set of parties that maximizes fun value, given a party budget: and parties: where each party have an entrance cost , and is associated with a fun value .

In this post, we discuss how to solve this problem by first outlining an algorithm, and afterwards, by implementing that using F#.

# Background¶

This problem is a special case of  Knapsack problem. Main objective of this problem is to select a subset of parties that maximizes fun value, subject to the restriction that the budget must not exceed .

More formally, we are given a budget as the bound. All parties have costs and values . We have to select a subset such that is as large as possible, and subject to the following restriction —

Furthermore, an additional constraint: “do not spend more money than is absolutely necessary” is applied, which implies following. Consider two parties and ; if and , then we must select instead of .

# Algorithm¶

### Attempt 1

The definition of this problem suggests a recursive solution. However, a naïve recursive solution is quite impractical in this context, as it requires exponential time due to the following reason: a naïve recursive implementation applies a top-down approach that solves the subproblems again and again.

### Attempt 2

Next, we select dynamic programming technique to solve this problem. So, we have to define a recurrence that expresses this problem in terms of its subproblems, and therefore it begs the answer of the following question:

What are the subproblems? We have two variants: and , i.e. available budget and number of parties. We can derive smaller subproblems, by modifying these variants; accordingly, we define-

It returns the maximum value over any subset where . Our final objective is to compute , which refers to the optimal solution, i.e., maximal achievable fun value from budget and parties.

To define the recurrence, we describe in terms of its smaller subproblems.

How can we express this problem using its subproblems? Let’s denote to be the optimal subset that results in . Consider party and note following.

1. It , then . Thus, using the remaining budget and the parties , we seek the optimal solution.
2. Otherwise, then . Since is not included, we haven’t spent anything from .

Obviously, when , we apply the 2nd case. Using the above observations, we define the recurrence as follows–

where the base conditions can be rendered as below.

Using the above recurrence, we can compute for all parties and for costs . In essence, we build the OPT table, which consists of rows and columns, using a bottom-up approach as stated in the following algorithm.

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
 Initialize OPT[0,c] = 0, for c = 0..C and OPT[i,0], for i= 0..n For i = 1,...,n do For c = 1,..,C do Use above recurrence to compute OPT(i,c) Return OPT(n,C)
view raw gistfile1.txt hosted with ❤ by GitHub

# Implementation¶

Following code builds the OPT table using the stated algorithm.

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
 let computeOptPartySchedule (budget,N,(costs:int[]),(funValues:int[])) = let OPT = Array2D.zeroCreate (N+1) (budget+1) for i = 1 to N do for j = 0 to budget do let c_i = costs.[i-1] //cost for the ith party let f_i = funValues.[i-1] // fun value associated with ith party OPT.[i,j] <- match j,c_i with | _ when j OPT.[i-1,j] | _ -> Math.Max(OPT.[i-1,j],f_i + OPT.[i-1, j-c_i]) // returning (1) summation of all entrance fee or costs, // (2) summation of fun values ((budget,N,OPT,OPT.[N,budget])|>computeOptCost, OPT.[N,budget])
view raw gistfile1.fs hosted with ❤ by GitHub

In effect, it does two things: builds OPT table and afterwards, returns the OPT(n,C) and associated optimal cost (due to the constraint “do not spend more money than is absolutely necessary” discussed in background section)  as a tuple (see Line 16). Following function computes the optimal cost.

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
 let computeOptCost (budget,N,(opt:int [,]),optFunValue) = let mutable optCost = 0 for c=budget downto 0 do if opt.[N,c] = optFunValue then optCost <- c else () optCost
view raw gistfile1.fs hosted with ❤ by GitHub

Complexity.  as each requires time.

For the complete source code, please check out this gist,  or visit its IDEONE page to play with the code. Please leave a comment if you have any question or suggestion regarding the algorithm/implementation.

Happy coding!

## SPOJ 6219. Edit Distance (EDIST) with F#

This problem can be solved using dynamic programming with memoization technique. In essence, it is about computing the Edit Distance, also known as, Levenshtein Distance between two given strings.

# Definition

Edit Distance—a.k.a “Lavenshtein Distance”–is the minimum number of edit operations required to transform one word into another. The allowable edit operations are letter insertion, letter deletion and letter substitution.

# Implementation

Using Dynamic Programming, we can compute the edit distance between two string sequences. But for that, we need to derive a recursive definition of Edit Distance. We denote the distance between two strings as D, which can be defined using  a recurrence as follows.

Case 1 : Both and are empty strings, denoted as :

Case 2 : Either or is :

Case 3 : Both and are not :

Here , where is the last character of and contains rest of the characters. Same goes for .  We define edit distance between and using a recurrence and in term of and .

can be defined as the minimum, or the least expensive one of the following three alternatives stated in the above equation.

• Substitution: If , then the overall distance is simply . Otherwise, we need a substitution operation that replaces with , and thus, the overall distance will be .
• Insertion: Second possibility is to convert to by inserting in . In this case, the distance will be . Here, +1 is the cost of the insert operation.
• Deletion: Last alternative is to convert to by deleting from that costs +1. Then the distance become .

As this is a ternary recurrence, it would result in an exponential run-time, which is quite  impractical. However, using the dynamic programming with memoization, this recurrence can be solved using a 2D array. The code to solve this problem is outline below.

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
 let computeEditDistance (source:string,target:string) = let height,width = (source.Length, target.Length) let grid: int [,] = Array2D.zeroCreate (height+1) (width+1) // 2D Array for memoization for h = 0 to height do for w = 0 to width do grid.[h,w] <- match h,w with | h,0 -> h // case 1 and 2 | 0, w -> w | h, w -> let s,t = source.[h-1],target.[w-1] let substitution = grid.[h-1,w-1]+(if s = t then 0 else 1) let insertion = grid.[h,w-1] + 1 let deletion = grid.[h-1,w] + 1 min (insertion, deletion, substitution) // case 3 grid.[height,width]
view raw gistfile1.fs hosted with ❤ by GitHub

As shown in line 14, the distance `grid.[h,w]` can be computed locally by taking the min of the three alternatives stated in the recurrence (computed in line 11,12, 13). By obtaining the locally optimum solutions, we eventually get the edit distance from  `grid.[s.length, t.length]`.

Complexity: Run-time complexity: . Lets denote the lengths of both strings as . Then, the complexity become . Space complexity is also same.

Complete source code is outlined in the next page.