“F# 3.0 introduces an exciting and innovative new programming language feature – Type Providers. This functionality allows types that are dynamically generated during development to be used in statically typed code, bringing many of the benefits of dynamic languages to statically typed languages, without sacrificing the safety of static typing.

At BlueMountain, Howard has built and open-sourced a type provider that allows the functionality of the open-source R statistical package to be used from F# in a very fluid manner. This brings the broadest suite of statistical functionality available on any platform to F#. Howard introduces type providers and their (hitherto unexplored) uses for cross-language meta-programming. He also discusses some of the difficulties of bridging the strong static typing of F# with the loose dynamic typing of R” [1].

The Subset Sum (Main72) problem, officially published in SPOJ, is about computing the sum of all integers that can be obtained from the summations over any subset of the given set (of integers). A naïve solution would be to derive all the subsets of the given set, which unfortunately would result in time complexity, given that is the number of elements in the set.

This post outlines a more efficient (pseudo-polynomial) solution to this problem using Dynamic Programming and F#. Additionally, we post C# code of the solution.

This problem provides a set of integers , and specifies the following constraints–

The size of the given set, i.e., , where the value of is bounded by: .

, the following condition holds: .

Given this input, we would like to find all the integers: and is the sum of the items of any subset over . Afterward, we sum all these integers, and return it as the result to the problem instance.

In essence, we reduce this problem as follows: Given , can we express it using any subset over ? If yes, we include it in the solution set for summation. Interestingly, we realize that the stated problem is a special case of a more general problem called Subset Sum, given that the sum is .

What would be the maximum possible value for ? Indeed, is not practical at all, as can be bounded by the following upper limit: , i.e., the summation of all the items in . This observation effectively reduces the search space to , for a given .

It implies that a naïve algorithm would require to iterate all the subsets over and verify whether their sum is within . Recall that, due to its exponential time complexity, it is quite impractical .

Using dynamic programming technique, a pseudo-polynomial algorithm can be derived, as the problem has an inherent optimal substructure property. That is, a solution of an instance of the problem can be expressed as the solutions of its subproblems, as described next.

We define as the function that determines whether the summation over any subset can result in the integer . So, it yields if sum can be derived over any subset, otherwise, . Also, note that, and .

To define the recurrence, we describe in terms of its smaller subproblems as follows.

In Eq. (1), the first case refers to the fact that is larger than . Consequently, can not be included in the subset to derive . Then, the case 2 of Eq. (1) expresses the problem into two subproblems as follows: we can either ignore though , or we can include it. Using any case stated in Eq. (1), if we can derive i.e. = true, we can include it in the solution set.

As we can see overlapping subproblems, we realize that we can effectively solve them using a bottom-up dynamic programming approach. What about the base cases?

Using a table– dp, we can implement the stated algorithm as follows.

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters

In essence, the N^{th} row in the table provides the set of integers that can be derived by summing over any subset . Thereby, we compute the summation of all these integers that satisfies subsum(N,j) = true, and returns it as the result.

Full source code of the solution can be downloaded from this gist. For C# source code, please visit following gist. Please leave a comment if you have any question/suggestion regarding this post.

Enough said… now, it’s time for a and a new problem. See you soon; till then, happy problem-solving!

Recently, Tomas Petricek gave a talk on the topic of internal DSL development with F#. In particular, he discussed the motivation behind DSL development, and demonstrated a number of example DSLs using F#. If you are interested, please check it out —

The Party Schedule problem, published in SPOJ website, is about deriving an optimal set of parties that maximizes fun value, given a party budget: and parties: where each party have an entrance cost , and is associated with a fun value .

In this post, we discuss how to solve this problem by first outlining an algorithm, and afterwards, by implementing that using F#.

This problem is a special case of Knapsack problem. Main objective of this problem is to select a subset of parties that maximizes fun value, subject to the restriction that the budget must not exceed .

More formally, we are given a budget as the bound. All parties have costs and values . We have to select a subset such that is as large as possible, and subject to the following restriction —

Furthermore, an additional constraint: “do not spend more money than is absolutely necessary” is applied, which implies following. Consider two parties and ; if and , then we must select instead of .

The definition of this problem suggests a recursive solution. However, a naïve recursive solution is quite impractical in this context, as it requires exponential time due to the following reason: a naïve recursive implementation applies a top-down approach that solves the subproblems again and again.

Attempt 2

Next, we select dynamic programmingtechnique to solve this problem. So, we have to define a recurrence that expresses this problem in terms of its subproblems, and therefore it begs the answer of the following question:

What are the subproblems? We have two variants: and , i.e. available budget and number of parties. We can derive smaller subproblems, by modifying these variants; accordingly, we define-

It returns the maximum value over any subset where . Our final objective is to compute , which refers to the optimal solution, i.e., maximal achievable fun value from budget and parties.

To define the recurrence, we describe in terms of its smaller subproblems.

How can we express this problem using its subproblems? Let’s denote to be the optimal subset that results in . Consider party and note following.

It , then . Thus, using the remaining budget and the parties , we seek the optimal solution.

Otherwise, then . Since is not included, we haven’t spent anything from .

Obviously, when , we apply the 2nd case. Using the above observations, we define the recurrence as follows–

where the base conditions can be rendered as below.

Using the above recurrence, we can compute for all parties and for costs . In essence, we build the OPT table, which consists of rows and columns, using a bottom-up approach as stated in the following algorithm.

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters

Following code builds the OPT table using the stated algorithm.

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters

In effect, it does two things: builds OPT table and afterwards, returns the OPT(n,C) and associated optimal cost (due to the constraint “do not spend more money than is absolutely necessary” discussed in background section) as a tuple (see Line 16). Following function computes the optimal cost.

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters

For the complete source code, please check out this gist, or visit its IDEONE page to play with the code. Please leave a comment if you have any question or suggestion regarding the algorithm/implementation.

The sum of the squares of the first ten natural numbers is,

1^{2} + 2^{2} + ... + 10^{2} = 385

The square of the sum of the first ten natural numbers is,

(1 + 2 + ... + 10)^{2} = 55^{2} = 3025

Hence the difference between the sum of the squares of the first ten natural numbers and the square of the sum is 3025-385 = 2640.Find the difference between the sum of the squares of the first one hundred natural numbers and the square of the sum.

Implementation

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters