Coalesce data functionally

In a recent project I had to coalesce quite significant amount of data in the following way. To simplify it for this post, consider that we have the following two lists.

val x = List(“a”, “b”, “c”, “a”)

val y = List(1, 2, 6, 9)

We are about to write a function which would return the following list as the result.

val result = List((a,10), (b,2), (c,6))

Basically it would coalesce value with the same category. See for instance “b” in the above example.

Language that came up with repl inherently provides very nice way to try out different expression and to get to the expected outcome. In this context, as we are using scala, we can use repl-driven development quite conveniently as illustrated below.

  • Define the Lists:
scala> val x = List("a", "b" , "c", "a")
x: List[String] = List(a, b, c, a)

scala> val y = List(1,2,6,9)
y: List[Int] = List(1, 2, 6, 9)
  • Zip them.
scala> val z = x zip y
z: List[(String, Int)] = List((a,1), (b,2), (c,6), (a,9))
  • Group them based on the values of x.
scala> val grps = z groupBy (_._1)
grps: scala.collection.immutable.Map[String,List[(String, Int)]] = Map(b -> List((b,2)), a -> List((a,1), (a,9)), c -> List((c,6)))

  • Map the values of res8 and reduce them to compute the sum.
scala> val res = {_.reduce((i,j) => (i._1, (i._2+j._2)))}

res: Iterable[(String, Int)] = List((b,2), (a,10), (c,6))
  • Sort res based on the 1st value of the tuple.
scala> res.toList.sorted
res23: List[(String, Int)] = List((a,10), (b,2), (c,6))


Thus, the function can be simply written as follows:

Thus we get the expected result.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s