In a recent project I had to coalesce quite significant amount of data in the following way. To simplify it for this post, consider that we have the following two lists.
val x = List(“a”, “b”, “c”, “a”) val y = List(1, 2, 6, 9)
We are about to write a function which would return the following list as the result.
val result = List((a,10), (b,2), (c,6))
Basically it would coalesce value with the same category. See for instance “b” in the above example.
Language that came up with repl inherently provides very nice way to try out different expression and to get to the expected outcome. In this context, as we are using scala, we can use repl-driven development quite conveniently as illustrated below.
- Define the Lists:
scala> val x = List("a", "b" , "c", "a") x: List[String] = List(a, b, c, a) scala> val y = List(1,2,6,9) y: List[Int] = List(1, 2, 6, 9)
- Zip them.
scala> val z = x zip y z: List[(String, Int)] = List((a,1), (b,2), (c,6), (a,9)) scala>
- Group them based on the values of
x
.
scala> val grps = z groupBy (_._1) grps: scala.collection.immutable.Map[String,List[(String, Int)]] = Map(b -> List((b,2)), a -> List((a,1), (a,9)), c -> List((c,6))) scala>
- Map the values of
res8
and reduce them to compute the sum.
scala> val res = grps.values.map {_.reduce((i,j) => (i._1, (i._2+j._2)))} res: Iterable[(String, Int)] = List((b,2), (a,10), (c,6)) scala>
- Sort
res
based on the 1st value of the tuple.
scala> res.toList.sorted res23: List[(String, Int)] = List((a,10), (b,2), (c,6)) scala>
Thus, the function can be simply written as follows:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def coalesce(x:List[String], y:List[Int]):List[(String, Int)] = | |
(x zip y).groupBy(_._1) | |
.values | |
.map{_.reduce((i,j) => (i._1, (i._2 + j._2)))} | |
.toList | |
.sorted | |
Thus we get the expected result.