Coalesce data functionally

In a recent project I had to coalesce quite significant amount of data in the following way. To simplify it for this post, consider that we have the following two lists.

val x = List(“a”, “b”, “c”, “a”)

val y = List(1, 2, 6, 9)

We are about to write a function which would return the following list as the result.

val result = List((a,10), (b,2), (c,6))

Basically it would coalesce value with the same category. See for instance “b” in the above example.

Language that came up with repl inherently provides very nice way to try out different expression and to get to the expected outcome. In this context, as we are using scala, we can use repl-driven development quite conveniently as illustrated below.

  • Define the Lists:
scala> val x = List("a", "b" , "c", "a")
x: List[String] = List(a, b, c, a)

scala> val y = List(1,2,6,9)
y: List[Int] = List(1, 2, 6, 9)
  • Zip them.
scala> val z = x zip y
z: List[(String, Int)] = List((a,1), (b,2), (c,6), (a,9))
scala>
  • Group them based on the values of x.
scala> val grps = z groupBy (_._1)
grps: scala.collection.immutable.Map[String,List[(String, Int)]] = Map(b -> List((b,2)), a -> List((a,1), (a,9)), c -> List((c,6)))

scala>
  • Map the values of res8 and reduce them to compute the sum.
scala> val res = grps.values.map {_.reduce((i,j) => (i._1, (i._2+j._2)))}

res: Iterable[(String, Int)] = List((b,2), (a,10), (c,6))
scala>
  • Sort res based on the 1st value of the tuple.
scala> res.toList.sorted
res23: List[(String, Int)] = List((a,10), (b,2), (c,6))

scala>

Thus, the function can be simply written as follows:

def coalesce(x:List[String], y:List[Int]):List[(String, Int)] =
(x zip y).groupBy(_._1)
.values
.map{_.reduce((i,j) => (i._1, (i._2 + j._2)))}
.toList
.sorted
view raw coalesce.scala hosted with ❤ by GitHub

Thus we get the expected result.

Advertisement

Interactive Scala Development with SBT and JRebel

Problem Statement

Scala REPL is a great tool for incrementally develop software. However, its is quite slow as noted here. Alternative: use Simple Build Tool, aka– SBT. SBT, via its tasks such as compile, run, console, facilitates a quick feedback cycle. The command sbt ~compile further provides continuous/incremental compilation model, that is– whenever a .scala file changes, it re-compiles them and generates .class files.

Problem is that in order to reload these changed class files, one has to restart sbt console. In my humble opinion, it seems a bit inconvenient.

Solution

JRebel provides dynamic reloading of modified *.class files. In combination with sbt ~compile, this setup leads to an interactive development experience as follows.

  • We run sbt ~compile in one console window
  • In other one, we run run sbt console.

Thus, when we modify Scala source codes, first SBT instance compiles it and generates class files. If we invoke any command in the second SBT instance, JRebel reloads the modified class files and afterwards the command gets executed, as shown below.

Note that JRebel is free for personal use, and it is definitely worth taking a look. For more information about the development flow using JRebel, SBT, Eclipse/IDEA, please have a look at this article which describes the setup process in details, or simply leave a comment.

Certificate of #ProgFun of @Coursera

Yay! Finally, I have received the certificate of Functional Programming Principles in Scala course. Thanks @coursera! It has been a great pleasure, and I look forward to the future courses (e.g., Discrete Optimization).

progfun certificate

In a previous post, I remember mentioning how awesome this course is, and how much I have enjoyed this course, and looking forward to the next advanced course in this topic. It’s great to know that around 70% students ‘Absolutely’ share the same feeling. So, prof. @oderskey, please hurry up. We are waiting!

Once again, thanks prof. @oderskey and his team for this excellent course.

Scala: 99 Problems/ Problem01: Find the last element of a list in scala

object Scala99Problem01{
  def lastElement[A](ls: List[A]):A  = {
    def lastElementAux[A](ls: List[A]): Option[A] = ls match{
      case Nil      =>  None 
      case x :: Nil =>  Some(x)
      case x :: xs  => lastElementAux (xs) 
    }

    lastElementAux(ls).getOrElse(throw new NoSuchElementException)
  }
}

Scala Hacking: Computing Powerset

Given a set represented as a String, we can compute its powerset using foldLeft, as shown below.

def powerset(s: String) =
s.foldLeft(Set("")) {
case (acc, x) => acc ++ acc.map(_ + x)
}
view raw powerset.scala hosted with ❤ by GitHub

Isn’t this approach quite concise and elegant? Following snippet shows a pretty-printed output from powerset for a set: "abc".

scala> powerset("abc").toList sortWith ( _ < _) mkString "\n"

res3: String = "
| a
| ab
| abc
| ac
| b
| bc
| c"

Following is a F# implementation of this same function.

let powerset (s:string): Set<string> =
s.ToCharArray()
|> Array.fold(
fun (acc: Set<string>) x -> acc + (Set.map(fun y -> x.ToString()+y) acc)
) (Set.empty.Add(""))
view raw powerset.fs hosted with ❤ by GitHub