Building ODataURI Parser with Scala Parser Combinators

Objective

Open Data Protocol (ODATA) facilitates end-users to access the data-model via REST-based data services by utilizing Uniform Resource Identifiers (URIs). In this post, we present the result of our recent experiment to build an abstraction on ODATA URIs to generate AST (Abstract Syntax Tree).

Note that this experiment is in its initial stage; hence, the implementation does not support complete feature-set of ODATA URI specification outlined at odata.org. It states a set of recommendations to construct these URIs to effectively identify data and metadata exposed by ODATA services.

To give an example of ODATA URI, consider following URI:

http://odata.org/service.svc/Products?$filter=Price ge 10

It in essence refers to a service request to return all the Product entities that satisfies the following predicate: Price greater than or equal to 10.

Motivation

Primary motivation of building such abstraction is to promote separation-of-concern and consequently, to allow the underlying layers of ODATA service implementation to process query expression tree and yield the result-set in a more efficient manner.

Approach

To implement this parser, we use Parser Combinators, which is in essence a higher-order function that accepts a set of parsers as input and composes them, applies transformations and generates more complex parser. By employing theoretical foundations of function composition, it allows constructing complex parser in an incremental manner.

Scala facilitates such libraries in its standard distribution (see scala.util.parsing). In this implementation, we in particular, use JavaTokenParsers along with PackratParser.

class ODataUriParser extends JavaTokenParsers with PackratParsers {
//...
}

Implementation

ODATA URI contains three fundamental parts, namely Service Root URI, Resource Path and Query Options as below and as per the documentations at [1].

http://host:port/path/SampleService.svc/Categories(1)/Products?$top=2&$orderby=Name
\______________________________________/\____________________/ \__________________/
| | |
service root URL resource path query options
view raw odata_uri.txt hosted with ❤ by GitHub

If we consider the ODATA URI mentioned previously, following illustrates the stated three parts of ODATA request:

odata-uri-parts

Hence, we can construct this parser by building combinators for the three sub-parts in a bottom-up manner and then compose them to construct the complete parser as listed below.

lazy val oDataQuery: PackratParser[SourceNode]={
serviceURL ~ resourcePath ~ opt(queryOperationDef) ^^ {
case uri ~ path ~ None => ODataQuery(uri, path, QueryOperations(Seq.empty))
case uri ~ path ~ Some(exp) => ODataQuery(uri, path,QueryOperations(exp))
}
}
view raw main.scala hosted with ❤ by GitHub

To gets started with an example, lets consider following URI:

http://odata.io/odata.svc/Schema(231)/Customer?$top=2&$filter=concat(City, Country) eq 'Berlin, Germany'

and we are expecting an expression tree based on a pre-defined model as follows:

ODataQuery(
URL("http://odata.io/odata.svc"),
ResourcePath("Schema",Number("231"),ResourcePath("Customer",EmptyExp(),EmptyExp())),
QueryOperations(
List(Top(Number("2")),
Filter(
EqualToExp(
CallExp(
Property("concat")
, List(Property("City"), Property("Country"))
)
, StringLiteral("'Berlin, Germany'"))))))
view raw expressionTree hosted with ❤ by GitHub

Building a parser combinator for Service Root and Resource Path are considerably simpler compared to that of Query Options (the third part). Let’s build them first.

We are using this convention (see ODATA specification) that a ODATA service root should always be ended by .svc. The following snippet can parse for instance http://odata.io/odata.svc to URL("http://odata.io/odata.svc").

lazy val serviceURL: PackratParser[Expression] =
"""^.*.svc""".r ^^ {
case s => URL(s)
}

Next we are defining a resource path which can parse for instance Schema(231) to ResourcePath("Schema",Number("231"),ResourcePath("Customer",EmptyExp(),EmptyExp())) expressions. A compound resource path can be augmented with multiple resources.

lazy val resourcePath: PackratParser[Expression] =(
"/" ~> idn ~ ("(" ~> predicate <~ ")") ~ opt(resourcePath) ^^ {
case Property(name) ~ keyPredicate ~ None => ResourcePath(name, keyPredicate, EmptyExp())
case Property(name) ~ keyPredicate ~ Some(expr) => ResourcePath(name, keyPredicate, expr)
}
| "/" ~> idn~ opt(resourcePath) ^^ {
case Property(e)~None => ResourcePath(e, EmptyExp(), EmptyExp())
case Property(e)~Some(expr) => ResourcePath(e, EmptyExp(), expr)
}
)

After that we have reached to the crux of the problem: to build a parser that can handle the query operators defined in the OData specification. To solve it, we apply bottom up approach in conjunction with top-down realization.

First we define a basic parser that can parse arithmetic expressions as follows.

lazy val expression: PackratParser[Expression] =
expression ~ ("add" ~> termExpression) ^^ {case l ~ r => PlusExp(l, r)} |
expression ~ ("sub" ~> termExpression) ^^ {case l ~ r => MinusExp(l, r)} |
termExpression
lazy val termExpression: PackratParser[Expression] =
termExpression ~ ("mul" ~> factor) ^^ {case a ~ b => MultiplyExp(a, b)} |
termExpression ~ ("div" ~> factor) ^^ {case a ~ b => DivideExp(a, b)} |
termExpression ~ ("mod" ~> factor) ^^ {case a ~ b => ModExp(a, b)} |
factor
lazy val factor: PackratParser[Expression] =
("not" ~> factor) ^^ NotExp |
factor ~ ("(" ~> expressionList <~ ")") ^^ {case id ~ param => CallExp(id, param)} |
number |
boolean |
string |
idn |
"(" ~> predicate <~ ")"
lazy val expressionList: PackratParser[Seq[Expression]] = repsep(predicate, ",")
lazy val propertyList: PackratParser[Seq[Property]] = repsep(idn, ",")
lazy val idn: PackratParser[Property] = ident ^^ Property
lazy val number: PackratParser[Number] = floatingPointNumber ^^ Number
lazy val string: PackratParser[StringLiteral] = ("\'" + """([^"\p{Cntrl}\\]|\\[\\/bfnrt]|\\u[a-fA-F0-9]{4})*""" + "\'").r ^^ StringLiteral | stringLiteral ^^ StringLiteral
lazy val boolean: PackratParser[Expression] = "true" ^^^ TrueExpr() | "false" ^^^ FalseExpr()

Then we incrementally augment support for handling relational operators, and thus can handle logical and, or and similar operation.

lazy val predicate: PackratParser[Expression] =
predicate ~ ("and" ~> relExpression) ^^ {case l ~ r => AndExp(l, r)} |
predicate ~ ("or" ~> relExpression) ^^ {case l ~ r => OrExp(l, r)} |
relExpression
lazy val relExpression: PackratParser[Expression] =
relExpression ~ ("gt" ~> expression) ^^ {case l ~ r => GreaterThanExp(l, r)} |
relExpression ~ ("lt" ~> expression) ^^ {case l ~ r => LessThanExp(l, r)} |
relExpression ~ ("eq" ~> expression) ^^ {case l ~ r => EqualToExp(l, r)} |
relExpression ~ ("ne" ~> expression) ^^ {case l ~ r => NotEqualToExp(l, r)} |
relExpression ~ ("ge" ~> expression) ^^ {case l ~ r => GreaterOrEqualToExp(l, r)} |
relExpression ~ ("le" ~> expression) ^^ {case l ~ r => LessOrEqualToExp(l, r)} |
expression

The above two code listings form the basis to provide support for the query operations such as $filter and $select. See below.

lazy val queryOperationDef: PackratParser[Seq[Expression]] =
"?" ~> repsep(filter | select | top | skip | orderBy, "&")
lazy val filter: PackratParser[Expression] =
"$filter" ~> "=" ~> predicate ^^ Filter
lazy val top: PackratParser[Expression] =
"$top" ~> "=" ~> number ^^ Top
lazy val skip: PackratParser[Expression] =
"$skip" ~> "=" ~> number ^^ Skip
lazy val orderBy: PackratParser[Expression] = (
"$orderby" ~> "="~> propertyList <~ "asc" ^^ OrderByAsc
| "$orderby" ~> "="~> propertyList <~ "desc" ^^ OrderByDesc
| "$orderby" ~> "="~> propertyList ^^ OrderByAsc
)
lazy val select: PackratParser[Expression] =
"$select"~>"="~> propertyList ^^ Select

Thus, it allows to parse the URI to expression tree as shown below.

test("Parse /Customers?$top=2&$filter=concat(City, Country) eq 'Berlin, Germany'"){
val uri = "http://odata.io/odata.svc/Schema(231)/Customer?$top=2&$filter=concat(City, Country) eq 'Berlin, Germany'"
val actual = p.parseThis(mainParser,uri).get
println(uri + "=>" + actual)
val expectedAst=
ODataQuery(
URL("http://odata.io/odata.svc"),
ResourcePath("Schema",Number("231"),ResourcePath("Customer",EmptyExp(),EmptyExp())),
QueryOperations(
List(Top(Number("2")),
Filter(
EqualToExp(
CallExp(
Property("concat")
, List(Property("City"), Property("Country"))
)
, StringLiteral("'Berlin, Germany'"))))))
assert(actual == expectedAst)
}

Or, as follows:

test("Parse /Products?$select=Name"){
val uri = "http://services.odata.org/OData.svc/Products?$select=Name,Price"
val actual = p.parseThis(mainParser,uri).get
val expectedAst =
ODataQuery(
URL("http://services.odata.org/OData.svc"),
ResourcePath("Products",EmptyExp(),EmptyExp()),
QueryOperations(List(Select(List(Property("Name"), Property("Price"))))))
assert(actual == expectedAst)
}

Conclusion

The complete source of this project is available at github repository. Please feel free to browse and if there is any question, please post.

See More:

  1. OData URI Specification
  2. External DSLs made easy with Scala Parser Combinators
  3. DSLs in Action
Advertisement

x += x++

Hi, I’m back. I’ve finally sorted out the guidelines for blogging in Credit Suisse.

Here is something I have been playing around with in the spare time between one meeting and the next one.  It is a Scheme interpreter that includes a REPL window. The full code is here.

All the smarts for it come from this Wiki Book. I just ported the code to F# (and modified it a bit). I thought the comparison might be interesting, so here we go. Thanks to Tobias and Jose for reviewing the code, find one bug and suggest improvements.

Before we start looking at the real code, here is what we are trying to accomplish in form of test cases. If you are a bit rusty on LISP syntax, you might want to try and see if you understand what it does.

Our goal is to make all this XUnit…

View original post 501 more words

Semantics with Application by Hanne Riis Nielson & Flemming Nielson

“Semantics with Applications: A Formal Introduction” – by Hanne Riis Nielson and Flemming Nielson presents the fundamental ideas behind the major three approaches to operational, denotational and axiomatic semantics. This book also addresses the relationship by formulating and proving the relevant theorems. In addition, it provides several illustrations of the applicability of formal semantics in various field of computer science as a prominent tool.

“Semantics with Applications: A Formal Introduction” – by Hanne Riis Nielson and Flemming Nielson presents the fundamental ideas behind the major three approaches to operational, denotational and axiomatic semantics. This book also addresses the relationship by formulating and proving the relevant theorems. In addition, it provides several illustrations of the applicability of formal semantics in various field of computer science as a prominent tool.

Course Notes are available at: http://www.daimi.au.dk/~bra8130/Wiley_book/wiley.html

 

 

 

 

TUDelft | Lectures of Model-Driven Software Development Course

The following lectures on Model-Driven Software Development(MDSD) are given by Dr. Eelco Visser at Tudelft in IN4308 course. To get the basic idea of MDSD and motivation behind it , please go through these lectures .

The following lectures on Model-Driven Software Development(MDSD) are given by Dr. Eelco Visser at Tudelft  in IN4308 course.   To get the basic idea of MDSD and motivation behind it , please go through these lectures .

Lecture 1 : MDSD : Introduction and Overview

Lecture 2 : Domain Analysis & Data Modeling

Lecture 3 & 4 : Web Abstraction

In these lectures, WebDSL , a DSL for Web Application is introduced.  Continuing from the previous discussion on Domain Analysis, these lectures address the design issues and motivation behind abstracting the domain of Web Application using this DSL.

Lecture 5 : Web Abstraction to actual Implementation