Getting Hands dirty with Argonaut

I’m trying to learn as much as I could of tools that adhere to pure functional programming. This is the reason why I am currently working on some personal projects in scala where I kept on iterating development in an attempt to follow the functional way of writing it. Though most of the time are written impure, I believe that I’m getting somewhere at least. That’s progress.

Recently, I’m trying to build a wrapper api that integrates with elastic search via HTTP. To cut the story short, I have this json from the response I got from elastic search. Basically, I search elastic with this,

localhost:9200/cdrflow/_search

where cdrflow is the index name of the source. This will give me a document with this structure.

{"took":41,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":100,"max_score":1.0,"hits":[{"_index":"crdflow","_type":"example","_id":"AUrd_fgQWzmfiAiNnHYu","_score":1.0,"_source":{ "type":"testing" }},{"_index":"crdflow","_type":"example","_id":"AUreBH5UWzmfiAiNnHYz","_score":1.0,"_source":{ "type":"testing" }},{"_index":"crdflow","_type":"example","_id":"AUreBH5YWzmfiAiNnHY2","_score":1.0,"_source":{ "type":"testing" }},{"_index":"crdflow","_type":"example","_id":"AUrdSklNWzmfiAiNnHXp","_score":1.0,"_source":{ "type":"testing" }},{"_index":"crdflow","_type":"example","_id":"AUrdSklcWzmfiAiNnHXv","_score":1.0,"_source":{ "type":"testing" }},{"_index":"crdflow","_type":"example","_id":"AUrdSklgWzmfiAiNnHXy","_score":1.0,"_source":{ "type":"testing" }},{"_index":"crdflow","_type":"example","_id":"AUrdUdaxWzmfiAiNnHYK","_score":1.0,"_source":{ "type":"testing" }},{"_index":"crdflow","_type":"example","_id":"AUrdUdbAWzmfiAiNnHYO","_score":1.0,"_source":{ "type":"testing" }},{"_index":"crdflow","_type":"example","_id":"AUrUffyQ3aMnPR2OmmEN","_score":1.0,"_source":{ "type":"testing" }},{"_index":"crdflow","_type":"example","_id":"AUrUffyT3aMnPR2OmmEO","_score":1.0,"_source":{ "type":"testing" }}]}}

Problem domain

We are going to parse this document to retrieve a set of “id” so we are expecting that the solution would yield something like this.

val result: Seq[String] = List(<id1> <id_n>)

Let's begin

To begin this, we have to create a sbt project to load argonaut dependencies so that we can test this in the scala shell.

I'll assume you are using *Nix OS to do this but this can be easily replicated on windows machine. I also assume that you already have SBT installed and presumably useable on your machine.

$> mkdir argonaut
$> cd argonaut
$:argonaut> echo 'libraryDependencies += "io.argonaut" %% "argonaut" % "6.0.4"' > build.sbt
$> sbt update

Let sbt finish pulling the necessary dependencies and after that we are ready to open the shell with argonaut preloaded. bash $> sbt console

We will store the document above to a string variable and call this json.

val json: String = """
<copy our json object found above here>
"""

Then we have to convert this into a Json type. My choise is to parse this to Json enclosed inside Option.

import argonaut., Argonaut.
val option: Option[Json] = Parse.parseOption(json)

The _ids are enclosed inside the hits object. And if you'll peruse our document, you'll observed that there are two hits object where one wraps the other. The nested hits object is what we want to get a hold of. To do this, we would have to use argonaut's traversal api called Lens.

We will have to compose our Lens and name it nestedHits2ArrayLens.

val nestedHits2ArrayLens = jObjectPL >=> //(#1)
jsonObjectPL("hits") >=> //(#2)
jObjectPL >=> //(#3)
jsonObjectPL("hits") >=> //(#4)
jArrayPL //(#5)

How this works is (#1)it starts by converting it to an Object then (#2)it will look up for hits in that object. If it did found one, (#3)it will again convert it to an Object that (#4)will again be used to look up for the nested hits. If successful, (#5)it will convert this to JsonArray.

Let's do that now.

val nestedHitsArray = nestedHits2ArrayLens.get(option.get) //:t of Option[argonaut.Argonaut.JsonArray]

And now we have the array where we want to harvest those _ids. Let's starting harvesting..

Approaches

There are certainly more than one way to skin a cat. In my case, these are some of them.

Using Argonaut's Codec

To do this we must first define our case class and call it Document. The structure is as follows: scala case class Document(_id: String)

We also need to define an implicit function and use casecodec1.

implict def DocumentCodecJson: CodecJson[Document] = casecodec1(Document.apply, Document.unapply)("_id")

This way of doing it requires that our nestedHitsArray be mapped to a sequence of string rather than of json.

Here's how to do it.

val ids: Seq[String] = for {
i <- nestedHitsArray.get
m <- Parse.decodeOptionDocument
} yield m match {
case Document(id) => id
case _ => ""
}

Parsing it directly

We don't have to use a codec rather just parse it directly and here's how:

val ids: Seq[String] = nestedHitsArray.get.map(i => Parse.parseWith(i.toString, .field("id").flatMap(_.string).getOrElse(""), msg => msg))

Lens all the way..

This is the last one that I could think of and that is to use a lens again for harvesting.

Initially, we need to define our lens composition.

val idsLens = jObjectPL >=>
jsonObjectPL("_id") >=>
jStringPL

With above defined, the following is straight forward.

val ids: Seq[String] = for {
i <- nestedHitsArray.get
m <- idsLens.get(i)
} yield m

Wrapping things up

It's fun to solve problems like this with argonaut. Though at first it is a bit intimidating but when you get the hang of it you'll be addicted.

References

http://mth.io/talks/argonaut/