Monday, February 15, 2016

JSON grep

I haven't really done anything worth showing here for a while, but I've been playing around with Groovy recently, and it's hard not to do something weird and hack-y in Groovy, so here it is.

The problem was pretty basic. I had JSON configuration files for a benchmark and wanted to retrieve particular properties from JSON files to use them in bash scripts. Since the benchmark was done in Groovy, so I looked into a solution using Groovy as well. Finding the property in JSON is a little problematic, since the format is pretty complex, but Groovy (and most other languages, I suppose) can parse JSON files and dynamically create a data structure consisting of lists and maps, which then can be used using standard Groovy collection magic.

Once that structure is in place the only problem is to execute a Groovy command in bash on that data structure. I do this by creating a binding object, which represents an execution environment. I set a variable in the binding: the variable is called cfg and it is the entire data structure that I got by parsing JSON. Then I pass the binding to a Groovy shell object. The shell can execute any groovy command passed as string. Since it has access the binding with cfg defined, the Groovy commands can be executed using cfg. Then, Groovy code passed by argument to the script can be used to extract a particular item from the JSON file and print it. In addition, all those cool Groovy collection methods like grep, collect and inject can be used. The result of execution is simply printed on screen.

For my own convenience, since I want this script for executing things on the JSON data structure only, and I don't necessarily want to remember down the line that the variable inside the script is called cfg, I make the command I get via the argument always execute as a method of cfg. This is a bit arbitrary, and I can see that for some purposes this should be simplified and the user of the script should be made to remember herself to use cfg as appropriate, but, as I say, this is for my convenience as is.

An example usage is as follows:

groovy jsongrep.groovy -f file.json coordinator.host coordinator.port

This example is fairly straightforward. It assumes that the top level of the data structure is a map. We execute two command on it. The first one grabs the contents of data structure under coordinator from the top level structure and, again assuming that that is a map, grabs the contents under host. The second command grabs port from the latter. Traversing the structure using Groovy notation is fairly elegant, if I say so myself.

groovy jsongrep.groovy -f file.json 'hosts.inject(""){acc, e -> acc + " " + e}'

This example also assumes that the top level is a map. From this map we grab the substructure under the key hosts and execute inject on it. This is a fold-type function which accumulates the result of operations performed on individual elements of the collection. We use it to glom together a space-separated string. Not as elegant as the basic usage, I suppose, but it gives access to a very powerful mechanism, since you can do a lot of things using those Groovy collection methods.

The code:

 1 #!/usr/bin/env groovy
 2 
 3 import groovy.json.JsonOutput;
 4 import groovy.json.JsonSlurper
 5 
 6 def parser = new CliBuilder(usage: 'jsongrep.groovy [ -f FILE ] EXPRESSION [ EXPRESSION [...] ]')
 7 parser.with {
 8     f longOpt: 'file', args: 1, 'Path of JSON file.'
 9     h longOpt: 'help', args: 1, 'Usage information.'
10 }
11 
12 def options = parser.parse(args)
13 if (options.h) {
14     parser.usage()
15     return
16 }
17 
18 def path = options.f ? options.f : null
19 
20 def arguments = options.arguments()
21 def cfg = new JsonSlurper().parseText(path == null ? System.in.text : new File(path).text)
22 
23 Binding binding = new Binding()
24 binding.setVariable("cfg", cfg)
25 GroovyShell shell = new GroovyShell(binding)
26 
27 arguments.each { println (shell.evaluate ("cfg.$it")) }

The code is available on GitHub at groovy/jsongrep.groovy.