2017-12-06
The Advent of Void: Day 6: jq
Day 6! Today we want to get some pseudo modern web technology in the game. jq(1) is a commandline tool that allows to query and manipulate JSON documents and is great to use with shell scripts.
Let’s get started with this simple JSON document:
{
"title" : "The Advent of Void: Day 4: containers",
"pubDate" : "2017-12-04 00:00:00",
"link" : "http://voidlinux.eu/news/2017/12/advent-containers.html",
"guid" : "http://voidlinux.eu/news/2017/12/advent-containers"
}
To query the title of this element you can use the following:
# jq '.title' file.json
"The Advent of Void: Day 4: containers"
Now, this is still formatted as a JSON string. Let’s get the plain string we
can pass -r
on the shell:
# jq -r '.title' file.json
The Advent of Void: Day 4: containers
To get both the title
and the pubDate
, just add it as a comma seperated list:
# jq -r '.title, .pubDate' file.json
The Advent of Void: Day 4: containers
2017-12-04 00:00:00
That’s quite helpful. But the real power of jq comes to light once we work with JSON Arrays. Let’s have a little more complex example:
[
{
"guid" : "http://voidlinux.eu/news/2017/12/advent-containers",
"pubDate" : "2017-12-04 00:00:00",
"link" : "http://voidlinux.eu/news/2017/12/advent-containers.html",
"title" : "The Advent of Void: Day 4: containers"
},
{
"link" : "http://voidlinux.eu/news/2017/12/advent-ministat.html",
"title" : "The Advent of Void: Day 3: ministat",
"guid" : "http://voidlinux.eu/news/2017/12/advent-ministat",
"pubDate" : "2017-12-03 00:00:00"
},
{
"link" : "http://voidlinux.eu/news/2017/12/advent-taskwarrior.html",
"title" : "The Advent of Void: Day 2: taskwarrior and friends",
"pubDate" : "2017-12-02 00:00:00",
"guid" : "http://voidlinux.eu/news/2017/12/advent-taskwarrior"
},
{
"link" : "http://voidlinux.eu/news/2017/12/advent-gcal.html",
"title" : "The Advent of Void: Day 1: gcal",
"guid" : "http://voidlinux.eu/news/2017/12/advent-gcal",
"pubDate" : "2017-12-01 00:00:00"
}
]
Assume we want a list of all links and titles as a CSV in the document: Our first step is to extract them both from the document:
# jq -r '.[] | ( .title, .link )' file.json
The Advent of Void: Day 4: containers
http://voidlinux.eu/news/2017/12/advent-containers.html
The Advent of Void: Day 3: ministat
http://voidlinux.eu/news/2017/12/advent-ministat.html
The Advent of Void: Day 2: taskwarrior and friends
http://voidlinux.eu/news/2017/12/advent-taskwarrior.html
The Advent of Void: Day 1: gcal
http://voidlinux.eu/news/2017/12/advent-gcal.html
That’s all that’s needed to query the information. To format the output we need to generate an array from every article:
# jq -r '.[] | [.title, .link ]' json
[
"The Advent of Void: Day 4: containers",
"http://voidlinux.eu/news/2017/12/advent-containers.html"
]
[
"The Advent of Void: Day 3: ministat",
"http://voidlinux.eu/news/2017/12/advent-ministat.html"
]
[
"The Advent of Void: Day 2: taskwarrior and friends",
"http://voidlinux.eu/news/2017/12/advent-taskwarrior.html"
]
[
"The Advent of Void: Day 1: gcal",
"http://voidlinux.eu/news/2017/12/advent-gcal.html"
]
Now as a last step join the arrays using the tab character:
# jq -r '.[] | [.title, .link ] | join("\t")' file.json
The Advent of Void: Day 4: containers http://voidlinux.eu/news/2017/12/advent-containers.html
The Advent of Void: Day 3: ministat http://voidlinux.eu/news/2017/12/advent-ministat.html
The Advent of Void: Day 2: taskwarrior and friends http://voidlinux.eu/news/2017/12/advent-taskwarrior.html
The Advent of Void: Day 1: gcal http://voidlinux.eu/news/2017/12/advent-gcal.html
All in all jq is a quite powerful tool and we did only scratch the surface of what you can do with jq. For more comprehensive documentation consider the jq Manual and try it online.