JSON DB: a compressed JSON format
JSON is far less verbose than XML but sometimes even JSON is more verbose than necessary.
An example is sending many database rows to the browser. If we want to send rows of data representing users and books to the browser then the basic JSON approach would be something like the following.
{
"users": [
{"first": "Homer",
"last": "Simpson"},
{"first": "Hank",
"last": "Hill"},
{"first": "Peter",
"last": "Griffin"}
],
"books": [
{"title":"JavaScript",
"author": "Flanagan",
"year": 2006},
{"title": "Cascading Style Sheets",
"author": "Meyer",
"year": 2004},
{"title": "The C Programming Lanaguage",
"author": "Kernighan",
"year": 1988},
{"title": "The Scheme Programming Language",
"author": "Dybvig",
"year": 2003},
{"title": "Design Patterns",
"author": "The Gang of Four",
"year": 1995}
]
}
There repetition of the keys in the above JSON is wasteful. Given the uniformity of the data, there is no need to write first, last, title, author, and year more than once per table (i.e. users and books). When the number of rows of data increases this repetition becomes increasingly wasteful.
For a project I'm working on now up to one thousand rows of data with about seven fields of data each are sent to the browser. With a structure like above that is seven thousand keys when just seven would do fine! To save characters we are using a compressed format we call JSON DB (with extension .jsondb) where the keys for each table are only listed once. This format is self-contained and easy to decompress in the browser.
{
"users": {
"cols": ["first", "last"],
"rows": [["Homer", "Simpson"],
["Hank", "Hill"],
["Peter", "Griffin"]]
},
"books": {
"cols": ["title", "author", "year"],
"rows": [["JavaScript", "Flanagan", 2006],
["Cascading Style Sheets", "Meyer", 2004],
["The C Programming Language", "Kernighan", 1988],
["The Scheme Programming Language", "Dybvig", 2003],
["Design Patterns", "Gang of Four", 1995]]
}
}
With browser apps becoming richer all the time, it seems that more frequently a database of information must be populated in the browser. I anticipate this JSON DB format being useful many times. I imagine that other situations using JSON could also benefit from similar approaches to reducing the size of the JSON.
Comments
Have something to write? Comment on this article.
Tobie,
Yes I eval() the JSON DB string (since it is really just JSON). Then I can run it through a reformater like you suggest or I create an object in my model layer for each row.
Thanks for this, Peter. I have a task coming up at work where I might be able to use this technique. Good timing!
Why not CSV?
Tim,
This is pretty close to CSV but it is just so easy to eval() the JSON on the client side and easy to dump the JSON on the server side (JSON dumpers escape new lines and quotation marks.) Also this format allows for multiple tables of data to be returned together.
I'm thinking of combining your more compact table def with TrimPath's TrimQuery
Just at the idea stage. Seems like it would be easy...
Hey, I was going to use this for sending JSON to the client XHR calls. I typically request 10 or so items, and could return either an array of key/value pairs or an array of rows and columns, like you have here.
I grabbed some sample data, and here's what I found:
size file
4729 article_list.json
4171 article_list.min.json
3748 article_list_compressed.json
2986 article_list_compressed.min.json
1136 article_list.min.json.gz
1125 article_list_compressed.min.json.gz
The .min files are minified with jsmin. The .gz files are gzipped.
So, depending on your situation, using JSON DB will result in: Unminified: 20% savings Minified: 30% savings Minified and GZipped: 1% savings
So, we're going to gzip the files and skip the JSON DB, since that'll save the processing on each end. (The DB component returns an array of objects, so returning that to the view layer is one call to json_encode.)
Of course, this would likely be different if you had 100 items instead of 10, or really long field names and short field values, or whatever.
The moral of the story: Measure the trade-offs before you make them, even if they seem obvious!
Isaac,
Thanks for posting your comparison. Gzip makes it savings by repeated patterns which is how this jsondb format makes savings. It should not be possible to squeeze compression twice with this same strategy and your results confirm that. Some browsers may not accept gzip so jsondb can be useful for those browsers.
you can compress it more, with the same idea
we will remove "rows", "cols" and replace it with the name
ex:
{
"users": {
"first": ["Homer", "Hank", "Peter"],
"last": ["Simpson", "Hill", "Griffin"]
},
"books": {
"title" :["JavaScript","Cascading Style Sheets","The C Programming Language",
"The Scheme Programming Language","Design Patterns"],
"author":["Flanagan", "Meyer", "Kernighan", "Dybvig", "Gang of Four"],
"year" :[2004, 1988, 2003, 1995]
}
}
Have something to write? Comment on this article.
That sounds like an interesting idea.
How do you handle evaluation client side though ?
Do you just evaluate the JSON data and then run it through a series of loops to reformat it as expected ?