The story of Internet Object

Aamir MAniar by Mohamed Aamir Maniar 12th SEP, 2019
Image

There is only one thing stronger than all the armies of the world: and that is an idea whose time has come. ~Victor Hugo

It was the summer of 2017, I was working on a JSON based REST API on one of the projects. The front-end was JavaScript and the backend was based on Python. I had designed and developed hundreds of REST APIs earlier. However, this time, something felt wrong; I realized that with the JSON, we were exchanging a huge amount of unnecessary information to and from the server. The structure of JSON is as such you must provide key/value pairs. You remove the keys and just pass the values; you will save a significant amount of bandwidth! Wow, why did this thought not come earlier? Well, it did, it kept coming but never remained persistent. This time it stayed there and strongly prompted me to look into that and find the solution.

I think when you have a lot of jumbled up ideas they come together slowly over a period of several years. ~Tim Berners Lee

I started sleeping with the thought; found that, carrying unnecessary bytes in the form of keys is not the only issue JSON has; it did not enforce schemas! Lack of built-in schema led to other issues such as data validations, non-clarity, extra development time, higher development cost so on and so forth. For example, every time we serialize and deserialize data, we are required to validate them. Just imagine, for one API endpoint two different client types (Desktop and Mobile) how much effort one needs to put! In many cases, you are required to validate them up to six times; while sending and receiving each at the server, desktop, and mobile! How much effort does it cost? How much time and money one can save if validations are built-in. There are libraries and frameworks which can simplify this job, still, you will have to put extra efforts! Often, those extra efforts lead to inconsistent validation mechanism.

JSON not only mixes key/values and lacks schema; it mixes data and headers (or metadata for that matter) too. It does not provide a standard way to keep them separate. For example, you want to return a large employee's collection with the pagination support, chances are, your API response might look like this.

{
   "count": 12342,
   "currentPage": 2,
   "pageSize": 100,
   "nextPage": "https://example.com/api/v1/employees/?page=3"
   "previousPage": "https://example.com/api/v1/employees/?page=1"
   "employees": [{...}, {...}, …]
}

If you look closely, this JSON document mixes the data employees with other non-data keys (headers) such as count, currentPage, and pageSize in the same response.

The Quest for a Solution

This is when I had this epiphany (Maybe epiphany is just a too heavy word to describe this 😉, you may choose to ignore it) that JSON is not the most suitable format for data exchange over the internet. JSON may be simple, easy-to-learn, and self-explanatory, but it lacks some essentials qualities. That may be because JSON was not originally built (from the scratch) for the web! It was discovered and got adopted because it happened to be the simple alternative to the complex XML. JSON simplified many things and over the period of time, through REST it drove the SOAP to that little corner where only a few people wish to go! That way, JSON has done remarkably well but, as mentioned, it lacks essentials required for the web.

While I was going through that phase; the following queries popped-up my mind.

  1. What are the best ways to remove the keys and other unnecessary bytes from the JSON document?
  2. How can we split the document into data and definition so that only data can be sent over the wire?
  3. Why can't the format parsers, serializers, & deserializers have the ability to validate the data itself?
  4. Should the format support the streaming of the data over the wire? So that instead of entirely sending in one go, can we send it in parts?
  5. Can we efficiently handle meta-data (such as records count, error details)?

The Internet Object

Having identified the problems; I was eager to solve them. I started working on the new design and In this quest, I looked into many existing formats, approaches, and mechanisms. These included JSON, SGML, XML, SOAP, CSV, YML, HTML, MIME and many others. Some of them were not data-interchange formats at the core, but they offered something inspiring. I figured the new web serialization format must have the following qualities.

  1. Thin and compact: It must serialize data into the compact and smallest possible size.
  2. Schema oriented: It must have a built-in schema with strong validation support.
  3. Well structured: It must keep data and metadata (or headers) separate.
  4. Simple and Easy: It must be simple, human-readable, Unicode based format which is easy to get started and work with.
The Internet Object

While researching, one name frequently flashed into my mind. That name was "Internet Object". That made sense because it represents a data object that travels through the web and the internet, Very soon, I started calling it "The Internet Object"

It's not an experiment if you know it's going to work ~Jeff Bezos

The Internet Object became my favorite side project. Whenever I had time to spare, I'd work on new designs and formats. I continuously tested those draft designs by running various cases against them. I spoke with many developers and tech-architects in the circle. Ultimately, after two years, I am almost ready to tell the world that something interesting is coming up very soon. So, please stay tuned!

Best Regards
Mohamed Aamir Maniar

Remain updated, we'll email you when it is available.