The Wonderful and Dangerous to_json from Postgres

The Wonderful and Dangerous to_json from Postgres

PostgreSQL is an amazing RDBMS. Not only is has it been ACID compliant since 2001, it supports many data types and functions, putting it on par with Document Store systems like MongoDB. Postgres supports a wide range of JSON formats and methods to store, search, and transform JSON object data. For instance, the to_json and related function json_agg allow you to collect a row into a JSON object. Handy for when you have nested data and want PG to return an object with nested values.

An Example Usage of to_json

Let’s take a look at how to use to_json in your queries. A really simple example involving two tables. The following SQL should create and populate two tables.

CREATE TABLE company (
    id SERIAL PRIMARY KEY,
    name text,
    website text,
    range int4range
);
CREATE UNIQUE INDEX company_pkey ON company(id int4_ops);

CREATE TABLE person (
    id SERIAL PRIMARY KEY,
    name text,
    phone text,
    company_id integer REFERENCES company(id)
);
CREATE UNIQUE INDEX person_pkey ON person(id int4_ops);

INSERT INTO "public"."company"("id","name","website","range")
VALUES
(1,E'Apple',E'www.apple.com',E'[100,150)'),
(2,E'Google',E'www.google.com',E'[80,120)');

INSERT INTO "public"."person"("id","name","phone","company_id")
VALUES
(1,E'Shane',E'3035551212',1),
(2,E'Larry',E'4045551212',2);

Now we can get the rows in person with nested JSON values for the company

select person.*, to_json(company) as company
from person
left join company on company.id = person.company_id;

This gives us a nice JSON array of results

[
  {
    "id" : 1,
    "phone" : "3035551212",
    "company" : {"id":1,"name":"Apple","website":"www.apple.com","range":"[100,150)"},
    "name" : "Shane",
    "company_id" : 1
  },
  {
    "id" : 2,
    "phone" : "4045551212",
    "company" : {"id":2,"name":"Google","website":"www.google.com","range":"[80,120)"},
    "name" : "Larry",
    "company_id" : 2
  }
]

Do you see how we get a nice JSON array with nested values for the person.company? This makes app development a breeze. We no longer have to make two different queries and stitch tegther the results, PG performs the hard work for us.

NOTE: You can use json_agg, which will give you a JSON array for one-to-many relationships. We’re not going to cover it here, but in all likelihood, you’ll need to utilize group by when using json_agg

What’s so Dangerous?

Glad you asked. By converting the rows you return to JSON, you lose the specific types of the fields. Now the company field is JSON and the items within just use primitive JSON types (string, numbers, boolean, arrays, objects). You lose any data type hints for the postgres client driver to utilize. This is true for dates, dates with timestamps, integers, floats, and any other postgres data type that is not just a JSON primitive. Looks specifically at the company.range field. For all the client knows, it’s just a text field, it doesn’t know it’s actually an int4range field type.

An Example in a current app

I’m working on a app utilizing the PG Range Types. In this instances, it’s the int4range type to signify age ranges. For example, it might have age range values of [0, 10), or [10, 15), or [15, 100). This is range type literal notation for Number range beginning at 0 and ending on 9. I also use a really handy Node.js postgres tool pg-range, it will detect a range field and automagically parse the field into other properties, such as begin, end, and some nice comparison methods.

The problem, when I use to_json, the fields returned for age_range is not a int4range, but rather a text data type. Therefor, it doesn’t get converted by the pg-range driver and I lose the ability to call .begin or .end on that field.

Conclusion

By all means, use to_json and json_agg to your advantage and leverage their power. You just need to be careful you understand you’re getting back JSON and not real workable Postgres values.