json

JSON importer only supports name based loading.

It does not support schema discovery.

By default, the JSON importer accepts JSON data with array at the root level, but it is possible to specify a row expression to if the array is a decendent of the root object.

Path Expression

Jaqy JSON importer does not use JSONPath, which is too complicated for dealing with very large data. Instead, it uses a simple parent.child notation to specify the path. For example, a column expression of a.b is equivalent to JSONPath $.a.b.

Additionally, the row expression implicitly considers the data reached by the path is a JSON array of JSON objects. For a row expression of items, it is equivalent to JSONPath $.items[*].

Options

-a,--array                   treats BSON root document as array.
-b,--binary <base64 | hex>   sets the binary format.
-c,--charset <arg>           sets the file character set
-f,--format <text | bson>    sets the JSON format.
-r,--rowexp <arg>            sets the row expression

Examples

Example 1: Simple JSON Data

For the following simple JSON data, the row expression is an empty string, and thus not needed to be specified. The column expressions a and b can be used to extract the attributes a and b in the root level of JSON objects of each row of data.

[
        {"a":1,"b":2},
        {"a":3,"b":4},
        {"a":5,"b":6},
        {"a":7,"b":8}
]

The following is the import command and SQL.

.import json -b hex myfile.json
INSERT INTO MyTable VALUES ({{a}}, {{b}});

The result is a four rows of data.

Example 2: Shredding Complicated JSON Data

In this example, we import United States state cartographic boundary GeoJSON data ( https://github.com/Teradata/jaqy/blob/master/tests/unittests/json/lib/cb_2017_us_state_500k.json ), which is generated by downloading the 500k resolution Shapefiles zip and uploading to https://mapshaper.org/ to convert to GeoJSON format.

The resulting JSON is roughly in the following format.

{
        "type" : "FeatureCollection",
        "features" : [
                {
                        "type" : "Feature",
                        "geometry" : {
                                "type" : "Polygon",
                                "coordinates" : [ ]
                        },
                        "properties" : {
                                "STATEFP" : "54",
                                "STATENS" : "01779805",
                                "AFFGEOID" : "0400000US54",
                                "GEOID" : "54",
                                "STUSPS" : "WV",
                                "NAME" : "West Virginia",
                                "LSAD" : "00",
                                "ALAND" : 62265662566,
                                "AWATER":489840834
                        }
                },
                {
                }
        ]
}

In the following JSON import, the row expression is features. For column expressions, properties.GEOID, properties.STUSPS, and properties.NAME are used to extract metadata inside properties. geometry is used to retrieve the GeoJSON data for the state boundaries.

CREATE TABLE stateTable(geoId INTEGER, state VARCHAR(4), name VARCHAR(100) CHARACTER SET UTF8, shape MEDIUMTEXT);
.import json -r features lib/cb_2017_us_state_500k.json
INSERT INTO stateTable VALUES ({{properties.GEOID}}, {{properties.STUSPS}}, {{properties.NAME}}, {{geometry}});

A sample output can be seen at https://github.com/Teradata/jaqy/blob/master/tests/unittests/json/control/json_import_5.control , which also imports U.S. city boundaries.