json¶
JSON importer only supports name based loading.
It does not support schema discovery.
By default, the JSON importer accepts JSON data with array at the root level, but it is possible to specify a row expression to if the array is a decendent of the root object.
Path Expression¶
Jaqy JSON importer does not use
JSONPath, which is too
complicated for dealing with very large data. Instead, it uses a simple
parent.child notation to specify the path. For example, a column expression
of a.b
is equivalent to JSONPath $.a.b
.
Additionally, the row expression implicitly considers the data reached by the
path is a JSON array of JSON objects. For a row expression of items
, it
is equivalent to JSONPath $.items[*]
.
Options¶
-a,--array treats BSON root document as array.
-b,--binary <base64 | hex> sets the binary format.
-c,--charset <arg> sets the file character set
-f,--format <text | bson> sets the JSON format.
-r,--rowexp <arg> sets the row expression
Examples¶
Example 1: Simple JSON Data¶
For the following simple JSON data, the row expression is an empty string,
and thus not needed to be specified. The column expressions a
and b
can be used to extract the attributes a
and b
in the root level of
JSON objects of each row of data.
[
{"a":1,"b":2},
{"a":3,"b":4},
{"a":5,"b":6},
{"a":7,"b":8}
]
The following is the import command and SQL.
.import json -b hex myfile.json
INSERT INTO MyTable VALUES ({{a}}, {{b}});
The result is a four rows of data.
Example 2: Shredding Complicated JSON Data¶
In this example, we import United States state cartographic boundary GeoJSON data ( https://github.com/Teradata/jaqy/blob/master/tests/unittests/json/lib/cb_2017_us_state_500k.json ), which is generated by downloading the 500k resolution Shapefiles zip and uploading to https://mapshaper.org/ to convert to GeoJSON format.
The resulting JSON is roughly in the following format.
{
"type" : "FeatureCollection",
"features" : [
{
"type" : "Feature",
"geometry" : {
"type" : "Polygon",
"coordinates" : [ ]
},
"properties" : {
"STATEFP" : "54",
"STATENS" : "01779805",
"AFFGEOID" : "0400000US54",
"GEOID" : "54",
"STUSPS" : "WV",
"NAME" : "West Virginia",
"LSAD" : "00",
"ALAND" : 62265662566,
"AWATER":489840834
}
},
{
}
]
}
In the following JSON import, the row expression is features
.
For column expressions, properties.GEOID
, properties.STUSPS
, and
properties.NAME
are used to extract metadata inside properties
.
geometry
is used to retrieve the GeoJSON data for the state boundaries.
CREATE TABLE stateTable(geoId INTEGER, state VARCHAR(4), name VARCHAR(100) CHARACTER SET UTF8, shape MEDIUMTEXT);
.import json -r features lib/cb_2017_us_state_500k.json
INSERT INTO stateTable VALUES ({{properties.GEOID}}, {{properties.STUSPS}}, {{properties.NAME}}, {{geometry}});
A sample output can be seen at https://github.com/Teradata/jaqy/blob/master/tests/unittests/json/control/json_import_5.control , which also imports U.S. city boundaries.