Loading continually updated ndjson data from flat files : inmydata

The inmydata publisher app allows you to easily create queries that will import flat files, including those containing ndjson data, to be published to your tenant. It will often be the case that those flat files are constantly updated by some other process.

As such, from version 8.00.0129 onwards the data definition (query) wizard allows you to define a query that will load all files containing ndjson data in a directory. You can also optionally set the query to only load files where the archive bit is set, and clear the archive bit once the file is loaded. The archive bit is set by the OS when a file is modified or created.

To prevent duplicate rows of data from a file that is modified after it is loaded, the publisher app will add the name of the file to a column called imd_key in the data, and force this column to be used in the publish settings key. As such, any data from a file that has already been published will be dropped from the published dataset before the file is reloaded.

As such, you can easily create a query that will efficiently load only new ndjson data from flat files in a particular directory. To do so, follow these steps...

Open the inmydata publisher app on a machine that has access to the directory containing the ndjson files.
Select File-->Add New task from the menu
Enter a name for your new task and press enter
Select Data-->Add data definition from the menu
Press Next, enter a name for the data definition in the box labelled Query Name, then press Next
Select Import data from a file in the box labelled Query Type and press Next
Click on the browse button (...), select the file type ndjson (*.ndjson, *.json) and select the first data file in your directory
Check the check box labelled Import all files of the same type in the directory
Check the check box labelled Honour archive flag and press Next
Ensure each column has the correct type and label, press Next
Press Next, then press Finish

Loading continually updated ndjson data from flat files Print

Related Articles