JSON vs. CSV - When to use each adapter
Overview
The purpose of this article is to explore the use of JSON and CSV adapters, to compare the two in terms of suitability for various cases and provide some recommendations on approaches to ingesting both simple and complex data through these flat--file adapters.
What Are the JSON and CSV Adapters?
The JSON and CSV adapters provide the ability to ingest custom flat files into Axonius. These adapters provide a flexible means of capturing data which is not already available through your existing adapters and often provide supplemental information for end users.
JSON or JavaScript Object Notation, is a key-value pair oriented file structured in a way in which each attribute is provided in a “Key”:”Value” structure. The key importance of this notation is that it supports lists and nested objects, so in cases where a specific attribute has more than one value (for example last used users), JSON may be used to capture that information.
CSV or Comma Separated Values, is a standard representation of data akin to common data processing apps like Excel or SQL. CSVs present a tabular format of data by first providing a header containing the expected structure of the file via providing the headers, and then following that with multiple lines of information that mirrors the same order as the header.
For the purposes of Axonius and how it interprets data, there are a few key considerations that you should keep in mind when choosing one of the two formats:
- Axonius Treats every row of any flat file (JSON/CSV) as a unique device. If your data is structured so that you have 1 device with many rows, you should consider consolidating them.
- Axonius has a set of required fields that every file (JSON or CSV) must have at least one of.
- As per every application, the more information that you supply that may assist with correlation (such as Serial Number, Hostname, IP address, Mac Address) the more likely the file’s contents will correlate with their corresponding device in Axonius.
- Axonius’s CSV adapter does not support complex / nested objects*
- Like all adapters, CSV and JSON adapters may be scheduled on a custom discovery cycle.
*CSVs may be used to produce Installed Software in a specific circumstance where you follow a given layout provided by axonius.
Basic information about csv and json adapters can be found in Axonius' documentation.
When to Use Either Adapter
Ultimately the use of flat files is an exception-based process. These files are used to obtain data that is not currently captured via any known adapter and is not stored on any accessible database.
As a simple rule of thumb:
- For Data that can exist as a single attribute of a device, use CSV
- For Data which has a many to 1 relationship (many values for this field per field) to a device, use JSON
The rationale for this is that ingesting data presently only accepts String based data within CSV, thus it does not accept lists or complex objects. Capturing multiple values of the same attribute for each device within a CSV would require duplicating the device onto multiple rows to capture all values required. By duplicating the same device over and over again to capture the scope of data can cause issues to arise with how the data is represented and how it correlates. These issue include:
- Each Row of CSV is seen as a new Device, thus you will end up with a new CSV Connection for each row that a device ends up on (So if you have 200 rows, you’ll end up with 200 connections for a single device)
- In the way correlation presently works, presenting a large mass of unique devices (rows) for the same device will drastically increase the number of correlation cycles to be performed before all rows are correlated together. Instead you may be left with a large wait time (in order for the correlation engine to pass over the data numerous times) before the device merges into a unique record, which may be days or weeks and not feasible for your usage pattern.
- The way that device connections are presented means that the Device Details Page for each unique device will be unruly, with over 200 connection labels in this case appearing for each device, drowning out any other connection for that device.
- As each row will be seen as a separate device, there is no quick means to search in the Device Details page for the given connection that has your desired data. Instead you’d have to manually search each connection one at a time.
The solution for providing many to 1 type data (e.g. Business Applications that exist on a single web server or Historical snapshots of a specific field for the last 6 months etc.) is to use lists of nested objects within JSON notation. By using these list objects, we can provide a single record for a device in JSON which easily correlates and will not cause a glut in the UI.
An Example of how complex objects are handled by CSV and JSON
To Illustrate this point, let’s take a CSV file and an equivalent JSON file which contains the same data in a many to 1 relationship to a device (In this case, we are looking at websites hosted by 3 web servers, Server 1 has 6 websites, Server 2 has 9 websites and Server 3 has 273 websites).
JSON Output
Viewing the JSON Output for the file we get 3 Devices as a result of loading the file:
Drilling down on the JSON Output for the complex object we see:
Here all of the website data is now stored under a queryable attribute called “JSON Website Details”. This field can be queried the same way as any other complex object (e.g. Installed software) and exists on 1 connection.
CSV Output
Viewing the CSV Output for the file we get 36 rows as a result of loading the file (and performing several correlation cycles).
Drilling down on Server 3 (Which has over 200 rows in the CSV) We get the following view
For this, all of the CSV entries are treated as new connections and thus there is a glut of the connections and it requires effort to find any particular set of information as its not consolidated.
Example Use Cases for CSVs
Watchlists
- Providing a set of Devices which you would like to treat differently (via dashboards, queries or enforcements) such as known rogue devices, stolen devices or highly critical devices
Business Criticality and Location
- Providing supplementary data to enrich the dataset for devices, it may be that you presently do not capture a reliable measure of criticality for a device, or do not capture a reliable location for a device in your application. A manual CSV can be used to capture this data and feed it into the System.
Remediation of CMDB Records
- In cases where a mass of information needs to be updated within a CMDB that has been manually captured outside of the CMDB, a CSV of all the corresponding data may be supplied and the CMDB enrichment enforcements may be used to flow it into the CMDB.
Example Use Cases for JSON
Business Application Association
- Providing metadata of applications or processes that may rely upon a certain device (such as websites, databases and backups).
Historical Metric Storage
- In cases where you want to retain a historical record of certain attributes on a rolling period, you can use a list of JSON objects to store each period as a separate object and thus be able to store and query that data as needed.
Comments
Please sign in to leave a comment.