Data Quality & Data Validation
Viewing data in Axonius can represent a more accurate & relevant view of data than in the data source itself. Data quality is fundamental to all aspects of IT. When you trust the data, you’re able to work confidently, effectively & efficiently. My customers ask data related questions such as: why do the device counts not match between the source and Axonius? Why is this device or user not found in Axonius? Why isn’t this field populated on every device? The goal of this article is to expand your foundational knowledge of the data in Axonius. With this knowledge, you will learn to tune settings, better understand & trust the data.
Table of Contents
- Data flows into Axonius
- Adapter settings & Customization
- Data Analysis Tips
Data flows into Axonius
Axonius leverages your existing tools & infrastructure with what we call "Adapters" to “fetch” data and bring data into Axonius. Today, we have hundreds of Adapters that cover every aspect of IT. The data is aggregated and deduplicated resulting in high-quality data.
A full list of Axonius adapters can be found here
When viewing your Axonius dashboard, the device and user counts displayed in parenthesis represent the raw device or user counts while the number to the left represents the deduplicated number. Some environments have thousands of duplicate records. You could argue that viewing the data in Axonius is a more accurate, relevant view than the data source itself.
Adapter Settings - Customization
Each Adapter has common settings as well as Adapter specific settings to be aware of. In terms of data quality and comparing data between a source and Axonius, these settings are important to understand.
During an initial Adapter configuration, there are basic connectivity settings. Some Adapters have additional options to add or remove certain data from this initial configuration window. In this example, ServiceNow includes options to fetch devices or users updated in ServiceNow in the last X hours. Optionally, you leave this blank to fetch all devices or users. Review your Adapter configurations to see what options are available to you.
Note: during initial Adapter configuration, make sure to review permissions. It's possible for an Adapter to make a successful connection, but not have access to all the data you want. You can have partial access with missing data fields.
The "Adapter Configuration" tab is present for all Adapters and covers basic configuration options. Two settings have significant impact on entities (Users or Devices) brought into Axonius:
1. “Ignore devices that have not been seen by the source in the last X hours”
- This option allows you to ignore stale data from the source. Depending on the data source, this setting can and should be changed to ensure you have relevant data in Axonius (90/180/365 days)
2. “Delete devices that have not been returned from the source in the last X hours”
- This setting should be at least 2x longer than the Fetch setting for your adapter. If fetches are running every 24 hours, this setting should be 48 hours or longer. If an Adapter fails to run for longer than this setting, devices will be deleted from Axonius.
In short, these settings allow you to automatically trim irrelevant devices and users from Axonius giving you quality, relevant data. For more details on these settings, check out this article.
Note: to view how many devices or users are being cleaned up, go to logs and search for "removed":
Most Adapters have Adapter-specfic “Advanced Settings”. We will look at ServiceNow in this example, as ServiceNow may have the most configurable options of any Adapter. Every option adds the ability to customize data flows into Axonius. Maybe you only want to Fetch “Active” devices from ServiceNow. You can enter the appropriate number in the setting “Install status number to include list”. There is also an exclude list option. On the other hand, maybe you want to exclude disposed or decommissioned devices. I recommend reviewing all your Advanced Adapter settings to ensure you capture the data you want & don’t want in Axonius. If you aren’t sure what an option will do, reach out to your TAM, open a support ticket by emailing support@axonius.com or simply make a post here in the Axonius Community!
Finally, the Discovery Configuration tab can add a custom schedule to any Adapter. If you have a data source that has highly ephemeral (i.e. data is changing frequently,) it makes sense to Fetch more frequently. Consider the current Fetch time for an Adapter when setting a custom discovery. If a Fetch takes 4-hours to complete, at minimum your custom discovery should be 4 hours. Always set your custom discovery longer than the Fetch time.
Data Analysis Tips
1. If your data source has bad data, it's still bad data in Axonius.
While you have options to specify data coming into Axonius, there are limitations, and bad data points can make their way into Axonius. When you identify bad data, you have options to ignore specific fields in the Query Wizard. When you select Aggregated data, you have an option to ignore Adapters you don’t want to include in your aggregated query results. As shown in this screenshot, deselect the checkboxes for Adapters you don’t want to include.
2. Field Segmentation charts help you understand possible field values and gaps.
When you find a valuable field in Axonius, don’t assume a field is populated everywhere. Remember, each field is tied to one or more Adapters. If an Adapter only has coverage for certain segments of your business, that will be a limitation. A quick and easy way to evaluate a field for all possible values and potential gaps is by utilizing a Field Segmentation chart in your dashboards. Go to My Dashboard (private to you), select the blue “+” symbol and create a chart as shown in the example below. You can select an Adapter specific field as I have or an aggregate field. Select the checkbox for “include entities with no value” to represent the gap in this field across your environment.
In this example, the ServiceNow “Install Status” field has potential values of “In use”, “In stock” or “1”. “No Value” is shown because we selected that option in the configuration to understand our gaps.
3. “NOT” shows you the flip side of the coin.
Adding a “NOT” operator to your query will show you the opposite results. Stale data is typically not considered as valuable. In this example, we can evaluate all devices seen within the last 30 days. We get 1204 devices last seen in the last 30 days.
Then, we add a “NOT” operator to show stale assets not seen in the last 30 days.
A less obvious option is to use a NOT operator along with a Saved Query you developed.
Conclusion
Understand what options you have and tune your Adapters to specify the data you want to bring into Axonius. With a little work, you can develop the highest-quality data source in your environment.
Comments
Please sign in to leave a comment.