Utilizing Pandas to Explode Mutiple Fields

Pandas is one of the best open-source standard python libraries to extract, transform, and load data. Pandas is the leader in Data Science and has a bunch of useful under hood tools to work with Comma Separated Variable (CSV) files. In this post we will cover some of the basic functionality in pandas and one of its most used case scenarios with Axonius. The following Topics will be covered.

  • Prerequisites
  • Creating a DataFrame from Axonius Data
  • Exploding multiple fields with Pandas
  • Exporting DataFrame to CSV file

Prerequisites:

  • Python
  • Axonius API Client
  • Pandas
  • Knowing the Field Name(s) you wish to explode
  • If using a saved query, you must know the name of your saved query

Python and Axonius API Client

      Below you can find references for how-to guides in our community.

  1. Axonshell getting started
  2. Axonius API Python library

Pandas

    you can follow the official getting started from the pandas documentation here

    or you can install via Pip

pip install pandas

    after Pandas has been installed (or you fired up a Jupyter NoteBook following the Pandas Getting Started Guide) lets import pandas

import pandas as pd

Creating a DataFrame from Axonius Data

    Now that we have pandas installed, lets first grab some data from Axonius

Axonshell:

axonshell devices get-by-saved-query --name PandasPost

Python:

import axonius_api_client as axonapi

# create an empty dictionary object for our environment Variables
client_args = {}

# update our empty dictionary with values from local .env file
client_args.update(axonapi.get_env_connect())

# pass our environment variables to the Connection Object
client = axonapi.Connect(**client_args)

# initialize the client
client.start()

# add our query name
# my query is called 'PandasPost'
queryName = 'PandasPost'

# initialize a devices object
devices = client.devices

# get our devices from a saved query
savedQueryResults = devices.get_by_saved_query(name=queryName)

    Now that we have the data stored as a python object we can create a Pandas DataFrame.

# create a Dataframe of assets from the results of the saved query
dataFrame = pd.DataFrame.from_records(savedQueryResults)

    Below is a preview of the first 3 rows of our dataframe.

    We can see the fields specific_data.data.hostname, specific_data.data.network_interfaces.ips and specific_data.data.network_interfaces.mac have more then one entry, lets try to explode this out to thier own rows.

|    | internal_axon_id                 | specific_data.data.hostname                                                               | specific_data.data.network_interfaces.ips                                             | specific_data.data.network_interfaces.mac                       |
|---:|:---------------------------------|:------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------|:----------------------------------------------------------------|
|  0 | a5f257e7f0902fc99082fc73dbb529c9 | ['22AD.TestDomain.test', 'ec2-18-188-87-167.us-east-2.compute.amazonaws.com', 'HVM domU'] | ['10.0.227.26', '18.188.87.167', 'fe80::c123:a46c:c80f:cda2']                         | ['06:75:D9:93:EE:68']                                           |
|  1 | fe6f16a535c8c2a8df03ff26a8b0010d | ['WIN-D14VSGS3C0G.TestDomain.test', 'ip-10-0-2-147.us-east-2.compute.internal']           | ['10.0.2.147']                                                                        | ['06:B9:C8:89:0D:00', '00:FF:35:E6:CB:94', '02:10:7B:0F:90:01'] |
|  2 | b091dde75f56b506be55db43cf326330 | ['WIN-76F9735PMOJ.TestDomain.test', 'ip-10-0-2-120.us-east-2.compute.internal']           | ['10.0.2.120', 'fe80::e4e3:8ffc:637a:59a7']                                           | ['06:37:53:6E:A2:9C']      

Exploding multiple fields with Pandas

     Lets first explode on specific_data.data.hostname

explodedDF = dataFrame.explode('specific_data.data.hostname')

     We now see a row for each hostname the device had.

(We can compare the column internal_axon_id to the preview above to see this is the same device)

|    | internal_axon_id                 | specific_data.data.hostname                       | specific_data.data.network_interfaces.ips                                             | specific_data.data.network_interfaces.mac                       |
|---:|:---------------------------------|:--------------------------------------------------|:--------------------------------------------------------------------------------------|:----------------------------------------------------------------|
|  0 | a5f257e7f0902fc99082fc73dbb529c9 | 22AD.TestDomain.test                              | ['10.0.227.26', '18.188.87.167', 'fe80::c123:a46c:c80f:cda2']                         | ['06:75:D9:93:EE:68']                                           |
|  0 | a5f257e7f0902fc99082fc73dbb529c9 | ec2-18-188-87-167.us-east-2.compute.amazonaws.com | ['10.0.227.26', '18.188.87.167', 'fe80::c123:a46c:c80f:cda2']                         | ['06:75:D9:93:EE:68']                                           |
|  0 | a5f257e7f0902fc99082fc73dbb529c9 | HVM domU                                          | ['10.0.227.26', '18.188.87.167', 'fe80::c123:a46c:c80f:cda2']                         | ['06:75:D9:93:EE:68']    

     using the same data, lets now explode on specific_data.data.network_interfaces.ips

explodedDF = explodedDF.explode('specific_data.data.network_interfaces.ips')

    we now have a row for each network interface the device had

 |    | internal_axon_id                 | specific_data.data.hostname                       | specific_data.data.network_interfaces.ips   | specific_data.data.network_interfaces.mac                       |
|---:|:---------------------------------|:--------------------------------------------------|:--------------------------------------------|:----------------------------------------------------------------|
|  0 | a5f257e7f0902fc99082fc73dbb529c9 | 22AD.TestDomain.test                              | 10.0.227.26                                 | ['06:75:D9:93:EE:68']                                           |
|  0 | a5f257e7f0902fc99082fc73dbb529c9 | ec2-18-188-87-167.us-east-2.compute.amazonaws.com | 10.0.227.26                                 | ['06:75:D9:93:EE:68']                                           |
|  0 | a5f257e7f0902fc99082fc73dbb529c9 | HVM domU                                          | 10.0.227.26                                 | ['06:75:D9:93:EE:68']                                           |
|  0 | a5f257e7f0902fc99082fc73dbb529c9 | 22AD.TestDomain.test                              | 18.188.87.167                               | ['06:75:D9:93:EE:68']                                           |
|  0 | a5f257e7f0902fc99082fc73dbb529c9 | ec2-18-188-87-167.us-east-2.compute.amazonaws.com | 18.188.87.167                               | ['06:75:D9:93:EE:68']                                           |
|  0 | a5f257e7f0902fc99082fc73dbb529c9 | HVM domU                                          | 18.188.87.167                               | ['06:75:D9:93:EE:68']                                           |
|  0 | a5f257e7f0902fc99082fc73dbb529c9 | 22AD.TestDomain.test                              | fe80::c123:a46c:c80f:cda2                   | ['06:75:D9:93:EE:68']                                           |
|  0 | a5f257e7f0902fc99082fc73dbb529c9 | ec2-18-188-87-167.us-east-2.compute.amazonaws.com | fe80::c123:a46c:c80f:cda2                   | ['06:75:D9:93:EE:68']                                           |
|  0 | a5f257e7f0902fc99082fc73dbb529c9 | HVM domU                                          | fe80::c123:a46c:c80f:cda2                   | ['06:75:D9:93:EE:68']                                           |

    lastly lets explode on specific_data.data.network_interfaces.mac

explodedDF = explodedDF.explode('specific_data.data.network_interfaces.mac')

    we can now see that the mac address field has changed from a list type to a string type (no more brackets)

|    | internal_axon_id                 | specific_data.data.hostname                       | specific_data.data.network_interfaces.ips   | specific_data.data.network_interfaces.mac   |
|---:|:---------------------------------|:--------------------------------------------------|:--------------------------------------------|:--------------------------------------------|
|  0 | a5f257e7f0902fc99082fc73dbb529c9 | 22AD.TestDomain.test                              | 10.0.227.26                                 | 06:75:D9:93:EE:68                           |
|  0 | a5f257e7f0902fc99082fc73dbb529c9 | ec2-18-188-87-167.us-east-2.compute.amazonaws.com | 10.0.227.26                                 | 06:75:D9:93:EE:68                           |
|  0 | a5f257e7f0902fc99082fc73dbb529c9 | HVM domU                                          | 10.0.227.26                                 | 06:75:D9:93:EE:68                           |
|  0 | a5f257e7f0902fc99082fc73dbb529c9 | 22AD.TestDomain.test                              | 18.188.87.167                               | 06:75:D9:93:EE:68                           |
|  0 | a5f257e7f0902fc99082fc73dbb529c9 | ec2-18-188-87-167.us-east-2.compute.amazonaws.com | 18.188.87.167                               | 06:75:D9:93:EE:68                           |
|  0 | a5f257e7f0902fc99082fc73dbb529c9 | HVM domU                                          | 18.188.87.167                               | 06:75:D9:93:EE:68                           |
|  0 | a5f257e7f0902fc99082fc73dbb529c9 | 22AD.TestDomain.test                              | fe80::c123:a46c:c80f:cda2                   | 06:75:D9:93:EE:68                           |
|  0 | a5f257e7f0902fc99082fc73dbb529c9 | ec2-18-188-87-167.us-east-2.compute.amazonaws.com | fe80::c123:a46c:c80f:cda2                   | 06:75:D9:93:EE:68                           |
|  0 | a5f257e7f0902fc99082fc73dbb529c9 | HVM domU                                          | fe80::c123:a46c:c80f:cda2                   | 06:75:D9:93:EE:68                           |

lets export this dataFrame to a csv file for easier viewing

explodedDF.to_csv('PathToCsv.csv', index=False)

 

There you have it, an easy way to explode out mutiple fields using pandas.

Please feel free to reach out to me directly if there are any questions around pandas and utilizing the Axonius API

2

Comments

0 comments

Please sign in to leave a comment.

Didn't find what you were looking for?

New post