General - Splunk Macros Advanced
In my last community post, we reviewed the basic usage and best practices for utilizing Splunk macros. In today's post, we'll review how advanced configurations within Splunk can be used to optimize the performance of the integration. To obtain this performance gain we will utilize the tstats command to query against time-series index files created from data model acceleration. This method of searching is important when working with large indexes and log sources which undergo a significant amount of search-time processing.
Indexed vs Search-Time
Splunk fields are derived in one of two manners: indexed field extractions or search-time field extractions. By default, Splunk indexes a number of fields, notably _time, index, host, source, and sourcetype which are commonly used in searches. Indexed field extractions are conducted when the event is written to its corresponding index prior to searches being run against the data. In comparison, search-time field extractions are performed at the time the search is executed, which subsequently increases search time. In our first search from the last community post, we leveraged a few of these default indexed fields to ensure our searches were performant.
| eval time=strftime(_time,"%Y-%m-%d %H:%M:%S")
| stats latest(time) as lastseen by host
We are able to restrict our search to only indexed field extractions avoiding all search-time field extractions by using the tstats command, which significantly reduces the time it takes to search. The search below transforms our Rapid7 example from the first community article into a tstats search. As seen in the screenshots below we observe a significant performance gain reducing our search time from 37.674 seconds to 0.104 seconds.
| tstats latest(_time) as time where index="rapid7" sourcetype="rapid7:nexpose:asset" by host
| eval last_seen=strftime(time,"%Y-%m-%d %H:%M:%S")
| fields - time
To this point the performance gain is exceptional as long as you only want to search and report against the default indexed fields, but this restriction is not effective for a number of data sources we seek to fetch from Splunk. As such, we will leverage accelerated data models to index additional fields which are not indexed by default. To do this we will create a new data model by navigating to Settings -> Data Models and selecting New Data Model. Once given a Title we will select Create, after which we will define our Base Search, index="rapid7" sourcetype="rapid7:nexpose:asset", from which to build our data model via Add Dataset > Root Search. Once configured, we can add any number of extracted fields, extract new fields with regular expression, and perform eval commands, such as our time formatting | eval time=strftime(_time,"%Y-%m-%d %H:%M:%S"), which would otherwise be performed at search-time. Once the fields are defined, you will accelerate your data model and begin writing tstats searches to query the data.
| tstats latest(time) from datamodel=r7_dm by host
Searches which leverage generating commands, such as tstats, require a | character be prepended to the search and macros require this leading pipe to exist outside of the macro.
As a result, we will exclude it from our macro definition in lieu of adding it within the Splunk adapter configuration Advanced Settings > Splunk Configuration > Splunk search macros list.
If you enjoyed this community post, please feel free to leave a comment and contact your TAM or Customer Support to assist with configuring this adapter within your environment.