Engine Works

Under the hood of Alteryx: tips, tricks and how-tos.

We're all under Stay-At-Home orders, so what better time for the next installment of the Will it Alteryx? series? In this edition, we will be looking at Splunk. We'll start with what Splunk is, and then see if Splunk will Alteryx. (Spoiler - yes it will).


What is Splunk?


Although Splunk has expanded to many software products, in this blog we will focus on Splunk's core product offering, which is a technology for collecting and analyzing machine generated data. Data can be ingested into Splunk directly from IoT devices, applications, logs, performance monitoring tools, and more. This typically takes place through "Forwarders" which collects and forwards the data to "Indexers". Indexers receive the data and store it in the back end where it can be searched and analyzed.


Splunk provides a very powerful way for IT, DevOps, and Administrators to visualize data from their infrastructure and source systems. Once these connections are setup, metrics, logs, and messages are automatically read into Splunk in real time, where reports can be generated to run on scheduled basis, or on demand to gain insights into system behaviors. In addition to reports, search results can be saved as Dashboards and Alerts as well. 


You can even configure Splunk to store data sets such as CSV files and then append additional rows as new records are read from the source system. In the example below, you can see I imported the FuzzyData2.csv Alteryx sample data into Splunk. 


A list of data sources in Splunk.A list of data sources in Splunk.


Splunk also provides a slick visual interface for monitoring the environment, working with saved reports and dashboards, configuring new searches, etc...


Splunk's Monitoring Console provides great insight into the system.Splunk's Monitoring Console provides great insight into the system.


Will it Alteryx?


With some background in Splunk, we can now turn to the question on everyone's mind, if and how Alteryx can integrate with this technology. There are a number of ways that Alteryx can be used to analyze Splunk data. Here are some examples:


1. Splunk's REST APIs and the Alteryx Download Tool. Splunk offers REST API access to various Splunk resources, including data inputs, data outputs, searches, and alerts. These APIs can be accessed via the Alteryx Download Tool to pull data into an Alteryx workflow. 


2. Splunk's Python library and the Alteryx Python Tool. Splunk offers a Python SDK with libaries to access Splunk resources using Python code. You can work with data, saved searches, new searches, and more. These libraries can be accessed via python code in the Alteryx Python Tool and then integrated with other Alteryx workflow building blocks. 


3. Splunk's ODBC Driver and the Alteryx Input Tool. Splunk provides an ODBC driver which allows us to read data into Alteryx using the standard Input Tool with a generic ODBC connection. Let's dive into this scenario a bit more...


First, configure the Splunk ODBC Driver to connect to your Splunk environment:




Then, in Alteryx, simply choose the ODBC Generic Connection and reference the ODBC DSN you created above.




If the connection succeeds, you'll see a list of "Tables" that can be imported into Alteryx Designer over the ODBC connection. These "Tables" are actually Saved Searches in Splunk. 




One thing to be aware of, if you use this method, you will likely observe the following error when trying to run the workflow and read in data:


Error: Input Data (1): Error SQLExtendedFetch: [Splunk][SplunkODBC] (60) Unexpected response from server. Verify the server URL. Error parsing JSON: Text only contains white space(s)




This appears to be an issue with the Splunk ODBC Driver. The data is in fact read into Alteryx, but it is inaccessible by any tools downstream in the workflow. The trick here is to enable the "Cache Data" option on the Input tool, and then run the workflow again.




The next time the workflow runs, it will build up the data in the cache. You'll still see an error displayed. Then any subsequent workflow runs will leverage the cached data and be successful.




This works regardless of whether we are accessing file based content in Splunk, or data from logs or performance metrics:




Final Thoughts


Splunk provides a great way to store, search, and monitor machine generated data from sensors, logs, network traffic, and more. Being able to analyze that data to gain business insights makes it even more valuable. With Alteryx, data from Splunk can be combined with other sources, cleansed, and enriched to provide deeper insight.


If you have other thoughts on how to connect to Splunk with Alteryx please leave a comment below!

David Hare
Senior Manager, Solutions Architecture

David has the privilege to lead the Alteryx Solutions Architecture team helping customers understand the Alteryx platform, how it integrates with their existing IT infrastructure and technology stack, and how Alteryx can provide high performance and advanced analytics. He's passionate about learning new technologies and recognizing how they can be leveraged to solve organizations' business problems.

David has the privilege to lead the Alteryx Solutions Architecture team helping customers understand the Alteryx platform, how it integrates with their existing IT infrastructure and technology stack, and how Alteryx can provide high performance and advanced analytics. He's passionate about learning new technologies and recognizing how they can be leveraged to solve organizations' business problems.

7 - Meteor

I am excited to see this! Thanks for the post. 

12 - Quasar
12 - Quasar

@DavidHa ,


This is a great post! We had lots of questions related to how Alteryx could be used for cyber security during a session last year at Inspire. One of the platforms that was used in the use case was Splunk. We werent able to share much of the use case from a workflow and implementation perspective, but this definitely on track with a portion of the implementation. 

7 - Meteor

I'm curious to know if anyone has been able to successfully connect and see saved searched based on their user credentials?  I've been trying pull in any saved searches but so far, no luck at all.  It's weird because using the same credentials in Tableau, I pull back all saved searches regardless of creator.  Thanks!

5 - Atom

I am having an issue during the configuration of ODBC Splunk at Alteryx Designer. After all steps when I try to connect any table (saved search on splunk) I receive the error "No columns Returned", anyone knows what I am doing wrong?



5 - Atom

I have a similar issue to pervic26. Anyone have any input?

7 - Meteor

Sorry to just post this, but my team figured out how we can pull Splunk data once the connection is made.  


Once you successfully connect via the ODBC connection, run your queries and in the upper right hand corner, you should see an option that says "Create Table View".  When you click that option, you should be guided through two additional screens.  The first will allow to select any fields that you may have missed in your initial query.  Once you're satisfied that you've captured those fields, click the "Done" button in the bottom left corner of the screen (if you're confident you have all the fields you need, you don't have to wait for the preview to load, just click "Done").


When you move to the next screen, here is your final chance to preview the data.  You can review all of your fields, check stats of populations, etc.  On this screen, you will have to wait until the sample data has populated before you can proceed.  Once populated, click on the "Save" button in the upper right hand corner and give your table any name you choose (if you missed a field, there is a chance to start over by clicking the "Open in Search" link in the lower left corner, but that will start you from step one).


Once you save the table, you can then view table or close the dialog box.  In Alteryx, connect to the Splunk and refresh your table view.  What your should see (at least in our teams experience) are both reports (which have been spotty to work with Alteryx) and tables (which are prefixed with "RootObject.".  This should be your dataset to pull into your workflow that you can use in whatever automation you need.  I hope this is helpful.  I'll attach numbered screen shots in sequential order with the final output.


Step 1Step 1Step 2Step 2Step 3Step 3Step 4Step 4Step 5Step 5Step 6Step 6Step 7Step 7

Final outputFinal output

7 - Meteor

Sorry, forgot to add the output in Alteryx.  The count in Splunk shows 262K...the count in Alteryx matches that, I just removed the RootObject. prefex.


Final matchFinal match