Engine Works

Under the hood of Alteryx: tips, tricks and how-tos.
ned_blog
8 - Asteroid

We got the following customer request:

We work with a ton of different input files in various formats, etc...and sometimes have poor documentation on what is in the fields, etc...so, something that would be very useful would be to have a function that could create a frequency report for every column/field in a file. I am able to do this now manually by using a lot of summarize tools and crosstabs, but if there was an 'out of the box' tool that would inspect each data field, provide some of the same info that Calgary loader does, such as number of distinct values, etc...that would be great.

 

I thought - that will be easy, I'll just build a wizard. I started in on the wizard and realized that it wasn't as easy as I thought. I wanted to go down a different path for different field types. Numeric one way, string the other, etc... Fortunately for me, I have access to the code. I wrote a DynamicSelect tool. This allows fields to be selected either by field type or via a formula.

 

 

With my new tool in hand, I thought I was all set to write the wizard. The next problem I ran into is that Alteryx really doesn't like having a stream with no fields. If I used this against a table with no numeric fields, it crashed. It makes sense that most tools can't deal with no fields, but I needed to have at least 1 tool that could handle no fields. The logical tool is the transpose tool, since that will always give me some fields out (but maybe no records) and then I can use that stream for summarize tools with the proper grouping. After fixing up the transpose tool so it didn't error having no fields on input I was good to go.

 

So I wrote a wizard that collects all kinds of statistics on the individual fields of a table and produces a PDF file. For all fields it gives some statistics for the number of unique and null values. For string fields it gives the top N (user settable) values. For numeric fields it gives Average, Median, etc... All and all it produces a pretty cool report - although it needs someone with more artistic sense than myself to make it look good. This wizard will be available as a sample in Alteryx 5.0.