01-10-2018 09:30 AM - edited 08-03-2021 11:30 AM
Thisarticle is part of the Tool Mastery Series, a compilation of Knowledge Base contributions to introduce diverse working examples for Designer Tools. Here we’ll delve into uses of theField Summary Tool on our way to mastering the Alteryx Designer:
The Field Summary Tool analyzes data and creates a summary report containing descriptive statistics of data in selected columns. It’s a great tool to use when you want to make sure your data is structured correctly before using any further analysis, most notably with the suite of models that can be generated with the Predictive Tools. Think of the Field Summary as a Browse Tool, but on steroids. Not only does it give you a summary of your fields, but it will give you recommendations based on its data type on how to fix your data to be used for further analysis.
The configuration for the Tool should be easy enough – just select your fields you want summaries for and if you would like to sample your data:
Now let’s go over the various summaries we get, depending on the field’s data type. Summaries will be provided for Numeric, String, Spatial, and Date/Time fields. Depending on the type, a different set of statistics will be provided, as shown below (borrowed from the Tool's help document):
These statistics are provided via three outlets – The O anchor gives us a data stream, the R anchor gives us a static report that can be viewed via the Browse Tool, and the I anchor gives us an interactive dashboard that can be viewed via the Browse Tool.
Usually we’ll use the Field Summary Tool if we’re unfamiliar or unsure about the data we’re using, before we plug into a Predictive model. Understanding your data before doing any predictive analysis can save many headaches for any errors that may arise when running your model, or even when assessing why your output is less than desirable.
Many the suggestions provided by the Field Summary Tool can help guide you to either categorize your data differently, or to even look to supplement your data with less biased data. For example, for String data types, a common remark is that some values in a certain field have a small number of value counts. This can greatly affect a model, as it can create bias for that variable and doesn’t accurately capture the true effect of that variable on the model.
We can agree that the plethora of predictive tools available is great, but it’s not much help if we feed it bad data! Using the Field Summary will be there when you’re confused or heartbroken, lending you a helping hand in trying to understand what went wrong (or avoid the heartaches by heeding it’s advice before casting your heart into those predictive tools!).
By now, you should have expert-level proficiency with theField Summary Tool! If you can think of a use case we left out, feel free to use the comments section below! Consider yourself a Tool Master already? Let us know atcommunity@alteryx.comif you’d like your creative tool uses to be featured in the Tool Mastery Series.
Stay tuned with our latest posts every#ToolTuesdayby following@alteryxon Twitter! If you want to master all the Designer tools, considersubscribingfor email notifications.
hi - i've been looking at this and see lots of potential for data auditing purposes. i have a question about the R anchor. which according to your description - the R anchor gives us a static report that can be viewed via the Browse Tool,. The report is so easy to generate and would be very useful, but - how can I get this report out to a format i can distribute/download/use? What am i missing here as a newbie? Thanks!
Hi @nwhite
You can use the Render Tool, under the Reporting Category to output reporting objects to certain file types.
https://help.alteryx.com/current/PortfolioComposerRender.htm
Cheers,
Mike
Nice - thanks!
Hi, Thank you for posting all the needing details of Field summary tool. I am aware of concepts and principles of data science but I am getting stuck at a point in my project. As part of data exploration and cleaning process, I have identified the issue with my dataset using the field summary tool. Since I am trying to automate the whole workflow in Alteryx I was wondering how can I use the values of the columns returned via the field summary as input to the select tool to deselect columns which will not contribute to the model.
I can only view the output, report and interactive dashboard.
Please help.
@navneet_ramachandran You can use a combination of the Dynamic Rename and the Dynamic Select tools. Here's an example: https://gallery.alteryx.com/#!app/Dynamically%2Bselecting%2Bcolumns%2Bbased%2Bon%2Bthe%2Bdata/5dae32...
Thank you NeilR, this is exactly what I was looking for.
Can you also point me to a tutorial where I could learn ML specific content that gives a large gamut of examples of ML modelling like all supervised, unsupervised and reinforcement learning?
@navneet_ramachandran check out our Tool Mastery series for information about specific tools like Boosted Model for supervised learning and K-Centroids for unsupervised, among others. Also check out our Data Science blog, including this Assisted Modeling article.