09-24-2015 03:50 PM - edited 08-03-2021 09:40 AM
This article is part of the Tool Mastery Series, a compilation of Knowledge Base contributions to introduce diverse working examples for Designer Tools. Here we’lldelve into uses of the Cross Tab Tool on our way to mastering the Alteryx Designer:
Sometimes you look at the steaming pile of data before you and wonder how you’ll ever get it in the form you need. Every option seems to require a great deal of manual labor, and as a lazy– er that is, as a data blending professional, that is simply something you will not abide.
In situations like these, you may want to consider shaking things up.There's no better tool for this than the Cross Tab, apowerful tool that allows you to reshape your data any-which-way, allowing you to approach your problem from a new angle. In this article, I will demonstrate a few use cases to showcase how you can leverage this awesome tool.
The data?
You receive a list that looks like Field_1 below. The PMID refers to an ID number for a medical journal article in a database. Each FAU is another author on the paper. There may be any number of authors for a paper.
The goal?
A table with ID numbers in the first field and columns corresponding to authors.
How?
The data?
A handful of numerical fields, shown below. They are grouped by a category field and you’ve added a unique RecordID field.
The goal?
Rolling averages for each column, within their respective category.
How?
The data?
You have a list of all possible combinations of 5 items. For each combination, a number of rows corresponding to the number of items lists each item's weight and value - i.e. Combination 123 will be represented three times, with information for item1, item2, and item3.
The goal?
You wish to optimize your selection of items to meet certain criteria, such as minimum weight and maximum value.
How?
A downside of the Cross Tab tool is that it doesn't play nice with special characters in field headers, including spaces. This means that if you have a field header "a a", it will actually come out as "a_a". I know this can be a bit inconvenient, but when we were developing the Alteryx engine we prioritized speed and efficiency over keeping the field headers looking nice. Don't worry though - there's a perfectly doable solution to this problem (more than one actually!) that makes use of an awesome tool called Dynamic Rename. This is the way I usually like to go about it:
For the workflows shown in these use cases, please see the supplementary Alteryx package. Note that you may receive an error upon extracting the content, but this won't affect running the workflow.
Additional Information
Click on the corresponding language link below to access this article in another language -
By now, you should have expert-level proficiency with the Cross TabTool! If you can think of a use case we left out, feel free to use the comments section below! Consider yourself a Tool Master already? Let us know at community@alteryx.com if you’d like your creative tool uses to be featured in the Tool Mastery Series.
Stay tuned with our latest posts every Tool Tuesday by following Alteryx on Twitter! If you want to master all the Designer tools, consider subscribing for email notifications.
Updated Use Case 1 to show @PeterS's Run Total method from "Parsing Data with Unknown Number of Fields"
Reformatted Use Case 2 Results to eliminate precision warnings.
Included the latest version of my “Combination” macro, which can also be found on the Gallery here.
Something to remember as well (I got caught on this today), after the cross-tab the columns will be in alphabetic order :(
Very nice explanation, Thx for sharing
Great use cases, make things definitely more digestible.
I have a question about Use Case 1, in the 1st cross tab tool I found that using Concatenate rather than First leaving the cells without authors empty rather than Null, so there is no need for the other steps after the select tool to remove the Null.
Does using Concatenate in this case ok? or it make the scope of the workflow scope smaller or might cause issues?
Regards
AliAS
Hi,
Is there a way to add subtotal like a pivot table in excel. For example i can use cross tab to create column for APAC and AMER but how do i get the yearly total lines inserted after every year.
Thanks in advance!
Raw Data
Year | Product | Region | Volume |
2018 | A | APAC | 10 |
2018 | A | AMER | 12 |
2018 | B | APAC | 20 |
2019 | A | AMER | 5 |
2019 | B | AMER | 13 |
Required output
Year | Product | APAC | AMER |
2018 | A | 10 | 12 |
2018 | B | 20 | |
2018 Total | 30 | 12 | |
2019 | A | 5 | |
2019 | B | 13 | |
2019 Total | 0 | 18 |
You may run into the case where you add or update a Cross Tab tool that has subsequent tools referencing new fields created as a result of that Cross Tab. This would error because Alteryx won't recognize these new fields until the workflow is ran.
My solution: Create a "dummy" Text Input with the fields needed in the subsequent tool(s). No actual data rows are required in this input, simply the headers. Then use a Union tool to combine this with the output from the Cross Tab tool. The reason this works is because now the workflow will always be expecting these fields since they're coming from the Text Input no matter what. Because there are no rows of data coming from this source, though, the actual data remains as expected, but there will be no errors!
Hopefully this helps other folks in the future as well.
Text Input:
Workflow snippet:
This is helpful and interesting. Thank you.