Tool Mastery | Cross Tab
- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Notify Moderator
- edited
This article is part of the Tool Mastery Series, a compilation of Knowledge Base contributions to introduce diverse working examples for Designer Tools. Here we’lldelve into uses of the Cross Tab Tool on our way to mastering the Alteryx Designer:
Sometimes you look at the steaming pile of data before you and wonder how you’ll ever get it in the form you need. Every option seems to require a great deal of manual labor, and as a lazy– er that is, as a data blending professional, that is simply something you will not abide.
In situations like these, you may want to consider shaking things up.There's no better tool for this than the Cross Tab, apowerful tool that allows you to reshape your data any-which-way, allowing you to approach your problem from a new angle. In this article, I will demonstrate a few use cases to showcase how you can leverage this awesome tool.
Use Case 1: Extracting dynamic data made easier by assigning groups
The data?
You receive a list that looks like Field_1 below. The PMID refers to an ID number for a medical journal article in a database. Each FAU is another author on the paper. There may be any number of authors for a paper.
The goal?
A table with ID numbers in the first field and columns corresponding to authors.
How?
- Prepare the data by filtering and splitting away the identifier. The third column above, "Field_12", shows the usabledata.
- Use a Multi-Row Formula to identify the ID column and author columns uniquely. In this case, each ID number is represented by ‘0’, and authors are counted up from zero until they hit another ID. See the “Headers” column.
- Use another multi-row to associate each group of headers to one another. This is basically a RecordID – it identifies a single paper in the database. See the “Groups” column above.
- Cross-Tab! By using these identifying columns, you may shift your data so that each of the Headers creates a column, and each of the Groups create a row. See the configuration window in the first image above.
Use Case 2: Performing calculations dynamically for any number of fields
The data?
A handful of numerical fields, shown below. They are grouped by a category field and you’ve added a unique RecordID field.
The goal?
Rolling averages for each column, within their respective category.
How?
- Instead of writing a multi-row formula for each column, Transposeeverything down to a single column, and tack on the Key Fields “RecordID” and “Category.” See the configuration window in the first picture. This results in the below output.
- While it may appear to be even more difficult to work with, this allows you to calculate a rolling average in one fell swoop. Use a Multi-Row tool to calculate an Average. You can easily avoid picking up the wrong values by using the Group By option – check off “Category” and “Name”. Make sure also to set Values for Rows that don’t Existto the closest valid row.
- Restructure using Cross Tab! (Group by “RecordID”, Header “Name”, Data “r3”)
Use Case 3: Tricky logic made easier with Cross Tab methodologies
The data?
You have a list of all possible combinations of 5 items. For each combination, a number of rows corresponding to the number of items lists each item's weight and value - i.e. Combination 123 will be represented three times, with information for item1, item2, and item3.
The goal?
You wish to optimize your selection of items to meet certain criteria, such as minimum weight and maximum value.
How?
- Use a Formulatool to add a column for "Weight" as shown in the first image.
- Use theCross Tabwith the "Sum" methodology to find the combined weight of all the items in each combination. The "Weight" header will group all the "kg" values together, and grouping by "Combinations" will create one row for every combination.
- Repeat this for "Value" ($).
Pro Tip: Renaming Fields
A downside of the Cross Tab tool is that it doesn't play nice with special characters in field headers, including spaces. This means that if you have a field header "a a", it will actually come out as "a_a". I know this can be a bit inconvenient, but when we were developing the Alteryx engine we prioritized speed and efficiency over keeping the field headers looking nice. Don't worry though - there's a perfectly doable solution to this problem (more than one actually!) that makes use of an awesome tool called Dynamic Rename. This is the way I usually like to go about it:
For the workflows shown in these use cases, please see the supplementary Alteryx package. Note that you may receive an error upon extracting the content, but this won't affect running the workflow.
Additional Information
Click on the corresponding language link below to access this article in another language -
By now, you should have expert-level proficiency with the Cross TabTool! If you can think of a use case we left out, feel free to use the comments section below! Consider yourself a Tool Master already? Let us know at community@alteryx.com if you’d like your creative tool uses to be featured in the Tool Mastery Series.
Stay tuned with our latest posts every Tool Tuesday by following Alteryx on Twitter! If you want to master all the Designer tools, consider subscribing for email notifications.
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Notify Moderator
Updated Use Case 1 to show @PeterS's Run Total method from "Parsing Data with Unknown Number of Fields"
Reformatted Use Case 2 Results to eliminate precision warnings.
Included the latest version of my “Combination” macro, which can also be found on the Gallery here.
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Notify Moderator
Something to remember as well (I got caught on this today), after the cross-tab the columns will be in alphabetic order :(
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Notify Moderator
Very nice explanation, Thx for sharing
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Notify Moderator
Great use cases, make things definitely more digestible.
I have a question about Use Case 1, in the 1st cross tab tool I found that using Concatenate rather than First leaving the cells without authors empty rather than Null, so there is no need for the other steps after the select tool to remove the Null.
Does using Concatenate in this case ok? or it make the scope of the workflow scope smaller or might cause issues?
Regards
AliAS
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Notify Moderator
Hi,
Is there a way to add subtotal like a pivot table in excel. For example i can use cross tab to create column for APAC and AMER but how do i get the yearly total lines inserted after every year.
Thanks in advance!
Raw Data
Year | Product | Region | Volume |
2018 | A | APAC | 10 |
2018 | A | AMER | 12 |
2018 | B | APAC | 20 |
2019 | A | AMER | 5 |
2019 | B | AMER | 13 |
Required output
Year | Product | APAC | AMER |
2018 | A | 10 | 12 |
2018 | B | 20 | |
2018 Total | 30 | 12 | |
2019 | A | 5 | |
2019 | B | 13 | |
2019 Total | 0 | 18 |
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Notify Moderator
You may run into the case where you add or update a Cross Tab tool that has subsequent tools referencing new fields created as a result of that Cross Tab. This would error because Alteryx won't recognize these new fields until the workflow is ran.
My solution: Create a "dummy" Text Input with the fields needed in the subsequent tool(s). No actual data rows are required in this input, simply the headers. Then use a Union tool to combine this with the output from the Cross Tab tool. The reason this works is because now the workflow will always be expecting these fields since they're coming from the Text Input no matter what. Because there are no rows of data coming from this source, though, the actual data remains as expected, but there will be no errors!
Hopefully this helps other folks in the future as well.
Text Input:
Workflow snippet:
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Notify Moderator
This is helpful and interesting. Thank you.