Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Community is experiencing an influx of spam. As we work toward a solution, please use the 'Notify Moderator' option on the ellipsis menu to flag inappropriate posts.

Tool Mastery

Explore a diverse compilation of articles that take an in-depth look at Designer tools.
Become a Tool Master

Learn how you can share your expertise with the Community

LEARN MORE

Tool Mastery | Cross Tab

AlexKo
Alteryx Alumni (Retired)
Created

Cross Tab.pngThis article is part of the Tool Mastery Series, a compilation of Knowledge Base contributions to introduce diverse working examples for Designer Tools. Here we’lldelve into uses of the Cross Tab Tool on our way to mastering the Alteryx Designer:

 

Sometimes you look at the steaming pile of data before you and wonder how you’ll ever get it in the form you need. Every option seems to require a great deal of manual labor, and as a lazy– er that is, as a data blending professional, that is simply something you will not abide.

 

In situations like these, you may want to consider shaking things up.There's no better tool for this than the Cross Tab, apowerful tool that allows you to reshape your data any-which-way, allowing you to approach your problem from a new angle. In this article, I will demonstrate a few use cases to showcase how you can leverage this awesome tool.

 

Use Case 1: Extracting dynamic data made easier by assigning groups

 

The data?

 

You receive a list that looks like Field_1 below. The PMID refers to an ID number for a medical journal article in a database. Each FAU is another author on the paper. There may be any number of authors for a paper.

 

 

idea Skyscrapers

 

 

The goal?

A table with ID numbers in the first field and columns corresponding to authors.

 

2019-03-01_10-13-18.png

 

 

How?

  1. Prepare the data by filtering and splitting away the identifier. The third column above, "Field_12", shows the usabledata.
  2. Use a Multi-Row Formula to identify the ID column and author columns uniquely. In this case, each ID number is represented by ‘0’, and authors are counted up from zero until they hit another ID. See the “Headers” column.
  3. Use another multi-row to associate each group of headers to one another. This is basically a RecordID – it identifies a single paper in the database. See the “Groups” column above.
  4. Cross-Tab! By using these identifying columns, you may shift your data so that each of the Headers creates a column, and each of the Groups create a row. See the configuration window in the first image above.

 

Use Case 2: Performing calculations dynamically for any number of fields

 

The data?

A handful of numerical fields, shown below. They are grouped by a category field and you’ve added a unique RecordID field.

 

idea Skyscrapers

 

 

 

 

The goal?

Rolling averages for each column, within their respective category.

 

2019-03-01_10-18-17.png

 

 

How?

  1. Instead of writing a multi-row formula for each column, Transposeeverything down to a single column, and tack on the Key Fields “RecordID” and “Category.” See the configuration window in the first picture. This results in the below output.

    2019-03-01_10-19-09.png
     
  2. While it may appear to be even more difficult to work with, this allows you to calculate a rolling average in one fell swoop. Use a Multi-Row tool to calculate an Average. You can easily avoid picking up the wrong values by using the Group By option – check off “Category” and “Name”. Make sure also to set Values for Rows that don’t Existto the closest valid row.

    2019-03-01_10-19-52.png
     
  3. Restructure using Cross Tab! (Group by “RecordID”, Header “Name”, Data “r3”)

 

Use Case 3: Tricky logic made easier with Cross Tab methodologies

 

The data?

You have a list of all possible combinations of 5 items. For each combination, a number of rows corresponding to the number of items lists each item's weight and value - i.e. Combination 123 will be represented three times, with information for item1, item2, and item3.

 

crosstab3a.jpg

 

 

 

The goal?

You wish to optimize your selection of items to meet certain criteria, such as minimum weight and maximum value.
 

2019-03-01_10-24-22.png

 

How?

  1. Use a Formulatool to add a column for "Weight" as shown in the first image.
  2. Use theCross Tabwith the "Sum" methodology to find the combined weight of all the items in each combination. The "Weight" header will group all the "kg" values together, and grouping by "Combinations" will create one row for every combination.
  3. Repeat this for "Value" ($).
     

Pro Tip: Renaming Fields

 

A downside of the Cross Tab tool is that it doesn't play nice with special characters in field headers, including spaces. This means that if you have a field header "a a", it will actually come out as "a_a". I know this can be a bit inconvenient, but when we were developing the Alteryx engine we prioritized speed and efficiency over keeping the field headers looking nice. Don't worry though - there's a perfectly doable solution to this problem (more than one actually!) that makes use of an awesome tool called Dynamic Rename. This is the way I usually like to go about it:

 

Capture.PNG

 

For the workflows shown in these use cases, please see the supplementary Alteryx package. Note that you may receive an error upon extracting the content, but this won't affect running the workflow.

Additional Information

Click on the corresponding language link below to access this article in another language -

By now, you should have expert-level proficiency with the Cross TabTool! If you can think of a use case we left out, feel free to use the comments section below! Consider yourself a Tool Master already? Let us know at community@alteryx.com if you’d like your creative tool uses to be featured in the Tool Mastery Series.

 

Stay tuned with our latest posts every Tool Tuesday by following Alteryx on Twitter! If you want to master all the Designer tools, consider subscribing for email notifications.




 
Attachments
Comments
AlexKo
Alteryx Alumni (Retired)

Updated Use Case 1 to show @PeterS's Run Total method from "Parsing Data with Unknown Number of Fields"

Reformatted Use Case 2 Results to eliminate precision warnings.

Included the latest version of my “Combination” macro, which can also be found on the Gallery here.

cmcclellan
13 - Pulsar

Something to remember as well (I got caught on this today), after the cross-tab the columns will be in alphabetic order :( 

rfcnobrega
6 - Meteoroid

Very nice explanation, Thx for sharing

AliAS2020
8 - Asteroid

Great use cases, make things definitely more digestible.

I have a question about Use Case 1, in the 1st cross tab tool I found that using Concatenate rather than First leaving the cells without authors empty rather than Null, so there is no need for the other steps after the select tool to remove the Null.

 

Does using Concatenate in this case ok? or it make the scope of the workflow scope smaller or might cause issues?

 

Regards

 

AliAS

Harshmore
5 - Atom

Hi,

 

Is there a way to add subtotal like a pivot table in excel. For example i can use cross tab to create column for APAC and AMER but how do i get the yearly total lines inserted after every year.

 

Thanks in advance!

 

Raw Data

YearProductRegionVolume
2018AAPAC10
2018AAMER12
2018BAPAC20
2019AAMER5
2019BAMER13

 

Required output

YearProductAPACAMER
2018A1012
2018B20 
2018 Total 3012
2019A 5
2019B 13
2019 Total 018
Kenda
16 - Nebula
16 - Nebula

You may run into the case where you add or update a Cross Tab tool that has subsequent tools referencing new fields created as a result of that Cross Tab. This would error because Alteryx won't recognize these new fields until the workflow is ran.

 

My solution: Create a "dummy" Text Input with the fields needed in the subsequent tool(s). No actual data rows are required in this input, simply the headers. Then use a Union tool to combine this with the output from the Cross Tab tool. The reason this works is because now the workflow will always be expecting these fields since they're coming from the Text Input no matter what. Because there are no rows of data coming from this source, though, the actual data remains as expected, but there will be no errors!

 

Hopefully this helps other folks in the future as well.

 

 

Text Input:

Kenda_0-1639487546739.png

 

 

Workflow snippet:

Kenda_1-1639487734527.png

 

 

mvtejano
7 - Meteor

This is helpful and interesting. Thank you.