This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
Have you ever wanted to do a Cross Tab, but needed the results in a particular order? You've probably discovered that Cross Tab will dutifully change your data to column headings and keep the rest of the data correlated, but that the output columns will be arranged in alphabetical order. It's as if there were a Select tool with the "Sort on Field Name" Option thrown into your workflow without you putting it there.
You may have noticed that the Cross Tab tool will insert underscores in your headers. There may be times when is ok, but for those times when it’s not, how do you get the original format of the data back - without the underscores? Attached is an example that provides a solution. Feel free to bypass the rest of this posting and go directly to the example.
If you'd like to narrative of what is going on in the example, read on. The Text Input tool has data that has the following data:
We want to cross tab our data, so fields in Dept Name become our new headers with Cost in Millions as our values.
So now our data looks like this:
where the headers have been changed to include underscores. Very uncool. So in a separate workflow stream, we're going to get the original format.
Attach a Select tool to the Text Input:
And select Program Type and Dept Name. We don't need the Cost in Millions anymore. We going to use the Dept Name - which is in the original format - and eventually use this to replace the data with underscores.
After the select, our data looks like this.
However, we only need one group for Dept Name. You can see it the Dept Name is repeated for each Program Type. So attach a Unique Tool and check Dept Name in the configuration window.
Now we're left with the following data:
This is where it gets interesting. We need to make a copy of the Dept Name, but we want to make it look exactly like the headers with the underscores in it. Why? We'll use this new field to align our data in sequent Union tool. In the meantime, back to creating the new field with underscores in the data.
This REGEX_Replace expression will take the current Dept Name field and replace the colons and spaces with underscores. In the real world, you actual data may contain other characters that were replaced with underscores by the Cross Tab tool. You'll need to modify this expression accordingly. In this example, the result of the REGEX_Replace statement looks like this:
where Dept Name2 looks like the headers from the Cross Tab tool.
So rather ironically we're going to cross tab this data. Connect a Cross Tab tool and in the configuration, we'll group by Program Type, make the new headers Dept Name2 and the values will be Dept Name:
The result looks like this:
We're ready to union our data at this point. Since the headers from both Cross Tabs tools look the same, we can select 'Auto Configure by Name'.
Make sure the Cross Tab with the original format is attached to the Union Tool first.
By specifying the output order in the Union Tool configuration, we ensure the original field data is on top (rather than the bottom).
For the grand finale, we'll add a Dynamic Rename tool and configure the tool to 'Take Field Names from First Row of Data'. Notice that Program Type is not checked in the configuration. That's because this field is already in the format we want, so we unselect that field.
Now we have our data, cross-tabbed, with the original format. Our headers look the way we want them to!
The Report Map Tool (master it here) allows to create thematic maps by selecting a "Theme" field on the Data tab and then further defining the theme on the Layers tab, for example:
The above example creates a map of Texas showing average annual rainfall totals where orange is the least rainfall and blue the most:
Pretty nice, right? But what if you want to change the map and instead of applying the theme to the fill color for the polygons, you want to apply the theme to the outline and just show that?
That is a little trickier because the Report Map Tool allows you to adjust the outline color and size of your polygons, it doesn't automatically apply the theme, so a workaround needs to be built.
You could feed in each polygon as an individual layer but that is difficult to manage - to keep the color gradient consistent, making sure they are in the right order. And what if a new rainfall range is introduced? You might have to adjust a couple of layers to account for it.
A better option would be to turn the polygon outlines into line polygons themselves. That would allow you to apply a theme right to the outline polygons.
In order to do this, we will use the following tools:
A RecordID is assigned so that we can pull the data apart and put it back together again.
The polygons are split into detailed regions using the Poly-Split tool and rows flagged as 'holes' are removed.
The polygons are split into points.
Those points are reassembled as a sequence polyline. The create the polyline, the data is grouped by the RecordID to keep each polyline separate. (A polyline contains multiple line segments, where a line has one start and one endpoint, but can have any number of points in between. A polyline can be quite complex as in the case of road systems, or rivers.)
The sequence polylines are joined back to the original data set.
Using the reporting tools to create the maps with rainfall range as the thematic field.
With that workaround you can create a map that looks like this:
For details on tool configurations, see the attached workflow.
The Dynamic Rename Tool is part of the developer category of tools. It allows the user to quickly rename any or all fields within an input stream by employing the use of different methods.
The user has the option to rename only certain fields, all fields, or even dynamic/unknown fields at runtime (e.g. after a Cross Tab Tool). The option for renaming fields are:
The Make Columns Tool takes rows of data and arranges them by wrapping records into multiple columns. The user can specify how many columns to create and whether they want records to layout horizontally or vertically.
Let's take a look at this data set that needs to be combined into one field for reporting purposes:
but it shouldn't end up looking like this:
That is no better for reporting purposes than the data we started with because we only want to see areas that need improvement and we don't want to just combine values separated by commas.
So how can we get it to look like this?
A formula such as this one could work:
If contains([ProdDessert], "Improve") then [ProdDessert] else "" endif + " & " + If contains([ProdSupplies], "Improve") then [ProdSupplies] else "" endif + " & " + If contains([ProdFood], "Improve") then [ProdFood]else "" endif + " & " + If contains([ProdFoodSvc], "Improve") then [ProdFoodSvc] else "" endif
But that's a lot of typing! And it will require cleaning up of "&" and spaces where they aren't needed.
So let's look at a dynamic way of doing this:
First, using a Transpose tool, the columns are turned into rows and the data is filtered for rows that contain the word "Improve":
Then, using a Summarize tool, the remaining rows are concatenated together using a space as the delimiter:
Lastly, some cleanup is done in a Formula tool:
And the data is ready for reporting!
Also, take a look at the attached workflow that walks through these steps.
This article was put together to resolve a common issue with cleansing your data as well as to show the use of tools and techniques that are not normally used for newer users. The goal of the article is to get newer users into these tools to open their creativity with the tool and hopefully take you to the next level!
In this use case, the data in the attached workflow is messy with capitalized strings all over the place. We want to format the data by removing some of the capitalization, but not all of it.
Note: If we wanted to make every first letter of the word capitalized we can use the Formula Tool and the TitleCase(String) function. This would make BEAR the WEIGHT - Bear The Weight. See the difference?
The tools that we will be using in this exercise is the Record ID, Text to Columns, RegEx, Formula, Tile, and Cross Tab Tools.
The exercise will show you the importance of using the Record ID Tool. The flexibility of the Text to Columns and RegEx Tools, the under-used Tile Tool, the creativity of the Formula Tool, and the not so scary Cross Tab tool when then data is configured properly.
We hope that these exercise and use cases open up your mind and the greatness of Alteryx!
See attached workflow and enjoy!
Did you know the average football game lasts 3 hours and 12 minutes and only amounts to roughly 11 minutes of play? Now, I love trying to eat Doritos through my TV screen as much as the next guy, but for me the highlights are definitely a better watch. The Summarize Tool would probably agree - the most effective communication of your data is the most concise summary of it. Whether it’s concatenating strings for storage, merging reports to have better readability, getting your spatial objects to interact, or even calculating averages and other formulas on groupings of data, the Summarize Tool can reframe your data to be more informative. This article provides a few examples on how.
The Transpose tool pivots data for all selected fields. The column headers are listed in the name field and the corresponding data items are listed in the value field. You can also select key fields which will remain unchanged through the transformation. The transpose tool is often used in conjunction with the Cross Tab tool, which essentially works in the opposite direction.
Is there a way to avoid the Cross Tab Tool from transferring all the input information alphabetically? Simply add a RecordID to your records and add the RecordID field as a grouping field in your Cross Tab Tool to keep the order!
We frequently get questions about how to flag rows in a data set that are missing values in any column.
In the example below, we will demonstrate how to add a new column to the data that gives a count of null or empty values per row. The end result will give a count of 1 to each of the highlighted rows in the image. The new column can be used in a filter tool to isolate rows of data that have missing values.
Sometimes you look at the steaming pile of data before you and wonder how you’ll ever get it in the form you need. Every option seems to require a great deal of manual labor, and as a lazy– er that is , as a data blending professional , that is simply something you will not abide.
The Excel number for a modern date is most easily calculated as the number of days since 12/30/1899 on the Gregorian calendar.
Attached please find a v10.0 workflow that converts a double (Excel date) to a Datetime in Alteryx.
It uses this function in a Formula Tool:
where Datebase = a datetime value set to "1899-12-30 00:00:00"
and Field1 is double value = 42397
How do you use the Arrange Tool in Alteryx?
The Arrange tool allows you to manually transpose and re arrange your data fields for presentation purposes. Data is transformed so that each record is turned into multiple records and columns can be created by using field description data.
Set the Arrange tool.
Key Fields : Select columns from your data stream. Create and manipulate Output Fields . To create a new ouput field, click Column and select Add to open the Add Column window. Column Header : Enter the name of the new column of data. Fill in Description Column : Select Add New Description to create a column containing your description value of the selected fields.
Please find the example Arrange.yxmd attached.
Recently a couple questions came across the Customer Support desk asking how a fiscal calendar could be incorporated into a workflow. Natively Alteryx doesn’t have a tool to create one, but Alteryx does have a number of tools to make a fiscal calendar. Here is an example of how this can be done.
1. Determine the start and end dates of the fiscal calendar and enter them into Text input tool, each on an individual row
2. Connect a TS Filler tool to generate dates between the start and end dates
3. A Select Tool was added to remove unnecessary fields
4. Add a Multi-Row Formula Tool to create a Day of Week field by assigning each row a day of the week from 1 to 7
5. Add another Multi-Row Formula Tool to calculate the fiscal week. Whenever the Day of Week is 1 add a value of 1 to the previous row’s fiscal week value. This will create a running week value for the entire year
An example workflow is attached. Also in example workflow is an example of how the fiscal month and week of month may be added. (Workflow is in Alteryx 10.6 version.)
You monitor the mileage of multiple trucks as they deliver shipments over the course of a week and record additional information regarding each truck in a file (Truck Metrics). Each truck’s cumulative mileage per day is recorded in a separate file (Truck Mileage). Your goal is to update the mileage-related fields in Truck Metrics with the values recorded in Truck Mileage. Note: today is Tuesday so only fields for Monday and Tuesday will be updated in in the Truck Metrics file.
Manually Select Fields to Update
Whether the data is uniquely identified by a Truck Number (or Record ID) or identically ordered in both files, data from Truck Metrics (the table to be updated) and Truck Mileage (the data used for updating) can be Joined together. Then, using the Select functionality within the Join tool, I can manually choose the fields from Truck Mileage that I want to use to replace the fields that need to be updated (Figure 1).
Figure 1: Fields from Truck Mileage (Yellow) replace fields from Truck Metrics (Blue). Note that fields in yellow are selected while fields in blue have been deselected. Fields that need to be included from Truck Metrics (Red) remain selected.
Fantastic! A simple, straightforward way to update fields! But, as any analyst knows, working with data is rarely simple or straightforward. What if you’re dealing with 20 fields that need to be updated, not just 2? In that case, manually selecting fields to update is not only tedious but also error-prone. For these types of situations, I recommend a process that allows for a more dynamic approach.
'Verticalize' the Data to Dynamically Update Fields
Transposing, or ‘verticalizing’ data, allows for a more dynamic workflow when you have unknowns in your processing. In a scenario such as this one, you may have an unknown or changing number of fields that will need be updated in Truck Metrics. Using this approach, we’ll first Transpose both data sets to configure the Field Name and its associated value in a single row (Figure 2).
Figure 2: The data is transposed from Truck Mileage and Truck Metrics. The highlighted fields in Truck Metics indicates which fields will be updated, as the same field exists in Truck Mileage.
Then, we’ll Join our datasets based on two fields: Truck Number AND Name. This ensures that fields in Truck Mileage will match to the correct fields in Truck Metrics, assuming the fields names in both tables are named in the same way. The only selecting we’ll have to do is to make sure all fields from the Right data source (in this case, Truck Metrics) are deselected (Figure 3). This allows that, in the situation of matched Truck Numbers and field names, the updated values will be used.
Figure 3: The Joined fields indicate the fields that exist in both Truck Mileage and Truck Metrics. Fields in yellow (Truck Mileage) are selected to reflect updated values in downstream tools. Fields in blue (Truck Metrics) are deselected.
Note that any unmatched fields have fallen out of the Right side of the Join:
To add them back into the data stream, simply Union the Center and Right Joins together, setting the tool to “Auto Configure by Name”. Then, to rearrange the data into its original format, use the Cross Tab tool (Figure 4). And, voila! Your data is updated!
Figure 4: The updated fields are highlighted in yellow. All other fields originally included in Truck Metrics are included the in the final results as well.
Question I have a table of sales data with each column being a week's worth of sales. I only want records that have data in each of those fields and want to filter out all records that have Null values. How can I do this?
Answer There are two basic elements necessary to make this happen. The first is that all records in the original table have a unique ID. If you do not have a unique ID in your data, go ahead and add a Record ID Tool.
In the sample data you can see we will want data from Rows 1 and 6 while filtering out each of the other records because they contain null values.
From here we will use the Transpose Tool to pivot your data into 3 separate columns. In the transpose field choose your unique ID as the KEY FIELD and make sure all other records are selected as DATA FIELDS.
The result is that you will have your unique ID field, a field called [Name] which contains the names of each of the fields in your data, repeated for every unique ID in your original data, and a [Value] field which contains the individual values for each of the records for each of the columns in the original data.
Now we want to search for Nulls, and get a comprehensive list of the UniqueID values that do not contain Null values. Now is the time to bring in a Summarize tool and GroupBy your unique ID field, and then use the CountNull action.
The result is a list of how many nulls exist in each of your unique ID groups.
Next we can simply filter out the fields that have 0 null values in them and then use the unique IDs to join back to the original data, and pull only those records.
It's important to note here that because I'm only interested in the original fields I intentionally chose to deselect the unique ID and the Null Count fields from the output of the join so that I am left with only those records that have data in all of the weeks.
See the attached v10.5 workflow for an example of the approach above.