You may have noticed that the Cross Tab tool will insert underscores in your headers. There may be times when this is ok, but for those times when it’s not, how do you get the original format of the data back - without the underscores? Attached is an example that provides a solution. Feel free to bypass the rest of this posting and go directly to the example.
If you'd like a narrative of what is going on in the example, read on. The Text Input tool has data that has the following data:
Wewant to cross tab ourdata, so fields in Dept Name become ournew headers with Cost in Millions as ourvalues.
So now ourdata looks like this:
where the headers have been changed to include underscores. Very uncool. So in a separate workflow stream, we're going to get the original format.
Attach a Select tool to the Text Input:
And select Program Type and Dept Name. We don't need the Cost in Millions anymore. We going to use the Dept Name - which is in the original format - and eventually use this to replace the data with underscores.
After the select, our data looks like this.
However, we only need one groupfor Dept Name. You can see it the Dept Name is repeated for each Program Type. So attach a Unique Tool and check Dept Name in the configuration window.
Now we're left with the following data:
This is where it gets interesting. We need to make a copy of the Dept Name, but we want to make it look exactly like the headers with the underscores in it. Why? We'll use this new field to align our data in sequent Union tool. In the meantime, back to creating the new field with underscores in the data.
This REGEX_Replace expression will take the current Dept Name field and replace the colons and spaces with underscores. In the real world, your actual data may contain other characters that were replaced with underscores by the Cross Tab tool. You'll need to modify this expression accordingly. In this example, the result of the REGEX_Replace statement looks like this:
where Dept Name2 looks like the headers from the Cross Tab tool.
So rather ironically we're going to cross tab this data. Connect a Cross Tab tool and in the configuration, we'll group by Program Type, make the new headers Dept Name2 and the values will be Dept Name:
The result looks like this:
We're ready to union our data at this point. Since the headers from both Cross Tab tools look the same, we can select 'Auto Configure by Name'.
Make sure the Cross Tab with the original format is attached to the Union Tool first.
By specifying the output order in the Union Tool configuration, we ensure the original field data is on top (rather than the bottom).
For the grand finale, we'll add a Dynamic Rename tool and configure the tool to 'Take Field Names from First Row of Data'. Notice that Program Type is not checked in the configuration. That's because this field is already in the format we want, so we unselect that field.
Now we have our data, cross-tabbed, with the original format. Our headers look the way we want them to!