Dynamic Regular Expressions to Split Columns by Delimiters
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi,
I'm trying to split columns dynamically by different delimiters, but I'm having some issues with the RegEx element and how to pass this through the RegEx tool dynamically.
Here's a view of my data:
I'm looking to split the "expected_pattern" field by the delimter found in the "expected_delimiter" field. This can be different depending on the data source. I've created a column called "Max_Count of Delimiters" to get the total number of splits I'll need to do. I've done this so all "expected_pattern" fields are split. If there are blank fields, that's fine.
I did some searching on how to generate the dynamic RegEx for this, which is the value appearing in the "Regex" column. Here's the formula I used to create this:
REPLACE(REPLACE("^(?:([^DELIM]+)DELIM){COUNT}", "DELIM", [expected_delimiter]), "COUNT", ToString([Max_Count of Delimiters]))
How can I use this RegEx to split each row dynamically by the delimiter in the "expected_delimiter" column by the total number of splits needed from the "Max_Count of Delimiters" column?
Thanks,
Mark
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi, @mholland
Maybe you can upload input and output sample data (as table, not screenshot) as your want ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Sure, I've attached a file that shows the data and how I'd like it to appear in the end:
- In this example the Max_Count of Delimiters is 8, therefore 9 columns should be created
- In the first row, the delimiter is "|". When split, this creates 8 fields
- In the second row, the delimiter is "_". When split, this creates 9 fields
- In the third row, the delimiter is "|". When split, this creates 5 fields
Does that help?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi, @mholland
The following solution does not use RegEx, but in addition to manually specifying the delimiter, it is possible to dynamically split columns. Of course, you can also use the Dynamic Replace Tool to achieve a fully dynamic splitting effect. As for whether you must use RegEx to achieve a completely dynamic effect, I look forward to the answers of others.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi,
With your solution, I like that it splits the fields into a column. I could use this later on.
But is there any way to split the column by a delimiter from another column, rather than having to manually input the delimiter?
I'll have a lot of variations on this and want to avoid having to do any manual changes to steps.
Any thoughts?
Thanks,
Mark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@mholland
If we are having a few and fixed delimiters then we can go with a rather static way, similar with the approach with @flying008 .
Or we can go with a Batch Macro, which is more dynamic.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi, @mholland
Yes, use the Dynamic Replace Tool to achieve a fully dynamic splitting effect. But the workflow will get complicated. You can also use macros to solve it.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @Qiu ,
Thanks for this. We're almost there. The split by delimiter part is working. But it's not splitting dynamically by the max count of delimiters. I have another set of strings with 14 delimiters, using this method seems to lock in only 8.
Also, if you look at your example, you'll see on the first row that the last field has not been split. There should be 9 columns in this example, rather than 8.
Ideally I would want the number of columns to be dynamic, based on the max count of delimiters plus 1.
Any ideas on how to do that?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@mholland
Thanks for the feedback.
Then we can go with the approach of @flying008 , Split the data to rows, then use the Tile and Cross Tab.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@Qiu and @flying008 ... Nailed it! Thank you so much for your help on this, it's working perfectly.
