Parsing concatenated string
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hello,
I'm pretty new to Alteryx and am trying to process a dataset that includes a concatenated string column comprised of user-entered items and quantities. I suspect RegEx would be the way to handle this, but don't have enough experience with it and was hoping someone could help out. I have attached a very simplified example of what the incoming data would look like and the desired output.
- Input data is from a user-entered list of items (I have used fruit in the example to simplify matters: apples, bananas, grapes, pears and oranges).
- There may be spaces between the items or not ("1apple 1 banana", "1 apple1banana", "1apple 1banana", etc.)
- The items may be entered as singular or plural ("apple" or "apples") but this does not affect the quantity recorded.
- If the item in the list is preceded by an integer, that integer represents the quantity. If there is no integer, the quantity is 1. If the item is not present in the list, it is recorded as either null or 0 (doesn't matter which, easily changed later, 0 is better)
- The items of interest in the list are contained in a separate text input
- Anything not processed is collected in another column to later be checked for typos, etc.
Basically I want to parse the input data for the items in the list and store the quantity of each item in a separate column, with the remaining string in a separate column.
Any help would be greatly appreciated!
Thank you 🙂
In the example attached:
Input data:
Input |
1BANANAAPPLE 2ORANGES1GRAPES 1PEAR |
1APPLESPEARS 2BANANA ORANGE |
1GRAPE1ORANGES PEAR1APPLESPEARS |
Item list:
Items of Interest |
Apple |
Banana |
Grape |
Desired output:
Input | Apple | Banana | Grape | Other |
1BANANAAPPLE 2ORANGES1GRAPES 1PEAR | 1 | 1 | 1 | 2ORANGES1PEAR |
1APPLESPEARS 2BANANA ORANGE | 1 | 2 | PEARSORANGE | |
1GRAPE1ORANGES PEAR1APPLESPEARS | 1 | 1 | 1ORANGESPEARPEARS |
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @Ross_K, I was able to process your data and get it to a point where you are able to view the Items of Interest in a tabular format. As for those that don't, I'll get back to you once I spend more time with it. Hope this helps.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @Ross_K ,
I'm attaching a solution where I'm using the regex tool to separate all appearances of the 3 fruits and also a regex_replace function to build your 'others' column.
Best,
Fernando Vizcaino
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@Ross_K - attached is my updated solution. I confess using @fmvizcaino's approach for the Other's path. His regex piece was my missing link.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thank-you both @fmvizcaino and @AbhilashR for the prompt replies and accurate solutions. I decided to go with that of @fmvizcaino given the elegance and minimal steps of the solution, but both achieve the result very well.
Thanks 🙂
