Hi All- Trying to find the best way to parse out the input table below:
Started by using text to columns with a Delimiter + Find and replace but that didnt seem to work so trying Regex with 3 different flows but need some assistance with forming the expression. Can you please guide me to the best solution? Thank you!
Input | |
ID | Text |
1 | Acct: 1234 Acct Name: test Ref: test2 |
2 | Acct: 12345 Acct Name: test1 |
3 | Ref: test3 |
*This cell has embedded newlines.
Output | |||
ID | Account | Account Name | Reference |
1 | 1234 | test | test2 |
2 | 12345 | test1 | |
3 | test3 |
Solved! Go to Solution.
Hi @ah108
Here is how you can do it.
Workflow:
1. Using text to column to split to rows on newline character.
2. Using regex tool to split into name and value.
3. Using crosstab with key as ID converting to table with name as name and value as value.
You can rename as needed.
Hope this helps : )
Thanks for your assistance! It worked with most scenarios but forgot to consider the below:
There is a possibility that all tags - Acct, Acct Name, & Ref - might be more than one row so the current WF is not working for these.
4 | Acct: 1234678910111213141516 17 Acct Name: test1 test2 test3 test4 test5 Ref: test2 |
Hi @ah108
Here is how you can do it. I am assuming Acct, Acct Name, & Ref are the only existing tags.
1. Using formula tool I am replacing Acct, Acct Name, & Ref with |Acct, |Acct Name, & |Ref (adding pipe before the tag so that its easier to split). I am using unmarked group (?:) regex method to create keyword list which I picked up from @Jean-Balteryx recently 🙂
2. Using text to column to split on pipe. This way even if there is line break it wont affect.
3. Using data cleansing to remove line breaks.
4. Using regex to split to col name and value.
5. Using crosstab to convert it to table.
Hope this helps : )
This worked perfectly! Thank you!