Regex to extract Text and Number into different columns
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi Champions,
i need help in extracting text and numbers into different columns from this
What I have is this:-
" Arizona 0 0 (US-AZ) Colorado 0 0 (US-CO)"
"Qatar (QA) 0 0 United Arab Emirates 147 0 (AE)"
" Missouri 70 0 (US-MO)"
" Utah (US- 0 0 UT) Total 217 0"
What I need is, i.e. Arizona in one column and zero in another
Arizona-0, Colorado-0
Qatar-0, United Arab Emirates 147
Missouri-70
Utah-0, Total 217
Solved! Go to Solution.
- Labels:
- Regex
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
What does the comma mean? Should the output look like this:
Arizona | 0 | Colorado | 0 |
Or should each pairing be in a new row?
Arizona | 0 |
Colorado | 0 |
Also, there is very little standardization in your data (Very important for Regex). Is what you have included, representative of every case in your data?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi, @Pranab_C
FYI.
Input | |||||
Txt | |||||
Arizona 0 0 (US-AZ) Colorado 0 0 (US-CO) | |||||
Qatar (QA) 0 0 United Arab Emirates 147 0 (AE) | |||||
Missouri 70 0 (US-MO) | |||||
Utah (US- 0 0 UT) Total 217 0 | |||||
Output | |||||
Txt | Match | State | ID | SKU | Num |
Arizona 0 0 (US-AZ) Colorado 0 0 (US-CO) | Arizona 0 0 Colorado 0 0 | Arizona | 0 | Colorado | 0 |
Qatar (QA) 0 0 United Arab Emirates 147 0 (AE) | Qatar 0 0 United Arab Emirates 147 0 | Qatar | 0 | United Arab Emirates | 147 |
Missouri 70 0 (US-MO) | Missouri 70 0 | Missouri | 70 | ||
Utah (US- 0 0 UT) Total 217 0 | Utah 0 0 Total 217 0 | Utah | 0 | Total | 217 |
REGEX_Replace(REGEX_Replace(REGEX_Replace([Txt], '\s\([a-z\-]+?\)(?=[^"$])', ''), '\([a-z\-]+?([\d\s]+?)[a-z]+?\)(?=[^"$])', '$1'), '\s\([a-z\-]+?\)(?="|$)', '')​
Parse:
^["\s]+([[:alpha:]]+)\s+(\d+)\s[\d\s]+(?:([[:alpha:]\s]+)\s+(\d+))?​
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thank you so much for taking a look at this, worked for all except the row that contains
"Qatar 0 0 United Arab Emirates 147 0"
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi, @Pranab_C
So, like this string of "Qatar 0 0 United Arab Emirates 147 0", what result are you want get ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Input
Qatar 0 0 United Arab Emirates 147 0
Desired Output
Qatar| 0 | United Arab Emirates| 147|
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thank you once again for quickly taking a look at this, can you share the workflow that you have created.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi, @Pranab_C
There is only 2 step for workflow, and i post all expression in above post, you just copy it to your canvas to get it.
If you can post your want output result data as table, maybe i rebuild the flow as pre.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
