Using Regex to extract a specific string from a column and placing it another column
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi I have this type of data can post the original due to sensitivity. As you can see there is this ID in random places in data 2ABC-20-06-XXX-026 all IDs are of same number of 18 Characters and all have XXX in common but there position in the text is different as can be seen below in the Input column. I want the ID extracted and placed in output column as can be seen!! Please need help as i am a newbie to regex
Input | Output |
This is 2ABC-20-06-XXX-026 and it is 26 | 2ABC-20-06-XXX-026 |
Can I be 2ABC-20-06-XXX-027 what can I be | |
Are you sure 2ABC-20-06-XXX-028 can it be true | |
how doubt full ssss deec eljjdjdjd 2ABC-20-06-XXX-029 | |
Wwyy abdhkjkjkdjkfj 2ABC-20-06-XXX-030 cjdnknskk shdjdhdu | |
2ABC-20-06-XXX-678 as it can be airport |
Solved! Go to Solution.
- Labels:
- Expression
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
is XXX always numbers or can it be letters?
regex_replace([Input ],".*(2ABC-20-06-\w{3}-\d{3}).*","$1") would work for either letters or numbers.
regex_replace([Input ],".*(2ABC-20-06-[a-zA-Z]{3}-\d{3}).*","$1") - looks for 3 letters.
edited both to allow for different trailing identifiers.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
XXX is actually alphabets and always XXX in each ID
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
hi @ahsansalik
if the position of the hyphen "-" is fixed, you can do something like this. I am assuming that the actual alphanumeric values between the hypens can vary.
Cheers,
Dawn
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
It looks to me that the XXX is always after 11 (random) places and before 4 random places, right?
A random character is represented by a . in RegEx. Thus my simplest parsing approach which is quite comprehensive for newbies would be:
(...........XXX....)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Or if only the last places change in the ID, it'll be parsing:
(2ABC-20-06-XXX....)
Please mark it as a solution if that answers the question - it helps Alteryx and other members to distinguish between solved and unsolved topics.
Thanks
Matthias
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thanks all i was just testing the solutions in order to verify the output ... Thank you all solutions work like a Gem... really grateful
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thanks man it works perfectly
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Dawn can you please share your work flow
