Regex - match everything after the second to last dash
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi all, I am trying to match everything after the second to last dash in a file which contains strings with hyphens or dashes. In each line is a string with a series of letters, numbers, and dashes. I would just like to extract everything after the second to last dash.
See examples below:
ab2372347234tfashdjfawhewe-c3
ab534572hasdfaj32rahafd32135466324-c1
dude-fhdhe12-ab5438472dsfawe35b3-c4
abtt78923dahdfajh23498kljcxcze
8ver238234923adfadhfahdabc33
gretorg-xsss47-hireball-rhetaama3-01
cn3-vov-4
YCMCDE7888132892-5
gretvrg-dae01-hetprotesh-dug-02
pret-grab01-vov-sebastian-03
Therefore for "gretvrg-dae01-hetprotesh-dug-02" I would just like dug-02 and then the next step would be to have "dug" in one column and "02" in another. For "cn3-vov-4" I would like "vov-4" and the next step would be to have "vov" in one column and "4" in another. For "dude-fhdhe12-ab5438472dsfawe35b3-c4" I would like "ab5438472dsfawe35b3-c4" and then the next step would be to have "ab5438472dsfawe35b3" in one column and "c4" in another. Etc etc.
Some have four dashes, some have three or two dashes, and some have none as you can see.
So that is why I would like to grab everything after the second to last dash. For the ones that don't have a dash its no big deal because I am planning to just bring those in at the end anyways. I am not having success working on this, as I have for the whole day so I would appreciate any help.
I tried doing this in a regex tool on tokenize output method, which gets everything after the last dash but thats the closest I can get: [^-]+$
Thank you so much for the help!
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Does this help?
.*-(.*?-.*)
Cheers,
Mark
Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@MarqueeCrew It does work, I believe! I am going to run it through a few more files but yes I appreciate your help. Can you clarify/explain what your solution is doing, though, so I may better understand the regex syntax? Thank you so much for the help!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Try regex101.com and enter the expression there. It will give you a very thorough explanation.
Glad to have helped.
Cheers,
Mark
Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Okay that makes sense conceptually, like high-level. Looking at the syntax can you maybe explain it per each part of what you wrote? I am having trouble deciphering that from what you provided is all.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
.* looks for any character zero or more times
- looks for the last dash that occurs before the group of characters contained in the parentheses.
(.*?-.*) is the desired pattern
.*? Is the first set of any characters until the first occurrence of the next pattern.
- looks for a dash.
.* is everything else.
By defining the group as the stuff before and after the last dash, I achieved your objective. Hopefully this solves the challenge.
Will we meet at inspire in Nashville?
Cheers,
Mark
Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@MarqueeCrew Thanks for the explanation(s). In response, what does the ? do in .*? before the dash? And what in this solution makes it look at the end (i.e. last two dashes) and not any prior sequence of that (i.e. what in the solution is telling it to look at the last two dashes and not the first two or middle two dashes)?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
.*- Finds the last dash
.*- Finds the first
Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@MarqueeCrew Okay I think I'm starting to get it more. Is that a rule with the ? then (i.e. does always putting a ? before something mean you find the first occurrence of that thing/sequence?)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
https://www.oreilly.com/library/view/mastering-regular-expressions/0596528124/
You can also search YouTube. I have an intro here:
Alteryx RegEx Beginner Tutorial
https://youtu.be/pTZj2U2SDFA
Cheers,
Mark
Chaos reigns within. Repent, reflect and restart. Order shall return.
Please Subscribe to my youTube channel.
