Hi!
Could someone please help me write out the regex expression to pull the specific rows of data in the attached file below ( highlighted) Basically, I'm trying to pull the data starting with "Total ###( 3 digits) ". There are several rows of data starting with "data" but I just want to get the one following by 3 digits and then -. Thanks so much in advance !
Solved! Go to Solution.
Hi @TomWelgemoed ,
Thank you so much for your help! The solution is great. However, I just have a question for future for reference. What is really the difference between "+" vs *.
In your expression, you used (\d+) instead of (\d*). I'm aware that * means "zero or more" and + means " one or more". Do you mind providing an example to suggest when I should use * instead of + or vice versa? I found this super confusing some time.
Thanks again for your help! Really appreciate this.
Hi @TomWelgemoed ,
Just tagged you here again in case you missed my last reply. Please see my previous reply when you have a chance. Thanks!
Hi @Wynn ,
Firstly, apologies for the delay. I did see your note and it was on my list to respond to!
Secondly, good on you for trying to really understand it.
The reason I didn't get back to you straight away is because I needed to understand it better myself. I'm no regex expert, and often have to test it to learn it myself. If you don't know this already, I can really recommend https://regexr.com/ for trying out regex - you can just cut & paste your text there and try out different methods.
To trial your question, I simply expanded our set of data with some other numeric examples and then compared the 2 outputs. So take this as an investigative result, rather than an expert's view:
To me it looks like the \d* method only really works on numbers - it doesn't appear to be very good at picking out the digits that follow text. See image below. The \d+ method however does appear to be very good at picking out the first group of digits it comes across across (that match the pattern specified). Also, when I put the \d* method on regexr.com, a warning pops up on the top right, saying that this method may have some undesired effects. So my 2 cents: probably best to avoid it as \d+ seems to cover all the required scenarios. Attaching the workflow for you.
Best,
Tom
Hi @TomWelgemoed ,
Thanks so much for taking your time to answer my questions. It is really helpful. I truly appreciate your effort.