Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Hopefully easy Regex problem

tristank
11 - Bolide

Good morning all. I am working through a challenge and am struggling to parse the final section.

 

Here is an example row:

0000001Wii Sports Wii (2006) CAT:Sports PUB:Nintendo $41.49million, $29.02million, $3.77million, $8.46million, $82.74million

 

And here is my regex so far:

^(\d{7})(.*) (\<\w+\>) \((\d{4})\) CAT:(.*) PUB:(.*) \$?(.*)

 

What I'm struggling with is separating the sales from the publisher. Since there can be publishers with multiple names I wanted to focus on parsing using the first appearance of the dollar sign. I thought my lazy quantifier would achieve this but I can never get it to work. It instead takes the last appearance (82.74 million).

 

I could approach the problem in a different way but long term I really want to figure out the lazy/greedy quantifier and why I can never get it to work. Thanks community for any insights and have a beautiful Thursday

4 REPLIES 4
FinnCharlton
13 - Pulsar

Hey @tristank , you need a '?' in the brackets before the '\$', i.e.:

^(\d{7})(.*) (\<\w+\>) \((\d{4})\) CAT:(.*) PUB:(.*?) \$?(.*)

Before, the PUB:(.*) was acting greedy, taking as much as possible and therefore going up to the last $. Now, it acts lazy, taking as little as possible and just going to the first $. Hope this helps!

cjaneczko
13 - Pulsar

Try this.

 

 

 

(\d+)(\w.*?)\s+(\w+)\s+\((\d+)\).*?CAT:(\w+).*?PUB:(\w+(?:\s+\w+)*)\s+(\$[\d.]+million)(?:,\s*)?(\$[\d.]+million)?(?:,\s*)?(\$[\d.]+million)?(?:,\s*)?(\$[\d.]+million)?(?:,\s*)?(\$[\d.]+million)?

 

 

image.png

 

image.png

tristank
11 - Bolide

Thanks @FinnCharlton that makes a lot of sense I guess I was thinking it would just go to the first '$'. Will hit you up on convo next time I have a regex problem ;)

tristank
11 - Bolide

And thanks @cjaneczko I learned a lot reading through your solution!

Labels