Alteryx Designer Desktop Discussions

fmilliren · ‎03-08-2024

Edited for copy/paste issues on expressions.

I am trying to build an expression for REGEX_CountMatch() but I'm struggling to get it to work. I have two different potential data scenarios.

I have no problem individually identifying if the first or last character is the caret, a comma or whatever else I'm looking for. So if I could get the "middle" part to work that would be acceptable, though a single REGEX to evaluate the entire string is what I would prefer to have.

I know I could run the text to columns tool and push them into Rows then filter and then count/summarize but I was looking for something a little slicker, less cumbersome and more easily reproduceable in workflows using REGEX in a formula tool rather than building out a whole separate stream in the workflows. I've used the REGEX_CountMatches in more simple scenarios and it's worked wonderfully.

I have also tried solutions from bingchat and the Alteryx suggested regex calculator (regex101 website) to no avail.

---------------------------------------------------------------------------------------------------

1st scenario - I would like to count the instances of ^ but only where it is the only character between the comma delimiters even if at the beginning or end of the string by itself.

So the count result should be 5

^,ABC,^,^,C^,D1E,F^G,H33,^,JKL,^2,^

I've tried the following with little success, most return 0 for the count.

REGEX_CountMatches([MY_FIELD], '\^,') -- This kind of works but also counts something like "C^", which I don't want

REGEX_CountMatches([MY_FIELD], ',\^')

REGEX_CountMatches([MY_FIELD], ',\^,')

REGEX_CountMatches([MY_FIELD, '^\^+$') -- I even tried this one where the data had two caret symbols instead of just one.

REGEX_CountMatches([MY_FIELD], '^\^')

REGEX_CountMatches([MY_FIELD], '\^+$')

REGEX_CountMatches([MY_FIELD],'(?:(?<=^),|,(?=,)|,(?=$))')

REGEX_CountMatches([MY_FIELD],'(?<=,|^)\^+(?=,|$)')

REGEX_CountMatches([MY_FIELD],'?<=,|^)\^+(?=,|$')

---------------------------------------------------------------------------------------------------

2nd scenario - I would like to count the instances where the delimiters are together, no data in between plus when they begin/end the string

So the count result should be 6

,ABC,,DEF,,,G11,23H,,

I've tried the following with no success, all return 0 for the count.

REGEX_CountMatches([MY_FIELD], ',{2}')

REGEX_CountMatches([MY_FIELD], ",{2}")

REGEX_CountMatches([MY_FIELD], ",,")

REGEX_CountMatches([MY_FIELD],'(?:(?<=^),|,(?=,)|,(?=$))')

dsandmann · ‎03-08-2024

For the first case, REGEX_CountMatches([MY_FIELD], '(?:^|,)\^(?=$|,)')

The matching function starts at location 0, so doing a lookbehind at the start won't work since it does not exist - I'm sure you were getting an error on those attempts. This match take the start or a comma, and looks ahead for the end or another comma.

For the second one - REGEX_CountMatches([MY_FIELD], ',(?=$|,)')+IIF(LEFT([MY_FIELD],1)=',', 1, 0)

Similar to the above, but because of the way the function solves, you need to separately test for does it go ,[,|end] vs does it start with ,. When this function brings parts of the string in as match, it "locks" that bit of the string, and will not try to count it again. In the event that the string you're testing starts with multiple commas this would always result in it missing that first case since we can't use lookbehind.

Highly recommend continuing to use https://regex101.com/ for future testing - it will also help you understand what is happening within the regex string, but there will always be some ambiguity

fmilliren · ‎03-08-2024

works like a dream thank you!

Alteryx Designer Desktop Discussions

In need of assistance with building expression for REGEX_CountMatches