If anyone knows why the below formula in Formula tool is returning (?=.*\\b\\b) everytime then let me know....I'm just following CHATGPT
IIF(Length(Trim([K1_2]))>0, "(?=.*\\b" + Regex_Replace([K1_2], "([.^$|()\\[\\]{}*+?\\\\])", "\\\\$1") + "\\b)", "")
tell Chat Gpt that you are using perl regex --- that might help this:
(?=.*\\b" + Regex_Replace([K1_2], "([.^$|()\\[\\]{}*+?\\\\]) --- is a mangled mess of multiple regex terms thrown together in a string.
(?=.*\\b" ---> probably a regex lookup group... here's it's a string (it's in quotes -- -not a regex function)
Regex_Replace([K1_2], "([.^$|()\\[\\]{}*+?\\\\] --- not set up for alterx regex looking for a path you'd use a single \ for an escape character --- not two.... --- this "\\\\$1 is probably off --- you probably want your group \{yourgroup} ---this would give you \\{your group}
i dont follow what you've said but that's not unexpected. Co-pilot which i'm also consulting with this has issued this reply when i fed it through on the chat feed for the solution
great question—and yes, that comment from the Alteryx discussion forum does highlight some important regex nuances, but it also contains a few misunderstandings about how Alteryx handles regex and string escaping.
Let’s break it down and clarify what’s actually relevant and helpful for your situation:
Alteryx uses Perl-style regex:
✔️ Correct. Alteryx’s Regex_Replace function is based on Perl-compatible regular expressions (PCRE), so understanding Perl regex behavior is helpful.
Escaping special characters:
✔️ Correct. You need to escape special regex characters like . ^ $ | ( ) [ ] { } * + ? \ when building dynamic patterns.
Avoiding mangled strings:
✔️ Important. If you’re building a regex pattern as a string, you must escape properly within Alteryx’s formula syntax, which is not the same as writing raw regex in a script.
“You probably want your group {yourgroup}”:
❌ Not applicable here. You're not using named groups or backreferences—you're escaping special characters dynamically. \\\\$1 is correct in Alteryx to insert a literal backslash before the matched character.
“Use a single \ for an escape character”:
❌ In Alteryx formulas, you need double backslashes (\\) to represent a single backslash in the final regex string. And because Alteryx itself escapes backslashes, you often need four backslashes (\\\\) in the formula to produce two in the output regex.
Here’s the correct version again for clarity:
Would you like to test this in a sample workflow or apply it across multiple fields dynamically?
Regen is probably just coding that I dont understand so if AI can't fix it I'm clueless
you can trust copilot --- or you can provide me an example of what you are trying to do --- and what you are trying to get out of it...
1) I am correct. Copilot is junk.
2) I can help you. Copilot can't help.
It's up to you how you want to proceed.
Copilot will try to give you overly complicated regex when 9 times out of 10 something much more straight forward will work.
Provide a sample string to test this against.
there are two threads to the workflow, one is a list of invoices and one is a reference sheet to correctly codify a concatenated column within the list of invoices. So the concenate is vendor + expense type. But this concenate will bring back values that should be categorized together but won't be because they differ slightly. Once they are categorized correctly, the sum of the invoices in the same category is tallied.
The other thread is the list of the codified categorisations. So it guides the words present in the concanteate that can be predicted and brings it to a codified vendor if all the words in the concatenate are matched.
So for example the concenate may generate Great Bank TA Networking fees on one invoice and GM TA Networking fees for another invoice. But both these have to be classified and amount tallied as Great Bank TA Networking fees for an eventual list of all vendors. There are two lines on the reference sheet for this one: 1- Great Bank TA Networking 2- GM TA Networking both leading to the codifiedvendor column which has a value of Great Bank TA Networking fees .
What i spend yesterday trying to do is setting up the workflow thread of the reference part so it can be set against the list of invoices. Following copilot and chatgbp i put it through so columns were renamed in a select tool and the multirow field formula moved the keywords into lower case.
THe formula tool that isn't working (is just returning lines of a formula as an output) was next. Not sure what its trying to achieve, probably trimming the keywords so just going to move on and do the next step.
Hopefully ive explained it in a way it can be understood. Maybe there are other tools that can do the same job only better, Maybe Fuzzy match can be utilized. Ive never used anything but the easy headline tools before.
Let me know if you have questions and thanks for your help.
It would be helpful if you could send a list of examples so that the solution is dynamic.
ok, to explain with the files that are altered to remove client data. The two files should work in an alteryx workflow as follows:
1- each line of the Proprietary data file should have the words taken from Column A
2- Those words should be taken into the split mappings file and if there is an exact match to all of the keywords in a row from columns A-D (a lot of cells are blank so needs to be worked in) then there is a match
3- the consequence of this match is the codified vendor in column E is selected
4- the amount of the invoice on the proprtiary data is tallied to that codified vendor
So take an invoice in the prop data is
The words GB, Collateral, Management are found on the keywords section of the Split mappings attachment
so the figure of the invoice 43852.46 should be tallied to the codified vendor: Great Bank Collateral Management Fund Lvl.
The reason for doing this is I need a list to show where the money on these invoices is going. The invoice map in column A of the prop data is too subject to small differences, so the aim is to filter it through keywords, codify it /standardize it and then tally the amounts to the codified/standardized list of vendors.
You are looking to fuzzy match. This is the wrong approach. A better approach is:
1) recordid.
2) split column 1 to rows on space.
3) wordmappsource recordid
3) join tool vs your word/mapper source.
4) summarize tool ->group by recordID ist/recordid wordmapper source count on recordid list
5) summarize tool -> recordid list -> max count
6) join this entry back to your prior lists --- so you can identify which item matched the most words from your word mapper --- and the join back to get the value.