Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

RegEx Parse not working for large field

gwood97
6 - Meteoroid

I'm attempting to parse data from an Outlook export to extract IDs, which are in a specific format.

This works in the majority of cases, however, where the body of the email is particularly large it doesn't seem to find the IDs.

 

I've attempted to increase the field sizes, but that doesn't seem to have worked.

 

Can somebody pls assist with this?

8 REPLIES 8
joelmiller66
9 - Comet

@gwood97 do you have an sample of the RegEx you are using or sample of the data?

 

My first thought would be the field size as well, but the issue might be with the extra characters in the data you might have to adjust your RegEx.

gwood97
6 - Meteoroid

Unfortunately, I'm unable to share the outlook data, but the RegEx is: ([A-Z]{1}\S{3}-\d{5})

An example of the output would be: FATC-00100 or S123-00175

DataNath
17 - Castor

@gwood97 is the issue that you're not finding any IDs at all, or just not all of them? If it's the latter and these larger fields can have multiple IDs within them then you'll need to use RegEx in Tokenize mode, rather than Parse, so that you return every instance of your pattern rather than just the first. Without you sharing a sample I also don't know whether you might want to make the pattern a little more specific than using \S but again can't comment much.

 

Parse - Returns the first pattern match:

 

DataNath_0-1661933949850.png

 

Tokenize - Returns all matches:

 

DataNath_1-1661933976024.png

gwood97
6 - Meteoroid

I've tried with Tokenize and the same issue occurs, for certain instances it's not finding any IDs (returns Null), even though they follow the format of the RegEx - FATC-00100 is an example of this, but this is mentioned at the bottom of the email chain.

I changed the size of the field to 10,000 to ensure all the data pulls through, but it still doesn't seem to pick it up.

DataNath
17 - Castor

Is there no way you can provide a sample of this @gwood97, even if you remove sensitive info/mock up text around the target? Hard to really know what's going on without being able to replicate the behaviour. Field size and using Parse vs Tokenize are the only immediate things I can think to check but if the whole chain is definitely pulling through with the extended size you're setting and you've tried with both RegEx methods then there ought to be other issues at play.

gwood97
6 - Meteoroid

Sample would be: Click here to open this form for FATC-00100 in SYSTEM NAME - that is the body of the email extract

 

Please let me know if that is sufficient

DataNath
17 - Castor

@gwood97 is this a true example of one of the longer fields you're having issues with? If not then we'll need a closer example as that snippet works as required when looking at it in https://regex101.com/ and Alteryx itself:

 

DataNath_0-1661974582449.pngDataNath_1-1661974592284.png

gwood97
6 - Meteoroid

I've now realised it was an upstream flow issue filtering out the data - thank you for your help!

Labels