I'm attempting to parse data from an Outlook export to extract IDs, which are in a specific format.
This works in the majority of cases, however, where the body of the email is particularly large it doesn't seem to find the IDs.
I've attempted to increase the field sizes, but that doesn't seem to have worked.
Can somebody pls assist with this?
Solved! Go to Solution.
@gwood97 do you have an sample of the RegEx you are using or sample of the data?
My first thought would be the field size as well, but the issue might be with the extra characters in the data you might have to adjust your RegEx.
Unfortunately, I'm unable to share the outlook data, but the RegEx is: ([A-Z]{1}\S{3}-\d{5})
An example of the output would be: FATC-00100 or S123-00175
@gwood97 is the issue that you're not finding any IDs at all, or just not all of them? If it's the latter and these larger fields can have multiple IDs within them then you'll need to use RegEx in Tokenize mode, rather than Parse, so that you return every instance of your pattern rather than just the first. Without you sharing a sample I also don't know whether you might want to make the pattern a little more specific than using \S but again can't comment much.
Parse - Returns the first pattern match:
Tokenize - Returns all matches:
I've tried with Tokenize and the same issue occurs, for certain instances it's not finding any IDs (returns Null), even though they follow the format of the RegEx - FATC-00100 is an example of this, but this is mentioned at the bottom of the email chain.
I changed the size of the field to 10,000 to ensure all the data pulls through, but it still doesn't seem to pick it up.
Is there no way you can provide a sample of this @gwood97, even if you remove sensitive info/mock up text around the target? Hard to really know what's going on without being able to replicate the behaviour. Field size and using Parse vs Tokenize are the only immediate things I can think to check but if the whole chain is definitely pulling through with the extended size you're setting and you've tried with both RegEx methods then there ought to be other issues at play.
Sample would be: Click here to open this form for FATC-00100 in SYSTEM NAME - that is the body of the email extract
Please let me know if that is sufficient
@gwood97 is this a true example of one of the longer fields you're having issues with? If not then we'll need a closer example as that snippet works as required when looking at it in https://regex101.com/ and Alteryx itself:
I've now realised it was an upstream flow issue filtering out the data - thank you for your help!