I have a .txt file where the data is somewhat unstructured as shown in the two attachments, especially page 2. Highlighted in both are the data I would like to bring into Alteryx as a table with the field names (e.g. PAYT GRP ID/NAME) and their corresponding values (e.g. 24/MONTHLY AFIS). How is the best way to trap this information and bring it in. The other issue is this txt document can have a page range of anywhere between 3 - 15 but I'm only needing the information from pages 1 & 2. Thanks.
Solved! Go to Solution.
Hi @Jake5
Sorry about the delay to answer, i just wanted to take a time to do it better.
1-Enter this website: https://regex101.com/
2-Fill the test string field with this full text (even the spaces before/after):
INCLUDE/EXCLUDE GROUP ID/NAME: 5/QUARTERLY FI - A RKCONNECT
3-Fill the regular expression with this:
\s{4,}(.*?)\s{4,}(.*?)\s{4,}.*
Description of what the groups are getting:
Group 1-Four or more spaces
Group 3-Four or more spaces
Group 5-Four or more spaces and everything after them
Group 2-Smallest sequence of characters between Group 1 and Group 3
Group 4-Smallest sequence of characters between Group 3 and Group 5
Hope this make sense, you will get it through practice.
Felipe - I really appreciate that you took the time to explain this....very helpful! For the '4 or more' expression in this formula, I'm trying to understand how you decided on 4. Is it because any values in groups two and four had 3 or fewer spaces so the '4' expression avoids grouping it with a spaced value?
@Jake5 to be honest with you, i tried 4 and it worked kkkkk, but some other values would make it work too.
Hi, Felipe. It's been awhile but I am hoping you can assist with this question. The attached txt file recently underwent a formatting change and it seems to have disrupted how records within the Value column display within the Regex Configuration tool. Referring to the txt file, page 1 for example. I would expect the values beginning in row 10 to parse so that the 2nd value in the row (for example, the value EXCLUDE for FUND SELECTION OPTION) would display in the Value column. But right now Alteryx is showing this as blank. Are you able to help tweak the Regex configuration so that EXCLUDE, INCLUDE, etc appears in the value column? I have included the workflow for reference. Thanks.