Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

How to trap fields/values from a .txt input

Jake5
8 - Asteroid

I have a .txt file where the data is somewhat unstructured as shown in the two attachments, especially page 2.  Highlighted in both are the data I would like to bring into Alteryx as a table with the field names (e.g. PAYT GRP ID/NAME) and their corresponding values (e.g. 24/MONTHLY AFIS).  How is the best way to trap this information and bring it in.  The other issue is this txt document can have a page range of anywhere between 3 - 15 but I'm only needing the information from pages 1 & 2.  Thanks.  

22 REPLIES 22
Jake5
8 - Asteroid

Felipe - I'm trying to accomplish a similar objective with the attached .txt file, but I don't understand how to adapt your Regex formula to work with this data set.  When running it, I can't get the expected values within the Key and Value fields as I don't understand how this formula works.  Also attached is a screencapture (.png) of the fields I'm trying to make use of along with the workflow.  Are you able to take a look and advise how to get this regex to work?  Thanks.  

Felipe_Ribeir0
16 - Nebula

Hi @Jake5 

 

Try the attached workflow.

 

keu.png

jdminton
12 - Quasar

@Jake5 for the Field 1 error, you need to uncheck the box in the input configuration that says "First Row Contains Field Names"

Snag_7eccffa.png

Jake5
8 - Asteroid

Thank you Felipe.  That achieved the desired results, but can you share what update you made compared to the workflow I sent you?  The regex formula looks the same as what I included in mine.  

Jake5
8 - Asteroid

Thank you, Felipe.  So something I just realized is for this input, there can be instances where page 1 layout can repeat in subsequent pages, the difference from one page to the next being the field value for INCLUDE/EXCLUDE GROUP ID/NAME:.  Using the attached txt file, page 1 has INCLUDE/EXCLUDE GROUP ID/NAME: = 5/QUARTERLY FI - A RKCONNECT, page 5 = 11/QUARTERLY TPA - R RKCONNECT and so forth.  I need the workflow to identify all pages where this layout is present in order to render the attached, desired output.  My challenge is trying to trap the unique value for INCLUDE/EXCLUDE GROUP ID/NAME from each page.  Are you able to assist?     

Felipe_Ribeir0
16 - Nebula

Hi @Jake5 

 

Try the attached version

 

include.png

 

 

Jake5
8 - Asteroid

Hi, Felipe.  The output seems to work with that input file but I just tried running the workflow against this version (see attached Sample file_R00848_v3.txt.)  There seems to be an issue with the page numbers that are being assigned as the highest page number the multi-row formula generates is 9 but this is a 37 page document.  In this version the values I"m after are on pages 1, 17, 29, and 33.  Sorry to continue to ask for your help - you've been amazing!  Can you look over and advise?  Thank you!     

Felipe_Ribeir0
16 - Nebula

Hi @Jake5 

 

You are right, there is an error unable to parse page numbers above the page 10. Please replace the expression of the multi row formula by this one, it should work now.

 

if CONTAINS([Field_1], 'PACKET PAGE:') THEN TONUMBER(REGEX_REPLACE([Field_1], '(.*?)(\d+)(.*)', '$2')) ELSE [Row-1:Page] ENDIF

(Basically, i just added the red ? to the previous version)

Jake5
8 - Asteroid

Yes - that worked - thank you!

Jake5
8 - Asteroid

Hi, Felipe

 

In the attached workflow you provided, you used \s{4,}(.*?)\s{4,}(.*?)\s{4,}.* as the Regex formula (Regex tool).  Can you help me understand how this formula is able to effectively parse Field_1 into the Key and Value fields?  I'm just wanting to learn for my own understanding.  Thanks!

Labels
Top Solution Authors