Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Data Extract

Tim6
8 - Asteroid

Hi There, 

 

I have a PDF file that has been converted to excel and the excel is now in messy format. 

Is there a tool that will allow me to get all the data relating to contract IDs? The issue is that there could be other nuances in the contract ID column, like a page name or other null spaces. 

 

the contract ID is formatted as XXXXXX-XXX 

Ex. 1234567-123

 

Is there a tool i can use to filter the data for contract IDs in the format above? 

 

Thank you, 

Jessica

2 REPLIES 2
marcusblackhill
12 - Quasar
12 - Quasar

Hey @Tim6 !

 

I think will work to you use the Regex tool with the configuration "tokenize" and the formula "\d+-\d+".

 

Hope that helps!

echuong1
Alteryx Alumni (Retired)

You can use the regexmatch() function in the filter to identify any line with the specified format.

 

REGEX_Match([Field1], ".*\d{7}-\d{3}.*" )

 

I am looking for anything with an instance of 7 digits - 3 digits in it. Anything that matches comes out of the T output, anything that doesn't out of the F output.

 

echuong1_0-1608227037606.png

 

Labels
Top Solution Authors