ACT NOW: The Alteryx team will be retiring support for Community account recovery and Community email-change requests Early 2026. Make sure to check your account preferences in my.alteryx.com to make sure you have filled out your security questions. Learn more here
Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Parsing Data with Regex

cherylwinschuh
7 - Meteor

Hello,

 

I am working with data that comes in from PDF using the PDF input tool. (not the auto insights PDF tool) 

 

Using regex I was able to break the data into columns. I am now running into an issue where some of the elements I used are not in the data.

example. I searched the data for "CP" and replaced it with |CP.  if the line does not have CP, it's throwing off my columns.

(\s[C]+[P]) replaced with |CP

 

I tried adding a | as a delimiter where there were 2 or more spaces, some lines do not have 2 spaces between "columns"

\s{2}\b replaced with |

 

The PO number and Invoice can vary, some do not have letters in front of them.

 

Does anyone know of a better formula/regex to parse the data and not have misalignment due to missing information?

 

thanks

Cheryl

 

1 REPLY 1
QuentinS
Alteryx
Alteryx

hi @cherylwinschuh ,

 

this would be my approach, please see the attached workflow.

Labels
Top Solution Authors