Alteryx Designer Desktop Discussions

cgoodman · ‎06-21-2021

Hey Community

It's Monday morning and my brain hasn't kicked in and I know community is really fast at answering questions.

I've used the intelligence suite tool to OCR a PDF document (it's a bank statement), now I need to parse this out into the required parts.

Roughly the documents is

Date Narrative Type Debit amount Credit amount Balance

Sometimes the narrative is split over several lines, I've tried text to columns on a new line delimiter (\n) however due to the narrative spreading over several lines it isn't consistent. For example if there was always an empty row it would be fine to then use a multi-row formula to group back to the transaction line items, but erroneous new lines are appearing which makes this logic fail.

What I wanted to then do is take something that would be consistent at the start of each transaction which is the date, and using regex_replace I can match on date and replace with a pipe delimiter

REGEX_Replace([text], "\d{2}\/\d{2}\/\d{4}", "|")

However this loses the date. Is there a way therefore to maintain the date but just prefix this with the pipe delimiter?

Chris
Check out my collaboration with fellow ACE Joshua Burkhow at AlterTricks.com

mpennington · ‎06-21-2021

@cgoodman ,

I am currently on a machine without an active license, but I believe this could be accomplished with capturing groups. I'm certain I've done this with other characters, but given that the Pipe is a RegEx metacharacter, it may have be escaped. So maybe something like this:

REGEX_Replace([text], "(\d{2}\/\d{2}\/\d{4})", "\|$1")

At any rate, hope it works for you. Good luck.

cgoodman · ‎06-21-2021

Yeah that looks to have worked. I was expecting anything in the replace part of the formula to just be put in as a text string.

Chris
Check out my collaboration with fellow ACE Joshua Burkhow at AlterTricks.com

Alteryx Designer Desktop Discussions

Help with parsing out a pdf document

Re: Is there any way the computer vision tools can...

Re: Batch Macro

Re: How to get cell reference address from excel

Re: Replacing Forecast columns with Actual Data

Re: Row creation