Hi
I am very new to Alteryx and have a side project I am working on. I have a single .pdf file with multiple Tax invoices. I have parsed that file to a txt doc. I am trying and completely failing at extracting only the required information from that document. All that is required from the doc is the Cell Number, Invoice Number, the billed items for each invoice and the amount of those items. The number of billed items varies for each invoice. I have attached a small section of the txt doc and the desired excel output. If anyone could help with this problem it would be greatly appreciated.
Solved! Go to Solution.
It looks like the 3 amounts on the itemised lines are Net VAT and Gross, but you've only asked for Net. Let me know if you want the other 2 as well.
Hi @Chris_Waspe
@DavidP 's solution looks great. Another approach is to start by filtering out only the pieces you need, and then creating one row with all of those pieces by using a Summarize tool. From there, a few RegEx parses (one to parse out the cellular and invoice numbers, one to parse multiple billed items into rows, and the final one to parse out the billed items into columns), and a record ID tool to add unique identifiers per row.
Give it a whirl, and let me know if it works for you. I parsed into billed and VAT, in case you needed both. If you only need the amount without the VAT, then modify the Regular Expression in the last RegEx tool to be (.*?)\s(\d+\.\d+)
Thank you very much for the assistance. Just required a tiny bit of tweaking but did exactly what was required. Don't need the other two amounts but can get there from what you provided. Thanks again
I'm new to this, I have something similar which i'm trying to figure out, in the multirow formula you are using this as key record identifier,
if [Field_1]='","Copy Tax Invoice' then [Row-1:Invoice ID]+1 else [Row-1:Invoice ID] endif
how would I create an identifier if first line of record only had "Customer.....:x" ( X is the difference customer name)
where the word customer appeared, it would be my a record ID but not sure how I'd create something to validate each "Customer.....:x" to assign as record id
sample:
CUSTOMER:CALL 2 SEE
CUSTOMER:0124874 Yodamo Ltd. Account number: 3e56gh
Company Registration No.
CALL 2 SEE Ref number: 0674345
20154/00765
PO BOX 1000 ID Registration No. 4014563DF1 Invoice number: xxdetgdwe
- Invoice date: Jan 1, 2020
Your registration
COLUMBIA Registered Office 161511
CUSTOMER:JETSKIESE
CUSTOMER:0124875 Yodamo Ltd. Account number: 3e5448gh
Company Registration No.
JETSKIESE Ref number: 0745342
1545412/0007
PO BOX 1000 ID Registration No. 4014G345 Invoice number: xx4567Ge
- Invoice date: Jan 1, 2020
Your registration
COLUMBIA Registered Office 954287
thanks David, that helps, i'm also looking at a field "DID...…:" if i apply the same filter to just "DID" it works fine, but if i try to match 'DID...…:' then it wont work - do i need to apply something different to the '...…:' characters?
@berty, probably best to start a new post on this.
Please tag me in the post and I can take a look.
In the new post, please provide a bit more detail on your last question.