Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Help with processing data from a report that contains unwanted headers

chc9
7 - Meteor

Hi,

I get raw data in report format that looks like this:

 

Customer:   78566      
Address:    BOX HILLHOSPITAL     
EASTERNHEALTH-ACCTS PAYABLE     
8 ARNOLDSTREET     
BOX HILLVIC3128     
AU      
       
       
MaterialDescriptionList priceCurrCustomer priceCurrDeal type
       
9300633714444WOOLWORTHS TUNA IN SWEET CHILLI SAUCE 95G0.885AUD0.797AUDSold-to Contr Price
0066613004026BRUNSWICK SARDINES IN OLIVE OIL 106G0.889AUD0.801AUDSold-to Contr Price

 

Please can you help me reformat the header so my output data looks like this.  Thank you.

 

CustomerMaterialDescriptionList priceCurrCustomer priceCurrDeal type
785669300633714444WOOLWORTHS TUNA IN SWEET CHILLI SAUCE 95G0.885AUD0.797AUDSold-to Contr Price
785660066613004026BRUNSWICK SARDINES IN OLIVE OIL 106G0.889AUD0.801AUDSold-to Contr Price

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

7 REPLIES 7
apathetichell
18 - Pollux

H,

 

Are you looking for something like this? It takes away the header records and isolates the field names. It then uses dynamic rename to turn them into the column headers...

 

Hope this helps.

chc9
7 - Meteor

Thanks!

I ran your workflow using my input data from the excel and the output failed.

Pls can you have another look using my excel example data?

 

Thanks!

Ben_H
11 - Bolide

Hi @chc9,

 

I've had a go at this - it assumes that the structure of your input is going to be generally the same going forward.

 

Ben_H_0-1616669660401.png

The first thing I've done here is try to identify the specific rows we want, so I've added a record ID first.

 

Then I've tranposed the data and done two things

 

1. Identify the rows that contain "Customer:" to get the ID we need

2. Count the number of non-null fields in each row - the assumption being that the max non-null count will be the table rows.

 

I then join back to the main stream, just the rows that have useful data in.

 

Create a column for customer ID and fill it down using a multirow formula.

 

I then update the column names using a dynamic replace and filter out a repeat of the column names later (your input had two tables with the same structure) to give the following output

 

Ben_H_1-1616669927347.png

 

Hope that helps,

 

Regards,

 

Ben

 

apathetichell
18 - Pollux

Got it! You want the customer number from the header replicated in each row... If your formatted is fixed you can use the attached.

 

Sorry about the confusion...

 

 

 

 

chc9
7 - Meteor

This is a brilliant solution!  Thanks.

chc9
7 - Meteor

Hey Ben,

For purposes of my original question, I tidied up the source data into excel but my actual source data comes in txt format (attached).  Can you pls help me read that into your workflow?  I can't figure out how to import & parse properly a txt file.

Thanks

Harry

Ben_H
11 - Bolide

Hi @chc9,

 

I've added an alternate method to read from text files (see attached)

 

The methodology is a bit different as you have to deal with parsing out the text file.

 

Broadly what I've done is -

 

  • Read in the file as a flat file (fixed length)
  • Used regex to replace any instances of 2 or more whitespace characters with a | symbol
  • Split this into a table using text to columns
  • Cleaned up any misalignments

This will work as long as this structure is fixed.

 

If there are changes to the structure this example should give you enough to work out how to correct it/make something that handles variation better.

 

Regards,

 

Ben

Labels