Hello,
I have a directory of 100 CSV files and would to combine them into one table for analysis. There are a few issues with the underlying data.
Unnamed: 0 | Unnamed: 1 | Unnamed:2 | Unnamed: 3 | |
0 | ||||
1 | ||||
2 | ||||
3 | ||||
... | ||||
8 | Product ID | Product Number | Name | Address |
9 | 1232 | XX |
I've tried utilizing a batch macro to bring in all the CSVs but the columns will never line up and it ends up being completely disjointed. The goal is an output table like the below:
Product ID | Product Number | Name | Address | CSV File Name |
1232 | XX | File1 | ||
1233 | XYZ | File1 | ||
1234 | ABC | File2 | ||
12345 | ABCD | File3 |
Appreciate any help / thoughts on how to best approach this.
Thank you,
Can you post an actual CSV files with a few data rows, mocked up if necessary?
You could use a Formula tool to identify your true header row, append the header row number to the original input data, use a filter like [Row Number] >= [Header row number], then use Dynamic Rename, take field names from first row of data.
Then for your batch macro, use the macro from option 3 in this article: But what happens if your field schema is different from one file to another?
The Ultimate Input Data Flowchart
Community > Designer > Designer Knowledge > The Ultimate Input Data Flowchart
https://community.alteryx.com/t5/Alteryx-Designer-Knowledge-Base/The-Ultimate-Input-Data-Flowchart/t...
Chris
Hi @bchen1
I would try with the batch macro too, good try. For this case, maybe if you go inside of the batch macro and
1)Do not pick the first row as header
2)Filter the dataset in a way that keep the header and the data
3)Use the dynamic rename to pick the first row as header
4)select the proper columns
Maybe the output will work as expected.
Attaching a few mocked up CSVs to help illustrate thank you! @ChrisTX
User | Count |
---|---|
106 | |
85 | |
76 | |
54 | |
40 |