Hello,
I have a few hundred Excel files, all with about 30 sheets. On the same sheet in every file, in the same place on every sheet, there are some numbers I need to aggregate. I've taken that sheet out and put in some dummy data, and attached it below.
The data that I need is column A, (just the retailer name, not any of the product details), and for columns DG to FF, row 3 (the date) and rows 1214 to rows 1381 (the data).
The output I need is as follows:
FILENAME | RETAILER | DATE | VALUE |
File 1 | Retailer 1 | 15/06/20 | 3000 |
File 2 | Retailer 1 | 22/06/20 | 2000 |
File 3 | Retailer 1 | 29/06/20 | 3000 |
File 1 | Retailer 2 | 15/06/20 | 2000 |
File 2 | Retailer 2 | 22/06/20 | 20002 |
File 3 | Retailer 2 | 29/06/20 | 200 |
I've run into so many problems I don't even know where to begin. I can't work out how to carry the Retailer name down, I can't work out how to transpose the date row into a column, I can't work out how to aggregate the numbers, and I can't work out how to get Alteryx to plug through 200 files in one workflow, doing the same thing to each of them and aggregating all the results into a single Excel output.
Any help you could provide on this, my fourth very late night in the office, would be appreciated!!
Thank you
Sam
Solved! Go to Solution.
Hi @pupmup,
Hopefully this helps you get home earlier tonight. This should work with real data and bringing in multiple files at once in the Input Data tool. I brought in the data at the first line instead of line 3 as the repeating dates would require data changing downstream to get the dates correct; a Select tool and a Select Records tool make it easier to keep the data clean down the road. To identify the Retailer I used the second field as the retailer name is associated with "NET REV". From there a Multi-Row Formula tool replicates the retailer name down; Transpose, data cleanse and change field type via Multi-Formula tool, and Summarize to put your data in the right format and update the field names.
I think I managed to put something together that you might be able to work with.
The Directory Tool loads the list of files from the folder (I only have 1 - perhaps try it with 1 first to check the results are what you want)
Three formula tools feed 3 dynamic input tools to load 3 cell ranges separately: A1214: C1389 for finding the Retailer; DG3:FF3 for the dates and DG1214:FF1381 for the data
I then had to do some jiggery pokery to find a way to bring it all together.
There's a lot of nulls in the data which I filtered out.
Thank you so much for your help @T_Willins !! I can't express how grateful I am. I've poked through all the various steps to make sure I understand them, and it's a joy indeed to see it working on one file at a time! I only have one question, and that's how to get the process to run through each Excel in the directory.
I have tried replacing the file name in the input file path with a *, so it reads \\network\directory\*.xlsx, and while this appears to read every file in the directory at the input stage, it only outputs 1.5 file's worth of data at the other end.
I think doing this means it reads all the data into one big table, but then the record select tool still just keeps the top few hundred rows, and discards all the other files' worth of data.
Is there a tool that allows me to say "run this entire workflow for a file in the directory, then move to the next file and repeat, until all files are gone", instead of the "read all files at once and then run the workflow once on the aggregate"?
(I will obviously look by myself as well!)
Thank you again
Thank you so much for this, I can't express how grateful I am to have had the assistance!
I only have one question left, and that's how to get the workflow to run on sequential multiple files. I have tried to replace the filename with an asterisk, but I think that just aggregates all the directory data into one lump, then discards 99% of it with the row selector tool later on.
Any suggestions you can provide (while I do my own research!) would be lovely.
thank you so much again!!
Sam
Thank you very much for your help @DavidP ! I really appreciate you spending the time helping me out 🙂