This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
I am working on a project, users will update comment fields in an Excel file that lives on a shared folder.
My Alteryx Workflow will be scheduled to run at the end of the day to capture all the updated info from user comments in all 29 files, and push that into a table in our Snowflake database.
The issue is this:
1. There will be 29 separate files in the shared folder, one for each of our business areas. The names will be in a format like...
"Daily Report SOUTHEAST 20181120.xlsx"
2. The tabs in the Excel spreadsheet will be named similarly to the files - they'll have the date dynamically, and the area in it - so the tab I need to capture in the file in #1 above might be called OUTS SOUTHEAST 20181120... the tab will always be the 2nd tab, but they will be named uniquely - 29 different names :(
3. Because of the file and tab naming, I don't believe I can just use a simple wildcard with an Input Data tool. My experiments with that have failed so far.
I believe I need to create a Macro tied to a Dynamic Input tool for this to function properly, but I haven't done something like that yet, so I am lacking confidence in that area :)
Has anyone seen a situation like this? Am I on the right track?
I'd use a Directory Tool to get a list of all the files in the shared directory that meet your criteria. Then, I'd use a Dynamic Input tool configured to pull in a list of Sheet Names for each file (and to output the file name).
From there, you can use a filter, either by the sheet names per your naming convention rules, or to pull in specific rows from the list of sheet names.
Then, a second Dynamic Input tool can be used to pull in your data.
I don't have time right this minute, but I can put together a proof of concept a little later today if the above description does not help.
I got an example workflow working which does something like this. This example does not include test files - let me know if you encounter issues pointing it to your own excel files.
What this workflow does is get a list of excel files in a directory, pull in all the sheets from it, give each sheet a recordid, filter for the second sheet in each file, and then pulls in the data from that sheet.
Hope this helps to point you in the right direction!
That'll help read in the data and list out all of the sheets of all of the Excel files in the folder. Then you can use the Tile tool or Multi-row tool to figure out the sheet name of the 2nd sheet in each workbook and then start your extraction / processing from there ...
OK, I was able to get this pointed to my local files, but it is generating an error on the 2nd formula tool. Image attached - the [SHEET NAMES] field isn't carrying through the workflow for some reason. Also, when I try to run the workflow... it gives me an odd error about Schema in many of the files not matching the 1st file in the set:
Error: Dynamic Input (2): The file "C:\Users\o1xxxx\Documents\Alteryx Workflows\Daily Outs\User Comment Files\Daily Outs Report DAKOTAS 20181116.xlsx|||<List of Sheet Names>" has a different schema than the 1st file in the set.
As far as I can tell, except for the file and tab naming, everything is identical :(
It seems that the Dynamic Input tool (and the input tool itself) only allocate enough space for the longest sheet name each time they read in a file. Because of this, the process cannot or does not read in additional excel files if the length of sheet names changes.
To get around this, I've attached a batch macro that I built in September which should resolve this issue.
If you replace the first Dynamic Input tool in this example with this macro, it should pull in the sheet names correctly.
I had to attach it as a YXZP file because of errors with the Community, but Alteryx can extract that without issue.
The macro should be relatively straightforward to configure, but if you run into issues, let me know!
I think i am one step closer! ;) I think it's just configuring this workflow properly... for some reason, it's taking the fields from the LAST tab in the workbook instead of the 2nd. Not sure why, but I am playing around with the various configurations to see what effect it has