New question. I now have a workflow that runs with all my files, which gives me a lot and is very messy. However, this is a step in the right direction! I am trying to delete all the sheets in these files that start with “DECS” because I don’t need them and it is just adding a lot of unnecessary run time and data. Is there a way to do this? I know sample let’s you skip every N or lets you start at a certain row, etc., but those won’t work for my case. Any help?
Solved! Go to Solution.
Hi @jakriete0416,
I didn't get a complete picture of what you are attempting to do. But making an assumption of your process, you reading multiple excel files which in-turn have multiple sheets (tabs). Off of which you are looking to exclude sheets that start with the characters 'DECS', yes? If the answer was yes, then below might be one way to solve your problem:
1) Read all excel files and their corresponding tabs.
2) Create a filter to exclude sheets with 'DECS' tab names
3) Modify the filepath to include non-DECS excel sheet names
4) Use a dynamic input to read the multiple sheets downstream
The workflow may look something like this:
The example I have above is meant for only one file, but you can extrapolate it to include multiple files by including the Dynamic Input inside a batch macro.
Hope this helps. Let us know if this is not what you were looking for.
The way I understand your question is that you have a directory that contains various Excel files and within each contain various sheets. You already have a workflow which is able to run through and dynamically read in each sheet from each file, but the output results in a very messy data table.
I assume your current solution utilizes a (batch) macro which reads in each file/sheet to avoid any schema validation errors. I also assume that the process has an initial step of reading in each of the Excel files and returning the sheet names then constructing a file path which is ultimately fed into the batch macro.
If this is correct, you should be able to put a filter after the Input tool that reads in only the sheet names using the expression STARTSWITH([SheetName], 'DECS').
If my assumption is not correct, please clarify your request and share any details on what you've already created. I'm sure this community can share some options, but sharing as much as you could would help us to understand your problem better and suggest options that can be easily added to your existing setup.
Jimmy
Teknion Data Solutions
I am using a read all excel tool which helps me easily bring in all the files with multiple sheets.. with that, I can't use that filter. however, I used a Contains filter instead for file name which seems to have worked. Now it is just a matter of making my data less messy because everything is coming in on random columns and rows.. I am new to Alteryx. 🙂
Would you be able to post a sample dummy workflow that replicates the steps you are taking? It will help us better understand the challenge you are facing.
Being that the tool you're using is not a standard Alteryx tool, can you either share the macro here or the url it can be downloaded from?
If you look at my other reply, I used that link to use that Alteryx tool to bring in all the files I need. Now they are in (hooray!), but it is a mess. So, to remove the sheets I didn't need, since I could not just remove sheets, I did a !Contains("DECS") for the filenames so it removed the rows including those. The data is still a mess though and does not look the way I need. I will show you what one looks like before and what it looks like after..
Attached is the before files/sheets.. they all have the same format. The second is what they look like now.. a mess.
Looking at the image you shared and the settings in the macro you used, I suspect the macro reads the all files from row 1, while the data in your source file starts at row 14. Maybe that is causing the mess?
Like most things in Alteryx, there are multiple ways of solving a given problem.. and to that end have attached a sample flow which should help you ingest multiple files & their tabs (excluding ones that start with DECS).
I have assumed all your files have their data starting at row 14, and so have hard coded the Start Data Import on Line value to 14 in SecondMacro.yxmc. Feel free to modify as maybe the case in your workflow.
Hope this works out for you!