Loop through files in a directory, select specific columns and then append
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I am currently pulling in files using a directory tool with a wildcard on the file name and then feeding that to a dynamic input tool. An issue that has come up is that the number of overall columns can vary from each xlsx, however the main columns that I want will always exist- those other columns can be dropped.
Using the current method, starting at the top of the files in the folder, they will all be brought in unless the schema doesnt match the first file that was read in.
I am thinking i need to be able to loop through each file, select the main columns that i know exist in each from that specific sheet, then append them as those columns are extracted.
Any ideas on what I might need to do to get this working? The above was just my thoughts on what i think i need to do, please feel free to recommend other options.
Thanks for the help!
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @JMart2135
If you turn this into a batch macro you could definitely use a select tool to curate the fields, otherwise in the interface designer of the batch macro there are a few options that can handle this scenario. I personally find myself using the 2nd option frequently, which acts similar to a union tool.
Let me know if you need help with how to make a batch macro
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @JMart2135 ,
I've built some tools to do exactly what you want.
I've attached them both long with an example workflow.
The first tool reads in the sheet names of all files in the directory or sub-directory, and then appends them to the filename, so you can filter down the sheets you want in case the humans change them over time. The second tool then dynamically reads them all in and gets around the error you will receive in the Dynamic Input tool if the schema differs. You then simply need to select the columns you want using a select tool.
Hope this helps.
M.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@Luke_C attached are the files we discussed
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @JMart2135
See attached. I put the steps you use to clean/select the columns into the batch macro. The control parameter is based on the full path input from the directory tool. It dynamically updates the input tool for each file in the directory location. Hope this helps.
