Hi. I want to compare that two folders have the same folder structure location wise also and both of the folder contains same number of files. And also both folder contains the same files with same names (If it is excel file then same sheet name) and also same columns. In short i need to compare that my template folder and my source file folder contains same structure. Thanks in advance.
Solved! Go to Solution.
I'm going to break this request down into a few pieces, since this starts at a pretty easy request and then adds complexity.
Compare folders to identify structure and files/filenames:
You can use two Directory Tools to get this information, and use a Join tool on the Fullpath field to find out if all files exist.
You can use a Summarize after the Directory tools, Group By Directory, followed by a Join on Directory to find out if all folders exist as expected.
Note that the Directory tool does not include empty folders in its results. If you need this functionality you may need to look at options using the Run Command tool.
Excel sheet names - using the Extension in the FileName or Fullpath field from the Directory tool, you can filter for those that end with .xlsx, .xls, or .xlsm. Then you can connect this to a Dynamic Input tool, configured to produce a list of Sheet names for each file. You can join this on the File Name and Sheet Name to make sure this matches.
Column structure of all files:
This one is tricky for a few reasons. Non-excel files could have different delimiters (comma, tab, pipe, semicolon, etc), and fixed width files could have different definitions as well. Identifying this information dynamically could be tough, so you may need some way of categorizing this information beforehand.
Assuming that you can complete that categorization, I would look into a Batch macro which accepts the filename, checks both locations for it, and loads with the appropriate configuration. Then you can compare column names/types/sizes using the Field Info tool and a Join, and output the results.
This should theoretically allow you to build out what you need for a full column compare, but it would require a fairly robust batch macro to make it work.
This is how I would do it:
1. Comparing 2 folder structures with the Directory tool and a join tool is quite straightforward. You can then use a number of techniques to identify any filenames that didn't joint from either folder.
2. Comparing Excel files and their Sheet Names is a bit trickier, because the only way to load sheet names from various Excel files with different schemas is by using a batch macro. However, this is not difficult.
Have a look at the attached workflow. I created 2 folders (Folder1 and Folder2) each with 2 (identical) files. One of the files is an Excel file with 2 sheets. Use a fornula tool to add <List of Sheet Names> to the fullpath of each file. This then feeds a very simple macro that feeds fullpath to a dynamic input tool to load the List of Sheet Names for each Excel file one by one.
You'll have to copy the macro into your Macro folder (or create one and point Alteryx to it) to get it to work.