Start Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Processing Excel files from different customers ALL with different schemas

LizPerry
7 - Meteor

I need to import 1000+ Excel files, some of our customers submit data which can earn them rebate. We have a desired template but over the last few  years this has not been enforced and I now need to import the data for 2 reasons, the first to monitor who is submitting data & assessing the format, the second to store the data for reporting purposes.

I have a flow with a dynamic Directory connection and two macros, one to find all files in the folder, the other to find all sheet names within the files to bring all sheets into my flow.

Where I'm really having problems though is that the initial dataset is one huge dump of all files and every combination of headings, I can't seem to separate out which headings are in which file so that I can assess them for the correct format and build a usable dataset, I have over 380 fields because of this variation over the years. Is there a way of building a view to show each individual file, each sheet name within the file and the contents of the first 5 rows?

I've attached sample data to show some of the variety, note the different sheets are all in different files from different customers.

1 REPLY 1
jrlindem
12 - Quasar

There's a lot going on here.  So I'll start with some general recommendations:

  • First, if there an incentive like a rebate, then enforcing a standard format to earn the rebate becomes easier!  I'd start there
  • Batch Macro is useful for Alteryx to consider, assess, and import data for each of the file as they are
  • Field Info tool could be useful to assess and compare all those different field headers
    • You may need to build in a crosswalk for those various field names and "map" them.  New values added as new headers become known
  • Using Transpose, Crosstab, and Summarize Tools can help assess fields, quantity, etc.

Overall, though, with so many files, it's not so much a matter of bringing it all in.  I'd be curious what you can even do to clean 'em up after you figure out how to get all those variations in your workflow...  I think your best bet is to hedge the incentive of the rebate and force conformity towards an acceptable data profile and go from there.  Only due to the volume of sheets.

 

Hope that helps, -Jay

Labels
Top Solution Authors