Excel Input with Columns not matching
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I am having an issue, as I am pulling in several excel spreadsheets, and need to run this workflow monthly. Issue, a field called "MBP-PED-BEG-DATE" has some monthly spreadsheets with the title with a space in front of it, and others with two spaces. Only field that does that. How do I check if the field has a space or two, and strip them off before proceeding so it will count everything?
Solved! Go to Solution.
- Labels:
- Alias Manager
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I can answer this one: it's actually pretty simple.
In your input tool it asks you for the title of the spreadsheet. Input an asterisk* where there may be fields that differ. That way it will still pull the base format.
Example: inputting C:\users\files\whatever\*PED-BEG-DATE*.xlsx|||Sheet1 will pull all files that have "PED-BEG-DATE" in the title and are in the same folder, regardless of any whitespace before or after the title.
As well, if the filenames are generated in Alteryx, adding a data cleansing tool after it generates would serve a similar purpose.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Not sure that I understand as it did not work for me. It is only a column name that changed in some of the months, and not all.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Is it the filename or a column in the file that you are having an issue with?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Column name. First 5 spreadsheets have this date column named [MBP-PED-BEG-DATE], then they start changing, some have that and some have [ MBP-PED-BEG-DATE]. That space is what is causing a problem.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Ah, I see. Yeah, the asterisk only works for filenames.
Use the data cleansing tool; it works on headers/column names too.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thanks, will give that a try and let you know tomorrow if it works!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Ok! Don't forget to mark it as a solution if it is correct so others can learn from it too!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
If I understand correctly, I think the problem you might encounter here is that if you try and load multiple sheets from excel files where a field name in one or more has a different name from the rest, you'll get a message saying that the file/sheet has a different schema from the rest and will be skipped.
Even if you manage to load all the Sheets (using a batch macro for instance), the field that has different names will be in different columns.
One way yo overcome this is to check the box First row contains data in the Input Data tool. You can then use a Dynamic Rename tool with option, Take field names from first row to rename your columns and then use a filter tool to remove all the rows that contain the field names.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Thank you, this worked perfectly!
