This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
I am trying to build an archive of data files in several sub-folders of in one master folder. The files are a total of 70+ files each with average rows of 20k and most of the files have the same name as they were generated by the same cloud system. It is a database of customer lead scores over a period of time so the most of the file entries are duplicates. The only thing we can use to differentiate them are the dates the files where created/generated from the cloud. I have so far been able to merge the all the files from all sub-folders in into a single database file using a batch macro. But in other to differentiate them each entry int the mater file, I need to add the file creation date, which is not a column in each file. Any ideas on how i can do this?
Assuming your control parameter is just feeding a list of file paths, you could instead retrieve these paths from the Directory tool, as this will give you the extra info you need, such as [Creation Time] (and others).
You can then maintain this information once you've brought the actual data in (example below I'm using a Dynamic Input):
Note that I've just used a select records tool (of 1 row) to replicate what would be your batch macro.
There's also a great demo provided in your Alteryx Designer if you want to see some additional examples, just go to: Help -> Sample Workflows -> Learn One Tool at a Time -> Developer -> Dynamic Input
Did you download my attached yxzp file above and run it? Hopefully you can see everything I’m referring to here.
The control parameter doesn’t actually need any configuration; this tool is what makes the macro a batch macro, and will say ‘run it for every record you have feeding this’. The upside-down '?' you see feeding the batch macro is what's going into our control parameter.
The macro input itself (shown below) will contain many rows, but we only want to run one row at a time. If your data is structured slightly differently you’ll find that the dynamic input tool will start to skip files because they don’t match the source input. This is why we’re batching it; basically saying run it independently as many times as there are rows, and then just union them together.
note that there are 56 rows here, but the dynamic input tool can struggle sometimes if the files are different structures
The filter is only in place to ensure we’re just taking one row at a time. The way it’s configured is arbitrary, because it is the control parameter that will update this (via the action tool) for every single run:
Here it's taking the top row, but this filter will eventually match every record that's being fed in by the control parameter (we're only seeing 'one run' here)
The action tool is needed for the control parameter to update the filter so it matches the path of row 1 on run 1, row 2 of run 2 etc. It’s fairly simple to configure: just leave it as ‘update value’ and choose the static value in your filter (hence why it was an arbitrary value), which it will update each time. As is shown below:
Here we're telling the action tool (on each run) to replace the value in [FullPath]=[Value] every time it runs. Hence the macro being run as many times as there are rows going into your control parameter (the upside-down '?')
Hopefully that answers your question? If yours still isn't working feel free to package it and I can take a look.