I am importing 40+ CSV files into a workflow, with number of rows each anywhere form 50 to 60k+.
I would like to merge them, and then output them again as separate files but based on these rules (in this rule priority):
1. New output file and file name, every time the input file name is different. (this is already the default behavior of the output tool if you tick "take file/table name from field" and specify the fullpath field)
AND
2. New output file and file name, every time the input file reaches 8000 rows in size.
Example:
Bananas.csv (340 rows of banana items)
Peaches.csv (1200 rows of peaches)
Apples.csv (23534 rows of apples)
Watermelons.csv (600 rows of watermelons)
Should be output to:
Bananas (Batch 1).csv
Peaches (Batch 2).csv
Apples-1 (Batch 3).csv - First 8k rows of original file
Apples-2 (Batch 4).csv - Next 8k rows
Apples-3 (Batch 5).csv - Remaining 7534 rows of original file
Watermelons (Batch 6).csv
Any ideas? I imagine this will involve a lot of regex, either regex matching or regex count of certain filepaths, then using a multirow to write parts of the naming in different columns, then appending all fields together as a string in another column, but I can't really come up with the logic for this yet.
Solved! Go to Solution.
Hi @BrandonB and thanks for your input, I'm currently trying to set it up as you described in your first reply.
Also, I need the batch number at the end of each file name, and I tried using a multi row formula for that but doesn't seem to work:
If [Row-1:FileName]!=[FileName] then [Row-1:Counter]+1 else [Counter] endif
Used the browse tool to see if I in fact have 40+ unique counter values, and there are only 3. Even though the same browse tool shows me there are 40+ unique file names. Any ideas why?
Also, [Counter] is stuck on 1. I thought after using running total, it would increment in numbers each time the file name changed.
Example workflow is attached. Ignore the text input and the generate rows (This was done for me to create sample data). At this point you would already have your data stacked. Connect a Record ID tool to your data and in the formula after the generate rows, use the Record ID instead of RowCount. The rest of the logic should follow. Finally, you would just have your Output Data tool at the end where you take the new file name that you have created. Let me know if you have any questions!
Thanks for your help BrandonB, that's what I was looking for.