This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
Hello community! How can I dynamically set the number of records to keep in the Select Records tool? (or completely do this another way!)
Every week new CSV files are generated and I do not have control over the format/layout of these files - they are what they are - but they always have the same fields and types in the same order. These CSV files usually come with 9 rows of info that I want to throw out. But sometimes they don't have these extra rows at the top. (actually 8 rows of strings, then 1 row of [null]).
Here is an example of the first 10 records when it includes the extra rows:
And then I do this:
How do I error-trap for those times where the CSV file doesn't have the extra rows? Is there a way to change the select records config to 1+ if they aren't there? Or can I skip the select records tool entirely if they aren't there?
You say that it always has the same fields in the same order, so what you really care about finding is what is on row 10 ('RPT_SHORT_DESC, ITM_SCN...), but you don't know what row it will appear on. Here's how I might approach it:
1. If you have multiple CSV files (unknown quantity of files), you'll likey want to use a Dynamic input or make this into a macro
2. For the part of finding what to throw out- since the fields or order doesn't change, just what row it appears on, why not use a multi-row formula tool to look for that string specifically, flagging all rows before as 0 and rows after it as 1. Then pass it through a filter to keep the rows you want. Parse to columns and dynamic rename to get the column names correct.
Brilliant! I had never used the multirow formula before, so I had to learn something new and it works! I can definitely filter off that.
And yes, I do have a varying number of CSV files each time... I am currently pulling them all in at once using an asterisk for the filename and using the filenames to populate a new field. Then I am cleaning them and adding them to a Calgary database for use by multiple users in future applications.
Would you mind walking me through why I may want to use a Dynamic Input instead? I am early enough in development that I could change course if it will help me, so I am open to ideas.
Actually in your case the *.csv on a single input may work best as the Dynamic Input tool requires a file as template. Since your challenge is that the file structure may differ, I think you'll keep things simpler just sticking with your current input wildcard setup.
The dynamic input tool is very useful if you need to read in the list of files, or especially if you need to modify database SQL queries on the fly.
And yes - Multi-row formula tool is one of my favorites :smileyvery-happy: