This seems like a trivial problem, but I was wondering what some best practices are.
I'm doing a bulk ingest from a single folder with CSV files into SQL Server 2012. Each CSV file contains the fields (Key, Value, DateIssued). I'm using the SQL Server Bulk Loader (ssvb) because the CSV files are massive. New CSV files are put in the folder every day, and I run the bulk ingest job once a week.
1) Upsert. Newer CSV supersede data in older files. I sort the directory by Create time and process the files in time order. In Alteryx's bulk loader, how do I specify that if the current DateIssued field (non-primary key) value is newer, overwrite the existing row in the SQL database, else insert a new row? (there isn't an update-if-newer in the ssvb tool)
2) Best practices for keeping track of files already processed. How do I keep track of the files I have already processed so I can skip them the next time the Alteryx job is run? Can I store the processed filenames in a table and use Alteryx to somehow skip all filenames in that table?
Thanks!