Hi Community,
I have to stitch certain files from SFTP. SFTP contains daily files.
Each file has 90 days of data in it, so date ranges that we wish to pull outside of 90 days require multiple files. The issue with this is that the files are most accurate with days further in the past (i.e. we could be missing data points if we were to simple stitch files every 90 days). To fix this issue, we should be stitching the files together in a way that prioritizes the last 30 days of each file. Below is an example.
Date Range Pull: 1/1/2021 – 6/1/2021 = 151 days
Files used:
- o Most recent date for the range pull should dictate which file we start with
- o 30 days prior to 6/1/2021
- o 31 days prior to 5/2/2021
- o The reason this one is not 30 days is because the number of days between 1/1/2021 and 6/1/2021 is not divisible by 30, so instead of using another file to take the other day, we would be fine with using 4/1/2021 to use the 31st day in the file.
- o The last file can leverage the last 30-60 days in the file, but if the number of days exceeds 60, we must pull a new file.
How the files will be stitched together:
- o Uses data from 3/3/2021 – 6/1/2021
- o Uses data from 2/1/2021 – 3/2/2021
- o Uses data from 1/1/2021 – 2/1/2021