Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Suggestion for Directory Structure for Data Files Download

StockMarket
8 - Asteroid


Hello


I have a list of URL's to be downloaded from a server. The files are of different types like csv, pdf, text etc. For all those days when the Stock Markets were open, the new files are to be downloaded from that server.


Once I download all these individual files, I need to do some data cleaning tasks on them and then upload them into the database into their Respective Tables. One table for each file type, already exist in the database and the new data simply needs to be appended to the respective tables on a daily basis.


Now, I am not being able to figure out, if I should be downloading all these 10+ Data Files into a single folder for "EACH DATE".

OR

if I should have a single folder for each "FILE CATEGORY", and then put all the files for all different dates into that particular folder alone. Which would look something like this, for example -


C:\Data\Alteryx\PR-Files
Date1File
Date2File
Date3File
C:\Data\Alteryx\DAT-Files
Date1File
Date2File
Date3File
C:\Data\Alteryx\EQUITIES-Files
Date1File
Date2File
Date3File

 

Should the Download Directory Structure be "DATE WISE" or "FILE NAME WISE" in my particular case, when there are 10+ different files to be downloaded on a Daily Basis and needs to be processed and then appended into the Respective Database Tables.


Which particular method makes more sense from the Alteryx Workflow Efficiency Perspective? Which method will allow me to have more control on the workflow designing and then running the workflow on a daily basis.


I even need to include a step at the end of these workflows which will actually VERIFY that all these 10+ files have been downloaded, processed and uploaded to the respective database tables. I need to get this FINAL REPORT for each of these 10+ individual files, on a Daily Basis, so that I can just look at this report and be sure that the whole workflow has happened successfully for all of those files. And if there was a failure for any particular file then this report will automatically HIGHLIGHT/NOTIFY that issue and then I can fix the problem for that particular file.


So I am thinking about choosing that particular directory structure, which will allow me to perform the above type of VERIFICATION and generate the Final Report of Success or Failure for Each Data File, for Every Single Date.

 

Please suggest any ideas that you have in this regards.


Thanks a lot


PS: You might have a look at my previous thread as well, where I generated these URL lists to be downloaded on a daily basis - https://community.alteryx.com/t5/Alteryx-Designer-Discussions/Workflow-to-generate-List-of-URL-to-be...

3 REPLIES 3
afv2688
16 - Nebula
16 - Nebula

Hello @StockMarket ,

 

What you are asking in this case is something that for me is too business specific, this kind of logic should be discussed maybe with yor team rather than here.

 

Anyways and having said that and without being responsible of whatever happens next, I would recommend you to have the files which share the same structure together with date applied on the name. Files that are downloaded from same url tend to share same structure. You can then apply batch macros to automate the ETL, analytics, etc processes for each file. You can establish the logic on each macro for each file and then merge them after together.

 

Again, this is how I like to do it, nothing else.

 

Regards

StockMarket
8 - Asteroid

Thanks again for the inputs @afv2688 

 

Actually this particular question is not as business specific as it might appear initially. Because this is about the Best Practices for designing Alteryx Workflow, which involves inputting the data from multiple files into a single workflow, which is a common scenario for a lot of user cases.

 

Having said that, if I understand your reply properly, then you are suggesting that I actually "ADD the DATE" to the Original File Names ex. -

 

Stocks-20210315

Stocks-20210316

Stocks-20210317

Commodities-20210315

Commodities-20210316

Commodities-20210317

Forex-20210315

Forex-20210316

Forex-20210317

And So On for all different file types.

 

And then design the Batch Macro which will work an Individual file of Stocks, Commodities, Forex and then make it loop through all different DATES for that particular file, right?

 

You haven't said anything about, if I should have the Separate Folders for Stocks, Commodities, Forex in this case, or if I should have Date Wise Folders of 20210315,20210316 and 20210317 respectively. What would you suggest in out of these 2, after I add the Date to the File Names itself, as you suggested earlier?

 

For running the batch macros, which particular directory structure would make more sense? Or if it makes no difference at all, whether you have all the files of above 3 types, mixed up into the same folder, or have them systematically arranged into the respective folders?

 

Any ideas, anyone ? Aren't there any best practices or suggested approach in such user cases ?

 

Thanks and Regards

 

 

afv2688
16 - Nebula
16 - Nebula

Hello @StockMarket ,

 

Regarding the folders, I would go for separate folders for each filetype. For me it's easier that way to afterwards look for them.

 

Whatever you go for in the end, I would pay attention to always have your data using the same hierarchy (if it has the same vale). This is my "best practices" advice.

 

Regards

Labels