Hi All,
I am facing some challenges while trying to migrate few large tables (~ 50 GB volume) from MS Access Database to Snowflake . I have tried different ways by changing the Design without any success. All the Access DB files are sitting in one of the File Server . We need to process these files individually in order to maintain proper file validation like the file count , load status , file name , etc... . So we are using the Batch Macro to load the files iteratively. Below are some of the different design approach we took thinking it will improve the performance .
Design 1 : Using Input / Output tools
When Alteryx try to read those files from network location there is performance bottleneck on the Source side . I tried to increase the Cache size but there is no improvement in the read performance . Also writing to snowflake was slow using the Snowflake ODBC .
Design 2 : Using Input & IN DB Tools
Changed the Target design by replacing the Output Tool with Write In DB tool to achieve better performance . So connected the Input Macro with In DB tools . We are reading the files iteratively and using Data Stream In creating a temporary table to load data into actual Snowflake table using Write In DB tool. Once the data is loaded I am using Data Stream Out tool to extract only the filename which was one of the metadata column to update the File Validation table using POST SQL in Output tool.
Any suggestions on how to optimize the existing Design and improve the extract and load performance with be highly appreciated. Looking forward for your responses .
Thanks
SA
Hey @ansars ,
1. Reading from file server: Activating the AMP engine is the only option. Other than that, the slowness you are experiencing is related to your network/VPN and not alteryx. I would suggest scheduling this part of the process to run at night and save everything to a local folder in the server itself in yxdb format.
2. Writing to Snowflake: I suggest using the Snowflake bulk loader option.
Best,
Fernando Vizcaino
Hi Fernando,
Thanks for your quick response. Snowflake bulk loading solved the performance issue .
Question about the Snowflake Bulk loader. Can it work on the Server? and does anyone have some example of the ODBC connection string?
Trying to get the SF bulk loader to work on server and running into issues - is there a workaround?