Performance issues while loading data into Snowflake from MS Access Database
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi All,
I am facing some challenges while trying to migrate few large tables (~ 50 GB volume) from MS Access Database to Snowflake . I have tried different ways by changing the Design without any success. All the Access DB files are sitting in one of the File Server . We need to process these files individually in order to maintain proper file validation like the file count , load status , file name , etc... . So we are using the Batch Macro to load the files iteratively. Below are some of the different design approach we took thinking it will improve the performance .
Design 1 : Using Input / Output tools
When Alteryx try to read those files from network location there is performance bottleneck on the Source side . I tried to increase the Cache size but there is no improvement in the read performance . Also writing to snowflake was slow using the Snowflake ODBC .
Design 2 : Using Input & IN DB Tools
Changed the Target design by replacing the Output Tool with Write In DB tool to achieve better performance . So connected the Input Macro with In DB tools . We are reading the files iteratively and using Data Stream In creating a temporary table to load data into actual Snowflake table using Write In DB tool. Once the data is loaded I am using Data Stream Out tool to extract only the filename which was one of the metadata column to update the File Validation table using POST SQL in Output tool.
Any suggestions on how to optimize the existing Design and improve the extract and load performance with be highly appreciated. Looking forward for your responses .
Thanks
SA
- Labels:
- Optimization
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hey @ansars ,
1. Reading from file server: Activating the AMP engine is the only option. Other than that, the slowness you are experiencing is related to your network/VPN and not alteryx. I would suggest scheduling this part of the process to run at night and save everything to a local folder in the server itself in yxdb format.
2. Writing to Snowflake: I suggest using the Snowflake bulk loader option.
Best,
Fernando Vizcaino
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi Fernando,
Thanks for your quick response. Snowflake bulk loading solved the performance issue .
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Question about the Snowflake Bulk loader. Can it work on the Server? and does anyone have some example of the ODBC connection string?
Trying to get the SF bulk loader to work on server and running into issues - is there a workaround?
