Alteryx Designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.

General Discussions has some can't miss conversations going on right now! From conversations about automation to sharing your favorite Alteryx memes, there's something for everyone. Make it part of your community routine!

SOLVED

Best practice for Snowflake & Alteryx

knozawa
11 - Bolide

Hello,

 

I have several questions about Snowflake connecting from Alteryx.

 

1. Bulk loader without primary key field

Would it be possible to upload data into snowflake using bulk loader that does not have primary key field?

Background: There is a primary key field with AUTOINCREMENT in snowflake

 

2. Insert new data into snowflake using bulk load

It seems like "Update: Insert if new" is not available for bulk load.  In that case, probably "Append existing" is the best way.  If there is any suggestions, please let me know!

 

3. Modify database by fill in value

What is the best way to modify data in snowflake?

Background:

There are some rows that are missing values.  I would like to modify/fill-in data in snowflake.   Would it be better deleting specific rows and append modified data?  OR modify data using sql statement in Alteryx output tool? OR clone original table --> create modified table --> replace table?

 

4. Normal input and output tool vs In-Database connection tools

What is the advantage/disadvantage using In-Database connection tools?

 

Sincerely,

knozawa

cmcclellan
13 - Pulsar

Hopefully these answers will help 🙂 

 

1) Yes, I'm pretty sure you can do this.  In the file format you can specify that the PK field is not in the file and Snowflake should create the value.  I can dig out some details from an old project if you need more info 🙂 

 

2) I don't think Snowflake support UPSERT when loading.  Usually I've uploaded to an empty table and used SQL to achieve the same thing

 

3) Depends on your situation.  Snowflake isn't designed for single record maintenance, you can do it, but don't try it for a large number of records and expect good performance.  It can take seconds to bulk load millions of rows, but hours to edit a thousand records individually.  If you understand how Snowflake works then you can design a really fast workflow

 

4) The same as every other database.  InDB means that the data stays in the database.  It's VERY fast and efficient and if you're just transforming data (not loading or unloading) then InDB will make things really fast.

knozawa
11 - Bolide

@cmcclellan,

 

Thank you very much for your detailed reply!

 

About the #1 question (auto increment PK using bulk loader), when I tried to upload data into snowflake without a primary key field using bulk loading, I've received following error message:

The COPY failed with error: Number of columns in file (2) does not match that of the corresponding table (3), use file format option error_on_column_count_mismatch=false to ignore this error

 

I've asked the same question on the stack overflow as well:https://stackoverflow.com/questions/60326474/autoincrement-primary-key-for-snowflake-bulk-loading/60...

 

It seems like there is a way to work around the issue like you said. Do you know if we can achieve it using an output tool modifying the pre/post SQL or do we need to use snowsql and cmd tool?

 

Sincerely,

knozawa

 

cmcclellan
13 - Pulsar

Yes, I agree with the SO reply, leave the column out of the COPY statement and it should work.  The autoincrement field is implied because it has to be there 🙂 

steven4320555
8 - Asteroid

In @cmcclellan 's point 3: "It can take seconds to bulk load millions of rows, but hours to edit a thousand records individually.  If you understand how Snowflake works then you can design a really fast workflow" 

 

This is a very interesting insight! Snowflake is optimised for bulk operations and analytics, not good for operation. 

Labels