Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Snowflake in-db browse very slow

fmvizcaino
17 - Castor
17 - Castor

Hi community,

 

I have been facing some issues with a client’s project that uses Snowflake and in-DB.

 

When I run this workflow, it takes 3.5 seconds to complete, but the same workflow with the in-DB Browse takes 30 minutes. Any idea what could be causing this?

The ODBC configuration has nothing out of the ordinary as well.

 

Thank you,

Fernando Vizcaino

14 REPLIES 14
apathetichell
18 - Pollux

Are you using DSII or the "recommended" driver? Is this on SSO?

 

my test run was 100 record browse in 4.5 seconds In-DB - (source has about 20 fields  and "a lot" of records)

fmvizcaino
17 - Castor
17 - Castor

Hi @apathetichell ,

 

Thank you for your answer.

We did our test with the drivers available on the data source's page (not sure what DSII means 😅). https://help.alteryx.com/20221/designer/snowflake 

No SSO.

 

Best,

Fernando Vizcaino

 

ArtApa
Alteryx
Alteryx

Hi @fmvizcaino - Do I understand correctly that the workflow runs slower if you use a Browse In-DB tool?

 

https://help.alteryx.com/20221/designer/browse-db-tool

 

If that's the case then check the configuration of your Browse In-DB. In-DB tools push down processing to the data source side. However, Browse In-DB will download data from your data source to the Desktop to allow you to see a preview in the results window. This looks like a bottle neck, which causes a delay.

 

 

apathetichell
18 - Pollux

@fmvizcaino On that page there are two distinct drivers - the "approved" SIMBA driver and the "Not Certified Driver From External Provider" - the later is the Snowflake provided DSII driver. I use it. I set up via SAML/SSO - but my hunch is service account would be just as good. 

 

my related question would be if you are setting a default compute warehouse and if it's a size x-small... I'd expect this would affect input data as well though.

fmvizcaino
17 - Castor
17 - Castor

Thank you both for the answers.

@ArtApa , I understand that the browse in-db tool downloads the data, but from 3.5 seconds to 30 minutes for the basic 100 rows download seems too much. 

 

@apathetichell , I didn't know that, thanks! We are using a service account to run and have tested both drivers.

 

apathetichell
18 - Pollux

if you want to have a call on this - I can try to walk you through some config options - this behavior is pretty odd. My config is incredibly simple for the DSII and my connection string in-DB is pretty straight forward too. 

 

If I didn't see anything glaringly wrong I'd recommend creating a new configuration in ODBC 64 for a system vs a user (or a user vs a system) and then creating a new In-DB connection. I'd also recommend leaving default compute warehouse blank - if possible, and retesting this on a really simple query (refresh tables and choose a random table)... If there is a syntax issue in the query Snowflake might get unhappy....  

 

ddiesel
13 - Pulsar
13 - Pulsar

Hi @fmvizcaino!

 

I'm subscribing to this post because I'd love to hear of your outcome. I haven't experienced this issue specifically with Snowflake, but there have been other times I have noticed significant latency in the In-Database tools vs. regular input. Please let us know what you find out.


Thanks,

Deb

Cedric
8 - Asteroid

I'm having the same issue, roughly. I have a query that is about 900k rows (250MB, more or less) and it took 4s in the Snowflake UI and took me 4 hours in my INDB data-stream-out tool. 

That is... ridiculous. 

I'm using the Snowflake DSII Driver rather than the Simba driver; it shows up as Snowlake ODBC in my configuration on the read side (and on the write side, I am getting the bulk loader, which I think would in theory be a good thing). The instance I have in Snowflake is an x-small, but, again, 4 hours for 250MB is just... not right. 

fmvizcaino
17 - Castor
17 - Castor

Hey @Cedric ,

 

In my case, the issue was related to the query generated by Alteryx. Our table had more than 2 hundred column and even using select *, Alteryx were translating that into a ginormous query instead of keeping the *.

 

You can use the dynamic output in-db to check your query. 

fmvizcaino_0-1670455287884.png

 

In addition to that, for in-db workflows with a lot of tools, I suggest including temporary tables in a few places there with the Write Data In-DB tool to simplify the resulting query.

fmvizcaino_1-1670455443266.png

 

Best,

Fernando Vizcaino

 

 

 

Labels