Alteryx Connect

Share your Connect product ideas - we're listening!

Delta loaders for Connect

We're working through an implementation of Connect - and it appears that every time Connect scans the Alteryx or Tableau environment, it does a full reload of all canvasses / workbooks.

 

However - we have several thousand tableau dashboards & alteryx canvasses - so this creates a significant delay on running the loaders, and we'd want to run these loaders every few hours so that Connect has up-to-date information (or at most 24 hours out of date).    Running a full export and scan is causing this load to take a very long time to run.

 

Can we change the default behaviour for all the loaders to use a delta-load rather than doing a full scan - i.e. only pull out assets that have changed since the last load?

 

 cc: @nganesha @Kosi

 

 

3 Comments
Alteryx Certified Partner

Including an option for a full reload would be useful too but agree that a delta loader setup would be significantly better than full reloads everytime

Alteryx Partner

+1

Alteryx
Alteryx
Status changed to: Comments Requested

Hi @SeanAdams ,

 

what delta do you have in mind?

 

Loading of metadata works in two steps. 

1) load the data into stage

2) create/update/delete assets in Connect

 

You are right that the first step is always "full load", while the second one actually is "delta". 

 

The question is how the "delta" on data source side should happen - there is no generic way how to distinguish what has changed since the last load. So somewhere the comparison has to happen. Maybe there could be a way for some of the technologies which hold the "last update" date (like filesystem), but even in such case the loader would have to rely that the filesystem has been harvested e.g. 1 week ago, so it could ask "what has been changed in the last week". But that means that if the loader doesn't run properly, the system gets  into inconsistency since suddenly there is a week of changes  that is no reflected.

 

We were thinking about other ways, but we haven't found any faster than the current one. If you have any idea how technically do the comparison in the data source, I would be very interested in it.