Hi all,
I’ve got an issue in getting largish data sets from a SQL Server data warehouse in the UK to an AWS Tableau Server deployment in the US (these can’t easily be moved due to other dependencies). I’ve originally posted with some more details on the Tableau community forums with limited success, but hoping there might be an Alteryx fix.
Analysis
It seems that the bottleneck is the slowness of the standard protocol (TDS over TCP) over this distance – see more detailed notes in other post referenced above. It’s taking 2 hours for 10M rows (4.5GB) and often timing out. We anticipate larger data sources in future, so really want a scalable solution.
My current solve is to get the data out locally using Alteryx (4 mins for extract) and then publish directly using the Publish to Tableau Server Alteryx tool. However, as stated elsewhere, this doesn’t allow the addition of some key metadata for dynamically calculated fields, default formatting, aggregation and comments, hierarchies, re-casting some measures as dimensions (e.g. hour of day), embedded data source filters etc. I know that I can add this metadata in a tableau workbook, and could even publish this up as a data source with metadata included, but I don’t know a way to nicely automate this.
Can anyone help? (We have desktop scheduler and may look at Alteryx Server soon). Is Tabcmd something that could push up a workbook after the Alteryx flow has updated the TDE? All possible solutions welcomed!
Thanks in advance!
Solved! Go to Solution.
Hello,
I have been seeing some similar use cases come up. You're correct, Tabcmd can help in this situation. In particular, the --r replace where you can take a metadata layer of a tde file and put it on top of another. If you utilize the run command tool in your workflow, you should be able to call tabcmd after the tde has been created to appending or correct the metadata layer on the new tde.
For anyone coming to this later, we ended up using a US based snowflake instance, and workflow was:
A slightly improved version would be to only run step 2 the first time this executes, since you lose history on the table if you drop it, so better to just truncate.