Does anyone know how scheduling/refreshing live data (web scraping) from Alteryx to Tableau server work?
I created web scraping workflow in Alteryx and published it into tableau server.
I am not sure what the best way to schedule/refresh data.
In Alteryx, I am currently using Local Machine and run the workflow from its original location on disk.
In Tableau, I am not sure if I should use extract (tde) to schedule/refresh data or use live data. If my data is live, do I still need to refresh my data?
Thank you,
Kazumi
Solved! Go to Solution.
If you are using Alteryx to do the web scraping and data prep,you would then use the Publish to Tableau tool to push the TDE directly to Tableau server from Alteryx. If you are running Alteryx Server, you woul dpublish your Alteryx workflow there and schedule it to run from there. If you have a desktop automation license, you could also schedule on your local machine running Designer, but the effect will be the same in both cases- Alteryx pulls and preps your data and then pushes the TDE directly to Tableau Server.
Hi Jason,
Thank you for your reply. Even if I schedule on my local machine running designer, I still need to schedule refreshing TDE in tableau server, is that correct? In addition, if data source is "Live" instad of "extract", I don't have to schedule refreshing data in tableau server (but still need to schedule on my local machine running designer in Alteryx), correct?
Kazumi
Hi @knozawa,
The Live vs Extract is a Tableau-side differentiation. Connecting Live means you will be connecting directly to the full data in its most recent form (this is independent of how you have your Alteryx workflow scheduled). An extract will simply be a data snapshot, so if you are connecting to the server (even if you schedule your local machine running designer), you will need to schedule a refresh of the extract on Tableau Server. If you are connecting Live, no refresh is needed - you will just set schedule your workflow and Tableau will always use the most recent data.
Best,
That's an excellent answer Sophia.
In terms of tuning/optimization process , can you let me know which one of the two are better:
1. Publishing data to Tableau server from Alteryx directly and using it as a Live one. Or
2. Publishing data to any RDBMS(eg. Sql Server) using Alteryx , create an extract , schedule it and use.
Thanks,
Srini