This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
I'm only just starting to explore the python and html sdks, but I think this functionality would be really useful for Alteryx tools.
I foresee cases where a custom tool is developed and we want to install it for 20+ users. Rather than having each user manually open and install the file, and troubleshooting for each of them (which could also become challenging if we want to deploy an enhancement to a tool in the future), I'd like a method (preferably via command line) to automatically install a tool for a user without any interaction/input.
This would allow for targeted tool deployment as well as large-scale tool maintenance as custom Python tools mature in the enterprise space.
Transfer of records from Python SDK RecordRef seems to be slow sending large amounts of data to the Alteryx Engine (e.g. discussion here). Although unclear of the exact specifics, it seems that there's a copy and convert process in play.
Apache Arrow appears to be addressing this issue, and the roadmap and specs are impressive! It seems like (again I have no understanding of the Alteryx Engine specifics) that something like this would be excellent for expanding SDK use cases as well as for other connectors such as the Apache Spark connector.
And it looks like it'd be fun to build into Alteryx! 🙂
There should by a Python Tool that is just a code paste (more like the R tool) and allows selection/packaging of venvs, similar to an IDE or we should be able to package scripts with workflows/macros.
A python tool that is easily integrated into macros for powerful and quick custom tools while avoiding Jupyter's failures would be incredibly beneficial. This would highlight how Python and Alteryx can work together, and don't need to be all or nothing competitors in the ETL space.
Jupyter is not a tool that should be used for production level processes - it is for teaching. Nobody has airflow or Luigi spinning up Jupyter and executing code in their ETL pipeline, so our Workflows shouldn't either. Yes, yes I have used to SDK to work around and I have also run scripts from the cmd tool but the first solution is time consuming and imposes a high skill wall and the latter has a lot of moving, non-packaged parts.
You guys have the API to do this and venv management from the SDK already so I don't think it would be expensive to implement.
Per my initial community posting, it seems that in environments where the firewall blocks pip the YXI installation process takes longer than it needs. My experience was 9:15 minutes for a 'simple' custom tool (one dependency wheel included in the YXI).
My 'Idea' is to provide a configuration option to install the YXI files 'offline'. That is, to skip the pip install --upgrade steps, and perhaps specify the --find-links and --no-index options with the pip install -r requirements.txt command. The --no-index option would assume that the developer has included the dependency wheel files in the YXI package. If possible, a second config option to add the path to the dependencies for the --find-links option would help companies that have a central location for storing their dependencies.
Alteryx should really get into the business of having a metadata management tool integrated into the UI.
We recently started training new users with Alteryx, making it more widespread than the few (less than 10) that already had it. The good and bad of Alteryx is that it really allows us the power to do work we've never done before. That also means that we've not had to worry too much about resource contention before either. But with great power comes great responsibility and now we do need to think find a way to manage this power.
There are some good new ideas out there that use the concept of cloud-sourced data governance (specifically, I just checked out Alation). But why should we have to go to two separate platforms then to examine our data?
I already know that if I have to spend more time switching back and forth between two platforms to get the same job done with greater care will result in many people taking shortcuts. Why couldn't this or something like it be integrated into Alteryx?
The Python SDK offers a possiblity to automatically install Python packages throughpipwhen installing using therequirements.txt. Some changes to the virtual environment of the tool might not be covered by this: For example, downloading and configuring language models for spaCy cannot be solved through the requirements.txt alone (similar for training corpora for NLTK).
So, as an idea for future versions of the SDK: Allow us to specify a Python script that is run when a tool is installed. This way we might be able to set up the environment, load additional tools etc.
Right now as far as I know you need to add each DB connection manually. This works... but is quite time consuming when trying to run tasks against a cluster of prod databases. It would be awesome to pass a JSON config file,example below, to the Alteryx Engine and have Alteryx create those connections upon parsing the file. This would save tons of time, and allow teams to share a central config file with consistent aliases across their clusters to ensure their app connections point to the same DBs across workflows. It would also make on boarding a breeze for new developers on team.
Allow the workflow to run again within few minutes and auto stop after 3 failed runs and send an email to owner.
If there was a network error or server unavailable, file unavailable types of errors the workflow should be able to rerun in couple of minutes and send failure message to the owner if the workflow fails 3 consecutive times. The scheduler UI should contain the option to allow rerun and fail after user selected number of tries(not more than 10 times).