This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
When building custom tools for Alteryx using the Python SDK, there is no current way to test these outside of the Alteryx Designer.
This means that your development process is:
- write some code (no code-sense; intellisense; auto-complete because Jupyter; VSCode; Visual Studio; etc cannot access AlteryxEngine or any of the other imports)
- copy that .py module into your C:\Users\<username>\AppData\roaming\Alteryx\Tools\<toolname>
- fire up Alteryx
- drop this new custom tool on a canvas
- run it to see if you get any errors
- then copy these errors out of Alteryx result window into Notepad to be able to read them
- then go back into your development environment to make changes
This is very painful, and this will directly scare most people away from learning how to create custom tools since it's not only inefficient - but also scary and frustrating for beginners.
Could we instead create mock python libraries; and a development harness (like Google does with Android development in Eclipse) in this SDK where:
- you have full code intelligence (intellisense, autocomplete)
- you can simulate engine events in a test harness (for example in the Android SDK; you can simulate the user rotating their phone, turning off GPS, hitting a volume button, etc).
- you can also write test cases which can run automatically
- then once you know that your tool will work - only then you drop it into the Alteryx Designer environment.
NOTE: This IDE way of thinking also allows you to bring the configuration pieces (like number of inputs; etc) out of raw code and into configuration options.
Although you may be able to do remote debugging by using platforms like PyCharm - that really does not give you the full ability to check in the code of your tool; along with all the test cases; in a harness that allows you to automatically check different events; or to make sure that your tool works in the test harness before deploying.
I'm only just starting to explore the python and html sdks, but I think this functionality would be really useful for Alteryx tools.
I foresee cases where a custom tool is developed and we want to install it for 20+ users. Rather than having each user manually open and install the file, and troubleshooting for each of them (which could also become challenging if we want to deploy an enhancement to a tool in the future), I'd like a method (preferably via command line) to automatically install a tool for a user without any interaction/input.
This would allow for targeted tool deployment as well as large-scale tool maintenance as custom Python tools mature in the enterprise space.
There should by a Python Tool that is just a code paste (more like the R tool) and allows selection/packaging of venvs, similar to an IDE or we should be able to package scripts with workflows/macros.
A python tool that is easily integrated into macros for powerful and quick custom tools while avoiding Jupyter's failures would be incredibly beneficial. This would highlight how Python and Alteryx can work together, and don't need to be all or nothing competitors in the ETL space.
Jupyter is not a tool that should be used for production level processes - it is for teaching. Nobody has airflow or Luigi spinning up Jupyter and executing code in their ETL pipeline, so our Workflows shouldn't either. Yes, yes I have used to SDK to work around and I have also run scripts from the cmd tool but the first solution is time consuming and imposes a high skill wall and the latter has a lot of moving, non-packaged parts.
You guys have the API to do this and venv management from the SDK already so I don't think it would be expensive to implement.
Transfer of records from Python SDK RecordRef seems to be slow sending large amounts of data to the Alteryx Engine (e.g. discussion here). Although unclear of the exact specifics, it seems that there's a copy and convert process in play.
Apache Arrow appears to be addressing this issue, and the roadmap and specs are impressive! It seems like (again I have no understanding of the Alteryx Engine specifics) that something like this would be excellent for expanding SDK use cases as well as for other connectors such as the Apache Spark connector.
And it looks like it'd be fun to build into Alteryx! 🙂
Alteryx should really get into the business of having a metadata management tool integrated into the UI.
We recently started training new users with Alteryx, making it more widespread than the few (less than 10) that already had it. The good and bad of Alteryx is that it really allows us the power to do work we've never done before. That also means that we've not had to worry too much about resource contention before either. But with great power comes great responsibility and now we do need to think find a way to manage this power.
There are some good new ideas out there that use the concept of cloud-sourced data governance (specifically, I just checked out Alation). But why should we have to go to two separate platforms then to examine our data?
I already know that if I have to spend more time switching back and forth between two platforms to get the same job done with greater care will result in many people taking shortcuts. Why couldn't this or something like it be integrated into Alteryx?
Per my initial community posting, it seems that in environments where the firewall blocks pip the YXI installation process takes longer than it needs. My experience was 9:15 minutes for a 'simple' custom tool (one dependency wheel included in the YXI).
My 'Idea' is to provide a configuration option to install the YXI files 'offline'. That is, to skip the pip install --upgrade steps, and perhaps specify the --find-links and --no-index options with the pip install -r requirements.txt command. The --no-index option would assume that the developer has included the dependency wheel files in the YXI package. If possible, a second config option to add the path to the dependencies for the --find-links option would help companies that have a central location for storing their dependencies.
The Python SDK offers a possiblity to automatically install Python packages throughpipwhen installing using therequirements.txt. Some changes to the virtual environment of the tool might not be covered by this: For example, downloading and configuring language models for spaCy cannot be solved through the requirements.txt alone (similar for training corpora for NLTK).
So, as an idea for future versions of the SDK: Allow us to specify a Python script that is run when a tool is installed. This way we might be able to set up the environment, load additional tools etc.
I have no idea how many people are using the .Net API to build custom tools, but found an issue with its assembly scanning.
It doesnt pick up classes implementing IPlugin in an abstract base class. Can be worked around by moving the interface onto the concrete implementation but think it should pick up any concrete class implementing the IPlugIn regardless of whether on the class itself or a base class.
When developing HTML GUI for an alteryx tool - it has to be done in hand-code.
There are 2 main challenges here:
a) it is not approachable for new folk. If we want the HTML SDK to be adopted more broadly, then it needs to be a graded learning curve where people without coding experience can use it and grow in confidence
b) it's not efficient. the only way to know if you've done something right or wrong is to type it up in notepad, and then try it in Alteryx and see what breaks.
Could we instead move to an IDE type approach like Visual studio (screenshot below)?
the user can drag & drop tools from the toolbox (left)
position them visually in the design surface (center)
while still having the ability to set custom properties or behaviours (right)
and jump straight into code if you're comfortable (bottom)
And when you're ready to test it, you hit "start", and any errors or issues are reported at the bottom of the screen.
Allow the workflow to run again within few minutes and auto stop after 3 failed runs and send an email to owner.
If there was a network error or server unavailable, file unavailable types of errors the workflow should be able to rerun in couple of minutes and send failure message to the owner if the workflow fails 3 consecutive times. The scheduler UI should contain the option to allow rerun and fail after user selected number of tries(not more than 10 times).