community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx Designer Ideas

Share your Designer product ideas - we're listening!

Python Tool: Managing Virtual Environments (venv) and Requirements

With an increasing number of different projects, involving different machine learning models, it's becoming difficult to manage different package versions across workflows. Currently, the Python tool has a single virtual environment, so we need to develop models in different projects always using the same Python and package versions as the Python tool venv. While this doesn't bother the code itself too much, it becomes a problem as soon as we store and load pickled models, which are sensitive to even minor changes in packages.

 

This is even more so a problem when we are working on the Alteryx server, where different teams might use different packages. Currently, there is only the server admin who can install packages on the server and there can only be one version per package.

 

So, a more robust venv management in the Python tool would be much appreciated!

2 Comments
Comet

@BlytheE- for visibility 🙂

 

My thoughts on this are not well developed ... however, for the past few months I've been leaning towards 'images' as the best solution for this issue, whether docker images or some other solution. The flow would go something like this:

 

  1. Alteryx ships a default image with the Python tool, that could be utilized by the user out of the box as a stand-alone python container to run the jupyter server and python code with the default packages already installed.
  2. Users that need to install additional packages would create a new image layered on top of the default image.
  3. Users would have a control - drop down or whatever - to select the image that they want to use for the python tool.
  4. The image Dockerfile (and/or compose yml file) and settings would save with the Python tool for copying to new workflows, or sending to other users.
  5. The layering of images should reduce the amount of space required compared to an equivalent Virtual Environment solution.

Downsides:

  • Packaging docker with Alteryx might be a hard sell! However, it might be a nice optional 'add on' for environments that already have docker installed.
  • Image management would become a 'thing', and users would need understand of how to do this, or a tool to help them decide what they need to keep.

Anyway, those are just my thoughts ... 🙂

Asteroid

Good thoughts, @c2willis ! Docker containers for workflows more generally would be a great way to improve deployment feasibility. Especially when different machines have different SDKs and tools installed (e.g. several Machine Learning packages for Python require Windows C++ SDK installations which might not be available on all machines).

 

I fear that integrating Docker will be many months of hard work for the Engine team and might be a bit out of scope. For the narrower use case of the Python tool, virtual environments are already a well-established and easy-to-configure system. Furthermore, Alteryx already makes use of virtual environments for the Python SDK custom tools. The challenge might be integrating the details for the venv within the Python Tool.