Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!
The Product Idea boards have gotten an update to better integrate them within our Product team's idea cycle! However this update does have a few unique behaviors, if you have any questions about them check out our FAQ.

Alteryx Designer Desktop Ideas

Share your Designer Desktop product ideas - we're listening!
Submitting an Idea?

Be sure to review our Idea Submission Guidelines for more information!

Submission Guidelines

Python Tool: Managing Virtual Environments (venv) and Requirements

With an increasing number of different projects, involving different machine learning models, it's becoming difficult to manage different package versions across workflows. Currently, the Python tool has a single virtual environment, so we need to develop models in different projects always using the same Python and package versions as the Python tool venv. While this doesn't bother the code itself too much, it becomes a problem as soon as we store and load pickled models, which are sensitive to even minor changes in packages.

 

This is even more so a problem when we are working on the Alteryx server, where different teams might use different packages. Currently, there is only the server admin who can install packages on the server and there can only be one version per package.

 

So, a more robust venv management in the Python tool would be much appreciated!

8 Comments
cam_w
11 - Bolide

@BlytheE- for visibility 🙂

 

My thoughts on this are not well developed ... however, for the past few months I've been leaning towards 'images' as the best solution for this issue, whether docker images or some other solution. The flow would go something like this:

 

  1. Alteryx ships a default image with the Python tool, that could be utilized by the user out of the box as a stand-alone python container to run the jupyter server and python code with the default packages already installed.
  2. Users that need to install additional packages would create a new image layered on top of the default image.
  3. Users would have a control - drop down or whatever - to select the image that they want to use for the python tool.
  4. The image Dockerfile (and/or compose yml file) and settings would save with the Python tool for copying to new workflows, or sending to other users.
  5. The layering of images should reduce the amount of space required compared to an equivalent Virtual Environment solution.

Downsides:

  • Packaging docker with Alteryx might be a hard sell! However, it might be a nice optional 'add on' for environments that already have docker installed.
  • Image management would become a 'thing', and users would need understand of how to do this, or a tool to help them decide what they need to keep.

Anyway, those are just my thoughts ... 🙂

chrisha
11 - Bolide

Good thoughts, @cam_w ! Docker containers for workflows more generally would be a great way to improve deployment feasibility. Especially when different machines have different SDKs and tools installed (e.g. several Machine Learning packages for Python require Windows C++ SDK installations which might not be available on all machines).

 

I fear that integrating Docker will be many months of hard work for the Engine team and might be a bit out of scope. For the narrower use case of the Python tool, virtual environments are already a well-established and easy-to-configure system. Furthermore, Alteryx already makes use of virtual environments for the Python SDK custom tools. The challenge might be integrating the details for the venv within the Python Tool.

TimN
13 - Pulsar

We need this as well.

 

Thanks.

claugreco
5 - Atom

Managing Virtual Environments within the Python tool will be very beneficial for us too.

YEM
8 - Asteroid

If you upgrade Alteryx Server, all of the packages you installed are removed.

 

Example.  Let's say in the past you did something like this:

from ayx import Alteryx
Alteryx.installPackage(package="tableau-api-lib")

 

The Python executable that comes with Alteryx Server is located here:

import sys
print(sys.executable)

c:\program files\alteryx\bin\miniconda3\envs\jupytertool_venv\python.exe

 

We now can tell the tableau-api-lib package we installed in step 1 ends up here:

c:\Program Files\Alteryx\bin\Miniconda3\envs\JupyterTool_vEnv\Lib\site-packages

 

When you upgrade Alteryx Server, it rebuilds this folder from scratch I guess.  All custom modules installed with pip are lost.  Bummer!

 

Being able to invoke a virtual environment that is maintained by us would avoid this upgrade fire.

Thableaus
17 - Castor
17 - Castor

@YEM I feel your pain but it's strongly advisable that you backup Python and R packages before an upgrade, most likely due to that point you mentioned!

AlteryxCommunityTeam
Alteryx Community Team
Alteryx Community Team
Status changed to: Accepting Votes
 
clmc9601
13 - Pulsar
13 - Pulsar

Yes, this would be very helpful!