Calling all Racers for the Alteryx Grand Prix! It's time to rev your engines and race to the stage at Inspire! Sign up here.

Data Science

Machine learning & data science for beginners and experts alike.
JeffA
Alteryx
Alteryx

Welcome to the Jupyter Flow Basics guide! If you're well versed on Jupyter and python virtual environments, you may consider the Jupyter Flow Help docs instead, which cuts straight to the chase. Or you may want to skip to section V in this guide! Visit Introducing the Jupyter Flow tool for an introduction to the tool and it's capabilities.

 

Here you will learn how to install Python, manage virtual environments, create and integrate Jupyter notebooks into Alteryx workflows, export and share those workflows, and run them on Server. Sections I through IV are intended for users who have no experience with python, virtual environments, Jupyter, or creating Jupyter notebooks.

 

Before starting this guide, make sure you have Alteryx Designer version 2020.4 or later, and have installed Jupyter Flow onto your system.

 

Note: the following guide uses Windows command line arguments. These arguments will be different depending on the operating system you’re using.

 

 

 

I. Install Python 3.8.5

You can skip this step if you already have python 3.8.5 installed, or you already have a preferred method to create python installs. Simply note that this tool has only been tested with Python 3.8.5 environments.

 

  1. Download the correct Python installer for your system

  2. Follow the instructions to install Python 3.8.5

  3. Open a new command prompt and type “python” and press “Enter”. If you get a python interpreter like this, you can skip steps 4 through 6:

    index.png

  4. Find out where python was installed

    1. If installed as a user, you may find it at "C:\Users\your_username\AppData\Local\Programs\Python\Python38\"

  5. Add this path to your user PATH environment variable

  6. If you would prefer not to add this to your PATH, simply use the full path to python for the rest of this guide.

    1. Everywhere in this guide you see the following code, replace python with your version of the full path to your installed python version (ex: "C:\Users\your_username\AppData\Local\Programs\Python\Python38\python.exe"):

      python <<some_commands>>

 

 

 

II. Create and Configure a Virtual Environment

You can skip this section if you already know how to create a virtual environment and add packages to it using pip.

 

  1. Open a new command prompt
  2. Create a new folder in which to store your virtual environments:
    mkdir my_envs
  3. Navigate into your new folder:
    cd my_envs
  4. Create a new virtual environment:
    python -m venv first_environment
  5. See your new environment by listing folder contents:
    dir
  6. Install pandas package into the new environment:
    first_environment\Scripts\pip.exe install pandas
    JeffA_0-1624917968323.png

     

Note: If you would like to pass data to/from Alteryx workflows and your notebook, you will need to use pandas at this time. Notebooks can run without Pandas, but the only way to pass data in/out of a notebook currently is via pandas.

 

 

 

III. Install and Run Jupyter Notebook

You can skip this step if you already know how to use Jupyter. Simply note two things: DLL kernel errors may be fixable by installing pywin32==300 in the environment you use to run Jupyter, and you should install jupyter_client 6.1.12 if installing jupyter in your environment.

 

 

  1. In your environment, install jupyter:
    first_environment\Scripts\pip.exe install jupyter jupyter_client==6.1.12
  2. There is currently a bug in Jupyter which can be fixed by running the following command:
    first_environment\Scripts\pip.exe install pywin32==300
  3. Open Jupyter
    first_environment\Scripts\jupyter.exe notebook
  4. If the previous command succeeded, you should see the Jupyter notebook page in your default browser:

    JeffA_0-1624913434121.png

 

 

 

IV. Create a Simple Jupyter Notebook

You can skip this step if you already know how to create and run Jupyter notebooks.

 

 

  1. Click "New" -> Python3

    JeffA_1-1624913618673.png

  2. You should now see an empty notebook

    JeffA_2-1624913688687.png

  3. Add the following code to the notebook:
    import pandas as pd
    data = pd.DataFrame({"text": ["Jupyter", "by", "itself"], "number": [1,2,3]})
    data
  4. Run the notebook. You should see something like this:

    JeffA_0-1624926541451.png

  5. In the last field of the notebook, type the following command and run the cell:
    !dir
  6. You should now see the following output. Copy the directory path and save it for later:

    JeffA_5-1624940038786.png

     

  7. Select "Untitled" at the top of the notebook to change its name:

    JeffA_2-1624933979139.png

  8. Give your notebook a name and click "Rename":
    JeffA_3-1624934022907.png
  9. Save the notebook using ctrl+s
  10. If you would like to know more about Jupyter, see the jupyter website

 

 

 

V. Run a Jupyter Notebook from an Alteryx Workflow

Summary: After specifying a notebook and site-packages, the tool will build an environment for the notebook to run in, and then run the notebook. The environment only builds the first time a new set of packages are specified, but can take some time. If the packages or versions thereof change, the environment will build again.

 

 

  1. Open Alteryx Designer and open a new workflow
  2. Find the Jupyter Flow tool in the search bar or Laboratory and drag the tool to the canvas:

    JeffA_1-1624932095715.png

  3. The bare minimum of things you must provide to the tool are the notebook and the environment packages
  4. Click the Browse button next to the "Notebook" field and paste the notebook's directory name (saved previously) into the navigation bar and press "Enter":

    JeffA_8-1624939074222.png

  5. Select the notebook previously created and click "Open":

    JeffA_1-1624934350072.png

  6. Back in the configuration pane, click the "Browse" button next to the "Packages" field. Navigate to your environment's Lib/site-packages folder and click "OK":

    JeffA_7-1624938985863.png

  7. The configuration pane should now look something like this:

    JeffA_3-1624934614362.png

  8. Run the workflow
  9. The first time Jupyter Flow runs with a new environment, you will see a message indicating that the environment is building. The amount of time this takes varies greatly depending on the size of your site-packages folder. Once your environment has been built, this step will no longer occur unless the environment changes.

    JeffA_4-1624936684506.png

  10. My first run took 2 minutes and 24 seconds:

    JeffA_5-1624936740326.png

  11. Run the workflow. Now that the environment has been built, the environment build step is gone. My second run took 3.9 seconds:

    JeffA_6-1624936812880.png

  12. Note that there is not yet any data passing through the notebook. This is only useful If you want to schedule your notebooks but do not care about passing Alteryx data into or out of them. Read on to find out how to connect Alteryx data to your notebook!

 

 

 

VI. Pass Data Through your Notebook

Summary: Adding #ayx_input=<<name of alteryx input connection>> above a variable assignment will replace that variable with a dataframe representation of the data flowing through the input connection specified. Adding #ayx_output=<<output anchor number>> above a variable or a variable assignment will output the variable (assuming it's a pandas dataframe) to the specified output anchor in the workflow.

 

 

  1. Open the Jupyter notebook
  2. Add a cell to the bottom of the notebook with the following code and save the notebook:
    #ayx_output
    data
  3. The notebook should look like this:

    JeffA_4-1624939953619.png

  4. Run the workflow
  5. Select the first output anchor of the tool. It should show the following outputs:

    JeffA_3-1624939915433.png

  6. In the Jupyter notebook change "#ayx_output" to "#ayx_output=3" and save:

    JeffA_0-1624940500594.png

  7. Run the workflow
  8. Select the third output anchor of the tool. It should show the following outputs and the first anchor should no longer have any outputs:

    JeffA_7-1624937767563.png

  9. Drag a TextInput tool to the canvas and configure as follows:

    JeffA_1-1624937283443.png

  10. Connect the TextInput tool to the input anchor:
    JeffA_2-1624937330857.png

  11. In your notebook, add "#ayx_input" above the assignment of "data" and save the notebook:

    JeffA_6-1624938720013.png

  12. Run the workflow
  13. The third output anchor should now contain the data from the Text Input tool:

    JeffA_9-1624938088277.png

  14. Add a new TextInput tool and configure as shown:

    JeffA_0-1624939760696.png

  15. Connect the TextInput tool to the input anchor:

    JeffA_0-1624938386115.png

  16. The input tag must now specify which connection to use.
  17. Open the Jupyter notebook and change "#ayx_input" to "#ayx_input=#2":

    JeffA_2-1624938515310.png

  18. Run the notebook again
  19. The third output anchor should show data from the second Text Input tool:

    JeffA_1-1624939800472.png

  20. Moonbuggy can make use of multiple inputs at the same time. Simply use the "#ayx_input=" tag plus the name of the input connection.

 

 

 

VII. Share Your Jupyter Workflow

Summary: Using Alteryx Designer's built-in export workflow feature will export all of the assets required for another user (or a Server) to run a workflow containing this tool. The environment packages will be exported inside the .yxzp file.

 

 

  1. In the configuration pane, deactivate the "Packages" toggle:

    JeffA_1-1624942361034.png

  2. Save your workflow
  3. Navigate to Options -> Export Workflow:

    JeffA_2-1624942428911.png

  4. Ensure your notebook (.ipynb file) and the zip app (.pyz file) are selected in the "Export Workflow" dialogue:

    JeffA_4-1624942519223.png

  5. Click "Save"
  6. You may now share the exported ".yxzp" file with anyone else who has the Jupyter Flow tool installed on their machine.

 

 

 

VII. Run Your Jupyter Workflow from Server

  1. In the configuration pane, deactivate the "Packages" toggle as shown in the "Share your Jupyter Workflow" section.
  2. Ensure the tool has been installed on your desired gallery instance
  3. Using your preferred method, save your Jupyter workflow to your gallery
  4. Run the workflow on Server just like any other Alteryx workflow

 

Banner image by Beate Bachman

Comments
atcodedog05
22 - Nova
22 - Nova

@JeffA This is a super helpful article 🙂👍

JeffA
Alteryx
Alteryx

Great to hear @atcodedog05. Let me know how well the tool works for you!

jrgo
13 - Pulsar

@JeffA 

 

Trying to learn this tool, but get stuck on step V9 (first run) finishing with an error directing me review the log file (below).

 

C:\Users\<redact>\.shiv\b01b3cec3977e9e-cv-0-1_830b0357a27e50e91c83b7a77e0750042db9a80cd6fde18310e1a11b29ef90c1\site-packages\papermill\iorw.py:50: FutureWarning: pyarrow.HadoopFileSystem is deprecated as of 2.0.0, please use pyarrow.fs.HadoopFileSystem instead.
  from pyarrow import HadoopFileSystem
Input Notebook:  C:\Users\<redact>\my_env\first_notebook_post_processed.ipynb
Output Notebook: C:\Users\<redact>\my_env\first_notebook_post_processed.ipynb
Executing:   0%|          | 0/3 [00:00<?, ?cell/s]Kernel 'Python 3 (ipykernel)' is referencing a kernel provisioner ('local-provisioner') that is not available.  Ensure the appropriate package has been installed and retry.
Executing:   0%|          | 0/3 [00:00<?, ?cell/s]
Traceback (most recent call last):
  File "C:\ProgramData\Alteryx\Tools\JupyterFlow_venv\lib\runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\ProgramData\Alteryx\Tools\JupyterFlow_venv\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\<redact>\.jupyter_flow\b01b3cec3977e9e-cv-0-1.pyz\__main__.py", line 3, in <module>
  File "C:\Users\<redact>\.jupyter_flow\b01b3cec3977e9e-cv-0-1.pyz\_bootstrap\__init__.py", line 233, in bootstrap
  File "C:\Users\<redact>\.jupyter_flow\b01b3cec3977e9e-cv-0-1.pyz\_bootstrap\__init__.py", line 36, in run
  File "C:\Users\<redact>\.shiv\b01b3cec3977e9e-cv-0-1_830b0357a27e50e91c83b7a77e0750042db9a80cd6fde18310e1a11b29ef90c1\site-packages\click\core.py", line 1137, in __call__
    return self.main(*args, **kwargs)
  File "C:\Users\<redact>\.shiv\b01b3cec3977e9e-cv-0-1_830b0357a27e50e91c83b7a77e0750042db9a80cd6fde18310e1a11b29ef90c1\site-packages\click\core.py", line 1062, in main
    rv = self.invoke(ctx)
  File "C:\Users\<redact>\.shiv\b01b3cec3977e9e-cv-0-1_830b0357a27e50e91c83b7a77e0750042db9a80cd6fde18310e1a11b29ef90c1\site-packages\click\core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\Users\<redact>\.shiv\b01b3cec3977e9e-cv-0-1_830b0357a27e50e91c83b7a77e0750042db9a80cd6fde18310e1a11b29ef90c1\site-packages\click\core.py", line 763, in invoke
    return __callback(*args, **kwargs)
  File "C:\Users\<redact>\.shiv\b01b3cec3977e9e-cv-0-1_830b0357a27e50e91c83b7a77e0750042db9a80cd6fde18310e1a11b29ef90c1\site-packages\click\decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "C:\Users\<redact>\.shiv\b01b3cec3977e9e-cv-0-1_830b0357a27e50e91c83b7a77e0750042db9a80cd6fde18310e1a11b29ef90c1\site-packages\papermill\cli.py", line 250, in papermill
    execute_notebook(
  File "C:\Users\<redact>\.shiv\b01b3cec3977e9e-cv-0-1_830b0357a27e50e91c83b7a77e0750042db9a80cd6fde18310e1a11b29ef90c1\site-packages\papermill\execute.py", line 107, in execute_notebook
    nb = papermill_engines.execute_notebook_with_engine(
  File "C:\Users\<redact>\.shiv\b01b3cec3977e9e-cv-0-1_830b0357a27e50e91c83b7a77e0750042db9a80cd6fde18310e1a11b29ef90c1\site-packages\papermill\engines.py", line 49, in execute_notebook_with_engine
    return self.get_engine(engine_name).execute_notebook(nb, kernel_name, **kwargs)
  File "C:\Users\<redact>\.shiv\b01b3cec3977e9e-cv-0-1_830b0357a27e50e91c83b7a77e0750042db9a80cd6fde18310e1a11b29ef90c1\site-packages\papermill\engines.py", line 343, in execute_notebook
    cls.execute_managed_notebook(nb_man, kernel_name, log_output=log_output, **kwargs)
  File "C:\Users\<redact>\.shiv\b01b3cec3977e9e-cv-0-1_830b0357a27e50e91c83b7a77e0750042db9a80cd6fde18310e1a11b29ef90c1\site-packages\papermill\engines.py", line 402, in execute_managed_notebook
    return PapermillNotebookClient(nb_man, **final_kwargs).execute()
  File "C:\Users\<redact>\.shiv\b01b3cec3977e9e-cv-0-1_830b0357a27e50e91c83b7a77e0750042db9a80cd6fde18310e1a11b29ef90c1\site-packages\papermill\clientwrap.py", line 43, in execute
    with self.setup_kernel(**kwargs):
  File "C:\ProgramData\Alteryx\Tools\JupyterFlow_venv\lib\contextlib.py", line 113, in __enter__
    return next(self.gen)
  File "C:\Users\<redact>\.shiv\b01b3cec3977e9e-cv-0-1_830b0357a27e50e91c83b7a77e0750042db9a80cd6fde18310e1a11b29ef90c1\site-packages\nbclient\client.py", line 456, in setup_kernel
    self.start_new_kernel(**kwargs)
  File "C:\Users\<redact>\.shiv\b01b3cec3977e9e-cv-0-1_830b0357a27e50e91c83b7a77e0750042db9a80cd6fde18310e1a11b29ef90c1\site-packages\nbclient\util.py", line 78, in wrapped
    return just_run(coro(*args, **kwargs))
  File "C:\Users\<redact>\.shiv\b01b3cec3977e9e-cv-0-1_830b0357a27e50e91c83b7a77e0750042db9a80cd6fde18310e1a11b29ef90c1\site-packages\nbclient\util.py", line 57, in just_run
    return loop.run_until_complete(coro)
  File "C:\ProgramData\Alteryx\Tools\JupyterFlow_venv\lib\asyncio\base_events.py", line 616, in run_until_complete
    return future.result()
  File "C:\Users\<redact>\.shiv\b01b3cec3977e9e-cv-0-1_830b0357a27e50e91c83b7a77e0750042db9a80cd6fde18310e1a11b29ef90c1\site-packages\nbclient\client.py", line 412, in async_start_new_kernel
    await ensure_async(self.km.start_kernel(extra_arguments=self.extra_arguments, **kwargs))
  File "C:\Users\<redact>\.shiv\b01b3cec3977e9e-cv-0-1_830b0357a27e50e91c83b7a77e0750042db9a80cd6fde18310e1a11b29ef90c1\site-packages\nbclient\util.py", line 89, in ensure_async
    result = await obj
  File "C:\Users\<redact>\.shiv\b01b3cec3977e9e-cv-0-1_830b0357a27e50e91c83b7a77e0750042db9a80cd6fde18310e1a11b29ef90c1\site-packages\jupyter_client\manager.py", line 331, in _async_start_kernel
    kernel_cmd, kw = await ensure_async(self.pre_start_kernel(**kw))
  File "C:\Users\<redact>\.shiv\b01b3cec3977e9e-cv-0-1_830b0357a27e50e91c83b7a77e0750042db9a80cd6fde18310e1a11b29ef90c1\site-packages\jupyter_client\utils.py", line 33, in ensure_async
    return await obj
  File "C:\Users\<redact>\.shiv\b01b3cec3977e9e-cv-0-1_830b0357a27e50e91c83b7a77e0750042db9a80cd6fde18310e1a11b29ef90c1\site-packages\jupyter_client\manager.py", line 295, in _async_pre_start_kernel
    self.kernel_spec,
  File "C:\Users\<redact>\.shiv\b01b3cec3977e9e-cv-0-1_830b0357a27e50e91c83b7a77e0750042db9a80cd6fde18310e1a11b29ef90c1\site-packages\jupyter_client\manager.py", line 131, in kernel_spec
    self._kernel_spec = self.kernel_spec_manager.get_kernel_spec(self.kernel_name)
  File "C:\Users\<redact>\.shiv\b01b3cec3977e9e-cv-0-1_830b0357a27e50e91c83b7a77e0750042db9a80cd6fde18310e1a11b29ef90c1\site-packages\jupyter_client\kernelspec.py", line 294, in get_kernel_spec
    return self._get_kernel_spec_by_name(kernel_name, resource_dir)
  File "C:\Users\<redact>\.shiv\b01b3cec3977e9e-cv-0-1_830b0357a27e50e91c83b7a77e0750042db9a80cd6fde18310e1a11b29ef90c1\site-packages\jupyter_client\kernelspec.py", line 258, in _get_kernel_spec_by_name
    raise NoSuchKernel(kernel_name)
jupyter_client.kernelspec.NoSuchKernel: No such kernel named python3

 

 

Environment ran in 21.2 AMP disabled. I don't think this would, but in case it helps, Designer is running in Parallels VM on Mac. I have no issues with the regular Python tool either (other than the AMP bug).

jrgo_0-1630103475486.png

 

Thanks,

Jimmy

JeffA
Alteryx
Alteryx

Thanks for trying it out @jrgo ! Sorry for the delay in responding. I'm going to work on reproducing this issue for you.

 

In the meantime, here are a few clarifying questions that would help a lot:

1. If you run the notebook using your virtual environment's installed Jupyter instance, does it work?

2. Could you run pip freeze for your virtual environment and upload the output? If you used the steps above to build it, you would run

C:\path\to\your\environment\root\folder\Scripts\pip.exe freeze

3. Could you rename your notebook's extension from .ipynb to .txt and upload that text here? Be sure to remove all sensitive code first. I'm just interested in the metadata of the notebook.

jrgo
13 - Pulsar

@JeffA 

 

Thank you for looking into this. Here's my responses for the questions you asked:

 

1. If you run the notebook using your virtual environment's installed Jupyter instance, does it work? Yes

jrgo_0-1630520240675.png

2. Could you run pip freeze for your virtual environment and upload the output? Unable to upload files so shared in this code block

Spoiler
argon2-cffi==20.1.0
attrs==21.2.0
backcall==0.2.0
bleach==4.1.0
cffi==1.14.6
colorama==0.4.4
debugpy==1.4.1
decorator==5.0.9
defusedxml==0.7.1
entrypoints==0.3
ipykernel==6.2.0
ipython==7.26.0
ipython-genutils==0.2.0
ipywidgets==7.6.3
jedi==0.18.0
Jinja2==3.0.1
jsonschema==3.2.0
jupyter==1.0.0
jupyter-client==7.0.1
jupyter-console==6.4.0
jupyter-core==4.7.1
jupyterlab-pygments==0.1.2
jupyterlab-widgets==1.0.0
MarkupSafe==2.0.1
matplotlib-inline==0.1.2
mistune==0.8.4
nbclient==0.5.4
nbconvert==6.1.0
nbformat==5.1.3
nest-asyncio==1.5.1
notebook==6.4.3
numpy==1.21.2
packaging==21.0
pandas==1.3.2
pandocfilters==1.4.3
parso==0.8.2
pickleshare==0.7.5
prometheus-client==0.11.0
prompt-toolkit==3.0.20
pycparser==2.20
Pygments==2.10.0
pyparsing==2.4.7
pyrsistent==0.18.0
python-dateutil==2.8.2
pytz==2021.1
pywin32==300
pywinpty==1.1.3
pyzmq==22.2.1
qtconsole==5.1.1
QtPy==1.10.0
Send2Trash==1.8.0
six==1.16.0
terminado==0.11.1
testpath==0.5.0
tornado==6.1
traitlets==5.0.5
wcwidth==0.2.5
webencodings==0.5.1
widgetsnbextension==3.5.1

3. Could you rename your notebook's extension from .ipynb to .txt and upload that text here? Again, unable to upload files so shared in code block, but I've only tested this with what's included on this KB post so nothing sensitive.

Spoiler
{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "c4b89aa7",
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>text</th>\n",
       "      <th>number</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Jupyter</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>by</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>itself</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      text  number\n",
       "0  Jupyter       1\n",
       "1       by       2\n",
       "2   itself       3"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pandas as pd\n",
    "data = pd.DataFrame({\"text\": [\"Jupyter\", \"by\", \"itself\"], \"number\": [1,2,3]})\n",
    "data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "3ad334ce",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>text</th>\n",
       "      <th>number</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Jupyter</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>by</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>itself</td>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      text  number\n",
       "0  Jupyter       1\n",
       "1       by       2\n",
       "2   itself       3"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#ayx_output\n",
    "data"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}

 

JeffA
Alteryx
Alteryx

@jrgo try doing this in your environment and re-running the tool:

C:\path\to\your\environment\root\folder\Scripts\pip.exe install jupyter-client==6.1.12

That worked for me. If it works for you, great! I'll still think about the best long-term solution for the tool though.

jrgo
13 - Pulsar

Thanks @JeffA, that fixed it for me too. Do you think the issue had to do with my machine or is it just a compatibility issue with the version of Jupyter that was installed and/or the version of Designer I used? Asking because if the latter - and until a long-term solution is found - do you plan to update this guide to avoid others from hitting the same error?

 

Thanks again for your help!

 

-Jimmy

JeffA
Alteryx
Alteryx

Glad it worked @jrgo. I'll write this up as a problem/solution on community.

 

To answer your question, I think it is a compatibility issue with the version of jupyter_client (jupyter only has two versions, 0.0.0 and 1.0.0). I reproduced the issue very easily and would expect anyone running this guide today to run into the issue as well. So I've submitted an update to the guide. However, I'm not happy that there even is a compatibility issue, so I'm going to investigate further and find out if this is something I can fix permanently on the tool code side.

 

Cheers!

alberto_hernie
9 - Comet

Hi @JeffA 

 

I've been testing the tool. Initially I had the issue with related to the jupyter client, but installing that specific version resolved the issue and I've been able to succesfully use the tool in my Designer and also on the Gallery.

 

I have a question, probably related to my low Python knowledge. What's the expected way of debugging the script inside the tool? I'm thinking if it has to be debugged by a different person that the one who created the script.

 

If I get the workflow from other, then I get (as assets) the notebook and also the zip app with the environment required to run the python script, but this venv is not installed on my machine, so even if I open the notebook I cannot run it since I don't have that environment... how can I unpack the zip app or link the notebook to the zip app, so I can see and run the code in the notebook?

 

 

Thanks in advance for your help.

 

Regards,

Alberto HM

JeffA
Alteryx
Alteryx

@alberto_hernie great question.

 

Currently the only way to debug the notebook without the original virtual environment is to run the notebook from the command line using the .pyz file.

 

From the command line, you do the following:

 

the_jupyter_python_interpreter.exe you_pyz_file.pyz your_post_processed_notebook.ipynb some_output_notebook_name.ipynb

 

 

Your _post_processed notebook should be in the same folder as your original notebook. You definitely want to run that instead of the original notebook, as it contains references to the data being piped in/out of the notebook.

 

For example:

 

C:\Users\<your_un>\AppData\Roaming\Alteryx\Tools\JupyterFlow_venv\python.exe 99b9e8838bc10d0-cv-0-1.pyz notebook_post_processed.ipynb notebook_post_processed_output.ipynb

 

 

Also, the user will usually need to turn on the "Back up data cache" advanced option in the tool and run it again so that the same data that flowed through the tool on your last tool run will still be available for the notebook to use:

JeffA_1-1631817112383.png

 

Apologies that this wasn't documented anywhere. For now I'll add this to the documentation and think about a simpler way for other users to debug for the future!

 

Thanks!

 

- Jeff

 

P.S. During a workflow run, the .pyz file gets extracted into its site-packages under C:\Users\<your_un>\.shiv (or the admin root\.shiv). It's possible to set a remote-pdb breakpoint in there if dependencies are causing problems (if you're comfortable with Python of course 😉 )

jeneir
8 - Asteroid

Hi!

 

I got a problem at V.8, when running the workflow.

Error: Jupyter Flow (3): Traceback (most recent call last): File "main.py", line 3, in <module>
ModuleNotFoundError: No module named 'moonbuggy_toolkit'

 

Tool settings:
Notebook: C:\Users\<username>\my_envs\first_environment\Scripts\first_notebook.ipynb
Packages: C:\Users\<username>\my_envs\first_environment\Lib\site-packages

 

Running the script in Jupyter (browser) works fine.

 

Virtual Environment is Python 3.8.5
Alteryx Designer 2021.3.2

 

JeffA
Alteryx
Alteryx

@jeneir sorry it's not working for you! I'll take a look on my end when I get a chance!

JeffA
Alteryx
Alteryx

@jeneir Looks like the tool itself is the problem, not your jupyter script. The tool can't even get the script started because it's not finding a dependency.

 

I'm going to take a wild guess that if you just plop a jupyter flow tool down on an empty canvas, it throws the same error. But I haven't been able to reproduce this on my end. There may be something wrong with the tool's installation. Try reinstalling, and if you can get me some more debugging information, that'd be great!

jeneir
8 - Asteroid

@JeffA  thanks for checking it out. I'm not admin on my machine, might affect the tool?

I can try and get more debugging info.

 

yurunsang_rbsi
5 - Atom

Hi Jeff,

 

Thanks for your great tutorial and the tool. I am testing it on my side and exactly following your steps. Below is what I have in the notebook, but when I run it in Alteryx, the outputs are not returning anything. It does say successfully saved the post-processed notebook, but when I open that notebook, it becomes empty. Do you know what's the issue happening there?

yurunsang_rbsi_0-1658929881216.png

 

Thanks for any help in advance!

 

JeffA
Alteryx
Alteryx

@yurunsang_rbsi Could you send me the .yxzp of your workflow?