Alteryx Designer Desktop Discussions

Shakile · ‎05-15-2018

Does anyone know how to import python libraries (like pandas) under Alteryx??

Thanks.

NeilR · ‎05-17-2018

The Python SDK is a framework to develop new Alteryx tools with Python. 3rd party library installation instructions are available here. And examples of how to create tools with the Python SDK are here and here.

Shakile · ‎05-29-2018

In fact,

I'm trying to run a python code (test.py) whom the content is :

#import librairies
import csv
import pandas as pd
import numpy as np

# read excel file and store it in file variable
file="input.xlsx"
xl = pd.ExcelFile(file)

# Define the dataFrame df1 : contains column metadata
df1 = xl.parse('sheet_1')
# Define the dataFrame df1 : contains line metadata
df2 = xl.parse('sheet_1')

# store excel file as csv for columns & ilnes metadata
df_columns = df1.to_csv("output1.csv", index = False)
df_lines = df2.to_csv("output2.csv", index = False)

But I get errors. I thinks Alteryx is not able to get libraries or I'm not doing right.

Thanks for yor help.

Shaikle

StephenR · ‎05-29-2018

Follow the links that @NeilR shared. It's not getting the libraries because they do not exist on your machine. They need to be installed in a venv before they can be used.

Regards,
Stephen Ruhl
Principal Customer Support Engineer

a1b · ‎10-15-2018

The given solution works only in the case when library needs to be imported in "Python SDK" tool from the "SDK Example" pallete. How to install packages when running python codes from "Apache Spark code" tool from developer pallete?

DavidW · ‎10-15-2018

To use packages in the Apache Spark Code tool, they will need to be installed on the Apache Spark cluster you are connecting to with your workflow. The process of installing those packages depends on two things - do you want them installed permanently or just for the job / workflow you are running, and the type of Apache Spark cluster you are using (on-premises, Databricks, or Microsoft Azure HDInsight).

If you are wanting to install the packages on the cluster permanently, then the instructions heavily depend on the type of Apache Spark cluster you are using.

On-premises (i.e., Livy cluster): The packages will need to be installed using pip (or whatever Python package manager is used on your servers), preferable on each server in the cluster. It can be done on just one or a few, rather than all, but each time a job runs that uses the package, it will be copied to each worker that doesn't already have it. There are scripts / tools available to make this easier if you have a large cluster.
Microsoft Azure HDInsight: Essentially the same as above. However, you can do this through the Azure web interface and the process of getting the package on each worker in the cluster is easier.
Databricks: Simplest of all. Databricks refers to this in their documentation as "installing a library", and it is the same process for Python, Java, Scala, and R libraries. Their documentation for the process can be found at https://docs.databricks.com/user-guide/libraries.html

If you only want to use the packages for this job / workflow, then the instructions are simpler and nearly identical for each type of connection. This can be done in the connection configuration dialog where you set up the Apache Spark connection. In each connection type, whether it is an on-premises, Databricks, or Microsoft Azure HDInsight, you have the option to add libraries to your connection string. You simply add the library in that part of the connection configuration dialog. The exact instructions can be found in the Alteryx help, and since they may change in the future after my reply is written, I'll simply provide a link to that documentation here:

On-premises (i.e., Livy) or Microsoft Azure HDInsight: https://help.alteryx.com/current/DataSources/SparkDirect.htm, under Advanced Options
Databricks: https://help.alteryx.com/current/DataSources/SparkDatabricks.htm

David Wilcox
Senior Software Engineer
Alteryx

Alteryx Designer Desktop Discussions

Import Python Library

Re: Need to check if we can activate container bas...

Alteryx 2024.2 Upgrade Issue – Formula Tool Config...

Re: Alteryx Core Exam Data Download Issues

Re: Alteryx single function list

Re: Parsing Multiple Acronyms from a field