I'm trying to use the Python tool to bring in data using pd.read_excel("/path/"). However, when I do, I get an error:
ModuleNotFoundError Traceback (most recent call last)
c:\program files\alteryx\bin\miniconda3\pythontool_venv\lib\site-packages\pandas\io\excel.py in __init__(self, io, **kwds)
351 try:
--> 352 import xlrd
353 except ImportError:
ModuleNotFoundError: No module named 'xlrd'
Does that mean xlrd is a dependency for pandas when trying to use read_excel? If that's the case, is there any way to use pandas for reading in outside data files?
To read in using jupyter notebooks (python tool) the the syntax is slightly different- you must use "Alteryx.read("#1")" you can read more about it Here or you can type in Alteryx.help() in the notebook to learn more.
hope that helps?
Sorry, I forgot to mention the file is outside Alteryx. I'm reading the data in from a fixed location/path, its not something inside the workflow.
I don't think its an Alteryx.read or Alteryx.write issue, because so far, I've only been getting this error with Excel files. CSVs, for example, seem to read in fine with traditional pandas syntax.
This is working for me fine:
from ayx import Alteryx
import pandas as pd
df = pd.read_csv("S:/PATH/TO/FILE/Data.csv")
Alteryx.write(df,1)
This is not:
from ayx import Alteryx
import pandas as pd
df = pd.read_excel("S:/PATH/TO/FILE/Data.xlsx")
Alteryx.write(df,1)
I'm wondering if it has something to do with the fact that Excel files aren't flat. Wouldn't they need a parser, then? Must be that pandas parses Excel files with xlrd? I never knew that, but it would make sense to me (given my very limited understanding of python under the hood!).
oh! sorry @DanWhalen i guess i misunderstood! I haven't tried reading in data that way. I guess you could still throw in an input tool and place the file path there instead? (i know that defeats what you are trying to do, but it works!)
You can read why I'm trying to dodge the Input Data tool on this post:
https://community.alteryx.com/t5/Alteryx-Designer-Discussions/Reading-entire-folder-of-Excel-files-a...
But it doesn't matter what the resolution to the above question is, I still would like to find out exactly what this error message is trying to tell me.
I'm assuming this is saying Alteryx would need to add the xlrd package to the Python env in order for me to use pd.read_excel? I base that off of the parts that say "ModuleNotFoundError: No module named 'xlrd'" and "ImportError: Install xlrd >= 0.9.0 for Excel support", but most of this error message is gobbledegook to me.
Can anyone confirm? Full text of the error message is below:
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
c:\program files\alteryx\bin\miniconda3\pythontool_venv\lib\site-packages\pandas\io\excel.py in __init__(self, io, **kwds)
351 try:
--> 352 import xlrd
353 except ImportError:
ModuleNotFoundError: No module named 'xlrd'
During handling of the above exception, another exception occurred:
ImportError Traceback (most recent call last)
<ipython-input-2-d145fb7caa14> in <module>
1 from ayx import Alteryx
2 import pandas as pd
----> 3 df = pd.read_excel("C:/Users/Desktop/test/Data 1.xlsx",engine=None)
4 Alteryx.write(df,1)
c:\program files\alteryx\bin\miniconda3\pythontool_venv\lib\site-packages\pandas\util\_decorators.py in wrapper(*args, **kwargs)
176 else:
177 kwargs[new_arg_name] = new_arg_value
--> 178 return func(*args, **kwargs)
179 return wrapper
180 return _deprecate_kwarg
c:\program files\alteryx\bin\miniconda3\pythontool_venv\lib\site-packages\pandas\util\_decorators.py in wrapper(*args, **kwargs)
176 else:
177 kwargs[new_arg_name] = new_arg_value
--> 178 return func(*args, **kwargs)
179 return wrapper
180 return _deprecate_kwarg
c:\program files\alteryx\bin\miniconda3\pythontool_venv\lib\site-packages\pandas\io\excel.py in read_excel(io, sheet_name, header, names, index_col, usecols, squeeze, dtype, engine, converters, true_values, false_values, skiprows, nrows, na_values, parse_dates, date_parser, thousands, comment, skipfooter, convert_float, **kwds)
305
306 if not isinstance(io, ExcelFile):
--> 307 io = ExcelFile(io, engine=engine)
308
309 return io.parse(
c:\program files\alteryx\bin\miniconda3\pythontool_venv\lib\site-packages\pandas\io\excel.py in __init__(self, io, **kwds)
352 import xlrd
353 except ImportError:
--> 354 raise ImportError(err_msg)
355 else:
356 ver = tuple(map(int, xlrd.__VERSION__.split(".")[:2]))
ImportError: Install xlrd >= 0.9.0 for Excel support
The read_excel function in Pandas has a dependency on the 'xlrd' package, which is not included by default in the Miniconda installation in Alteryx Designer.
You can install the xlrd package in one cell with the following syntax:
from ayx import Package
Package.installPackages('xlrd')
Then, in a separate cell, you can load your Excel file:
import pandas as pd
df = pd.read_excel('C:/users/ajkramer/Desktop/sabr.xlsx')
df.head()
I don't seem to be able to run Package.installPackages() via a proxy.
I have a proxy (and credentials) defined in the Designer User Settings. The Python Tool appears to know the "Use Proxy" checkbox is checked, but doesn't seem tolook at the credentials, however, because I can't get a connection. I've checked, double-checked and triple-checked the credentials and I'm certain they are correct.
Installing... (this may take a minute depending on the package size, dependencies, and other factors) Collecting xlrd Retrying (Retry(total=4, connect=None, read=None, redirect=None)) after connection broken by 'ProxyError('Cannot connect to proxy.', OSError('Tunnel connection failed: 407 Proxy Authentication Required',))': /simple/xlrd/ Retrying (Retry(total=3, connect=None, read=None, redirect=None)) after connection broken by 'ProxyError('Cannot connect to proxy.', OSError('Tunnel connection failed: 407 Proxy Authentication Required',))': /simple/xlrd/ Retrying (Retry(total=2, connect=None, read=None, redirect=None)) after connection broken by 'ProxyError('Cannot connect to proxy.', OSError('Tunnel connection failed: 407 Proxy Authentication Required',))': /simple/xlrd/ Retrying (Retry(total=1, connect=None, read=None, redirect=None)) after connection broken by 'ProxyError('Cannot connect to proxy.', OSError('Tunnel connection failed: 407 Proxy Authentication Required',))': /simple/xlrd/ Retrying (Retry(total=0, connect=None, read=None, redirect=None)) after connection broken by 'ProxyError('Cannot connect to proxy.', OSError('Tunnel connection failed: 407 Proxy Authentication Required',))': /simple/xlrd/ Could not fetch URL https://pypi.python.org/simple/xlrd/: There was a problem confirming the ssl certificate: EOF occurred in violation of protocol (_ssl.c:749) - skipping Could not find a version that satisfies the requirement xlrd (from versions: ) No matching distribution found for xlrd
Anyone know how to configure the Python Tool to use a proxy?
Hi,
I would recommend reaching out to Alteryx Customer Support (support@alteryx.com) for assistance with the proxy.
Thanks,
Andrew