Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Community is experiencing an influx of spam. As we work toward a solution, please use the 'Notify Moderator' option on the ellipsis menu to flag inappropriate posts.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

pandas read_excel from outside file (xlrd needed?)

DanWhalen
7 - Meteor

I'm trying to use the Python tool to bring in data using pd.read_excel("/path/"). However, when I do, I get an error:

 

ModuleNotFoundError Traceback (most recent call last)
c:\program files\alteryx\bin\miniconda3\pythontool_venv\lib\site-packages\pandas\io\excel.py in __init__(self, io, **kwds)
351 try:
--> 352 import xlrd
353 except ImportError:

ModuleNotFoundError: No module named 'xlrd'


Does that mean xlrd is a dependency for pandas when trying to use read_excel?  If that's the case, is there any way to use pandas for reading in outside data files?

17 REPLIES 17
Derangedvisions
11 - Bolide

To read in using jupyter notebooks (python tool) the the syntax is slightly different- you must use "Alteryx.read("#1")" you can read more about it Here or you can type in Alteryx.help() in the notebook to learn more.

hope that helps?

BenMoss
ACE Emeritus
ACE Emeritus
I'm thinking along the same lines as the poster above.

It sound spoke your trying to read in an excel file with python, why not just use the input data tool and then stream the data into python by connecting it to the python tool and then using the syntax notes by the user above.

Ben
DanWhalen
7 - Meteor

Sorry, I forgot to mention the file is outside Alteryx.  I'm reading the data in from a fixed location/path, its not something inside the workflow.

 

I don't think its an Alteryx.read or Alteryx.write issue, because so far, I've only been getting this error with Excel files.  CSVs, for example, seem to read in fine with traditional pandas syntax.

 

This is working for me fine:

from ayx import Alteryx
import pandas as pd
df = pd.read_csv("S:/PATH/TO/FILE/Data.csv")
Alteryx.write(df,1)

 

This is not:

from ayx import Alteryx
import pandas as pd
df = pd.read_excel("S:/PATH/TO/FILE/Data.xlsx")
Alteryx.write(df,1)

 

I'm wondering if it has something to do with the fact that Excel files aren't flat.  Wouldn't they need a parser, then?  Must be that pandas parses Excel files with xlrd?  I never knew that, but it would make sense to me (given my very limited understanding of python under the hood!).

BenMoss
ACE Emeritus
ACE Emeritus
That to me doesn't explain why you can't use the input data tool and browse to the path you mention?

Ben
Derangedvisions
11 - Bolide

oh! sorry @DanWhalen i guess i misunderstood! I haven't tried reading in data that way.  I guess you could still throw in an input tool and place the file path there instead? (i know that defeats what you are trying to do, but it works!)

DanWhalen
7 - Meteor

You can read why I'm trying to dodge the Input Data tool on this post:
https://community.alteryx.com/t5/Alteryx-Designer-Discussions/Reading-entire-folder-of-Excel-files-a...

 

But it doesn't matter what the resolution to the above question is, I still would like to find out exactly what this error message is trying to tell me.

 

I'm assuming this is saying Alteryx would need to add the xlrd package to the Python env in order for me to use pd.read_excel?  I base that off of the parts that say "ModuleNotFoundError: No module named 'xlrd'" and "ImportError: Install xlrd >= 0.9.0 for Excel support", but most of this error message is gobbledegook to me.

 

Can anyone confirm?  Full text of the error message is below:

 

---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
c:\program files\alteryx\bin\miniconda3\pythontool_venv\lib\site-packages\pandas\io\excel.py in __init__(self, io, **kwds)
351 try:
--> 352 import xlrd
353 except ImportError:

ModuleNotFoundError: No module named 'xlrd'

During handling of the above exception, another exception occurred:

ImportError Traceback (most recent call last)
<ipython-input-2-d145fb7caa14> in <module>
1 from ayx import Alteryx
2 import pandas as pd
----> 3 df = pd.read_excel("C:/Users/Desktop/test/Data 1.xlsx",engine=None)
4 Alteryx.write(df,1)

c:\program files\alteryx\bin\miniconda3\pythontool_venv\lib\site-packages\pandas\util\_decorators.py in wrapper(*args, **kwargs)
176 else:
177 kwargs[new_arg_name] = new_arg_value
--> 178 return func(*args, **kwargs)
179 return wrapper
180 return _deprecate_kwarg

c:\program files\alteryx\bin\miniconda3\pythontool_venv\lib\site-packages\pandas\util\_decorators.py in wrapper(*args, **kwargs)
176 else:
177 kwargs[new_arg_name] = new_arg_value
--> 178 return func(*args, **kwargs)
179 return wrapper
180 return _deprecate_kwarg

c:\program files\alteryx\bin\miniconda3\pythontool_venv\lib\site-packages\pandas\io\excel.py in read_excel(io, sheet_name, header, names, index_col, usecols, squeeze, dtype, engine, converters, true_values, false_values, skiprows, nrows, na_values, parse_dates, date_parser, thousands, comment, skipfooter, convert_float, **kwds)
305
306 if not isinstance(io, ExcelFile):
--> 307 io = ExcelFile(io, engine=engine)
308
309 return io.parse(

c:\program files\alteryx\bin\miniconda3\pythontool_venv\lib\site-packages\pandas\io\excel.py in __init__(self, io, **kwds)
352 import xlrd
353 except ImportError:
--> 354 raise ImportError(err_msg)
355 else:
356 ver = tuple(map(int, xlrd.__VERSION__.split(".")[:2]))

ImportError: Install xlrd >= 0.9.0 for Excel support

AndrewKramer
Alteryx Alumni (Retired)

The read_excel function in Pandas has a dependency on the 'xlrd' package, which is not included by default in the Miniconda installation in Alteryx Designer.

 

You can install the xlrd package in one cell with the following syntax:

from ayx import Package
Package.installPackages('xlrd')

 

Then, in a separate cell, you can load your Excel file:

import pandas as pd
df = pd.read_excel('C:/users/ajkramer/Desktop/sabr.xlsx')
df.head()

 

 

carl_steinhilber
8 - Asteroid

I don't seem to be able to run Package.installPackages() via a proxy.

I have a proxy (and credentials) defined in the Designer User Settings. The Python Tool appears to know the "Use Proxy" checkbox is checked, but doesn't seem tolook at the credentials, however, because I can't get a connection. I've checked, double-checked and triple-checked the credentials and I'm certain they are correct.

 

Installing... 
(this may take a minute depending on the package size, dependencies, and other factors)
Collecting xlrd
  Retrying (Retry(total=4, connect=None, read=None, redirect=None)) after connection broken by 'ProxyError('Cannot connect to proxy.', OSError('Tunnel connection failed: 407 Proxy Authentication Required',))': /simple/xlrd/
  Retrying (Retry(total=3, connect=None, read=None, redirect=None)) after connection broken by 'ProxyError('Cannot connect to proxy.', OSError('Tunnel connection failed: 407 Proxy Authentication Required',))': /simple/xlrd/
  Retrying (Retry(total=2, connect=None, read=None, redirect=None)) after connection broken by 'ProxyError('Cannot connect to proxy.', OSError('Tunnel connection failed: 407 Proxy Authentication Required',))': /simple/xlrd/
  Retrying (Retry(total=1, connect=None, read=None, redirect=None)) after connection broken by 'ProxyError('Cannot connect to proxy.', OSError('Tunnel connection failed: 407 Proxy Authentication Required',))': /simple/xlrd/
  Retrying (Retry(total=0, connect=None, read=None, redirect=None)) after connection broken by 'ProxyError('Cannot connect to proxy.', OSError('Tunnel connection failed: 407 Proxy Authentication Required',))': /simple/xlrd/
  Could not fetch URL https://pypi.python.org/simple/xlrd/: There was a problem confirming the ssl certificate: EOF occurred in violation of protocol (_ssl.c:749) - skipping
  Could not find a version that satisfies the requirement xlrd (from versions: )
No matching distribution found for xlrd

 

Anyone know how to configure the Python Tool to use a proxy?

AndrewKramer
Alteryx Alumni (Retired)

Hi, 

 

I would recommend reaching out to Alteryx Customer Support (support@alteryx.com) for assistance with the proxy.

 

Thanks,

Andrew

Labels