Alteryx Designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.
ALTER.NEXT:

Join us on Dec 2 for a half-day virtual analytics + data science event!
US & CA customers only

SAVE YOUR SPOT
It's the most wonderful time of the year - Santalytics 2020 is here! This year, Santa's workshop needs the help of the Alteryx Community to help get back on track, so head over to the Group Hub for all the info to get started!

Alteryx.write error - "input must be a pandas series"

Highlighted
6 - Meteoroid

I have multiple datasets with column headers in Japanese. I ran the headers through Google Translate and put the original and translated header values in a spreadsheet so that when we receive updated datasets in the future, I can use this as a template to easily find and replace headers with the translation. I created what I thought was a pretty straightforward Python script to load each dataset into a dataframe, union all of the dataframes based on the common set of fields between them, and then rename the headers with the translations.

 

from ayx import Alteryx
import pandas as pd

# Import data & translation template
df_2401_2t = pd.read_excel('**filepath + sheet name**')
df_2401_4t = pd.read_excel('**filepath + sheet name**')
df_2601 = pd.read_excel('**filepath + sheet name**')
df_trans = pd.read_excel('**filepath + sheet name**')

frames = [df_2401_2t, df_2401_4t, df_2601]

# Union df on common field names
df = pd.concat(frames, join = 'inner', ignore_index = True)

# Find & replace headers with translations
for col in df.columns:
    for i in df_trans.index:
        if col == df_trans['Original'][i]:
            df.rename({col: df_trans['Translation'][i]}, axis = 1, inplace = True)

# Output df to workflow
Alteryx.write(df, 1)

 

However I'm getting the following error when I try to write df back out to Alteryx: 

 

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-19-775ca6c91598> in <module>
----> 1 Alteryx.write(df, 1)

e:\program files\alteryx\bin\miniconda3\envs\jupytertool_venv\lib\site-packages\ayx\export.py in write(pandas_df, outgoing_connection_number, columns, debug, **kwargs)
     86     """
     87     return __CachedData__(debug=debug).write(
---> 88         pandas_df, outgoing_connection_number, columns=columns, **kwargs
     89     )
     90 

e:\program files\alteryx\bin\miniconda3\envs\jupytertool_venv\lib\site-packages\ayx\CachedData.py in write(self, pandas_df, outgoing_connection_number, columns, output_filepath)
    430             # (check only first non-null value in column -- tradeoff for efficiency)
    431             col_contains_bytearrays = coltype == "object" and isinstance(
--> 432                 firstValidValue(pandas_df[colname]), (bytearray, bytes)
    433             )
    434             try:

e:\program files\alteryx\bin\miniconda3\envs\jupytertool_venv\lib\site-packages\ayx\DataUtils.py in firstValidValue(pd_series)
     46 def firstValidValue(pd_series):
     47     if not isinstance(pd_series, pd.core.series.Series):
---> 48         raise TypeError(f"input must be a pandas series, not a {type(pd_series)}")
     49     if hasattr(pd_series, "first_valid_index"):
     50         first_valid_index = pd_series.first_valid_index()

TypeError: input must be a pandas series, not a <class 'pandas.core.frame.DataFrame'>

Not really following here, it's telling me input must be a pandas series but I thought Alteryx.write() required a pandas dataframe, which is exaclty what df is? If anyone can point me in the right direction it would be much appreciated.

Highlighted
Alteryx Certified Partner
Alteryx Certified Partner

Does the same error occur if you replace,

Alteryx.write(df, 1)

with,

Alteryx.write(df_2401_2t, 1)

 

Could you share the 2401_2t xlsx file? My feeling is there's a value in this that's confusing Alteryx.

You could also try doing df = df.reset_index() just before writing the output. Though it's probably not that.

Highlighted
6 - Meteoroid

Thanks Philip. I tried writing df_2401_2t as you suggested and that works fine:

 

SUCCESS: writing outgoing connection data 1

I also tried df.reset_index(), but I still get the same error. Unfortunately I can't share the .xlsx files because they contain proprietary data.

Labels