Hello everyone.
My company have recently updated our Alteryx Design version from 2019.2 to 2021.1, and since then I am having some trouble with the Alteryx.write() function inside the Python tool. I have two tables that were pivoted inside the script with more than one index and some columns. After reseting the indices, I try to use the above mentioned write function, which gives me the following error:
--------------------------------------------------------------------------- KeyError Traceback (most recent call last) c:\program files\alteryx\bin\miniconda3\envs\designerbasetools_venv\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance) 2888 try: -> 2889 return self._engine.get_loc(casted_key) 2890 except KeyError as err: pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc() pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc() pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item() pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item() KeyError: 0 The above exception was the direct cause of the following exception: KeyError Traceback (most recent call last) <ipython-input-45-fc7f657f3e2e> in <module> ----> 1 Alteryx.write(teste, 2) c:\program files\alteryx\bin\miniconda3\envs\designerbasetools_venv\lib\site-packages\ayx\export.py in write(pandas_df, outgoing_connection_number, columns, debug, **kwargs) 85 When running the workflow in Alteryx, this function will convert a pandas data frame to an Alteryx data stream and pass it out through one of the tool's five output anchors. When called from the Jupyter notebook interactively, it will display a preview of the pandas dataframe. An optional 'columns' argument allows column metadata to specify the field type, length, and name of columns in the output data stream. 86 """ ---> 87 return __CachedData__(debug=debug).write( 88 pandas_df, outgoing_connection_number, columns=columns, **kwargs 89 ) c:\program files\alteryx\bin\miniconda3\envs\designerbasetools_venv\lib\site-packages\ayx\CachedData.py in write(self, pandas_df, outgoing_connection_number, columns, output_filepath) 426 427 for index, colname in enumerate(pandas_df.columns): --> 428 coltype = str(pandas_df.dtypes[index]) 429 # does the column contain bytearrays? then its probably a blob 430 # (check only first non-null value in column -- tradeoff for efficiency) c:\program files\alteryx\bin\miniconda3\envs\designerbasetools_venv\lib\site-packages\pandas\core\series.py in __getitem__(self, key) 880 881 elif key_is_scalar: --> 882 return self._get_value(key) 883 884 if ( c:\program files\alteryx\bin\miniconda3\envs\designerbasetools_venv\lib\site-packages\pandas\core\series.py in _get_value(self, label, takeable) 989 990 # Similar to Index.get_value, but we do not fall back to positional --> 991 loc = self.index.get_loc(label) 992 return self.index._get_values_for_loc(self, loc, label) 993 c:\program files\alteryx\bin\miniconda3\envs\designerbasetools_venv\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance) 2889 return self._engine.get_loc(casted_key) 2890 except KeyError as err: -> 2891 raise KeyError(key) from err 2892 2893 if tolerance is not None: KeyError: 0
For the first table, I've used a dummy strategy: reassining the columns to the original table by
dataframe.columns = dataframe.columns.tolist(),
which worked pretty well. However, for the second table the error persists. I've tried running them on our old Gallery (from the 2019 version) and they worked fine.
I've googled a lot to try to find a proper solution to this error. Am I missing something important here? Any help or suggestion would be appreciated.
Notes: 1) The error is the same for each function; 2) I am able to plot the dataframes inside the Jupyter Notebook normally; 3) Other tables inside the Alteryx workflow are running OK.
Thanks!
This annoying error means that Pandas can not find your column name in your dataframe. Before doing anything with the data frame, use print(df.columns) to see dataframe column exist or not.
print(df.columns)
I was getting a similar kind of error in one of my codes. Turns out, that particular index was missing from my data frame as I had dropped the empty dataframe 2 rows. If this is the case, you can do df.reset_index(inplace=True) and the error should be resolved.