Dataframe output Key Error
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
I have a very simple script that does some calculation and creates new columns.
from ayx import Alteryx
import pandas as pd
import numpy as np
df = Alteryx.read("#1")
for i in range(len(df)):
k = df["WeekNumber"][i]
while k < 53:
df[k] = df["Sum_HC"]*((1-df["WeeklyShrinkage"])**(k-df["WeekNumber"][i]))
k+=1
Alteryx.write(df,1)
When I replace Alteryx.write(df,1) with print(df), it prints perfectly fine, but when I try to write, it gives me the following error
SUCCESS: reading input data "#1"
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
c:\users\appdata\local\alteryx\bin\miniconda3\envs\designerbasetools_venv\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2888 try:
-> 2889 return self._engine.get_loc(casted_key)
2890 except KeyError as err:
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 0
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call last)
<ipython-input-10-33adc0abc4c8> in <module>
10 k+=1
11
---> 12 Alteryx.write(df,1)
c:\users\appdata\local\alteryx\bin\miniconda3\envs\designerbasetools_venv\lib\site-packages\ayx\export.py in write(pandas_df, outgoing_connection_number, columns, debug, **kwargs)
85 When running the workflow in Alteryx, this function will convert a pandas data frame to an Alteryx data stream and pass it out through one of the tool's five output anchors. When called from the Jupyter notebook interactively, it will display a preview of the pandas dataframe. An optional 'columns' argument allows column metadata to specify the field type, length, and name of columns in the output data stream.
86 """
---> 87 return __CachedData__(debug=debug).write(
88 pandas_df, outgoing_connection_number, columns=columns, **kwargs
89 )
c:\users\appdata\local\alteryx\bin\miniconda3\envs\designerbasetools_venv\lib\site-packages\ayx\CachedData.py in write(self, pandas_df, outgoing_connection_number, columns, output_filepath)
426
427 for index, colname in enumerate(pandas_df.columns):
--> 428 coltype = str(pandas_df.dtypes[index])
429 # does the column contain bytearrays? then its probably a blob
430 # (check only first non-null value in column -- tradeoff for efficiency)
c:\users\appdata\local\alteryx\bin\miniconda3\envs\designerbasetools_venv\lib\site-packages\pandas\core\series.py in __getitem__(self, key)
880
881 elif key_is_scalar:
--> 882 return self._get_value(key)
883
884 if (
c:\users\appdata\local\alteryx\bin\miniconda3\envs\designerbasetools_venv\lib\site-packages\pandas\core\series.py in _get_value(self, label, takeable)
989
990 # Similar to Index.get_value, but we do not fall back to positional
--> 991 loc = self.index.get_loc(label)
992 return self.index._get_values_for_loc(self, loc, label)
993
c:\users\appdata\local\alteryx\bin\miniconda3\envs\designerbasetools_venv\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2889 return self._engine.get_loc(casted_key)
2890 except KeyError as err:
-> 2891 raise KeyError(key) from err
2892
2893 if tolerance is not None:
KeyError: 0
Any ideas?
Solved! Go to Solution.
- Labels:
- Python
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Can you share a screenshot of what the dataframe looks like before you write it? Better to show the output of a Jupyter Notebook cell, as opposed to printing it.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
According to the error, it looks like there's an issue with the loop's range. However, if you are able to print the Dataframe, but not write it (assuming all else is equal), this wouldn't make sense. Is there any way you can provide the data set so I can investigate further?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Try adding,
df.columns = df.columns.astype(str)
before you write the output. I'm confident that will solve your issue.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
It worked! thank you 😁
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Pandas KeyError occurs when we try to access some column/row label in our DataFrame that doesn’t exist. Usually, this error occurs when you misspell a column/row name or include an unwanted space before or after the column/row name.. Before doing anything with the data frame, use print(df.columns) to see column exist or not.
print(df.columns)
I was getting a similar kind of error in one of my codes. Turns out, that particular index was missing from my data frame as I had dropped the empty dataframe 2 rows. If this is the case, you can do df.reset_index(inplace=True) and the error should be resolved.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
worked for me, too! THX!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
perfect! THX!
