Hi,
Is there a limitation to the dataframe size using Alteryx.write(df,1) ?
I cant seem to write my dataframe to an anchor.
my code uses zipfile and read excel to get data.. and for some reason it is not reading in as a dataframe.. but it clearly is when i run it locally.
Is there anyway to convert it again to ensure i have it as a dataframe in alteryx?
Share your python code.
from ayx import Alteryx
import pandas as pd
import zipfile
xlrd = Alteryx.importPythonModule("C:\\Users\\user\\.conda\\envs\\Python\\Lib\\site-packages\\xlrd")
archive = zipfile.ZipFile(r'O:\Alteryx\Community\Original - All fields - October 2022.zip')
xlfile = archive.open('Original - All fields - October 2022.xls')
df = pd.concat(pd.read_excel(xlfile, header = 1, sheet_name=None), ignore_index=True)
#print(df)
Alteryx.write(df, 1)
Hi @wonka1234
What do you see when you print what you called 'df'?
df = process_files(month_to_process)
print(df)
@Felipe_Ribeir0 ah, i get "None" when i print df.. sigh not sure where it is not being converted..
@wonka1234 so this is your problem, if df is not a dataframe you cannot use this piece of code: Alteryx.write(df, 1)
Go back on your code and see what is missing and be sure that df is a dataframe and it will work.
did you import pandas as pd?
@apathetichell yes it is imported.
So in my below code I can print the DF fine.
How about why this code isnt working?
from ayx import Alteryx
import pandas as pd
import zipfile
xlrd = Alteryx.importPythonModule("C:\\Users\\user\\.conda\\envs\\Python\\Lib\\site-packages\\xlrd")
archive = zipfile.ZipFile(r'O:\Alteryx\Community\Original - All fields - October 2022.zip')
xlfile = archive.open('Original - All fields - October 2022.xls')
df = pd.concat(pd.read_excel(xlfile, header = 1, sheet_name=None), ignore_index=True)
#print(df)
Alteryx.write(df, 1)
and getting this huge error:
--------------------------------------------------------------------------- KeyError Traceback (most recent call last) c:\program files\alteryx\bin\miniconda3\envs\designerbasetools_venv\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance) 2888 try: -> 2889 return self._engine.get_loc(casted_key) 2890 except KeyError as err: pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc() pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc() pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item() pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item() KeyError: 0 The above exception was the direct cause of the following exception: KeyError Traceback (most recent call last) <ipython-input-2-d532070a98dc> in <module> 19 #print(df) 20 ---> 21 Alteryx.write(df, 1) c:\program files\alteryx\bin\miniconda3\envs\designerbasetools_venv\lib\site-packages\ayx\export.py in write(pandas_df, outgoing_connection_number, columns, debug, **kwargs) 85 When running the workflow in Alteryx, this function will convert a pandas data frame to an Alteryx data stream and pass it out through one of the tool's five output anchors. When called from the Jupyter notebook interactively, it will display a preview of the pandas dataframe. An optional 'columns' argument allows column metadata to specify the field type, length, and name of columns in the output data stream. 86 """ ---> 87 return __CachedData__(debug=debug).write( 88 pandas_df, outgoing_connection_number, columns=columns, **kwargs 89 ) c:\program files\alteryx\bin\miniconda3\envs\designerbasetools_venv\lib\site-packages\ayx\CachedData.py in write(self, pandas_df, outgoing_connection_number, columns, output_filepath) 426 427 for index, colname in enumerate(pandas_df.columns): --> 428 coltype = str(pandas_df.dtypes[index]) 429 # does the column contain bytearrays? then its probably a blob 430 # (check only first non-null value in column -- tradeoff for efficiency) c:\program files\alteryx\bin\miniconda3\envs\designerbasetools_venv\lib\site-packages\pandas\core\series.py in __getitem__(self, key) 880 881 elif key_is_scalar: --> 882 return self._get_value(key) 883 884 if ( c:\program files\alteryx\bin\miniconda3\envs\designerbasetools_venv\lib\site-packages\pandas\core\series.py in _get_value(self, label, takeable) 989 990 # Similar to Index.get_value, but we do not fall back to positional --> 991 loc = self.index.get_loc(label) 992 return self.index._get_values_for_loc(self, loc, label) 993 c:\program files\alteryx\bin\miniconda3\envs\designerbasetools_venv\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance) 2889 return self._engine.get_loc(casted_key) 2890 except KeyError as err: -> 2891 raise KeyError(key) from err 2892 2893 if tolerance is not None: KeyError: 0
@wonka1234 Yes, you wont be able to use Alteryx.write(df, 1) unless df is a dataframe. This is what this error is saying. What you called df is not a dataframe (as you saw trying to print), so it is a good idea to go back into your code and see why.
try something like:
import pandas as pd
import zipfile
xlrd = Alteryx.importPythonModule("C:\\Users\\user\\.conda\\envs\\Python\\Lib\\site-packages\\xlrd")
archive = zipfile.ZipFile(r'O:\Alteryx\Community\Original - All fields - October 2022.zip')
xlfile = archive.open('Original - All fields - October 2022.xls')
df = pd.concat(pd.DataFrame(pd.read_excel(xlfile, header = 1, sheet_name=None), ignore_index=True))
#print(df)
Alteryx.write(df, 1)
basically you need to manually convert something into a dataframe... so that pd.DataFrame() function is key.