community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.

Alteryx Not Reading Large File

Hello -

 

I am trying to use a fairly large .xlsx file in Alteryx.

 

I am inputting it using the Input Tool.

 

After attaching a Browse tool and running the workflow, Alteryx is showing that 0 records are being output from the input.

 

However, when I reduce the "Record Limit for All Inputs" to 100k, 300k, or even 500k records, it seems to work fine and outputs that number of records.

 

However, it has over 1 million records in the file and without a Record Limit, it outputs 0 total records.

 

Greatly appreciate any help.

Alteryx
Alteryx

@JohnKan222 Can you confirm exactly how many rows and columns your Excel file has? Excel has limits on the number of rows and columns supported by the file format. If your file is exceeding these limits it will likely cause issues when attempting to read the file. These limits are documented both in our help documentation as well as by Microsoft. I am specifically concerned with the row limit in your case as the limit for the file format is a little over 1 million (1,048,576) rows and you stated your file has over 1 million records.

 

References:

https://support.office.com/en-us/article/excel-specifications-and-limits-1672b34d-7043-467e-8e27-269...

https://help.alteryx.com/current/index.htm#AlteryxDriverXLSX.htm (See lengths and limits section)

Bolide

what ever program wrote the xlsx, if it exceeds the row limit of 1,048,076, typically the excel file is corrupted after the last record.  This is why you can set the limit option on your input tool and read up to that limit but if you remove it then it will not work because it gets to the last record and is lost on what to do next because there is no end of file as expected but there is no next record either...this is in layman's terms and the actual file code is a bit different I expect but the bottom line is the file was not written correctly because it exceeded the limit.  If you open in excel it will tell you not all rows are there...the excel program and its proprietary drivers know how to interpret the final code in the excel overhead, the custom drivers from Alteryx do not address this.  Most of us are fine with this because we know not to write data to excel if we think we might get near the 1m row mark, we write it as a csv or yxdb or other format that is not limited to ~1m rows.  If you have an affinity for excel you can use the output tool, assuming the file is written by alteryx, to write multiple files,or write to multiple tabs in the same workbook, capping them to no more than N rows per file...excel's limitation of ~1m rows is per sheet not per workbook.

 

 

Hi, 

 

It would be great if you could help me..

I have a very similar issue, with large excel files: they do not exceed 700.000 rows and 40 columns, however they are not read into Alteryx. Could this be due to the size of 100 MB?Untitled.png

 

thank you!

Bolide

Hi  @Ioana_Radoveneanu,

I have read larger files, in rows and columns as well as bytes.  Do you have an error message?  The only message in your screenshot is not a fatal error.

without seeing an error message or at least knowing how long you waited for it to read before canceling the job I can't help much.

 

Is the file on a non-local disk?  I mean is the file on network or cloud storage...and do you use vpn to reach it?  That could cause very slow reading do to network constraints.  so you may think it should read in a minute but it might take 5 dragging the data across the vpn.

 

2 things to try

1: get the file on your local hard drive so there is no network or vpn involved then try to read it.

2: Try first using the limit option within the "Input Tool" configuration to see if you can read a few records from the file.  if not then there may be a file issue.  if so then jump to 200k rows and work your way up. if it reads but you see it takes a long time then you may get a feel for how long to expect the input to take and it may be you killed the run too soon...

clipboard_image_0.png

@fharper - thank you! 

The issue was also because of the fact that the source files were on a network, which was slowing things down...

After a few hours the inputs finished loading, I was not expecting it to last that long

 

 

Meteoroid

I am having this issue with Directory Tool. My excel file is on OneDrive. If I copy the file on local machine it works fine but I want to avoid manual intervention. Any suggestions on how to resolve it?

Bolide

Onedrive is a Cloud service.  Basically it is a network drive you reach via an IP.  

 

If you can get the UNC path for your one drive folder you should be able to put that in your directory tool like any other network drive.  So instead of something like C:\Users\* in the Directory tool it would be something like this... https:\\IPaddress\YourOneDrive\* this latter being made up to illustrate...

Meteoroid

That's the problem. I am not able to get that path. Is there any workaround within Alteryx?

Bolide

I don't use Onedrive but I have a hard time imaging there is no way to identify UNC path..suggest exploring that...I will see if I can do a trial and see...out of curiosity 

Labels