Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Community is experiencing an influx of spam. As we work toward a solution, please use the 'Notify Moderator' option on the ellipsis menu to flag inappropriate posts.

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

How to output 2 Million records in excel

saralisset
6 - Meteoroid

Hello,

 

I have a workflow that generates an outuo with more than 2M records. I used "RecordID" and then a formula to create "sheets" in the excel based om records, in this case 1M per sheet. This report is taking 29 min to execute. If I remove the splitting functionality this report execute in 2min. Do you know any other way to split the data bases on amount of records that can work faster?

 

Any help would be much appreciated.

saralisset_1-1670462476014.jpeg

 

 

Thanks,

Sara

9 REPLIES 9
PanPP
Alteryx Alumni (Retired)

Hi @saralisset 

 

 

Excel has a hard 1M row limit per sheet. 

 

You can try to use the Block Until Done tool for the 1M records per sheet to see if the performance will be impacted. 

You can also try to enable AMP engine to see if performance will improve.

 

Hope this helps.

apathetichell
19 - Altair

 

As a reminder - Excel is not a database. It's a spreadsheet. There is never a case to output 2mm rows to Excel.

 

Having said that - add a record id tool. create a partition using something like ceil ([RecordID]/1000000) - add this column to your filename so that you now have a fully formed filename like "myincrediblelargeexcelfile_"+tostring([partition])+".xlsx|sheet1" - now use this field in your output data tool under "change entire path" - your error may be in your selection in your output data. Your filenaming convention seems reasonable.

 

NOTE - I'd use a number below 1,000,000 like say 700,000 because performance really starts to drag on larger excel files.

 

Did I mention that excel is not a database and you should not be storing your data there? perhaps you can look at a nice csv file  - or json?

 

also your screen grab is your sql error. You need a datasource named in odbc64 and a configured driver. Configure/text in ODBC 64 and then add it to Alteryx.

saralisset
6 - Meteoroid

thanks for answering. Your solution is exactly what I did… I’ll try with 700k but I don’t think that will help. I debugged it and performance issue is because RecordID. Any other idea? 
thanks anyways!

apathetichell
19 - Altair

Hi - can you share the error message? Assuming you have a normal size amount of ram (16gb or greater) - I highly doubt the issue is RecordID. If it is in performance profiling - turn off AMP. I just generated 100,000,000 rows. I ran RecordID. I created a partition. It took 21.6 seconds.

apathetichell
19 - Altair

Can you confirm there is only one record in the your text input which you are appending? Can you share more of your workflow? Are you by any chance running it on a Mac in a VM setting?

saralisset
6 - Meteoroid

Oh no. I missstated, I think the error is the splitting of data in different tabs. How do you output that 100M records? CSV file ?

 

qurstions:

RAM in my personal computer? Or the server? What would be the minimum required for a report like this?

What is AMP? Where should I turn it off

how do you create a partition?

 

saralisset
6 - Meteoroid

Yes, that only is replace by the location where the output will be save

saralisset
6 - Meteoroid

Running in VDI windows. 

apathetichell
19 - Altair

They are not different tabs - they must be different files... file.xlsx|sheet1...file.xlsx|sheetn - have a 1,000,000 total record limit - you would want file1.xlsx|sheet1 ... filen.xlsx|sheet1

 

Memory in VDI and performance can be significantly different than in a native windows system - and not sure about your question on Server. Is this being run on a Server or on desktop? If this is run on a VDI on desktop - please increase your memory you are allocating to your windows environment to 16gb.

 

For Server if you mean an EC2/Compute Instance - that would be the same as a windows system for this purpose.

Labels