Large Data
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi All,
I have an Input data that produce an output of over 200K. I developed a Alteryx workflow which firstly consumes a lot of time(almost 1hr) and then provides an Output of about 140MB File.
The input is actually a ODBC query, but here for the security purpose, I have put an Dummy excel file.
Please help me to get a solution.
Thanks in advance.
- Labels:
- Reporting
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
May I clarify what is the ask?
- Are you looking for an alternative method to read/write data faster? (200k rows is actually not a very large size for Alteryx. I have handled GB size data with millions of records so maybe I read the question wrongly)
- Or do you need help with some particular step in the analysis / procedures that you are stuck with?
Dawn.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi,
Actually the time is consumed by the Basic Table Tool and the Render Tool. I am using Render tool to create an XLSX file.
The second concern is that the XLSX file with all the formatting is reaching upto 140MB.I also have 30 fields in the report.
The normal file size is about 40MB when ODBC query is executed directly from Excel with the similar format.
Regards,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
As I come across Excel files as both inputs and outputs, I would question the usability of such a large Excel file. Even at 40MB (let alone 140MB) just opening the Excel file to read would be a challenge. Using render tool to generate such a large volume of formatted Excel does not sound v optimal to me.
I don't know if this works for you. My usual middle-of-the-road approach is to use Alteryx to build that "master datatable" that contains the highest possible dimension that I need and then build the related charts etc... of it.
Dawn.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi,
There are many Financial reports that need history data and hence the volume keeps increasing. We can't put any criteria to make the size smaller.
Please suggest some workflow for us to reduce the size of the final output to less than 50MB.
Regards,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi ,
Please suggest any other tool that can be used instead of the Render tool. The Render Tool formatting consumes as lot of time and hence need to find a better solution.Even for a DB query that provide results of about 20K, the tool format is consuming about 40 minutes.
Please suggest.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
In my experience, its best to use the report tools to format excel files only if the number of rows of your output is small.
So, i would suggest to you:
1) Use the normal .xlsx output (wittout report formatting) to deal with this situation. Propose report tools for summarized outputs with smaller number of rows.
2) If it is really necessary to format the output with this number of rows, do it manually after Alteryx gives you the output, it will not create this heavy file that Alteryx create with report tools. OR If you have some knowledge with python, you can use the python component associated with the Openpyxl library to automate the formatting it in a better way.
If the provided answer helped you to solve the problem/was correct, please accept it as a solution :)
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
The solution to your problem is: Don't use Excel to store your entire data history.
Batch your output if you have to or otherwise partition it. Alteryx (or Python - or anything) isn't going to solve this issue until you fix the route problem.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
@apathetichell , excel can handle 200k rows well in my experience, it is not 200k rows that makes any excel file wait 1h to be written and have almost 150MB, this is not realistic. The problem that i understood here and that i am familiar with, is that the table tool formatting is making the output file too heavy.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Excel can handle 200k rows - this is also 30 columns. Usually excel performance starts to decline as row and column numbers increase. I also 100% do agree that this is not what Render is designed for. I'm running a 1mm row output to excel as we speak to see how it looks in render.
Something to keep in mind is that because Render is designed for "printed pages" it always thinks of the file as something to be printed. @Jayan_alteryx have you tried output data? At 1,000,000 rows I'm not seeing much of a difference between output data and render in terms of size - output data is a bit faster though. You can try throwing an autofield before your table tool but I don't think it will change much.
I will reiterate - storing an entire transactional or other history in excel doesn't make sense. Excel is a spreadsheet. It is not designed to be a database.
