Community Spring Cleaning week is here! Join your fellow Maveryx in digging through your old posts and marking comments on them as solved. Learn more here!
The Product Idea boards have gotten an update to better integrate them within our Product team's idea cycle! However this update does have a few unique behaviors, if you have any questions about them check out our FAQ.

Alteryx Designer Desktop Ideas

Share your Designer Desktop product ideas - we're listening!
Submitting an Idea?

Be sure to review our Idea Submission Guidelines for more information!

Submission Guidelines

Optional override for record limit on updated browse tool data profile.

Added in Alteryx Version 2020.3, the Browse tool no longer shows a profile of the complete dataset (it is capped when the record data size reached 300MB).

 

My proposed solution is an optional override of the record size limit on the browse tool (which will make the profiling take longer, but actually profile the entire dataset).  I would also like a general user setting to set the default behavior of the browse tool to either be limited or unlimited.

 

Below is the newly included documentation of the Data Profiling Limit, which I'm proposing can be overridden.

 

 

Data Profiling Limit
Data Profiling in the Browse tool is capped at 300 MB. This allows you to process very large datasets faster. For each record in the incoming dataset, we process the record and add the record size to a counter. Once the counter reaches 300 MB, we stop processing records.

It is important to note that there is no specific number of records that we can process. This depends on the dataset since a record size can range from 1 byte to a few thousand bytes. This record size is different from the file size, displayed in the Results grid and Data Profiling Holistic View. The file size is generally different since it has been compressed to optimize spacing.

In other words, 300 MB of record size is not the same as 300 MB of file size.

 

 

 

This new tool can cause confusion when looking at the data profile (e.g. if you expect the sum to be $3 million, but the browse tool is only showing 2% of your total records in the profile tool, the profile sum may only show $60 thousand).

 

The sampled version with a cutoff of 300MB is rarely useful if you are using browse tools to get a quick sense of the variable profiles on medium sized datasets (around 1 million records) since this rarely will fit into the 300MB record size limit.

 

An example can be shown in the image below, where the dataset contains 855,085 records, but the browse tool is profiling only the first 20,338.

 

alteryxExample1.png

 

Again, being able to override this 300MB record size limit would fix the problem created in the 2020.3 change to the browse tool.

 

 

 

11 Comments
KylieF
Alteryx Community Team
Alteryx Community Team

Thank you for your feedback and idea!

 

This is interesting feedback and I'm sure our product team would be really interested hearing from users on how often they run into this limitation and if working with datasets greater then 300MB is common. It looks like this is your first time posting on the idea boards (welcome!), so I'd recommend checking out our Submission Guidelines as they go over what we are looking for to get an idea pushed to our product team once it's posted.

MichaD
8 - Asteroid

I have the same issue working with a medium-sized dataset. 40.000 rows with 97 fields.

I used to use the Browse Tool to get a quick overview on the whole dataset. This is no longer possible with the updated Browse Tool.

I'd appreciate the improvements helmick2003usap recommended.

Switching to older Alteryx version until this will be fixed.

Struck
6 - Meteoroid

I just run in what seems the same limitation.

Alteryx was not showing me all the distinct values of a Field, and coming from another thread (https://community.alteryx.com/t5/Alteryx-Designer-Discussions/Browse-Tool-wrong-records-amount/td-p/...) it got more distinct values after I removed several columns with the select tool. I am still not sure if they are all showing up

I think this is unacceptable. The table is only 19433 records, and the summary tool is not warning me about the fact that it hasn't processed the full table. I am on 2020.3.5....

Cheers,
Enric

 

 

DavisWard
5 - Atom

I have also run into the same limitation. I even reduced my dataset size to <1MB and it's still only profiling about 20% of my records when I select an individual field. Please confirm when this is resolved, thanks!

Struck
6 - Meteoroid

Replying @DavisWard, you're right, the 300MB is shady the way it is counted. How do they count this 300MB? For me it seems the profiling is incomplete in cases where the anchors say the amount of data is just a few MB, and by a few I mean less than 10, and it is not processing all the records or INFORMING about it 😞

 

Hope you fix this soon

Struck
6 - Meteoroid

This seems to have been fixed in 2020.4 🎉
Can others confirm? @DavisWard ? @MichaD ? @helmick2003usap ?

helmick2003usap
5 - Atom

@Struck 

Yes I'm not having issues in 2020.4.

missiecyclone
6 - Meteoroid

I'm still having the issue in 2021.2.  At a minimum, I would like to see a warning to make it more obvious that the profile is incomplete.

m_davis337
5 - Atom

Yep, we're still having the issue as well. Would love to see either a way to increase the data profiling limit as a workflow configuration option or have a separate tool we can use to evaluate the entire data set (even if it slows my workflow).

AlteryxCommunityTeam
Alteryx Community Team
Alteryx Community Team
Status changed to: Accepting Votes