Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Profile results unclear

ArjanF
6 - Meteoroid

Dear community,

 

As a starting user of Alteryx, I am puzzled by the behavior of the Profile in a Browse tool. In my super simple analysis I determine the length of a field with following Profile result:

ArjanF_0-1617888562681.png

The profile tells me I am having approx. 8.2M records in my data set, but the distinct values for length of the field in that dataset (between 2 and 18 characters) adds up to approx. 5.5M records. Seems there is data missing for about 2.7M records. Generating the profile costs a lot of calculation time, but the information presented seems incomplete. I have seen somewhere in the community the result is capped at 300MB, but if that is happening should there not be a visible hint of that fact being given? What gives?

 

Hope somebody can throw some light on this, am using Designer version 2020.4.

 

Cheers,

Arjan

6 REPLIES 6
TrevorS
Alteryx Alumni (Retired)

Hello @ArjanF 

Thanks for reaching out to the Community!
Are you able to share your workflow so that the Community can help to troubleshoot further?
From your screenshot, it appears you have begun to dig into the data within the browse tool, so if we can see the workflow and some sample data it will help to better understand what is causing the difference in the records you believe you should have, and the records being reported.

Here are some resources for you to check out about the browse tool and its data profiling:
Tool Mastery | Browse

Data Profiling in the Browse Tool

Browse Tool Help Doc


Thanks!
TrevorS

Community Moderator
ArjanF
6 - Meteoroid

Hello TrevorS,

 

It took some time to anonymize my data set, as working with a smaller set doesn't show the problematic behavior in the Browse Tool. Based on that, I do believe the behavior is caused by size of input file. Apologies upfront for sharing a packed workflow of 24MB, but I could not make this smaller. The problematic behavior is shown in the Browse Tool marked in picture below. I have put a Summarize and another Browse based on the same node as the problematic Browse, to make it clear where the problem is (I hope). Any pointers would be very welcome.

ArjanF_0-1618583711907.png

Thanks for any support/insights on this topic.

ArjanF

 

apathetichell
18 - Pollux

Any chance this is a memory issue on your end?

 

This shows up when I summarize your summarize to get a total count:

Sum_Count
8552322

 

The browse tool you marked also showed 8552322 records....

 

Have you tried a frequency table tool?

ArjanF
6 - Meteoroid

Hi apathetichell,

 

Total amount of records is not the problem, memory on my end should also not be the problem as I am running on a 16 GB machine, should be enough I believe.

 

I made a picture to point it out more clearly I hope.

ArjanF_0-1618585725568.png

 

 

Kind regards,

ArjanF

apathetichell
18 - Pollux

Honestly - I'd chalk it up to limits of Browse.

 

For this amount of data you should be using something like Field Summary to get analysis. Clearly the data is there and the machine isn't cutting anything out - as the summarize functions are working properly.

ArjanF
6 - Meteoroid

That might all be true, but, in my humble opinion, the software should give some visual guidance to that fact (assuming data volume is triggering this behavior). With the implementation in 2020.4 I get no (visible) feedback the results are unreliable as profile is not created on the full dataset! That is by itself very bad as unaware users will reach the wrong conclusion as I initially did. Again, in my humble opinion, either show a full and correct result, or don't show a result (but a message pertaining to the data size) as the current potential partial result leaves users guessing at the correctness of the Profile information.

 

Is there somewhere a feature request / bug fix section where this can be brought to the developers of Alteryx?

 

Kind regards,

ArjanF

Labels