Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Mismatch in summary data from Summarize tool and Browse tool

pushkar_oke
6 - Meteoroid

I have a huge data about 57 million rows and after cracking my head on it for one whole day, I (and my team) still could not find out the reason for the following discrepancy: I have a browse tool and summarize tool connected to the same data set as shown in first screen capture. The output of summarize tool shows 791 million pounds quantity. However, output of browse shows 70 million quantity as seen in the other 2 screen captures. There is no way to identify the true value due to the huge file. Any help would be appreciated.

 

Workflow:

Flow.PNG

 

Summarize Output:

 

Summarize Output.PNG

 

Browse Output:

Browse Output.PNG

9 REPLIES 9
Jean-Balteryx
16 - Nebula
16 - Nebula

Hi @pushkar_oke ,

 

Could you send a wider screenshot of the Browse tool please ?

pushkar_oke
6 - Meteoroid

Hi Jean, PFB SS of browse tool:

 

Browse_2.PNG

 

Browse_3.PNG

Jean-Balteryx
16 - Nebula
16 - Nebula

I would expect quantity to be integer but it's a double field. Does it contain decimal values ?

pushkar_oke
6 - Meteoroid

I won't expect that field to contain decimals. I tried converting it to Int64 using select tool before summarize and browse, but similar result.

Jean-Balteryx
16 - Nebula
16 - Nebula

Can you try to compute min, max and average with summarize ? To check consistency with Browse tool.

pushkar_oke
6 - Meteoroid

Yes I did that, there was some discrepancy (with Max, Avg and Median) with that as well. Min looks good though.

 

Summarize Result:

Sum_Act.delivery qtyCountNonNull_Act.delivery qtyMin_Act.delivery qtyMax_Act.delivery qtyAvg_Act.delivery qtyMedian_Act.delivery qty
79197031856901306-28330216652813.91831532

 

And Browse output:

Min-Max-Avg.PNG

 

Jean-Balteryx
16 - Nebula
16 - Nebula

I just discovered that Browse data profiling is capped at 300MB. Does the amount of data is greater than that ?

pushkar_oke
6 - Meteoroid

Yes!! The data is over 2GB! That must be it, then!

 

Are the summarize numbers true then? Can I use those for further calculation?

 

Jean-Balteryx
16 - Nebula
16 - Nebula

Extract from this documentation about Browse tool : https://help.alteryx.com/20212/designer/browse-tool

 

Capture d’écran 2021-08-03 à 23.15.58.png

 

So you can trust the Summarize result !

Labels
Top Solution Authors