Advent of Code is back! Unwrap daily challenges to sharpen your Alteryx skills and earn badges along the way! Learn more now.
Community is experiencing an influx of spam. As we work toward a solution, please use the 'Notify Moderator' option on the ellipsis menu to flag inappropriate posts.
Free Trial

Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Calculate Median and Percentile in Alteryx doesn't work in Large Data set

shubhdgn27
5 - Atom

Hi All,

 

I am trying to calculate the median and 98th percentile for large data set with million of records, and the calculation of median and percentile is not as expected, it messing up somehow. When trying the same calculation using smaller data set it works perfectly fine.

 

3 REPLIES 3
KrishnaChithrathil
11 - Bolide

@shubhdgn27 

I think this post would help you. 

 

you can achieve the same using summarize tool

KrishnaChithrathil_0-1669105656849.png

 

 

shubhdgn27
5 - Atom

Hi Krishna, 

 

Yes I am using summarize tool only to calculate the median and percentile but while working with large data set(100millions) the calculation was messing up.

 

To overcome such problem used recordid tool so it calculating as expected.

shubhdgn27_1-1669113116278.png

 

 

Thank you.

LeandroYgorLoli
11 - Bolide

I’ve recently encountered a significant challenge with my workflow, specifically when working with large datasets (exceeding 10 million entries) and calculating medians. This issue has been quite impactful, and I’m earnestly seeking immediate assistance to address it. In my efforts to troubleshoot, I found that Python-based results are consistent with those derived from the Record tool and sorting methods. The precision of these median calculations is paramount, as they substantially influence my strategic decision-making and analysis. Could you please prioritize addressing this matter?

Labels
Top Solution Authors