Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.

Simulation Sampling - Chunk Size

6 - Meteoroid



I want to use simple the simulation sampling tool to create reperformable samples over large data sets (hundreds of millions of records). 


Will the chunk size parameter potentially limit the amount of data that is subjected to being selected for a sample?


Thank you,

6 - Meteoroid

After playing around with this, it seems that chunk size refers to the number of records the sampling tool looks at once. So if you wanted the sample to re-performable and potentially needed to re-run/add iterations, your chunk size must be above the number of records. Example:

Population = 50

Chunk size = 25

Iterations = 5 then 10

^These parameters would yield different sample results for the first 5 records when the iterations are increased from 5 to 10, even if seed numbers stay the same. However, if you were to increase the chunk size to be greater than or equal to the total population you would get the same results for the first 5 records.
