Showing results for 
Search instead for 
Did you mean: 

Alteryx designer Discussions

Find answers, ask questions, and share expertise about Alteryx Designer.

Representative Sample of Entire Store Network


I'd like to design a store level test that pulls a sample of stores which we could confidently say represents the whole company and I'm hoping for some ideas on how to do this scientifically using Alteryx.  Our company has about 300 stores and I'd like to pull somewhere between 20-25 stores out.  Are there any ideas on how to get started after I've identified the store level attributes we want?

Alteryx Certified Partner
Alteryx Certified Partner



Have you heard of a chi-square test? (


If you are looking at a child population, is it representative in terms of field-x as the parent is?  Is the distribution of values for a characteristic close enough to the distribution of the whole group?  


I created a macro that would allow you to compare a field and it's values from two sets and get the mathematic results to answer that question with a PASS/FAIL.  You could run the test with multiple fields and see if the groups are similar.  Note that the # of discrete values permitted is a limit.  If you are looking at variables like, distance to store, # of visits, $ amounts, then you would need to BIN or BUCKET them into groups and compare the bins and their counts.!app/Chi-square-test-macro/55955c2e398a7111688e6fff


It is a start.




Alteryx ACE & Top Community Contributor

Chaos reigns within. Repent, reflect and reboot. Order shall return.

Hi @chris_cleckner,


What you are after is either the ability to test each store against the population in the kind of way that @MarqueeCrew has suggested or to cluster similar representative stores and pick one out of each cluster. Alternatively, you could use the cluster metrics.


Take a look at the Store Clustering examples available in the Tableau & Qlik samples associated with the Starter Kits. You may get some ideas from those however you might want to play with them a little to fit different types of variables and different numbers of clusters.