It’s a measure of how many standard deviations below or above the population mean a raw score is. It is a way to compare results from a test to a normal distribution.
How is Z Score Calculated?
Z score calculations contain three simple variables
X = observed raw score
M = Mean of the test
SD = Standard Deviation of the test
The Problem I encounter when trying to apply this in Alteryx is, what if I have multiple tests for which I want to calculate the Z Score in a single data-set? Instead of doing the same repeated steps over and over for each test. I went ahead and created a macro.
How the Macro Functions:
When feeding in a dataset to the Z Score Select Macro, you are prompted to select two options. The first is the “Grouping Fields” – This will be the columns in your dataset that we will then calculate the average and standard deviation on. It can be one or multiple.
For example; I have an experiment that I am holding test every week (1-52) over 20 years. In my dataset I have 2 columns, one for week and one for year. If I want to find the observed Z Score for weeks 1-52 in 2018, my grouping fields will be both columns; week and year. If I wanted to calculate an observed Z Score for week 1 in 2010, 2011, and 2020; only week will be my grouping field. Alternatively, if I wanted to calculate an observed Z Score for every test in a given year, only year will by my grouping field.
The Second option is prompting you to select the fields you want to score. These will be the columns that contain your observed or tested values. You can select one or multiple.
The output will append a new column containing the Z score for every observed record. From there you can do as your heart desires!
there is a built in standardized z-score tool in alteryx already, but it's extremely hard to find on your own... considering the folder it's buried in is labeled "Supporting Macros", it must be a "sub"-macro they built to put inside of some other predictive tool, but they didn't make the z-score one itself directly available.
for me, it was located here. took awhile to find it-
Nice find! Using the standardized z-score macro, the z-scores turn out great. But, because I can't pass a string through the tool, I lose my unique row identifiers. Joining on Record number just jumbles everything up. Any ideas?