This site uses different types of cookies, including analytics and functional cookies (its own and from other sites). To change your cookie settings or find out more, click here. If you continue browsing our website, you accept these cookies.
on 03-19-201901:55 PM - edited on 03-19-201902:02 PM by SydneyF
Thisarticle is part of the Tool Mastery Series, a compilation of Knowledge Base contributions to introduce diverse working examples for Designer Tools. Here we’ll delve into uses of the Pearson Correlation Tool on our way to mastering the Alteryx Designer:
When you are investigating a new dataset, you might be interested in measuring the correlation between different variables. There’s two different correlation methods available in Alteryx under the Data Investigation tab:
Pearson Correlation: Indicates the strength and direction of a linear relationship between two variables
Spearman Correlation: Assesses how well an arbitrary monotonic function could describe the relationship between two variables, without making any other assumptions about the particular nature of the relationship between the variables
We’ll dive into the Pearson Correlation tool in this article. It is the most frequently used correlation measure in practice. If someone tells you the “correlation” between two variables without specifying the method, they’re usually talking about the Pearson method.
Before using the tool, you’ll want to make sure the variables you’re analyzing are numeric (ints, floats, and doubles all work fine). Also, make sure you don’t have nulls in the variables you’re analyzing.
It is usually a good idea to look at a scatterplot of your data to make sure that a linear relationship looks like a reasonable assumption. Pearson correlation isn’t a good choice if your data looks to have a quadratic, logarithmic, or other non-linear relationship.
If you’ve decided to use the Pearson Correlation tool, the good news is it’s a pretty simple tool to configure. You really only have two choices to make.
What variables do you want to calculate correlations for?
Do you want to calculate correlations or covariances?
The tool will generate correlations between all combinations of variables you specify so in the example above, we’ll actually be calculating 9 correlations and get a correlation matrix as our output.
The Pearson Correlation tool can also calculate covariances if you’d prefer. Think of covariances as an “unstandardized” correlation. It’s still a measure of the relationship between variables, but it’s not adjusted for the variance (i.e. “spread”) of each variable.
If you calculate correlations, you’ll get values between -1 and 1 as your output. There are different philosophies to determining whether your correlation is weak, moderate, or strong, and it depends on your use case. But, it’s a good rule of thumb to think of magnitudes under 0.3 as weak, 0.3 to 0.6 as moderate, and over 0.6 as strong, with the sign signifying a positive or negative relationship.
If you’d like to experiment with the data used in this article, download the attached workflow!
By now, you should have expert-level proficiency with the Pearson Correlation Tool! If you can think of a use case we left out, feel free to use the comments section below! Consider yourself a Tool Master already? Let us know firstname.lastname@example.org you’d like your creative tool uses to be featured in the Tool Mastery Series.
Stay tuned with our latest posts everyTool Tuesdayby followingAlteryxon Twitter! If you want to master all the Designer tools, considersubscribingfor email notifications.