community
cancel
Showing results for 
Search instead for 
Did you mean: 

Alteryx Knowledge Base

Definitive answers from Designer experts.
Upgrade Alteryx Designer in 10 Steps

Debating whether or not to upgrade to the latest version of Alteryx Designer?

LEARN MORE

Tool Mastery | Pearson Correlation

Alteryx
Alteryx
Created on
Pearson Correlation.png

This article is part of the Tool Mastery Series, a compilation of Knowledge Base contributions to introduce diverse working examples for Designer Tools. Here we’ll delve into uses of the Pearson Correlation Tool on our way to mastering the Alteryx Designer:

 

When you are investigating a new dataset, you might be interested in measuring the correlation between different variables. There’s two different correlation methods available in Alteryx under the Data Investigation tab:

 

  • Pearson Correlation: Indicates the strength and direction of a linear relationship between two variables
  • Spearman Correlation: Assesses how well an arbitrary monotonic function could describe the relationship between two variables, without making any other assumptions about the particular nature of the relationship between the variables

 

Pearson Correlation

 

We’ll dive into the Pearson Correlation tool in this article. It is the most frequently used correlation measure in practice. If someone tells you the “correlation” between two variables without specifying the method, they’re usually talking about the Pearson method.

 

Before using the tool, you’ll want to make sure the variables you’re analyzing are numeric (ints, floats, and doubles all work fine). Also, make sure you don’t have nulls in the variables you’re analyzing.

 

It is usually a good idea to look at a scatterplot of your data to make sure that a linear relationship looks like a reasonable assumption. Pearson correlation isn’t a good choice if your data looks to have a quadratic, logarithmic, or other non-linear relationship.

 

If you’ve decided to use the Pearson Correlation tool, the good news is it’s a pretty simple tool to configure. You really only have two choices to make.

  1. What variables do you want to calculate correlations for?
  2. Do you want to calculate correlations or covariances?

img1.png

 

img2.png

 

The tool will generate correlations between all combinations of variables you specify so in the example above, we’ll actually be calculating 9 correlations and get a correlation matrix as our output.

 

img3.png

 

The Pearson Correlation tool can also calculate covariances if you’d prefer. Think of covariances as an “unstandardized” correlation. It’s still a measure of the relationship between variables, but it’s not adjusted for the variance (i.e. “spread”) of each variable.

 

 img4.png

 

If you calculate correlations, you’ll get values between -1 and 1 as your output. There are different philosophies to determining whether your correlation is weak, moderate, or strong, and it depends on your use case. But, it’s a good rule of thumb to think of magnitudes under 0.3 as weak, 0.3 to 0.6 as moderate, and over 0.6 as strong, with the sign signifying a positive or negative relationship.

 

If you’d like to experiment with the data used in this article, download the attached workflow!

 

By now, you should have expert-level proficiency with the Pearson Correlation Tool! If you can think of a use case we left out, feel free to use the comments section below! Consider yourself a Tool Master already? Let us know at community@alteryx.com if you’d like your creative tool uses to be featured in the Tool Mastery Series.

 

Stay tuned with our latest posts every Tool Tuesday by following Alteryx on Twitter! If you want to master all the Designer tools, consider subscribing for email notifications.

Attachments