I'm learning Data investigation tools and completely new to it
how to determine strong correlation values between two variables(A-B),(A-C) based on the output below and why
Solved! Go to Solution.
Hi @poojasingh111,
correlation is a measure to see if there is a link between values. It goes from -1 to +1 the closer it is to 1 the bigger the correlation is.
You can read more about it here :
https://en.wikipedia.org/wiki/Correlation
https://help.alteryx.com/20221/designer/pearson-correlation-tool
Hey @poojasingh111,
The closer to 1/-1 the higher the positive or negative correlation:
You have 1's in the dataset as everything correlates directly with itself.
@poojasingh111 the closer the value is to 1 (positive correlation) or -1 (negative correlation), the stronger the correlation is. This is because 1 represents 2 linked variables I.e. the change of one represents a directly proportional change in the other. You should always get a diagonal of 1s through the middle as a variable will obviously always be perfectly correlated with itself.
still I didn't understand, can anyone help me to understand through given example in the post, if I take two variables(A-B), are the strongest values calculated as vertically and horizontally
@poojasingh111 in this example, the 2 most correlated variables are A-C as the value is the closest to 1 or -1 (-0.897...). As their value is negative, this means they have a negative correlation i.e. as one increases, the other decreases.
DataNath, are you taking black penciled(vertical) or Red penciled(horizontally to calculate the closest value?If I understood it clearly any value near to 0 will be treated as less strongest and any value near to -1 or 1 is most strongest?
Either @poojasingh111, as the data is output as a matrix, you’ll get the same values twice. As the tool compares all variables it’ll compare A to C and also C to A, hence the duplicate. Yes you’re absolutely right - 0 is totally not correlated and 1 or -1 is perfect correlation.