Alteryx Designer

Find answers, ask questions, and share expertise about Alteryx Designer.

Correlation Tool- Filtering Out Excessive Multi Collinearity

6 - Meteoroid

Hello All,


I am trying to devise a way to systematically remove variables based on an excessive correlation, The Pearson tool allows me to create an n x n matrix of correlations for all variables, however, there doesn't appear to be an obvious way to remove a correlation pairing based on a threshold value (e.g. remove if >0.8).


I have also attempted to do the same but using Spearman's coefficient but the association analysis outputs a series of info-graphics displaying the relationships and appears to be geared towards data investigation rather than processing. If there is a way to undertake the above problem using Spearman's.


I have attached an example dataset below. 


Thanks in advance,





6 - Meteoroid

Anyone got any ideas on this?

6 - Meteoroid


9 - Comet



If I understand your question, you want to systematically filter out collinear variables based on a correlation result. Is that right? The challenge is that the Pearson tool only compares on pair of variables, whereas the Association analysis tool, which does multiple pairwise comparisons, only outputs a report, not the data. 


Try the Pearson correlation Matrix macro from Laszlo Dobiasz in the Gallery.!app/Pearson-Correlation-Matrix/5e55ed0e8a933712c0404c30 

With it you can get the correlation values either as a matrix or as a long list.  Then you can use a filter to remove whatever variables you want (note that you will have 4 rows for each pair of variables), and join back to your original data. 


I've attached a workflow to get you started.


Does this solve your question?





6 - Meteoroid

Thanks Terry.


You correctly understand my intention but there does appear to be an issue with the tool (the solution itself is good): it throws out at error stating 'There were more than 16 records in the source' which I initially interpreted to mean that it was limited to 16 pairwise correlations but looking at the output, it appears to computed them all. Is it safe to ignore this error?