Test the Heteroscedasticity and Multicollinearity between variables in Alteryx
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi All,
I would like to know if there is a way to test Heteroscedasticity and Multicollinearity between variables in Alteryx. Is there is any ready to use tool or add-on macros?
Best,
Qais
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Hi @qais1975
Could you share a little more about your scenario? Are you performing a regression analysis? The testing methods and interpretations may differ based on the situation.
Heteroskedasticity:
Reviewing the plot of residuals vs fitted values would be the quickest way to look for any issues (patterns). The Linear Regression tool will output this plot by default under "Diagnostics":
Multicollinearity:
I typically look at the Variance Inflation Factor (VIF) for multicollinearity in models. There's a tool made by Alteryx that you can download for this test:
https://help.alteryx.com/20223/designer/variance-inflation-factors-tool
https://community.alteryx.com/t5/Community-Gallery/Variance-Inflation-Factors/ta-p/878752
The Association Analysis tool might be a good starting point for simple variable relationships:
https://help.alteryx.com/20223/designer/association-analysis-tool
Otherwise, if you have a specific statistic test you'd like to deploy, the R tool in Alteryx allows you to import and utilize the plethora of R packages available online.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Dear @CharlieS ,
First of all, thank you so much for your amazing reply. My current senario I have a data which I would like to perorm linear regression, decision tree, and neural network models. However, this data has seasonality trend that demand on specific items peak on specific time of the year. The standard deviation fan out which is the aspect of heteroskedasticity.
Therefore, I am asking how can I overcome this problem and do some data transormation to remove this issue in Alteryx?
I think I am going to download this tool for VIF and try it. Thank uou.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Heteroskedasticity describes situations where the relationship between the dependant and independent variables of a model affects the error/significance of those variables. Removing this as an issues can take many forms like removing, transforming, or substituting any independent variables from the model that introduce heteroskedasticity.
Seasonality trends can be normalized or transformed if you're interested in studying the relationship between demand and other independent variables. The various modelling methods you mention above have different strengths and weaknesses depending on how your tests are designed.
Could you share more about your scenario and the data you have to work with?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator
Dear @CharlieS ,
The data is private data and I cannot share them. However, my current scenario which I would like to do is to test different techniques in removing or reducing the heteroskedasticity.
Thank you.
Qais
