Alteryx Designer Desktop Discussions

Trey1 · ‎08-04-2020

Hi,

I have a list of sales data by major US city by week and am trying to find those cities in which sales week over week changes are most correlated. I want to do this with a simple linear regression for each pair of cities). So, I am trying to make a loop that will regress each city on each of the other 49 cities (minus duplicates) to give me the R^2 in a table which I can sort to find the highest R^2.

I have read about using a batch macro, an iterative macro, the step wise tool, the iterative tool, or the dynamic selection tool, and am not sure exactly which tool is best. I would appreciate any help!

EDIT: Including dummy data for clarity:

Week	New York	San Francisco	Los Angeles	Houston	Chicago
1	85%	10%	84%	67%	14%
2	84%	52%	79%	53%	79%
3	99%	27%	48%	1%	50%
4	39%	75%	79%	14%	79%
5	72%	62%	100%	79%	41%
6	57%	97%	30%	63%	89%
7	27%	64%	79%	89%	86%
8	71%	34%	14%	39%	26%

All combinations of regressions:

Y	X
New York	Chicago
New York	Houston
New York	Los Angeles
New York	San Francisco
San Francisco	Chicago
San Francisco	Houston
San Francisco	Los Angeles
Los Angeles	Chicago
Los Angeles	Houston
Houston	Chicago

I created the attached macro to loop through regressors, but I can't figure out how to then change the dependent variable and loop for additional regressors.

Thank you,

Trey

Trey1 · ‎08-06-2020

I solved this by:

Joining the list of cities on itself,

removing duplicate pairs,

splitting the pairs across two columns

creating a macro that feeds in one column for indep and one column for dep variables

Alteryx Designer Desktop Discussions

Regress all combinations of two cities out of 50