We’ve extended Inspire Early Bird Pricing until March 1. Register now and enjoy 20% off conference passes and 10% off training passes. P.S. Don’t forget to bring friends! When you sign up for five or more tickets, you get an extra 20% discount on conference passes. Learn more now.
alteryx Community

# Alteryx Designer Desktop Discussions

SOLVED

## pairwise comparisons

8 - Asteroid

I can't believe I'm not finding the easy solution here. I have three groups: a control group, a group that got product A, and then a group that got product B. There has got to be a way to test the differences across all groups rather than running separate t-tests (which introduces type I error several times). If my outcome is the percent of people who were contacted, I want to see if the percent is different across groups.

Control Group % who were contacted: 10%

Product A group % who were contacted: 25%

Product B group % who were contacted: 33%

I shouldn't have to run a t-test comparing control to A, then another comparing control to B, and then a third comparing A to B. I know the method is pairwise comparisons but I'm not finding how I can do this in alteryx and I've looked on the community and surprisingly the answer seems to be "you can't" but this is not a rare statistical test!

11 REPLIES 11
Alteryx

Have you looked at the AB Testing Toolset?

Regards,
Stephen Ruhl
Principal Customer Support Engineer

Alteryx Alumni (Retired)

This seems like an appropriate use case for the the test of means tool.

8 - Asteroid

Hi Dan-

The test of means tool, I assumed, only tested between two groups. Your reply made me think somehow the tool could do pairwise if I have 3 groups. So I created a single variable with 3 categories and made that the "group identifier." But sure enough I get an error because the tool is not expecting more than 2 groups. Is there something I can do within the tool to make this work?

Alteryx Alumni (Retired)

Just to make sure things are expected, I assume you have recoded the variable of interest (the target) to be either 0 or 1 values. If this is true, then you can use filter tools (with the filter based on the group identifier variable) to do a test of means for: (1) controls vs. treatment 1; (2) controls vs. treatment 2; and (3) treatment 1 vs. treatment 2.

8 - Asteroid

Hi Dan-no, if I understand your suggestion correctly you are telling me to run 3 t-tests. This can be done, yes, but each time a test is run my chance of error (which I can't quantify) increases. There should be a way to test all 3 groups against each other in one fell swoop rather than running several independent tests.

Alteryx Alumni (Retired)

Let me rescind my original suggestion. The Test of Means tool is the closest tool we currently have to address use cases similar to the one you have, but is not correct in you specific use case for the reasons you indicate. However, R does have the appropriate multiple comparison of proportions tests you are looking for through its pairwise.prop.test function, and provides six different methods to adjust the p-values for multiple comparisons. This weekend I'll come up with a quick macro to handle your use case, which should be a common one, and post it to this thread.

8 - Asteroid

Hi Dan-YES. I know this can be done in SPSS because I used to do it all the time there, but that was years ago and I don't currently have SPSS. I do not have any experience in R, but I have upon occasion brought in the R in the developer tool after someone has sent me code (since I can't write the code myself, as I don't know R well at all). If you can post the macro that I can copy and paste into that tool, that would be absolutely awesome. I should have enough skills to apply my specific variable names to your code. Thank you so much! I think we're on the right path, and I do think as you said this will be useful for other users too.

Alteryx Alumni (Retired)

Attached is the promised macro. I've wrapped it into a sample workflow. It assumes you have raw data (i.e., the subject level data) with the the subject's response (favorable or unfavorable) as well as a test group identifier column. Since the response is categorical in nature (favorable or unfavorable response) the underlying test statistic is chi-squared distributed (not Student's t distributed). The macro offers six different approaches to correct the p-values from multiple comparisons.

This was created in Alteryx version 11.5, so you may need to update you version of Alteryx to use it. I will be going through it one more time to clean things up, and it will then be posted to the public Alteryx Analytics Gallery.

Dan

8 - Asteroid

Hi Dan-thanks for this. I have version 11 so it'll be a few days before I can get the upgraded version. Does anything on the macro change if the response is continuous as opposed to categorical? I would imagine users would be interested in both types.

Labels