Alteryx Designer Desktop Discussions

Find answers, ask questions, and share expertise about Alteryx Designer Desktop and Intelligence Suite.
SOLVED

Market Basket Affinity tool - Tabular view as well as Directional capability

samkim87
6 - Meteoroid

My question pertains to the MB Affinity tool, which is simple and powerful, so far I'm liking it a lot.  My question is about reporting this data and whether or not it's possible to export the data in a Tabular format.

 

Take an example with 5 items, the MB affinity tool will produce a 5x5 grid with 25 values, all representing coaffinity scores.  I currently write this table to a SQL Server where I perform an Unpivot function similar to the below.

 

select u.rownames, u.Segment2, u.Affinity
from MB_STG_families A
unpivot
(
Affinity
for Segment2 in (Yuengling_Traditional_Lager,Wraps,Whiskey,Vodka,Unknown_Segment)
) u;

 

This gives me an output like the below

 

rownames Segment2 Affinity
AF___Lemonade Yuengling_Traditional_Lager 0
AF___Lemonade Wraps 0.0760472674700328
AF___Lemonade Whiskey 0.00555589852284023
AF___Lemonade Vodka 0.0209656967344384
AF___Lemonade Unknown_Segment 0.029074190231479

 

In my case I am then running this MB affinity once each for our 5 principle customer segments and comparing affinities across segments.  Without unpivoting the data I believe this analysis would be difficult.

 

My question is: Is there an ability within Alteryx to report this data in Tabular form from the get-go?  We sometimes perform the analysis on a larger set of items (in my example there are 5, in reality there are hundreds/thousands) and so this becomes more complicated.  I know that I can write a SQL query to identify column names and automatically perform the Unpivot, but I believe having the data in Tabular is more useful anyway, and am hoping Alteryx has this in mind.

 

My second question is: The coaffinity scores seem direction-agnostic, is there a vision to change this to "When Item 1 is purchased Item 2 is purchased at X score" AND "When Item 2 is purchased Item 1 is purchased at this score".  I could be misunderstanding the score, but I believe it's not capturing that nuance.

 

Thanks

3 REPLIES 3
RodL
Alteryx Alumni (Retired)

If I'm understanding your desired output, then it is basically just putting a Transpose tool after the MB Affinity tool and selecting only the RowNames as the Key Field. This will give you a 3-column data set with the pairs of product IDs and the corresponding value.

 

As to your question on the score, this process is merely showing when two products are in the same basket, so the scores for Prod A > Prod B and Prod B > Prod A should be the same.

DylanB
Alteryx Alumni (Retired)

Hi samkim87,

 

For question 1, I'm attaching a modified MB Affinity tool that has the functionality to do what you'd like by checking the "Convert data to pairwise results to reduce column count" checkbox. To my knowledge, there's not a way to do this easily in Alteryx, so I modified the R code.

 

For question 2, I think you're looking for something similar but slightly different from this metric.This metric is a similarity metric, so how similar A is to B should be how similar B is to A. What it sounds like is you want a conditional probability. It sounds like you're looking for the probability that A is picked given B is picked and vice versa. To find those frequencies, you'll want to measure the number of times each item is picked (using the sumarize tool), and the number of times each pair of items is picked (using the MB affinity tool's number of co-occurences). Then you can use those frequences joined with the individual ones to estimate the conditional probability as 

  • probability of A given B is picked ~= number of times A and B are picked / number of times B is picked
  • probability of B given A is picked ~= number of times A and B are picked / number of times A is picked

 

Regards,

Dylan

 

samkim87
6 - Meteoroid

Rod,

 

That's perfect, the transpose tool is exactly what I wanted...I love when it's that easy!

 

 

Dylan, that's exactly what I'm talking about.  Simple example being Pretzels and Beer, you're more likely to order beer AFTER pretzels than pretzels AFTER beer.  The calculation is simple enough, as you say, P(A|B) = P(A INT B)/ P(B)...i'll try that out on some smaller datasets, my concern is being able to apply that calculation over larger datasets...but I'll search around for some R scripts that may work better than, say, a custom written Formula that I apply.

 

Also thank you for the re-written R script Dylan, I appreciate it...I'm not as versed in R...yet!

 

Greg

 

 

Labels