I have loyalty data and am trying to analyze the willingness to switch between similar products (e.g. if we have three brands of sweatshirts what % solely purchase Puma, Nike, and Adidas, what % cross shop between Puma and Nike, Puma and Adidas, and Nike and Adidas, and what % cross shop between all three).
In our data, we have household #, UPC number, and transaction number of all purchases.
This is what I've built so far, but have a couple questions:
1) Does this look right so far?
2) Is there an easy way at the end to spit out the data we are looking for (venn-diagram like) or would I need to separately calculate each?
3) If I wanted to filter the number of times a household has purchased anything at the store, how would I do this (e.g. don't want a household that has come 1000 times a year – likely the free loyalty card at the check-out).
4) Similarly, if I wanted to filter so that a household has had to make two purchases in our subgroup (e.g. two sweaters from Puma Nike or Adidas) how would I do this? We are trying to avoid the case where someone has bought only one item and appears brand loyal when they in fact are not.
Solved! Go to Solution.
Your approach seems correct, but ensure you're properly associating UPCs with the correct brands and filtering the data accordingly.
You'd need to calculate each brand interaction separately, but tools like Python's matplotlib_venn can help visualize this as a Venn diagram.
To filter frequent buyers, add a condition to exclude households that exceed a certain threshold of transactions per year.
Use a filter to ensure households have made at least two purchases from the targeted brands (Puma, Nike, Adidas) before including them in your analysis. NJMCDirect