I have loyalty data and am trying to analyze the willingness to switch between similar products (e.g. if we have three brands of sweatshirts what % solely purchase Puma, Nike, and Adidas, what % cross shop between Puma and Nike, Puma and Adidas, and Nike and Adidas, and what % cross shop between all three).
In our data, we have household #, UPC number, and transaction number of all purchases.
This is what I've built so far, but have a couple questions:
1) Does this look right so far?
2) Is there an easy way at the end to spit out the data we are looking for (venn-diagram like) or would I need to separately calculate each?
3) If I wanted to filter the number of times a household has purchased anything at the store, how would I do this (e.g. don't want a household that has come 1000 times a year – likely the free loyalty card at the check-out).
4) Similarly, if I wanted to filter so that a household has had to make two purchases in our subgroup (e.g. two sweaters from Puma Nike or Adidas) how would I do this? We are trying to avoid the case where someone has bought only one item and appears brand loyal when they in fact are not.