Hi guys, I am trying to identify and count the occurrence of products that are commonly bought together. You could think of these products as complementary goods, like a cell phone and a cell phone case.
This data set contains over 1,000 products, and I am interested to see which products tend to be purchased with one another. Orders can range in size from 1 item to 20 items. The goal is to find the pairs of products that commonly occur in customer orders and identify the top 25% of pairs by frequency.
My first step is to group by order number to see which products (SKUs) are purchased in a set. Where I could use some help is to find the common pairs of complementary goods within each order.
I attached the data set if you want to give it a look. I would love any guidance you could provide or any insight into ways to solve this challenge.
Thanks!