• Claire Matuka

Market Basket Analysis - Part 1

Updated: Mar 11



If you have been following along on the data science content, I bet you have come across the Data Science in Marketing article. In the article, we looked at various ways data science is applied in marketing. One application that was highlighted was market basket analysis.


Market basket analysis is used to identify associations that exist between several products, based on what customers are buying. It is majorly used by retail stores and helps in product offerings or product placement. Retail stores like Amazon and Walmart are well known for using it in their marketing strategies.


How it works

Here is a brief overview of how it works. Have you ever walked into a store with the sole purpose of buying one item and somehow walked out with 10? Guess what, you are either a certified shopaholic, or it was a plan. The right products were placed at the right places for you to see.


Like come on, why would you only buy bread when you could have bread and jam. Why would you only buy a skirt when you can buy a matching cute top to go with it as well? A new phone, well I am going to need a phone case, screen cover and a car charger to go with that. Oh wow, and there is even an offer, definitely a once in a lifetime deal.


It is said that "out of sight, out of mind." All the retailer needs to do is place these items in sight or offer the right discounts, and the more you spend, the more revenue they generate.


Sounds like recommendation?

The first time I heard of this concept, all I could think of was how awfully familiar it sounds to recommendation. So is there a difference and if so, what is it?


There are various methods that are used for recommendation. Engines can recommend based on similarity to other users (collaborative filtering), similarity of products based on customer preferences (content-based filtering), or a combination of both (hybrid).


Though not as common as the methods listed above, we can also choose to use market basket analysis to recommend. It is worth noting that this recommendation will be exclusively based on the association between items.


So what is the difference? In market basket analysis, we look at the association between the items themselves. Items A and B are frequently bought together, so if a customer buys A, they are also likely to buy B. This method does not take the customer profile into any consideration.



Association Rule Mining

Market basket analysis is achieved thanks to a concept known as association rule mining.


According to Wikipedia, "association rule mining is a rule-based machine learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using some measures of interestingness."


The first step involved is to check for association. So, how exactly do we calculate association? What are the measures?


To come up with an association rule, two parts are needed

  • Antecedent (If )

  • Consequent (Then)

For example: If onions, then tomatoes; If notebooks, then pens; If fries, then a burger, etc. Each of the pair of items form an itemset (i.e. onions and tomatoes are an itemset).


Once we have the rules, we can then calculate "association". There are three main measures:

  • Support - Measures how frequently the itemset appears in the dataset

  • Confidence - Measures the percentage of all transactions satisfying X that also satisfy Y

  • Lift - Measures the ratio of the observed support to that expected if X and Y were independent


Association rule mining algorithms

How do we determine the frequent item sets? Now, please note that there may be many item sets in a retail shop, but we are only interested in the ones that occur most frequently and are most popular. Each algorithm has a different approach to mine frequent item sets:


  • Apriori algorithm

  • Eclat algorithm

  • FP-growth algorithm

Specifically for market basket analysis, we will focus on the Apriori algorithm.


*****************

Conclusion

In this article, we have understood the math behind market basket analysis. In part 2 and 3, we will code in Python and R, and then come up with findings as well as recommendations for a specific retail shop.


gif