Most of us can relate to the specific case, when Netflix knows which films we would like to watch next, and we spend the next 3 hours watching Netflix movies. Or a case, when we spend way too much money online shopping, even though we wanted to buy just one thing. Or a moment, when we start listening to songs on YouTube, and the playlist takes us to a very cool Icelandic band we instantly like. So, the question arises – how have these machines become so smart, so that they magically know what we want? The answer is – all the magic happening behind the scenes is done by machine learning, or more specifically, recommendation systems that use algorithms, to find similar items and similar customers, based on their behaviour, and recommend items which the specific customer should like.
Recommendation systems are a very popular and effective paradigm in retail business. With a recommendation system, shoppers can find items they like with less effort. Furthermore, they are presented with items they’ve never thought of buying, but which actually suits their needs. Therefore, as a part of one of our projects for a footwear company, we developed BE-terna’s customer behaviour analysis platform, which provides clients with customer profiling and recommendation system.
A recommendation system is a tool that uses a series of algorithms, data analysis and artificial Intelligence (AI) to make recommendations online.
The benefits: Customer behaviour analysis
Customer behaviour analysis focuses on understanding the type of customers; what do they like, what do they not like, what their pattern of interaction with the items is, customer value, etc. If we manage to model these aspects of a customer, we can anticipate their future needs.
The main benefits coming from customer behaviour analysis system are:
- Boost in sales,
- Better understanding of customers and
- Long tail strategies.
The main benefit can be put in just three words - boost in sales. According to McKinsey, 35% of all Amazon purchases, and 70% of Netflix purchases, are driven by their recommendation systems, and the start of using recommendations significantly boosted their sales. (ref. 1) Furthermore, during COVID-19 pandemic, many retailers went online, digitalised their businesses, and changed their business culture to adapt to the new and everchanging circumstances. According to reports, in 2020 the growth of ecommerce sales in US alone was more than 30% (Figure 1). (ref. 2) This provides a huge amount of online data for potential exploration, and use in building machine learning systems.
Figure 1: Comparing growth of US ecommerce sales in period 2019-2020
The second benefit comes from the better understanding of customers. This is the part where customer profiling comes in. By profiling customers, we can better understand their behaviour and, consequently, better understand their needs, or in other words, meet their needs, which finally can be rewarded with higher customer satisfaction and loyalty. Other than increased customer satisfaction, we can easily create automated marketing campaigns, and personalise them based on customer analysis.
The next benefit is a much better strategy for long tail items. The term long tail item, refers to niche and hard-to-find items that are very specific and unique, and usually only have a small group of people looking for them. From a customer’s perspective, tools such as recommendation systems, allow them to find products outside their immediate area, and items they, otherwise, would not have had access to. From a supplier’s perspective, if they hold items in a warehouse, hidden from the customers that would like them, this strategy could become very profitable. (ref. 3)
How it works?
To achieve these results, we have set-up the infrastructure and pipeline for data analysis and modelling with machine-learning algorithms. Briefly, the pipeline is composed of an input module that connects to a data source, and sends the data into data analysis, and human behaviour modelling module. In this module, data transformations, such as cleaning and pre-processing, are done, and data segmentation and recommendation models are built. The results of the modelling are sent to the output, presented as multiple dashboards in Power BI. The high-level pipeline is presented in Figure 2.
Figure 2: High-level pipeline of customer behaviour analysis platform
Customer profiling is done, by segmenting customers into clusters, which exemplify similar behaviour based on different parameters, derived from data such as a number of items bought, a value of bought items, a number of items returned, types of items bought, etc. Segmentation is essentially grouping by behavioural similarity. The input to segmentation is parameters for which we would like to find similar groups. The result is the number of similar groups, as well as rules by which a certain property falls into each group. Additionally, we can assign values to each segment per segmentation, and use it as a numerical weight (score) of the segment. Different segmentations can be combined as equals, or according to assigned importance weights, to provide customer value. More segmentations per property will give us more fine-grained profiles of the property, since we will get different views on it, and, consequently, be more insightful about the property (customer, item, brand, store, etc.). The customer profile is therefore, the most important parameter for further recommendations.
An example of segmentation flow is presented in Figure 3. There we see customers by parameters with a number of items bought (x-axis) and a margin earned on purchases (y-axis). Customers can be separated into 3 distinctive groups. We have three segments for which we assign the importance and score:
- High margin & medium number of items bought – Best – score is 1.00
- Low margin & higher number of items bought– Medium – score is 0.66
- Low margin & low number of items bought – Low – score is 0.33
Figure 3: Example of customer segmentation
There are several methods of how to implement recommender systems, and, in this case, we used a hybrid model of:
- Collaborative filtering model
- Content based model
Collaborative filtering is an approach which uses the assumption that users who bought similar items in the past, will also agree on new items. Let us look at an example case of 2 customers – Jack and Jill (Figure 4). If Jill bought items A and B, and Jack bought items A, B and C, that means that, if Jack and Jill had already agreed on 2 items, there is a high possibility that Jill would also like item C. So, per collaborative filtering approach, we would recommend item C to Jill next. On the other hand, collaborative filtering has some well-known problems, one major being a cold start problem. When a new item appears, it has no interaction. This means that it would never appear under recommendations.
Figure 4: Example of collaborative filtering
Another common approach, that softens weaknesses of collaborative filtering, is content based model. Content based model works under the assumption that what customer liked/bought in the past, would probably be liked/bought in the future. It uses meta information of items, and a profile of the user’s preferred choices. Let us look at the example of Jelena, who often buys her clothing online (Figure 5). In the past several months, Jelena has bought several items online. First, she bought herself a pink skirt, then, a couple of days later, a pink T-shirt, then pink heels, and then a pink hat. It is obvious that Jelena likes pink clothes, a common feature that all items share. It is highly likely that Jelena will like a pink dress more than a black or a blue one, or even a non-clothing item. So, per content based approach, we would recommend a pink dress to Jelena next. On the other hand, content based model also has a problem of cold start. When a new user appears, it has no previous buys.
Figure 5: Example of content based model
We prototyped recommender system that uses hybrid approach, and fine-tuned hyper parameters that yield best results. We constructed it step by step, feeding more features to the model on every step, and inspected the results. We split data into train and test set. Train data was used for building a model, and then we checked performance of the model on a test data. Recommendation systems has more ways of evaluation, and we used Receiver Operating Characteristics Area Under Curve (ROC AUC) metric, where perfect score is 1.
User Interface: A way towards quick software adoption
As in any AI-based project implementation, a good user experience is crucial for a user to build trust with the system. User interface is built as a Power BI application. In Power BI, a user can see review input data, segmentations by different parameters, inspect customer scores, and most importantly, see output recommendations.
There are 2 possible ways of interacting with recommendations. One is from the perspective of which items we should recommend to customer A (Figure 6a). In this case we select only one customer, and in the leftmost table see the recommended items for the customer A.
Figure 6a: Interacting with recommendations results in Power BI - A) for recommending items to customers
The second way of using the app is from the perspective of which customers I could recommend item X (Figure 6b). In this case we can select only one recommended item, and in the bottom table see the customers we should recommend this item to in order to have a high probability of generating a sale.
Figure 6b: Interecting with recommendations results in Power BI - B) for recommending customers to item.
Segmentations can be used from a perspective of a personalised marketing. By using provided segmentations, a user can create personalized offers for different groups of customers based on their preferences. Let us look at an example in a footwear industry in Figure 7. In this case, one segment of customers prefers athletic and sporty items, and, consequently, we can make them an offer or discount on this kind of items; the second segment might like elegant and classy models, and that is what we should offer them. Then the third segment might be customers with kids, so a personalized offer for them would be kid shoes, then there is the fourth segment, followed by the fifth, the sixth, etc., each with different properties and preferences.
Figure 7: Example of personalized offering
Here we presented the potential of BE-terna customer behaviour analysis platform for ecommerce and how powerful recommendations in providing benefits in online retail industry can actually be. The possible application of this kind of systems is ever-growing and never-ending. They can be used in personalized marketing, online advertisements, finding the best offers for customers, providing discounts, recommending the next best offer, finding items frequently bought together for cross selling, and many, many more.
All in all, with trends shifting towards higher digitalization and growth of online technologies, for every retail company to keep up with the trends of the use of recommendation systems in online retail is a must.
If you prefer watching rather than reading, watch our video
on this topic "How do machines figure out which things we want?"
- Reference No. 1: How retailers can keep up with customers
- Reference No. 2: Pandemic causes US eCommerce to surge north of 32% in Q4
- Reference No. 3: What is the Long Tail?