In today’s digital world, we’re often confronted with vast arrays of choices. In such a situation, recommendations help us to focus on which choice might suit us best. A recommender engine is a system, often in the form of an algorithm that seeks to predict the preference a user would give to a certain item. Netflix uses a recommender engine to suggest movies you might like to watch and Amazon uses one to offer you products you’re likely to buy.
Recommendations that actually match customers’ needs or desires can increase conversion rates and unlock extra revenue at checkout. That’s why well designed recommender engines are very desirable for webshops.
My client, Technische Unie, is a Dutch wholesaler of installation materials and one of the largest companies in the Netherlands. Their webshop comprises a large portion of their total revenue. In this blog I’ll take you through the process of building a recommender system for the Technische Unie and I’ll explain why this one is different than others.
The basics of recommender engines
The two common approaches to giving recommendations, item-to-item and user-to-user, are similar in the sense that they are both algorithms that produce recommendations based on data. However, the way of processing that data differs.
Knowing that I have watched a lot of comedies, the item-to-item based recommender engine is likely to suggest that I watch another comedy movie. On the other hand, the user-to-user based engine will identify similar users. Suppose you have watched ten different movies on your Netflix account. I’ve watched six movies, but these six films are ones that you’ve also watched. We could then say that we are users with similar tastes. So, whenever you watch a new movie, it is also recommended to me.
Sometimes recommender engines will combine information of both similar items and similar users to generate recommendations through a hybrid approach. However, we wanted to give recommendations based upon the user’s current basket, as opposed to recommendations based upon individual products or user preferences. At Xomnia, I recently received a training about network science. Inspired by this training, my mind started to run and think about possible ways to use network science in order to give accurate recommendations. And so, together with my team at Technische Unie, I started building the recommender engine we were looking for, not using the common item-to-item or user-to-user approach, but instead using ideas borrowed from network science.
Utilising network science
Network science is an academic field that studies complex networks, represented by elements that are (or aren’t) connected with each other. An example of a network is Facebook. If two users are friends, they are connected with each other. Representing all Facebook’s users and their interconnections in a graph, results in a large complex network (visualised below).
Facebook’s network visualised. Areas where people are highly connected through Facebook light up, while areas where people are not much connected through Facebook remain dark. Image credit: Facebook
The question that now arises is how to give recommendations based on the information that is stored in such complex networks. We started with representing Technische Unie’s data as a network. A simple representation of this network is shown below. Each circle (node) represents a product. Each line (edge) between two products represents the fact that these products have been bought together at least once.
A representation of a simple network. The circles (nodes) are products and a line (edge) between two nodes indicates that these products have been bought together at least once.
Now consider the case where, during a visit to the webshop, a user has put two items in their basket. These items are indicated in blue in the image below. But before the customer pays and checks out, we’d like to provide a recommendation of a product they might like as well.
Suppose that we may only recommend one item from the hundreds of thousands of available products. The critical question is: which one should we recommend? Considering our simple network, it would make sense to recommend one of the items that has typically been bought together with the other items in her basket. As illustrated in the image below, this leaves us with just three choices instead of hundreds of thousands. What a relief!
However, we are still restricted to recommending just one product. So, what we could do is count the number of connections an item has with the customers’ current basket. In this scenario, two out of the three possible items have two connections with products in the basket; while one item has just one connection with the basket. Assuming we prefer to recommend items with a higher degree of “basket connectivity”, we can drop the item with just one connection. This now leaves us with just two options.
The network with the “basket connectivity” values indicated in the nodes.
As the two remaining items both have the same basket connectivity, we should introduce another metric to narrow down to our final choice. One possibility is to count the number of unique customers who have purchased a specific combination of products. These amounts are indicated near the edges in the image below. Counting the number of unique customers who have bought a specific item with items in the basket results in the numbers indicated in the nodes. We can see that this amounts to 60 for one of the nodes and 50 for the other. Based upon both the basket connectivity and our newly introduced metric, the “connection weight”, we can now decide which recommendation we will give to our user: the product for which both metrics are the highest.
The network with the “basket connectivity” and the “connection weight” values indicated in the nodes. The node where both values are highest is chosen as our recommendation.
Building a more complex engine
What we have essentially done so far is create a recommender engine that generates recommendations by processing information stored in a network of products. Clearly, this recommender engine is very simple and arguably too naive. However, it illustrates the basic process that is taking place. Following this same process, we can take steps to create a more sophisticated engine, capable of detecting subtle relations in the network and generating relevant recommendations.
In the above example, we generated our recommendation by calculating two different metrics. We are not restricted, however, to just two metrics. Instead, we can continue to construct different metrics and assign these values to our nodes and edges, just like we have been doing previously. By doing so, we are essentially extracting information out of our network and representing it in a different - machine readable - form.
An example of another such metric would be the “relative connection weight”. Suppose that one hundred customers bought a certain item in combination with other items. Now, we want to know in what ratio these combinations have been bought, relative to the total amount of one hundred. If one of the combinations has been bought by 60 customers, that ratio would be 0.6 (see below image). This metric answers the question: “Relative to the total amount of customers that bought a certain basket-item in combination with another product, how often did these combinations occur?”
The network with the “relative connection weight” values indicated in the nodes. NB: the node that previously scored low is now scoring the highest, indicating that recommendations heavily depend on what metric is used.
Training the engine
Within very large and complex networks such as Technische Unie, we can train a computer to learn relationships between these metrics (which we’ll call variables from now on) and the correct recommendations (I’ll discuss later what is considered a “correct” recommendation). If the computer sees enough examples of correct recommendations, all having their own set of variables, it can learn to recognize what sets of variables are likely to belong to a correct recommendation.
So, whenever the computer is then given a new basket it has never seen before, it can analyze the associated variables and then decide which products are most likely to be an adequate recommendation.
Defining a correct recommendation
We have trained multiple models following the above setup. These models are all similar in the sense that they can take a basket of products as input and as an output return all possible recommendation choices along with the probability of those products being a correct recommendation.
Subsequently, we take the five products with the highest probability and these will be our recommendations for that basket. (Giving five recommendations was considered adequate by the Technische Unie. Fewer recommendations would decrease the chance of giving a correct recommendation and too many recommendations would result in an information overload for the user). We use the following method to define a “correct” recommendation:
- From each basket in the historic data, we remove one product (the hold-out product).
- The resulting basket is then given as input to the model.
- If one of the five recommendations that the model returns is the actual hold-out product, then we consider this a “correct” recommendation.
A small portion of the product network of the Technische Unie. Products are indicated as black dots, while a line between two products represent the fact that this combination has been bought together at least once. You can see clusters of products that have frequently been bought together.
We trained our best performing model (yes, it was XGBoost) on five months of historic data. This data forms a large and complex network of about 400.000 baskets, comprising 11 million edges. Feeding all these baskets to our model along with the information extracted from the network, resulted in a correct recommendation approximately 30% of the time when testing it on a test set of one month of data. The test set contained around 130.000 baskets, which means that for around 39.000 baskets a correct recommendation was given.
It’s important to note, however, that the customer does not actually add the product to their basket every time a correct recommendation is given. Moreover, one could argue that the way of defining a correct recommendation is slightly biased, as we are removing hold-out items from the actual baskets, thereby disturbing their original composition.
This will always leave us with an imperfect estimate of the actual performance of the model as the baskets’ compositions in the testing environment are disturbed, while the performance of the recommendations in a live environment are influenced by external factors such as where and how the recommendations are shown on the website.
During this project, we’ve used a fairly unknown technique to build our recommender engine. The model performs well according to the defined metric, and the results are promising. However, not much literature exists on recommender engines that use a network based approach.
There are therefore plenty of interesting topics to research within this context. For example, it would be interesting to see when and if network based models outperform traditional methods such as item-to-item or user-to-user. It would also be useful to know how our network based approach translates to these other methods, as there are arguably similarities between them.
For now, you should take-away the fact that modelling user-preferences is a complex task and that there is probably no ‘holy-grail’ that works best in all cases. Therefore, it is key to understand that recommender engines don’t just come in the item-to-item or user-to-user form. By using your creativity, you can come up with other methods and combine seemingly unrelated fields of science into new hybrid-form recommender systems that might work very well for your specific case.