16/4/2019Machine Learning

Building Python Recommendation Systems that Work™

Jakub CwynarJakub Cwynar
8 min read
Building Python Recommendation Systems that Work™
Artificial IntelligenceData ScienceEcommerce
Building Python Recommendation Systems that Work™
Recommendation Systems improve both customer experience and sales. Recommendation System is a must-have for modern e-commerce. A simple system can be built in less than an hour

Why should we care about recommendations?

You may not always realize it, but so many of the websites you use on a daily basis have built-in Recommendation Systems that are driving your experience — as well as nudging you towards purchases.

They are a must-have feature for any e-commerce website. We recently built one for a major apparel retailer which increased conversion rate by 1% and improved average order value by 5.55%. It helps if you have an expert team behind your implementation, but we believe that most people can get a handle on the concepts and even have a go at building their own simple systems. Let’s get down to business…

The fundamental idea used in recommendation systems — Collaborative Filtering — works on the assumption that if two (or more) users rate common items the same way, they probably have similar taste. It is a mathematical equation with many unknowns — and the bigger the database of users and items, the more it sprawls towards infinity. But don’t let the math scare you off.

Sudoku is a mathematical equation with nine unknowns. You can do it the nerdy way, reducing it to nine linked equations, but it takes a lot of work before you get down to real business. In fact, the quickest way to complete the puzzle is usually through logical thinking and risk (guessing). Filtering uses a similar mix of math and intuition.

Let’s imagine that cinephiles Tom and Ben both use our movie website. They don’t know each other but were equally excited by Gal Gadot running through no-man’s land in Wonder Woman, both rated Harry Potter as more “Accio” than just okay, both loved Avatar, and agreed that Godzilla kind of sucked.

However, when it came to the new Star Wars movie, things went a different way. Ben rated it first and loved it. We’d assume from what we know so far that Tom would feel the same, but he wasn’t into it at all. In fact, the algorithm now thinks that Tom’s preferences are more in line with Caitlyn’s.

The system would not be wrong to recommend Star Wars to Tom based on Ben’s rating. All the data suggests that Tom will like it; but there is still always an element of guessing, as there is no real accounting for taste. The system will never be right 100% of the time but, with enough data, we can find full or partial taste matches for people; learning as much from where people’s reactions are the same as from where they differ. It can collect people in pairs or groups and make the best possible guess with the information at hand.

Getting the right data is essential

We can use two different types of customer feedback when to create data. The first, ‘explicit’ feedback, is when users provide clear, affirmative information through actions like rating or buying a product or watching a film on a service like Netflix. These are obvious choices, but human activity is often more subtle.

‘Implicit’ feedback is when a user gives us a suggestion of their interest by perhaps watching a trailer or reading a review. A user might click on a product but not buy it. They have signaled intent but not committed an action.

When building recommendation systems, we need to decide whether explicit or implicit feedback is of most value to us, and also how it should be weighted. Can we learn as much from intention as we can from a completed action? And how do we factor in negative implicit feedback like a user watching only the first few seconds of a movie trailer? It’s a complex area that we will debate in another post. For our current example, we can assume that rating a movie is sufficient user feedback.

We should also say that, in our example, we are not dealing with the so-called cold-start problem. When new users sign up to a service or visit a site for the first time, we don’t know much about them yet. On our movie site, the user needs to watch a few films before we can make recommendations. And when a new movie is added, it needs some ratings before it can be paired with other films. To get the ball rolling, we might make some educated guesses or ask new users a few questions when they sign up to start feeding data into the algorithm. On Netflix, you are asked to choose a few titles you like to help “jump-start” your recommendations as a new user. If you choose none, you’ll be shown a generic choice of popular titles and your activity from that point will be the basis for the process.

Interpreting your information

Assuming we’ve got our users and started gathered some explicit data, the next step towards building a recommendation system is to look at the actual ratings users gave:

This matrix, called rating matrix (R), has some missing elements because, in real life, nobody has seen every movie. At this point, we can define what our recommendation system should do. We want the system to guesstimate how Caitlyn would rate Wonder Woman and Avatar so it can then recommend one or both to her if it decides that she would give them a high rating.

To do so, we can apply a technique called matrix factorization, more specifically, SVD (Singular Value Decomposition). It is a method of grouping items from the original matrix R into abstract concepts. It breaks down the elements of the matrix into single factors, removing all the information such as names and movie titles, to create pure mathematical results. These determine how each user correlates with each value. With this information, the system can try to predict missing fields in the R matrix by combining users' preferences with movie summaries. Of course, ours is only a simplification of what is actually a much more complex, automated process.


More than 80 per cent of the TV shows and movies people watch on Netflix are discovered through the platform’s recommendation system.

Josephina Blattmann, UX Planet


With the theory out of the way, we can start building the actual system. Fortunately, we don’t need to implement all the algebra magic ourselves, as there is a great Python library made specifically for recommendation systems: Surprise. In a few lines of code, we’ll have our recommendation system up and running. First, let’s import the necessary components:
from surprise import SVD
from surprise import Dataset
Recommendation systems need historical data to work properly. As we are interested in knowing user movie ratings, we can use the famous MovieLens-100k dataset and present it as such:
In Surprise, all we need to do to get this data is to use Dataset class and then extract the training set (the dataset used for training our model):
data = Dataset.load_builtin(‘ml-100k’)
trainset = data.build_full_trainset()

Of course, this is just an example, in real life we won’t be using MovieLens. Surprise documentation provides a nice tutorial for loading custom datasets.

The library comes with the SVD technique we discussed earlier straight out of the box:

svd = SVD()
We create an object representing our model and train it on MovieLens. We are interested in predicting every user’s ratings for movies they haven’t seen, for which Surprise also has a tool:
testset = trainset.build_anti_testset()
build_anti_testset method returns a new dataset with user-movie pairs not present in the training set. In other words, movies that users haven’t yet seen. Predicting ratings for these blank fields is as simple as running one line:
predictions = algo.test(testset)
We get a list of Prediction object describing users, movies and a predicted rating:
[Prediction(uid=’196', iid=’302', r_ui=3.52, est=3.99, details={‘was_impossible’: False}),
Prediction(uid=’196', iid=’377', r_ui=3.52, est=2.75, details={‘was_impossible’: False}),
Prediction(uid=’196', iid=’51', r_ui=3.52, est=3.73, details={‘was_impossible’: False}),
Prediction(uid=’196', iid=’346', r_ui=3.52, est=3.50, details={‘was_impossible’: False}),
Prediction(uid=’196', iid=’474', r_ui=3.52, est=4.16, details={‘was_impossible’: False}),
Prediction(uid=’196', iid=’265', r_ui=3.52, est=3.76, details={‘was_impossible’: False}),

This list might look overwhelming, but we are only interested in three fields:

  • uid — the user ID, for whom we carry out predictions
  • iid — item ID (here we treat movies as items)
  • est — estimated rating for an item, as we expect the user to give

The actual recommendation happens when we display the top rated results to the user as something they might be interested in. There is a nice guide for that in the Surprise documentation.

In future, we’ll talk about how to display recommendations in a more effective way, as well as a post on choosing the right data for your system. If you want to learn a little more right now, these links are a pretty good place to start:

Mirumee guides clients through their digital transformation by providing a wide range of services from design and architecture, through business process automation, to machine learning. We tailor services to the needs of organizations as diverse as governments and disruptive innovators on the ‘Forbes 30 Under 30’ list. Find out more by visiting our services page.

Let’s engineer great products and systems together

Have a particular project in mind? Contact us to help you transform your ideas into a unique end-to-end product.
Let's talk