Marketplaces pick out products for customers that perfectly meet their needs, and streaming services create personalized music and video compilations. This is possible thanks to recommendation systems. Let’s unveil how to develop one.
Today's online services act as virtual advisors, offering users shopping options and services tailored to their interests. They have been able to assume this role thanks to the emergence of recommendation systems based on Machine Learning (ML) and Artificial Intelligence (AI). These systems analyze user behavior in the online environment and generate personalized selections that are highly likely to match their preferences. The consulting company Mordor Intelligence forecasts that the global market of recommendation systems will increase sevenfold by 2026, compared to 2020, and will reach $15.13 billion.
Recommendation systems have become an indispensable tool for businesses. According to a 2021 McKinsey study, 78% of respondents choose, recommend or are willing to pay more for a brand that provides a personalized service or experience. This is true not only for online retailers, but also for social media, online movie theaters, and grocery delivery services. However, developing a model that can competently analyze changing user information requires a lot of effort. Let’s break down the main development stages and the nuances worth knowing.
Recommendation system development stages
1. Defining the problem
Before setting IT specialists the task of developing a recommendation system, it is necessary to define specific business tasks which this system is going to solve. The ultimate goal of the project may be to increase sales and the average bill, speed up the buying process, manage demand more effectively, save on POS displays (point-of-sale promotion materials), and increase user loyalty. The data collection parameters and scope of the recommendation system will depend on the task at hand.
2. Calculating the project costs
The cost of developing a recommendation system depends on a number of factors, including:
- the total amount of data in the database;
- the amount of information that is updated daily;
- the structure and variety of data;
- the complexity of the subject area;
- the required speed of operation;
- the expected system accuracy;
- the required infrastructure.
Systems have varying degrees of accuracy. Accuracy is how well the recommendations meet user expectations. In the initial stages of building the model, it is important to determine what accuracy makes sense for the company, that is to say — will give the desired effect versus the development investments. The degree of accuracy depends on:
- the amount of data — the more user information the recommendation system collects and analyzes, the more accurate it will be;
- the algorithms used — for example, it is possible to apply one method of recommendation formation — filtering based on content or user behavior, or to combine a few of them.
As the business grows, the model will likely need to be scaled up and retrained. Consumption trends are changing, which means that the model needs to be constantly updated with new data to improve its accuracy.
The cost range of the recommendation system is very wide. On top of that, 10-15% of the initial investments are spent annually to maintain the solution. The period of return on investments varies as well. However, on average, the system pays off within two to three years.
3. Collecting and analyzing data about the system future users
The model's ability to generate relevant suggestions directly depends on how much data it collects. IT professionals mainly use two methods to collect data:
- surveying users about their preferences;
- tracking actions on a site or in an app (purchase history, pages viewed, products liked, etc.).
The combination of these practices allows for both the interests of the individual and his/her experience with the platform to be taken into account.
However, in some cases, more non-trivial ways of collecting data are used:
- data purchase (it is possible if users sign documents confirming their consent to process and transfer personal data to third parties);
- integration with other services for the exchange of information;
- market research to enrich existing data.
Non-trivial ways can be more effective for companies that are new to the market and therefore do not have a large amount of data. These methods are also often used to improve the performance of already configured recommendation systems when all other resources have been exhausted.
The data obtained must be analyzed. What’s applied for this purpose, is, for example:
- duplicate search techniques that make it possible to find and sift out repetitive content;
- clustering methods that allow to group similar objects;
- algorithms of association rules search that help find patterns between related events.
4. Developing a prototype
A prototype is an initial machine-readable version of a system that produces only basic results. It is used to verify if the required percentage of accuracy is being achieved with the existing data set, and to evaluate the effectiveness of the future model.
At this stage, specialists recalculate the costs of developing and maintaining the system, and the company makes a final decision on the project's feasibility.
5. Additional data analysis and processing
At this stage, specialists:
- exclude the model’s attributes that do not carry useful information and interfere with the construction of recommendations. For example, an online bookstore does not need the user's apartment number to form recommendations, while country, city or even street can be useful;
- bring the values to a single format. Sometimes attributes are presented as strings or images, which makes their direct processing impossible: in this case coding is used, where the attributes are brought to a single numeric format. And sometimes — to make the attributes useful — the information from them must be reformatted. The model attributes generation is responsible for this.
6. The system launch and A/B testing
In the final step, experts refine the recommendation system and integrate it into the existing infrastructure. After, they evaluate the model effectiveness to make sure that it makes relevant recommendations. For this purpose, most often A/B testing is used — this method enables comparing two versions of the product and determining which of them is more suitable for achieving the objectives set. In most cases CTR (click-through rate) is used, which is one of the most common metrics in online marketing that shows the ratio of clicks to impressions.
Problems that developers face
1. “Cold” start and incomplete data
There are three main causes of cold start:
- when the system has been recently launched, and therefore has not yet recruited users and data to form recommendations;
- when a new element is added to the system with which users interact little;
- when a new user is registered in the system — there is not enough information about him/her yet.
Due to the lack of data, the model either does not create personalized suggestions at all, or make any irrelevant ones. To improve this deficiency, you can, for example, train the system to offer the most popular products or services, based on statistics for all users.
2. Increasing the amount of data
As the popularity of the service grows, the amount of information that needs to be considered when creating recommendations grows. This may require the use of other methods of data processing and creation of recommendations, as well as a more comprehensive approach to working with clustering algorithms. For example, specialists can apply special frameworks for cluster computing and data processing (Dask, Spark).
3. Changing user preferences
Over time, user interests change, and recommendations that were appropriate before turn out to be undesirable. The system needs to take these changes into account. To do this it collects data on the user's interaction with the site or app and compares the received information with his/her latest preferences, as well as evaluates such data as age, location, etc. After determining when a change has occurred, the system then adapts to the new interests.
Let us tell you more about our projects!