If you are starting a company that sells products online or streams movies or suggests images, chances that you would be advised to deploy recommendation engines are very high.
So, how does one assess the effectiveness of such personalised systems to a company? What factors does one have to consider to make the recommender systems for the better business and vice versa?
Successes like Amazon and Netflix have surely put AI-powered recommendation engines at the top. However, there also have been reports of significant failures by the companies that took the leap of AI from traditional methods.
Awareness of how algorithm-driven solutions such as personalised systems work, is necessary to handle the ever-changing preferences of customers.
Academic research mostly focuses on the consumer perspective, with the assumption that improved customer value is indirectly also beneficial for the recommendation provider. Indeed, among other considerations, service providers are usually interested in improving the recommendation experience of consumers. Typically business value is measured through sales or revenue, click-through rates(CTR), higher user engagement, or customer retention rates.
Choosing The Right Strategy
Way back in 2012, Netflix wrote in their blog post that 75% of what people watch is from some sort of recommendation. A couple of years ago, they estimates the business value of recommendation and personalisation as more than $1 billion per year. In 2019, one can only imagine what kind of popularity Netflix enjoys all over the world especially with growing markets like India.
Generally, there are various “business models” for recommenders, that is, how they help improve business with suggestions like list a number of effectiveness measures, including increased sales, fee-based sales through more transactions or subscriptions, and increased income from other types of fees.
Nonetheless, it remains difficult to assess how such increases in adoption translate to increased business value.
For example, click-through rates are often used as the measure of choice — partly because it is easy to acquire — but CTR measurements can be misleading and do not actually capture the business value well.
Click-through rates are used to measure how many clicks are garnered by the recommendations. The underlying assumption is that more clicks on the recommended items indicate that the recommendations were more relevant for the users.
While the CTR is able to measure user attention or interest, it cannot convey, for example, whether users really liked the recommended news article they clicked on or if they purchased an item whose product details they inspected based on a recommendation.
Therefore, alternative adoption measures are often used that are supposed to be better suited to gauge the usefulness of the recommendations and which are often based on domain-specific considerations
For instance, Netflix uses the “take-rate” as a measure which captures in how many cases a video or movie was actually played after being chosen from a recommendation.
According to their experiments, increases of the take-rate due to the deployment of a personalised strategy are substantial when compared to recommendations based on popularity.
In domains where the items cannot be directly consumed (read, viewed, or listened to), other business-related adoption measures are common. Examples include the “purchase-through” or “bid-through” rate on eBay, as well as the “link-through” or “cite-through” rate for research paper recommendations or the number of “click-out” events to external partners in online marketplaces.
When users pick an item more often from a recommendation list which they later purchase or view, this is a good indicator that a new algorithm was successful to identify items that are relevant to the user.
According to a survey on recommender systems done by Dietmar Jannach & Michael Jugovac, measuring the value of a recommender system is not trivial. Even when revenue or profit can be captured directly in A/B tests, there might be longitudinal effects that are difficult to assess in advance.
Here are a few takeaways from the survey:
- When nearly everything on a web page is personalised or some form of recommendation, e.g., in the case of Netflix, users are likely to choose whatever is recommended to them due to a mere presence effect.
- Counting only how often users start streaming such an item can therefore also be misleading as this measure would include users who started playing the movie but did not enjoy it in the end.
- In some cases, when customer retention is already high—like in the case of Netflix—obtaining significant improvements in customer retention can be difficult to achieve the choice of the recommendation strategy (collaborative vs. content-based vs.non-personalised) does matter both in terms of sales and in general user behaviour.
- Studies are, however, different from many online experiments conducted in academic research, which typically benchmark algorithms of a similar type, e.g., different matrix factorisation variants or sometimes even only different loss functions for the same learning approach.
- The tiny accuracy improvements reported in online experiments translate into relevant business value improvements when deployed in real-world environments remains difficult to assess, as published industrial field tests rarely focus on such one-grained comparisons between similar algorithmic approaches.
- The success of a recommender system can, however, also be dependent on other factors, including the users’ trust in the recommender or the website as a whole, the perceived transparency of the recommendations, and, most importantly, the user interface.
So Should You Get One?
Businesses, be it retail or entertainment, which are predominantly using recommendation models nowadays, more or less have three thumb rules to begin with:
- Content-based: What a certain customer is clicking(genres, actors, brands etc)
- Similarity: What customers who are similar in tastes are clicking
- Trending: Can depend on similarities, geographical location etc.
Added to these are other factors such as how long a customer watches a certain movie or what product one has added to the cart. If a user watches certain movie of some genre for a specific amount of time, then that too will be factored in building a personalised system. As the number of items or categories increase, the model gets complicated and not to forget the number of users and number of clicks per minute.
Businesses need to be ready to handle both big numbers without trading accuracy. Recommending an Indian user with tri-coloured products post-Independence day would not be much of a hit. Real to near real-time recommendation engines is what the top companies pursue.
“Skilled engineers and cloud resources like AWS play a key role in handling almost real-time recommendations,” says Chaitanya Jadhav, a former product engineer at a startup which builds recommendation engines for media and retail companies.
Skilled engineers, not only to build models but also to monitor whether the quality of a certain movie recommendation is consistent with user tastes. Humans are quite underrated when it comes to recommendation. However, given thousands of options, million customers and billion-dollar market value, recommendation ought to be automated.
Picking the right service plays a key role too. “ AWS has been the choice of many businesses for all the customisation and services they offer,” says Jadhav speaking to Analytics India Magazine.
A right strategy would differ with use cases. A YouTube user might have a watch history of trending videos only, then that user account will be associated with a personalised model that works that specific user. Not just recommendation but a personalised recommendation is what an up and coming should aim at.
As for assessing if a startup would need an AI-powered recommendation engine. The answer is — why not?
Recommendation models have been the most successful machine learning application so far. Because they are not only trying to know the customer but expose the product to new customers. The success of recommendation engines can be summarised as follows:
- Introduce products (movies, baby oil etc) to new users
- Retain the user with accurate recommendations
- Model improves with increased user base
- Recommends better