Modern digital applications mostly run on personalised user experience achieved through recommender systems. Movies, music, retail products and even choosing a partner on a dating site, involves a lot of feedback loops that are made robust through choices that the user makes.
A key source of bias is ignoring the fact that a choice is made from a cherry-picked and limited subset of options, leading to a self-reinforcing feedback loop. Options that were never presented are unfairly penalised.
Self-reinforcing feedback loops in personalisation systems are typically caused by choosing from a limited set of alternatives presented systematically based on previous choices.
In order to expose the biases swarming recommendation systems and in an attempt to offer potential solutions, a team at Bogazici University in Turkey published a paper. This work poses two questions towards tackling the feedback loop:
- How does one model user preferences accounting for the bias introduced by systematic and limited presentations, ensuring all alternatives are treated fairly?
- How does a system, built on such a model and aware of its limitations, learn to present the best subset of alternatives?
To address these questions, the authors explore Bayesian and bandit problem approaches.
They introduced Dirichlet-Luce, a Bayesian choice model that is aware of limited exposure and admits efficient learning and inference.
“TopRank” algorithm was used as a baseline to measure the effectiveness of our presentation mechanism in online learning to rank. TopRank subsumes various choice models including the ones shown in this paper and was shown to perform superior to previous work.
Dirichlet-Luce model provides fair and efficient preference estimates. However, in a real-world scenario, the onus is on the system to select a subset of options to present. In other words, the system needs an efficient presentation mechanism that simultaneously explores the options which the user might like and exploits the current best alternatives.
Here, the authors frame this active preference learning scenario as a bandit problem because, in a bandit setting, the Bayesian construction of model serves a dual purpose.
Key Takeaways
- Introduced a Bayesian choice model, the Dirichlet-Luce model, that accounts for limited exposure to alternatives provide a practical model, inference framework, and presentation mechanism that can deal with the inherent self-reinforcing feedback loop present in many interactive systems
- The model ensures Independence of unexplored alternatives—marginal posterior probabilities of choosing options that were never presented are independent of other choices
- Propose a mechanism for learning to present the best subset of alternatives, casting the Dirichlet-Luce model as the central component of a bandit algorithm
Avoiding Personalised Echo Chambers
The number of songs available exceeds the listening capacity of an individual in their lifetime. It is tedious for an individual to sometimes to choose from millions of songs and there is also a good chance missing out on songs which could have been the favourites. So, is the case in case of tweets on Twitter or products on an e-commerce website.
A naive recommender system can suddenly become a tool for pushing propaganda.
An user’s interest may degenerate over time due to systematic exposure, leading to echo chambers. The interplay of the user’s choice and the system’s presentation in a feedback loop, reinforcing the system’s own biased belief, results in the so-called filter bubble — an unintentional form of censorship with unexpected economic and societal impact.
Solutions such as the ones discussed above can also be adapted to a multi-user setting as in collaborative filtering where users are assumed to share interests.
There are many popular recommendation approaches that have been accepted widely and are still being improved. Since there is no going back for these deep learning-based recommendation systems, it is only obvious to evaluate them for the route they take in building the final model.