The Cricket World Cup season is the talk of the town and the cricket fans cannot wait to see the teams that would reach semi-finals. While people are still speculating, Predict22, an AI-based analytics startup expects that England, India, Australia and West Indies will qualify for semi-finals, in no particular order.
While they say that choosing the winner right now will be a fool’s errand owing to the margin of error in predictions, they don’t shy away from putting its money on England, and naming West Indies to be a dark horse. How has it been able to predict so?
Analytics India Magazine got in touch with Shubham Bindlish, Founder & CEO of Predict22 Sports Analytics to understand the amazing ways in which they are using analytics and AI to provide sports fans with interesting insights and predictions. It works extensively with artificial intelligence and machine learning models and claims to be the only sports analytics company in India that uses these technologies to predict the outcome of sporting events for all the stakeholders in the game — administrators, selectors, teams and fans.
“Our prediction accuracy is unmatched in the market right now and our models are constantly improving,” says Bindlish.
From Idea To Inception
The idea of founding Predict22 is as interesting as the startup itself. Bindlish shares that the idea conceptualised with the thought of putting applications of AI in the field with immediate feedback. “Nothing seemed better than sports where the algorithm is proven right or we learn something at the end of the day,” he says.
The choice of cricket was quite natural due to the easy availability of stats and the modular nature of the game coupled with his personal passion for the game as a fan. “Since childhood, I have been maintaining home-made databases (notebooks) for Indian players and have been trying to answer the ‘will they, won’t they’ question at every World Cup!,” shares Bindlish.
Predict22 started as an intuitive question to answer which spiralled all the way into its present form. Working at Goldman Sachs after graduating with a Computer Science background from BITS Pilani, he was fortunate enough to experience the importance of data firsthand and became intrigued by the potential of data to answer intuitive questions.
From Cricket, His Known Territory To Basketball
Bindlish started exploring his models in cricket. T20 cricket was a prime starting point. The game was not too old, as the format started in 2003, and there existed a lot of data points (thanks to the domestic T20 leagues all over the world). This data was also readily available, thanks to wonderful sites like https://cricsheet.org/ (now defunct).
“I narrowed down the 2016 T20 World Cup as the first tournament that Predict22 would predict matches for. After receiving promising results, we kept on expanding our dataset and building the algorithm to incorporate the IPL, other T20 domestic leagues and eventually white ball cricket for the 2017 Champions Trophy,” he said.
Intrigued by the results, he decided to investigate whether a similar model could work for other team sports. Football was the second obvious choice, as he is an ardent Chelsea fan. “With the 2018 World Cup serving as the hotbed for that experiment, we quickly expanded to cover the top four European (EPL, La Liga, Serie A, Bundesliga) + the Indian Super League (ISL),” he added.
While it predicted extremely good results for games that he understood, another experiment was to figure out whether the algorithm would work for a sport he does not follow and hence do not necessarily understand the salient features. Hence, the algorithm was extended to basketball, which surprisingly showed strong results.
For those wondering why the name Predict22, Bindlish has an interesting story to share. He says that the name Predict22 was conceptualised when he started predicted cricket matches. There are 22 yards of the cricket pitch and 22 players. The 22 players analogy remained with the introduction of football as well. While it doesn’t stand true with Basketball, but they are still quite fond of the name.
Putting AI And Analytics At Work
It is a no brainer that for the awesome predictions they make, they need data. The Predict22 team has collected historical data through various free datasets and scraping the free web. The data is cleaned and stored in a database using an in-house ETL process where the data is enriched with dimensional data (structured labelling information like the profile of players, the profile of teams, etc).
From the database, their machine learning pipeline processes the data with multiple in-house ML algorithms to learn more about the data. These include player ratings, match performances, and so on. Thereafter the enriched data along with the subsidiary algorithm output is fed into an artificial neural network which churns out the predictions for the matches.
They use a wide range of models and algorithms to make predictions about the matches. For instance, transfer learning, a section of the ANN coupled with a rules-based engine is used to make predictions for Daily Fantasy Sports (presently only Cricket).
For subsidiary algorithms, they use a bunch of algorithms akin to k-Nearest Neighbours, Probabilistic Programming, Markov Chains, ELO ratings to generate intermediate data points about team and player rankings and performances. These are generally used to better understand historical matches.
“We use the context of the match throughout our algorithms. This means that a 100 scored in a 200-run game is more valuable to the team compared to a 100 scored in a 350-run game. Similarly picking the wicket of Kohli is more impactful than picking up the wicket of Bumrah.
We are trying to fill in the context gap within the stats that historically exist to enrich the question bank that we can answer,” he explained.
They use Tensorflow for most of their modelling needs. Testing and development of the subsidiary algorithms are done in both R and Python. They also use AWS to host website and database servers.
Which Algorithm Has Been More Useful?
Bindlish shares that for predicting games, top-down scenario works better than the bottom-up model, which has too many interactions between the various factors that either lacked data or quantifiable outcomes.
With the top-down approach, the team performance outweighs the player performance. This allows them to segment into football and basketball easily as team data holds more weight than player data. Whereas for the Daily Fantasy problem statement, the opposite had to be done. Team performance holds little value over player performance.
“However, we quickly realised that the trained ANN above had a lot of implicit knowledge about the game. We used that knowledge and combined it with a rules-based engine we developed to train the algorithm on 1000s of previous matches and help us pick a winning team,” he explained.
The results so far are promising, but the team expects the performance to get better soon.
Predict22 currently boasts of around 8,50,000 unique users, and in IPL 2019 itself, they had around 5,00,000 unique users. This indicates that the average Indian cricket fan is ready for a more analytical element added in his overall experience. As the sports industry matures, fans are becoming more than just passive consumers. Everybody is looking for something to enhance the experience, whether it’s fantasy sports or fan forums and Predict22 has something for every sports fan.
For the coming year, they are going to introduce a lot of new features to enhance the sports experience across all three sports — such as player performance, prediction analysis and stats that are easier than ever to navigate.
Having said that AI’s involvement in the Indian sports industry is still at a nascent stage. But it is already being used in certain aspects of the game. “The growth is undeniable and that’s exactly where our international competition is heading, if Indian sports is going to stay competitive we must focus on gaining the crucial insights that AI can provide,” he says on a concluding note.