What forms the core of businesses today? Huge volumes of data that flows in and out every day — and though it does matter, what comes into play is the ability to use data and models to make better business decisions. UpGrad, a leading edtech startup, have collaborated closely with Uber for their Data Analytics course content generation.
Sai Alluri, Analytics Lead at Uber India, and Industry Expert at UpGrad, talks about how supply positioning model, segmentation and visualization tools that are applied at Uber and how Uber stays on top of the game plan by understanding the biggest mismatch between supply and demand.
From predicting the future to preparing for it, we list down top three reasons why this one is for keeps:
- Get a peek into how Uber analyses historical data, uses it as a benchmark and predicts future action
- Get pointers from the best in the industry, Sai Alluri part of Uber’s PRO team
- Learn how to leverage analytics to stave off competition
Supply Optimization at Uber
The supply positioning model at Uber refers to anticipating demand patterns, and placing driver partners across those hubs with the aim to plug in the demand, lower ETAs and increase overall efficiency. One of the key focus areas is moving from a passive supply-positioning model to active through specific recommendations across the network.
How is supply positioning done at Uber?
In the words of Alluri — Supply optimization is one of the biggest focuses at Uber and the challenge is to efficiently manage optimizing the supply wherever there are high areas of demand can be. One of the methodologies is through searchsurge — in real time, meaning that supply comes in from the highest area of demand. Say for example, when you see a search surge multiple in 2x or 3x, it portrays how much demand is in that particular area and what supply do you need to meet this demand.
Building models based on historical data
Uber analyzes historical data for say, last three or four weeks and identifies pockets within the city that witness extremely high demand. Let’s keep Gurgaon as a case in point. “Say there is a high search multiple in Connaught Place and our driver partner is in Gurgaon which is X kms from CP. It is very difficult for a driver to move from Gurgaon to CP given the traffic conditions and it might take him longer to reach. How do we know in advance where this demand is going to be based on historical data?” shares Alluri.
Here are some of the key steps on how the model was built:
- Look at historical data for the last three or four weeks
- Look at the time, day and specific areas within the city where the highest demand comes in
- Key metric is specifically the number of requests coming in and how many are getting completed in different pockets of the city
- If a specific pocket has a really low completed trips request, it implies a high demand in that hub but not enough supply
- Next step is to focus on how to proactively tell drivers to move within these areas not in real time but a 2 hour or 3 hour lag so that they can position themselves there when the demand arises
Supply Positioning in a Nutshell
How Uber does supply positioning is by specifically a) breaking down the city into multiple pockets, b) then identifying these pockets based on the demand parameters that show up, c) once you identify these pockets, you can figure out how you want to position the supply chain in these specific areas.
“Say for example, a specific pocket has a low complete request ratio or has fewer number of rides completed as compared to other areas, what should be done is ensuring how to get drivers get in the demand hub in time,” says Alluri.
Key parameters addressed for the analysis are: broken up by hour of day, by day of week and by specific pocket.
Meeting the Demand Supply Gap with Predictive Analytics
So now that you have the information, how do you use it to inform future decisions? In case of Uber, the real challenge is in filling the demand supply gap. “The idea is to figure out if the highest area of demand is in one specific pocket but the supply is going to come in from a different pocket. Which means we need to send this message to driver-partners early so that they can get to this specific area and ready to go when the demand hits,” points out Alluri.
At Uber, this analysis is automated to drive the following results:
- Uber sends out weekly communications to drivers at real time
- Weekly communications inform about high demand areas, with specific recommendations
- Enabling driver-partners to make best decisions, increase earnings and lower ETAs
Objective of historical analysis – build forecasting model
Alluri informs that the idea behind analyzing three-four weeks of data for a specific city, further broken down into specific hub/ pocket within the city and by hour of day and day of week is to get consistent behaviour across that time period for that particular pocket. The motive is to set a benchmark and rule out weekly anomalies. And it is further used to build a potential forecasting model where one can predict the highest demand or lowest supply and keep modifying it on a weekly or bi-weekly basis as the data changes.
A/B Testing & Clustering/Segmentation Analysis
At Uber, the goal is to drive efficiency across all areas of business. A/B testing was to find the most optimized and effective communications that had to be dispatched to driver-partners to address their issues, convert drivers to become loyal Uber partners by incentivizing.
“We want to make the process for a driver-partner signing up on our platform easy and scalable, so that they can reach out to us for specific issues, such as using the app. For example, as soon as the driver becomes active on our system, we want to make sure if he has any questions pertaining to how do you go online or how do you essentially go pick up your customer. So we monitor every aspect of this journey map at different cycles,” says Alluri.
The communication dispatch was targeted at converting drivers into loyal Uber partners. An A/B test was set up for two specific cohorts of drivers who had joined in the same week. Let’s keep a 100 drivers in cohort A and another 100 in cohort B.
- Idea is to find out how many don’t take the trip in the first 3-4 days
- Reach out with specific communications to drivers who still haven’t gotten activated
- Did the communication improve efficiency and drive conversions vis a vis cohort B that did not receive any messaging
The goal of A/B test was to use resources, in this case communications and incentives effectively:
- Lift conversions, urge drivers to become activated and turn from part time to full time
- Find out what communication is most (text or more personalized calls) effective
- Find out what should the content be and how to build the iterative process
Clustering Analysis basically means breaking up huge data sets into further subsets to help get better insights into critical decision areas. “What happens with clustering/ segmentation analysis is that, it is an iterative process, you keep building into the model and keep finding data sets so that you gain smarter and stronger insights,” notes Alluri.
In this case, segmentation was based on hours and trips. Alluri shares how the model was further optimized to include trips and how it led to increased revenue for drivers. “When we started this model initially it was meant as a question analysis and we used hours, that driver partners were putting in on a weekly basis or a daily basis as a variable. But as the model became smarter we wanted to include trips also to ensure that drivers that are driving at night or just part time at night are not coming online for just 4-5 hours but are able to get trips, end result being they are engaged on our platform,” he says.
End result was:
- Helping part time drivers find trips at night (we don’t want a driver coming online at a wrong time)
- Achieve their running target, thereby meeting revenue generation
- Boosting loyalty, converting from part time to full time (achieving day-time trips as well)
SQL still triumphs in Data Analytics
“Data warehousing is set in a way that we can do analysis on it, so it is easy for city teams and analysts to go into this data, get what you need to figure out what the biggest problems/ issues are in those specific areas and how to go about fixing it,” explains Alluri.
Alluri tells why SQL in preferred in Analytics
- There are no manual mistakes
- Write the query you want, find out what information you need and run the logic in that query
- When you get the file you are ready to share and you can also keep adding analysis on it
- Automate it using either R or. Python and gather information sets that are more useful
Visualization at Uber
Visual analytics is used at Uber to make data look more actionable and understandable. In India, one of the tools used by Uber’s city teams is heat maps which is used to find out where exactly is the biggest mismatch between supply and demand. Our team uses visualization layers on most business insight applications and uses it to find out the sequence of data flowing in.[divider]
About Sai Alluri:
Sai Alluri holds a degree in Mechanical Engineering from University of Illinois at Urbana-Champaign. He worked in consulting before joining Uber in San Francisco, California. He shifted to India last year to set up a team and focus on operational and analytical challenges in India. Sai is on the industry expert panel for UpGrad & IIIT-B PG Diploma Program in Data Analytics.
Try deep learning using MATLAB