The accuracy of state-of-the-art object detection systems is often under scanner for seemingly obvious reasons. From unlocking the phone to self-driving cars, object detection is almost everywhere.
As computer vision applications grow in popularity, it has become crucial to keep their flaws in check or at least detect them in the first place. These flaws usually are a culmination of unhealthy data collection strategies and biases — both inductive and engineered. A wrongly identified image can spew results which can spiral a business into a catastrophe.
Object detection had been used to set up profitable business ventures. One such example is that of the billion dollar hospitality venture Airbnb.
Brian Chesky co-founded peer-to-peer room and home rental company Airbnb with Nathan Blecharczyk and Joe Gebbia in 2008. Now, almost 11 years later, Airbnb has been used by more than 300 million people in 81,000 cities in 191 countries.
Airbnb platform has millions of listings that cover living space around the world. The quality of these listings is a top priority, as the tastes of customers depend greatly upon the aesthetics of the interiors. To do this Airbnb had to determine whether the amenities advertised online match the actual ones.
For Airbnb, knowing that kitchenware exists in a picture does not speak much about the type of room. Likewise, knowing there is a table in the picture doesn’t help either.
The goal here is to understand whether the detected amenities provide convenience for guests and can assist the customer in decision making. Because a family trip might require a room with spacious kitchen when compared to a bachelor’s trip.
The search results should allow the customer to draw more insights. Airbnb platform is a fine example of using machine learning algorithms to improve the user experience.
Achieving High Quality Detection
The above picture is a sample of amenity detection result of a third-party API service, from an industry-leading vendor.
The picture below, depicts the amenity detection using Airbnb API:
The detailing in the Airbnb results, look more insightful. To achieve this, data scientists at Airbnb had to face some data challenges. From finding suitable annotated data to cleaning it, from manual classification to developing a robust model, the hurdles were plenty.
To address the issues with taxonomy that encompasses amenities (kitchenware, furniture etc) is an open-ended question. The taxonomy was unclear and the data science team at Airbnb began with something lightweight.
They have found Open Image Dataset V4, that offered a vast amount of image data. It has more than 9 million images that had been annotated with image-level labels, object bounding boxes (BB), and visual relationships.
Along with this, the team had also manually reviewed the 600 classes and selected around 40 classes that were relevant to the use case.
For labelling the data, the team used Google data labelling service and to ensure the diversity of the dataset, some in-house data was added to the public data resulting in an evenly distributed amenity classes.
For training, two pre-trained models were chosen:
The accuracies of the pre-trained models were tested on a 10% held-out data which contains 7.5k images with 30 object classes.
Mean Average Precision (mAP) was the metric used, which measures the average precision of a model across all object classes.
Broadening The Scope Of Object Detection
The team also experimented with Google AutoML vision and to their surprise, they found the results to be promising.
They were able to train an object detection model on 75k annotated images within 3 days.
They also found that model deployment on AutoML to be easy as the model was turned into an online service which people could easily use through REST API or a few lines of Python code. In addition to Amenity Detection, Object Detection with broader scope is another important area that Airbnb looks to count on going forward.
This is due to the promising results of Broad-scope Object Detection, which provided the data scientists with necessary content moderation to prevent things like weapons, large-size human faces, etc. from being exposed without protection and in way help Airbnb become a smarter and safer home-sharing platform.
Know more about amenity detection here.