MITB Banner

How To Build A Successful Machine Learning Product Like LinkedIn

Share

Illustration by Office Foosball Linkedin Hong Kong

Office Foosball Linkedin Hong Kong

Productivity and social networking are two terms which don’t always go hand-in-hand, at least for the millennials. But the emergence of LinkedIn has changed that.

Most grad students and working professionals know the importance of knowing someone in the field that they aspire to be in. Breaking all the barrier for a more free-flowing conversation, LinkedIn provided the common person with possibilities unbeknownst to the professional sector.

With more than 200 million users logging in every month, the platform has a hectic task of collecting, handling and serving the data accurately without any latency. Though there are many conventional on-demand services and state-of-the-art machine learning models, the engineers at LinkedIn were flexible enough to combine these conventional strategies with their built in-house tools to drive better results.

Building ML Models To Scale

The blueprint of a machine learning model would more or less consist of the same procedures; data collection, processing, training and testing the models and so on.

Firms seeking ML  transition can get an idea of how to scale their productivity by taking a look at LinkedIn’s model:

  • The major part of data or to be more precise, the most crucial data with respect to LinkedIn is based on the kind of jobs liked, jobs saved and connections made. So, recommending jobs to an individual and calculating the probability of a job posting being checked are one of the few important features of the dataset.
  • At LinkedIn, the ML team proceeds by building a domain-specific language (DSL) and then a Jupyter notebook to integrate the select features and for parameter tuning.
  • Most of the model training occurs offline where the ML teams train and retrain the models every few hours. For this, they avail the services of Hadoop. LinkedIn’s own Pro-ML training service is updated with newer model types for hyperparameter tuning. This training service leverages Azkaban and Spark to ensure that there is no missing input data.

The term ML model refers to the model artefact that is created by the training process. The training data must contain the correct answer, which is known as a target or target attribute.

The learning algorithm finds patterns in the training data that map the input data attributes to the target (the answer to be predicted), and it outputs an ML model that captures these patterns.

A model can have many dependencies and to store all the components to make sure all features available both offline and online for deployment, all the information is stored in a central repository.

“The deployment service provides orchestration, monitoring, and notification to ensure that the desired code and data artefacts are in sync. The deployment also ties with the experimentation platform to make sure that all active experiments have the required artefacts in the right targets in the overall system,” says the ML developers team at LinkedIn.

In addition to the aforementioned training services, custom built execution engine, Quasar is built to run the domain specific language(DSL) and also a Java API for composing online workflows and running recommendation engines.

To handle large chunks of data, Frame is deployed. A Frame is a system which contains metadata about the features in the centralised repository, making the job of searching, easy for the engineers.

 

Structure Of Pro-ML

Main areas of focus via LinkedIn

LinkedIn went the unconventional way in organising its team for maximising the ML productivity. Here AI teams are closely connected to the product team. This bridges the gap for researchers to collaborate and share their findings with fellow experts who might be working on similar problems. Hence reducing the redundancies and increasing output.

Key Ideas

  • Leverage and improve best-of-breed components from our existing code base to the maximum extent feasible.
  • Use an agile-inspired strategy; making one product line better at a time.
  • To enable services hosting the models to be independently upgraded without breaking their downstream or upstream services.
  • Enable new technologies to be A/B testable in production.

The goal of Pro-ML is to double the effectiveness of machine learning engineers while simultaneously opening the tools for AI and modelling to engineers from across the LinkedIn stack.

LinkedIn’s homegrown technologies have paved ways to a faster and better ML approach. Currently, enterprises are struggling to deploy machine learning models at full scale. Common problems include- talent searching, team building, data collection and model selection to say few. To tap the most out of AI, it is necessary to build service-specific tools and frameworks in addition to the existing models and the success of LinkedIn verifies the same.

Share
Picture of Ram Sagar

Ram Sagar

I have a master's degree in Robotics and I write about machine learning advancements.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.