As one of the most preferred and easy to work with languages for data scientists, Python has emerged as one of the most sought after tool for building a slew of applications. From Facebook to DropBox, a number of leading tech companies are known to use the language for advancing their product offerings.
In the backdrop of the recently concluded PyCon conference in the US, Netflix released a sampling of how the world’s biggest online streaming if using Python for its seamless working.
Netflix is believed to have as many as 148 million users worldwide and the company extensively use Python through the full content lifecycle, from deciding which content to fund including all the ways to operating its content distribution Network (CDN) which is responsible for delivering the final videos to its users
“Python has long been a popular programming language in the networking space because it’s an intuitive language that allows engineers to quickly solve networking problems. Subsequently, many useful libraries get developed, making the language even more desirable to learn and use,” Netflix said in a blogpost.
As pointed out by the streaming giant, here is how the company uses Python.
As the content delivery of Netflix, Open Connect forms the most crucial element of the platform. According to the company the content delivery has been categorised into two main segments, with the platform deciding on what to recommend based on your watch history and what are you likely to watch before the user click the play button, all these happen in AWS, whereas the video streaming happens in the CDN
Since CDN requires a number of software systems for its infrastructure, Netflix has written these in Python. Apart from these network devices of the CDN, which tracks the inventory, models and hardware components are managed by Python applications. Its collection of health and operational data are also managed by Python applications.
This aspect of the platform is responsible for Regional Failovers, Traffic Distribution, Capacity Operations, and Fleet Efficiency of the Netflix cloud and are built on Python.
“The service that orchestrates failover uses numpy and scipy to perform numerical analysis, boto3 to make changes to our AWS infrastructure, rq to run asynchronous workloads and we wrap it all up in a thin layer of Flask APIs. The ability to drop into a bpython shell and improvise has saved the day more than once.”
Is Netflix’s alerting and statistical team and depends heavily on libraries like numpy, scipy, ruptures, pandas to automate the analysis of 1000s of related signals when a problem is raised. Here Python has been used to automate, data exploring, cleaning and for visualisation
The platform’s operational insight, alerting, diagnostics, and auto-remediation is handled by its Insight Engineering team who heavily relies on Python clients for delivering these services. By building Python libraries, it enables other Netflix platform services like Winston and Bolt to interact with each other
‘With the increased popularity of Python, the team now supports Python clients for most of their services. One example is the Spectator Python client library, a library for instrumenting code to record dimensional time series metrics,” it said.
Python has been extensively leveraged by the platform to train their personalisation machine learning models, these include recommendation algorithms, artwork personalisation to marketing algorithms.
“For example, some algorithms use TensorFlow, Keras, and PyTorch to learn Deep Neural Networks, XGBoost and LightGBM to learn Gradient Boosted Decision Trees or the broader scientific stack in Python (e.g. numpy, scipy, learn, matplotlib, pandas, cvxpy),” said.
The platform also relies on Jupiter Jupiter and have developed several libraries integrate their vast ecosystem of data access, fact logging and feature extraction, model evaluation, and publishing.