As per the Stack Overflow Developer Survey, TensorFlow was the most wanted framework library, garnering 73.5 percent votes. It is one of the fastest growing and most popular framework on Stack Overflow. TensorFlow is already the most popular machine learning tool amongst leading tech giants like eBay, Google and Uber that adopted and embraced TensorFlow for a long time now. And the latest to join the bandwagon is Twitter that recently moved to TensorFlow as its machine learning framework, in July this year.
Twitter has been using its trademark internal system called DeepBird which before the migration to TensorFlow used LuaTorch for its operations. But since LuaTorch moved to PyTorch in December last year, the support for DeepBird was automatically restricted. So Twitter finally adopted TensorFlow as its machine learning framework.
Ml Use Cases @Twitter
Twitter just like any other platform of its kind, uses machine learning for the following different aspects:
- Ads: To display ads according to the user’s interest.
- Timelines: Providing with interesting relevant context of timelines to the users.
- Abuse: The information reached should be safe for work and healthy for the platform.
- Recommendations: Provides recommendations to tweets based on the user interest.
Twitter deals with petabytes of data, mostly tweets every single day are many and each tweet has to be addressed in very few seconds before being rendered on the timeline. Thus, the models besides being accurate in prediction game, should also be supersonic. This is why ML is always required in a platform like Twitter.
How TensorFlow Helped Twitter Lend Relevance & Decode Data Records
Python is the most popular language preferred by data professionals. TensorFlow supports Python and that provides a great deal of flexibility in leveraging machine learning.
- At Twitter, ML engineers can now better track the machine learning models and make changes with the inclusion of TensorBoard.
- The micro-blogging site shows tweets to Twitterati based on their relevance and TensorFlow integrates very well with different parts of Twitter platform.
- Also, with the new platform the process of decoding data records has become seamless.
- With the adoption of TensorFlow, Twitter engineers are using an internal tool called Model Repo, who works towards reducing friction in developing, deploying, maintaining and refining models.
Twitter, like other big tech firms relies on open source tools and platforms. The social media giant leverages Finagle over Thrift and that has worked better for their use cases. They have used Hadoop for training. Their data records are able to store different kinds of features in machine learning, for example binary features. Their build_graph function is used in three modes: training mode, evaluation mode and prediction mode.
Here’s how adopting TensorFlow has helped Twitter in the following aspects:
- Higher engineer productivity: Using TensorFlow along with Twitter’s internal visualisation tools like Model Repo, convergence of models can be observed and adjusted easily.
- Easier ML access: DeepBird v2 provides simplified trainers and easy integration with Twitter’s technology stack, making it easier for engineers to experiment with ML.
- Better inference performance: v2’s performance is better than its predecessors.
- Improvement in model metrics: The use of DeepBird v2 on TensorFlow has improved the ML models and made them more robust.
LuaTorch Vs TensorFlow
1. Code debugging: Their model was a combination of YAML Lua Torch and C at the backend. Why TensorFLow is better because the models can be easily tracked and they can easily transport the Python code into different ML models. Therefore it’s easier to debug your code.
2. Unitise models: Benefits of using TensorFlow as opposed to YAML is that it is easier to unitise models. It cannot be done without going to the backend code, in case of LuaTorch. Whereas, with TensorFlow, it is easier to write and unitise models.
3. Access to a specific code: With TensorBoard it is very easy for ML engineers to actually see how the data models are performing overtime. It was difficult in YAML to use a specific part of code of interest and reuse that. TensorFlow allows to do that. Parts of the model can be reused. It is difficult to debug the whole program and add test cases, which had to be done in LuaTorch. TensorFlow can tell the output of each layer.
4. Tools: LuaTorch had not many visualisation tools. The engineers had to create a lot of them by themselves. TensorFflow, on the other hand, has TensorBoard, TensorFlow Model Analysis (TFMA), TensorFlow Debugger (TFDebugger) which makes work more easy.
5. Code modification: It was not easy to modify code in Torch. It is relatively easy to debug in TensorFlow.
Twitter engineers believe that DeepBird v2 is the future of ML at Twitter. They are working to support models on this platform of v2 and train on GPU clusters. They are also working towards adding support to online as well as distributed training in v2.
Although Twitter faced several roadblocks before switching to TensorFlow, it has the power to unlock ML and create models for a better Twitter experience. TensorFlow is undoubtedly a very good way to provide a refuge to ML and the next thing would be to see who next relocates to this popular platform.