Open-sourcing has become a tradition for big tech companies today. This not only helps the developers in the community to gain an understanding of the innovative technologies used by the tech giants but also helps the companies to find bugs and make enhancements to the software. In this article, we list down 5 companies who have been open-sourcing their algorithms and software into the developer community.
Last month, the professional social networking site open-sourced a machine learning library known as the Isolation Forest. The library is being used by the Anti-Abuse AI Team at LinkedIn creates, deploys, and maintains models that detect and prevent various types of abuse, including the creation of fake accounts, member profile scraping, automated spam, and account takeovers.
The developers at LinkedIn created a Spark/Scala implementation of the Isolation Forest unsupervised outlier detection algorithm and is currently available on GitHub. This library supports distributed training and scoring using Spark data structures. This library also supports model persistence on the Hadoop Distributed File System (HDFS). The Isolation Forest algorithm is mainly chosen for vat=rious reasons such as this algorithm is a top-performing unsupervised outlier detection algorithm, scalable, low memory requirements, among others.
In March 2019, tech giant Microsoft open-sourced Project Zipline compression algorithms, hardware design specifications, and Verilog source code for register transfer language (RTL) with initial content at the Open Compute Project (OCP) Global Summit 2019. The researchers at Microsoft developed a cutting-edge compression algorithm and optimised the hardware implementation for the types of data that are found in the cloud storage workloads.
Project Zipline compression algorithm yields result up to 2X high compression ratio which is better than the result of commonly used Zlib-L4 64KB model. Project Zipline is a cutting-edge compression technology optimised for a large variety of datasets, while RTL allows hardware vendors to use the reference design to produce hardware chips to allow the highest compression, lowest cost, and lowest power out of the algorithm.
A few weeks ago, the popular social networking site open-sourced image and video algorithms. The photo-matching algorithm is called PDQ and the video-matching technology is called TMK+PDQF. Facebook announced the two technologies during the child safety hackathon which will help in detecting the graphical abusive contents, child exploitation, terrorist propaganda and other such threats.
These algorithms will be available on GitHub and are part of a suite of tools that Facebook uses to detect harmful content. The two technologies work in a method such that they will store the files in the form of short digital hashes and later comparing them with other instances in order to determine whether the files are identical and nearly identical images as well as videos to fight abuse on the internet platforms.
In July 2019, big blue open-sourced a deep learning algorithm known as PaccMann, stands for Prediction of anticancer compound sensitivity with Multi-modal attention-based neural networks. The goal of open-sourcing this algorithm is to deepen the understanding of cancer to equip industries and academia with the knowledge that could potentially which will help fuel new treatments and therapies. The researchers applied this algorithm to predict the sensitivity of cancer cell lines to known drugs and it achieves a superior predictive power compared to existing algorithms
Last month, the big blue took another huge step by open sourcing the POWER Instruction Set Architecture (ISA). The expectation behind this open-sourcing is to boost the IBM Power processor’s value by creating innovative hardware components. Besides this, the tech giant also contribute other technologies including a softcore implementation of the POWER ISA, as well as reference designs for the architecture-agnostic Open Coherent Accelerator Processor Interface (OpenCAPI) and the Open Memory Interface (OMI).
In 2015, Google open-sourced the software library for TensorFlow which is an end-to-end open-source platform for machine learning. The idea behind this open sourcing is to let the machine learning community such as everyone from academic researchers, to engineers, to hobbyists—exchange ideas much more quickly, through working code rather than just research papers.
Open-sourcing tools and systems are making a huge impact on these companies. Previously, when there were almost no open-source systems or toolkits, companies created their products which had close data, it was difficult as well as time-consuming to enhance it or remove the errors and bugs. Currently, the potential of business relies upon sharing software source code ann which is why tech giants like Microsoft, Google, IBM, among others are open-sourcing their intelligent software. Also, AI is at its infancy stage and open-sourcing will not only make it easier to develop but also the developer’s community gets a chance to learn and understand what’s going behind the emerging technologies.