MITB Banner

Sony Knows The Future Of Music Is AI. This Is What It’s Doing

Share

The use of AI to disrupt the music industry has been gaining traction of late. Earlier this year, Google demonstrated how to shred notes into lower dimensions and then perform fundamental techniques like batch normalisation and autoregressive factorisation to create new soundtracks from old ones. Though using the word disruption in domains steered by human creativity is still a hyperbole, one can’t help but sense the growing use of AI as an augmentation to yield creativity.

So far, AI-assisted music production has been pet projects of some lone researchers or had been introduced by up and coming start-ups. In a piece of somewhat unanticipated news, Sony Computer Science Laboratory (CSL) in Paris has introduced DrumNet that can autonomously generate kick drum tracks.

History of DrumNet

DrumNet is based on an artificial neural network which learns rhythmic relationships between different instruments and encodes these relationships in a 16-dimensional style space. 

The style of a kick drum track is determined by a 16-dimensional vector sampled from independent multivariate Gaussians. This style vector defines the relationship between the kick onsets and onsets of bass, snare, beat and downbeats. The model adjusts the tempo and timing of the output according to the input.

For time series modelling, common dense GAE architecture to 1D convolution in time was adapted, yielding a Convolutional Gated Autoencoder (CGAE).

As depicted in the figure above, ‘x’ represents 1D signals of length T indicating onset functions of instrument tracks and beat- and downbeat information of a song, while y represents the onset function of a target instrument. Then the rhythmic interactions (henceforth referred to as mappings) between ‘x’ and ‘y’ are defined as

m = W∗(U∗x·V∗y) 

Weight matrices W, U and V act as placeholders for several convolutional layers.

To generate a kick drum track, researchers sampled only one mapping code m_t (from a 16-dimensional standard Gaussian), repeat it across the time dimension, and reconstruct y given the resulting m, as well as x. Then they performed k-means clustering over all m_t.

The model was trained for 2500 epochs with batch size 100, using 50% dropout on the inputs ‘x’. During training, a data augmentation based regularization method is used to make the mappings invariant to time shift and tempo change.

Regardless of the success of this technology, the researchers at Sony CSL insist that their goal is not to replace musicians but to provide them with better tools to be more efficient in realizing their creative ideas. 

Dawn Of AI Leveraged Creativity

The turn of this century witnessed a new form of intelligence. Augmentation of human idea with the computational powers of the machines. These machines, now have become massive data-driven engine. With every innovation in the algorithms, the machines got better.

Companies like Sony CSL are working with musicians and content providers to push the boundaries of creativity and understand the complexity of modern music production processes. By combining cutting-edge A.I. research with strong musical expertise, they believe that they can pave the way for musical experiences yet to imagine.

If not for the excessive use of autotune for music creation, the industry has mostly benefited from the advancements in technology. From the way the instruments were manufactured to the way a sound wave is electronically manipulated, technology has touched many aspects of music. Without digital technology, popular music in the twenty-first century is almost unthinkable.

Innovating through Interdisciplinary research approach such as these, where blending theoretical modelling, data science and machine learning, gaming and participation is mainly – aimed at developing a science of the “new”, focusing on how the “new” emerges in social and technological systems and how humans and machines explore the space of possibilities and find new solutions.

Listen to the beats of AI here

PS: The story was written using a keyboard.
Share
Picture of Ram Sagar

Ram Sagar

I have a master's degree in Robotics and I write about machine learning advancements.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India