Project Proposal

By Ehsan Asdar (easdar3), Ashwini Iyer (aiyer65), Matthew Kaufer (mkaufer3), Nidhi Palwayi (spalwayi3), and Kexin Zhang (kzhang323)

Problem Statement

Generative music is a very interesting field – who doesn’t want to hear a new Mozart composition? We propose two methods of generating music through image generation, Markov Models and GANs, and will compare their results.

Think of a piece of music; it has notes that vary in pitch, and each note has a quantizable start and end time. Consider embedding a musical piece with three instruments into an image. The columns of the image can represent some temporal moment, while the rows of the image can represent a pitch. The RGB channels can each represent an individual instrument (you could splurge and do four instruments if you used the alpha channel too).

Now that we have some means of embedding music as images, we can treat the lines of notes in the images as textures.

Since we know something about texture generation, we can apply these methods to generating music. This is nice, since Markov Models are inherently repetitive, and could create some of the repetition that music entails.

However, Markov Models are so 19xx – anyone that’s anyone uses deep learning. Here, we will use both Markov Models and GANs to generate music embedded into images, and then compare the results.

Approach

Experiments and Results

The experimentation involves generating multiple Markov Models of different resolutions, as well as architecting and training a GAN. We will also experiment with the length of the output of the generated music to see if longer musical phrases fall apart quicker (to play with the curse of dimensionality).

We’ll use a dataset of classical music written by Antonio Vivaldi in 4/4, since his music has a distinctive style and composition that will allow us to draw parallels between the generated and original music (more on our evaluation criteria later).

The code that will be written is

Some code we’re not going to write from scratch:

Experiments will involve

Our evaluation criteria involves measuring stylistic similarity of the generated music to that of the original classical music dataset. We’ll write code that compares music on the metrics of tonality, contour, and consistency of time signature. We will also qualitatively analyze the outputs.

Datasets:

Inspiration: