Thursday 17:10–17:40 in Track 3

Why does my girlfriend dislike my music? - a look at my music using machine learning and statistics

Juan De Dios Santos Rivera

Audience level:


My girlfriend is not a big fan of my music. She says that it is too boring, instrumental, and too varied. To test her theory, I analysed our Spotify music. In this talk I will share the results of my experiment, which include a data analysis, data visualization and clustering of the audio features, as well as a machine learning model that is able to predict if a song belongs to my playlist or hers


In my talk "Why does my girlfriend dislike my music?" I will introduce a situation I encountered while listening to my Spotify music with her. A couple of hours into "Spotify and chill", she said: “Your music taste is weird…your playlist has a lot of variety, instrumental songs, and some of them are boring”.

So I ran an experiment, and I will discuss what I found with you.

Using Spotify's API, Python, R, statistics and machine learning, I studied my music to see if it is indeed varied, instrumental, and boring. Furthermore, and more importantly, I compared my music to hers to see how they differ from each other.

The experiment consists of three main parts: an exploratory data analysis, a section dedicated to unsupervised learning and visualization of high dimensional data, and lastly, a supervised learning approach.

In the first section, the exploratory data analysis, I will introduce and describe the data, and show if my playlist is varied, instrumental, and boring using descriptive statistics and graphs.

Following this, I will make use of unsupervised learning to cluster the content of the datasets and analyse them to see how the songs relate to each other. Moreover, I will show the techniques used to determine the amount of clusters, and several plots that serve as low dimensional representations of the datasets.

For the final part of the experiment, a machine learning model was trained with the purpose of predicting if a song belongs to my playlist or hers. In addition, I will describe the model used and - besides presenting the results and performance of the model - and explain how I found an optimal set of parameters.

In summary, at the end of this talk you will have learned (I hope) how data may be combined with music to learn about features, similarities and about the musical taste of a person.

Subscribe to Receive PyData Updates



Get Now