SpotifySongAnalysisProject
The goal of this project is to analyze the data available from Spotify to answer questions about Spotify Audio Features by song year, correlation between Audio Features and country metrics, and the Spotify Audio Features by song year. Technologies to be used are Python, Jupyter Notebooks, Pandas, Requests, and Matplotlib. Optionally, the Spotify API can be used but will match the Kaggle Data
Project maintained by JoeKell
Hosted on GitHub Pages — Theme by mattgraham
View my GitHub Portfolio
SpotifySongAnalysisProject
Summary
The goal of this project is to analyze the data available from Spotify to answer questions about Spotify Audio Features by song year, correlation between Audio Features and country metrics, and the Spotify Audio Features by song year. Technologies to be used are Python, Jupyter Notebooks, Pandas, Requests, and Matplotlib. Optionally, the Spotify API can be used but will match the Kaggle Data. For this we used the following data sources:
- Spotify Audio Features
- Spotify Charts
- World Metrics
- 2017 Musician Deaths
Our presentation slides can be found here.
Questions
In 2019, do Audio Features of charting songs correlate to a country’s Happiness Score?
- Use the Happiness Score from World Metrics, scrape the 2019 weekly data for each region from Spotify Audio Features, and use the Audio Features from here Spotify Audio Features.
- Merge the 3 data sets and aggregate each Audio Feature by the appropriate measure of central tendency.
- Show plots with regression lines and give the r value for each Audio Feature by Country Happiness Score.
In 2019, do Audio Features of charting songs correlate to a country’s Freedom to Make Life Choices Score?
- Use the Freedom to Make Life Choices Score from World Metrics, scrape the 2019 weekly data for each region from Spotify Audio Features, and use the Audio Features from here Spotify Audio Features.
- Merge the 3 data sets and aggregate each Audio Feature by the appropriate measure of central tendency.
- Show plots with regression lines and give the r value for each Audio Feature by Freedom to Make Life Choices Score.
In 2019, do Audio Features of charting songs correlate to a country’s GDP per Capita?
- Use the GDP per Capita from World Metrics, scrape the 2019 weekly data for each region from Spotify Audio Features, and use the Audio Features from here Spotify Audio Features.
- Merge the 3 data sets and aggregate each Audio Feature by the appropriate measure of central tendency.
- Show plots with regression lines and give the r value for each Audio Feature by GDP per Capita.
How have Audio Features changed over time?
- Use the Audio Features from here Spotify Audio Features.
- Determine the appropriate measure of central tendency for each Audio Feature and give evidence.
- Show plots displaying the change in each Audio Feature over time.
In 2017, What is the Impact on Streams of Artists following their Deaths?
- Scrape the 2017 daily charts in the US from Spotify Audio Features and use the deaths of these artists to cross reference.
- Narrow the data frame to artists that hit the charts in 2017 and died in 2017.
- Compare the streams before death, on the day of death, and after death using line plots.
How did we do it? (Spoilers)
Scraping the data
How have Audio Features changed over time?
In 2019, do Audio Features of charting songs correlate to a country’s Happiness Score, Freedom to Make Life Choices Score, GDP per Capita?
- The analysis for this question is in the Audio Features vs Country Metrics notebook.
- We found that there is no strong correlation between the audio features of songs streamed in a country and that country’s happiness score, freedom to make life choices score, or GDP. The strongest correlation that we observed was between GDP per capita and song duration. The relationship wasn’t that strong (r value of -0.59) but it was the strongest we found. Below are the plots of audio features compared to a countries happiness score.
In 2017, What is the Impact on Streams of Artists following their Deaths?
- The analysis for this question is in the 2017 Artist Deaths notebook.
- The deaths of Chester Bennington (Linkin Park) and Tom Petty had the most significant initial effect of on Spotify streams by a wide margin. Linkin Park and Tom Petty accumulated 10,647,809 and 9,080,227 streams, respectively, in the day following their deaths and were responsible for 14% and 11.5% of total songs on the chart those days. However, Linkin Park had a much more prolonged increase in streams, maintaining at least one song on the streaming chart for three weeks, while Tom Petty’s final appearance came one week after his death.
- Overall, the data supports our hypothesis that the number of Spotify streams would dramatically increase following an artist’s death. It was interesting, however to observe the variance in how long deceased artists maintained a position within the top 200 daily charts.
- Spotify only lists the number of streams for songs in the top 200, thus the total number of songs reflect only those particular tracks. It would be advantageous to be able to gather data from the entirety of an artists streams, which would also provide an even more telling look at their pre-death numbers and exactly how long after their deaths an increase was observed.
- Two artist provided unanticipated data that raised an additional question to contemplate: What are some song trends during certain seasons or particular events. Chuck Berry had just one song on one day reach the chart following his death, however, he appeared 33 times in the holiday season with his song “Run Rudolph Run.” Following the death of guitarist Malcolm Young, there were no appearances for AC/DC, but they made the charts on New Years Day with “You Shook Me All Night Long” and on Halloween with “Highway to Hell”.