DAV Project
DAV Project
1
COMPUTER SCIENCE 2
Spotify-2023 Analysis
Kaggle
2
COMPUTER SCIENCE 3
mode 0
danceability_% 0
valence_% 0
energy_% 0
acousticness_% 0
instrumentalness_% 0
liveness_% 0
speechiness_% 0
dtype: int64
Result:
There are no missing values in any column, and there are no duplicate rows in the dataset. The
data appears to be clean and ready for analysis.
plt.figure(figsize=(14, 7))
sns.barplot(x=artist_tracks.values, y=artist_tracks.index, hue=artist_tracks.
↪index, palette='muted')
3
COMPUTER SCIENCE 4
Result:
The visualization highlights the top 10 artists with the most unique tracks in the dataset. These
artists demonstrate prolific output, with each having a significant number of tracks to their name.
While the number of tracks is indicative of an artist's productivity, it's essential to consider other
factors, such as streaming numbers or listener engagement, to gauge an artist's overall impact
and popularity. Nonetheless, the data underscores the diversity and richness of the music
landscape, showcasing artists who have made substantial contributions in terms of content
creation.
4
COMPUTER SCIENCE 5
Result
This line chart represents the average danceability of tracks per year, providing in- sights into the
evolution of this musical characteristic over time. The histogram underscores the diverse nature
of danceability across the tracks, providing a quantitative overview of how danceable the music
in the dataset tends to be. This analysis can serve as a foundation for further exploration into the
relationship between danceability and other musical attributes or listener preferences.
5
COMPUTER SCIENCE 6
Result
The histogram provides a quantitative overview of the energy levels present in the music tracks.
This analysis can offer insights into the overall vibe or intensity of the music dataset, aiding in
further exploration or comparison with other musical attributes.
plt.figure(figsize=(10, 6))
avg_danceability_per_year.plot(kind='line', marker='o', color='orange')
plt.title('Average Danceability per Year')
plt.xlabel('Year')
plt.ylabel('Average Danceability')
plt.xticks(rotation=45)
plt.grid(True)
plt.show()
6
COMPUTER SCIENCE 7
Result
In summary, the visualization offers a chronological perspective on how average danceability has
evolved over the years. This analysis can be instrumental for music analysts, researchers, or
enthusiasts aiming to understand temporal patterns in musical attributes and their potential
correlations with broader cultural or industry shifts.
The line chart above illustrates the total streams per year, providing a visual repre-
sentation of the changes in streaming volumes over time.
[6]: import pandas as pd
# Now let's try to plot the total streams per year again
import matplotlib.pyplot as plt
plt.figure(figsize=(14, 7))
streams_per_year.plot(kind='line', marker='o', color='purple')
plt.title('Total Streams per Year')
plt.xlabel('Year')
plt.ylabel('Total Streams')
plt.xticks(rotation=45)
plt.grid(True)
plt.show()
7
COMPUTER SCIENCE 8
Result
The visualization offers a comprehensive overview of the total streaming landscape over the
years, reflecting both general trends and specific anomalies. Such insights can be invaluable for
stakeholders in the music industry, helping them understand consumption patterns and make
informed decisions related to content promotion, artist collaborations, and platform strategies.
1.0.4 This visualization helps to understand the distribution of music releases over
the years included in the dataset.
plt.figure(figsize=(10, 6))
spotify_data['released_year'].value_counts().sort_index().plot(kind='bar',␣
↪color='skyblue')
8
COMPUTER SCIENCE 9
Result
The visualization offers a comprehensive overview of the total streaming landscape over the
years, reflecting both general trends and specific anomalies. Such insights can be invaluable for
stakeholders in the music industry, helping them understand consumption patterns and make
informed decisions related to content promotion, artist collaborations, and platform strategies.
Result:
This scatter plot helps to understand how these three attributes correlate with each other across different
songs in the dataset. The size and color of the points represent the energy level, providing a multi-
dimensional view of the music characteristics.
10
COMPUTER SCIENCE 11
11