Assignment 2024 (1)
Assignment 2024 (1)
Assignment
Music Sales Data Analysis
Due date: November 28, 2024 (20:00)
Overview:
This task revolves around analyzing a dataset containing information about music sales. The dataset
includes various attributes, such as album details, artist information, sales figures, customer
demographics, and genre classifications. The goal is to explore this dataset, answer a series of questions,
and provide insights and comments based on the findings.
Throughout the analysis, you will be performing data cleaning, calculating summary statistics,
visualizing data distributions, identifying top-selling artists and albums, and analyzing sales trends
over time. Additionally, you will segment customers by demographics, compare sales across different
genres, and provide personalized recommendations for the top 5 customers to increase their purchases.
By working through these tasks, you will gain a deeper understanding of the data and identify trends
and patterns that can inform decision-making in the music industry.
Once you have completed the provided tasks, consider what other data might be useful to enhance your
analysis. For example, you could examine the relationship between promotional activities and sales
spikes, investigate the impact of album release dates on sales performance, or analyze customer
feedback and reviews to understand preferences better. Explain that with additional data, you can
perform further analysis to support your comments and insights, enabling you to make more informed
recommendations to music producers, marketers, and retailers.
P.1
Objective:
Analyze the provided music sales data to gain insights into sales trends, customer demographics, and
genre popularity. Write a comprehensive report summarizing your findings.
Requirements:
1. Data Loading and Cleaning (10%):
o Load the CSV file into a Pandas DataFrame.
o Ensure all data types are correctly assigned.
2. Exploratory Data Analysis (EDA) (15%):
o Provide summary statistics for numerical columns.
o Visualize the distribution of sales across different genres.
o Identify the top-selling artists and albums.
o Analyze sales trends over time.
3. Customer Analysis (15%):
o Segment customers by country and city.
o Determine the average sales per customer.
o Identify the most frequent customers.
4. Genre Analysis (15%):
o Compare sales across different genres.
o Identify the most popular genres in different regions.
5. Sales Analysis (15%):
o Calculate total sales and average sales per album.
oIdentify trends in sales over different time periods (e.g., monthly, yearly).
6. Top Customer Recommendations (15%):
o Identify the top 5 customers based on total sales.
o Provide personalized recommendations to increase their purchases. Consider factors
such as their purchase history, preferred genres, and frequency of purchases.
7. Report Writing (15%):
o Summarize your findings in a well-structured report.
o Include visualizations (at least 3 charts, 3 graphs) to support your analysis.
o Provide insights and recommendations based on your analysis.
Instructions:
1. Data Loading and Cleaning:
o Use pandas to load the CSV file.
o Check for and handle missing values using methods like dropna() or fillna().
o Ensure columns have appropriate data types using astype() if necessary.
2. Exploratory Data Analysis (EDA):
o Use describe() to get summary statistics.
o Use matplotlib or seaborn to create visualizations.
P.2
o Plot sales distribution by genre using bar charts.
o Identify top-selling artists and albums using groupby and aggregation functions.
3. Customer Analysis:
o Group data by Country and City to analyze customer distribution.
o Calculate average sales per customer using groupby and mean functions.
o Identify frequent customers by counting occurrences of CustomerID.
4. Genre Analysis:
o Group data by Genre and calculate total and average sales.
o Use bar charts to compare genre popularity across different regions.
5. Sales Analysis:
o Calculate total sales using sum() and average sales using mean().
o Analyze sales trends over time by converting InvoiceDate to datetime and resampling
data.
6. Top Customer Recommendations:
o Use groupby and aggregation to identify the top 5 customers by total sales.
o Analyze their purchase history to understand their preferences.
o Provide recommendations such as personalized offers, discounts on favorite genres, or
exclusive access to new releases.
7. Report Writing:
o Structure your report with an introduction, methodology, results, and conclusion.
o Include visualizations to illustrate key points.
o Provide actionable insights and recommendations based on your findings.
Deliverables:
• Python script or Jupyter Notebook with your analysis.
• A written report (PDF or Word) summarizing your findings and including visualizations.
• If you use Jupyter Notebook, then you can include your codes through your report (in the same
file). Otherwise, your report should not contain any code blocks, but include the results from
your codes and add proper reference to your code file.
Evaluation Criteria:
• Accuracy and completeness of data analysis.
• Quality and clarity of visualizations.
• Depth of insights and recommendations in the report.
• Code quality and documentation.
P.3