0% found this document useful (0 votes)
224 views32 pages

Amazon Sales Data Analysis

Uploaded by

Irfan Ali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
224 views32 pages

Amazon Sales Data Analysis

Uploaded by

Irfan Ali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Amazon Sales

Data
Analysis

amazon
01 Introduction

02 Dataset Description

03 Data Cleaning

04 Category Distribution Analysis

Contents
05 Price Analysis

06 Rating Distribution Analysis

07 Correlation Analysis

08 Top-Related Products

09 Review Content Analysis

10 User Insights
Introduction

In the highly competitive realm of e-commerce, gaining insights into product performance is essential for making informed
strategic decisions. This presentation offers an in-depth analysis of the Amazon products dataset, aiming to uncover key
insights into pricing strategies, customer ratings, and review patterns. By examining these factors, we can identify trends and
actionable insights that will aid in optimizing product listings, enhancing customer satisfaction, and driving overall business
growth. This analysis not only reflects the current state of the products but also provides recommendations for future
enhancements.
Dataset Description
❑ Product id: Unique identifier for each product
❑ Product name: Name of the
product category: Product category
❑ Discounted price: Price after discount

amazon
❑ Actual price: Original price
❑ Rating : Product rating
❑ Rating count: Number of ratings
❑ About product: Product description
❑ User id: Unique identifier for each user
❑ User name: Name of the user
❑ Review id: Unique identifier for each review
❑ Review title: Title of the review
❑ Review content: Content of the review
❑ Image link: Link to the product image
❑ Product link: Link to the product page
JEFF BEZOS
Data Cleaning
❖ Effective data analysis starts with meticulous data cleaning and preparation. In this phase, we addressed missing values by

employing appropriate imputation techniques to ensure completeness.

❖ Duplicate entries were identified and removed to maintain data integrity.

❖ Outliers, which can skew analysis and lead to inaccurate conclusions, were detected and managed appropriately.

❖ These steps were crucial to enhancing the quality of the dataset, ensuring that the subsequent analysis is both accurate and

reliable. This rigorous process set a solid foundation for extracting meaningful insights from the data.

❖ Tools

❖ Microsoft Excel (Power Query editor)

❖ Power BI

❖ MySQL
Category Distribution Analysis
Overview
• Overview of the distribution of Products
across different categories

Visual Representation
• Visual representation through a bar chart
illustrating the distribution of products in various
categories.

Significance of Product distribution


• Analysis of the significance of category distribution in
understanding the product landscape.
Price analysis
Distribution of Prices
• Distribution of Discounted prices and actual prices in
the datasets.

• Histograms illustrating the distribution of discounted


prices and actual prices.

Price Relationship analysis


• Visual representation through a bar chart
illustrating the distribution of products in various
categories.

Key pricing Insights


• Key insights on pricing trends and their implications for
product strategy and marketing decisions.
Discount analysis
Discount Percentage analysis
• Analysis of discount percentage across products Visuals representation.
Box plot illustrating the distribution of discount percentages.

Impact of Discounts
• Insights into the impact of discounts on product sales.
• Insights into the impact of discounts on customer behaviour.
Rating Distribution Analysis
Overview of the distribution of product ratings in the Analysis of the spread of ratings across different
dataset. products
• Introduction to the dataset • Comparison of ratings among top products
• Explanation of rating scale • Identification of products with highest and
• Summary of overall rating distribution lowest ratings
• Discussion on factors influencing rating
spread.

Visualization: Histogram showing the distribution of Insights into the range and frequency ofratings
product ratings given by customers
• Description of histogram • Analysis of most common ratings
• Key observations from the histogram • Examination of rating frequency
• Interpretation of data visualized • Insights into customer satisfaction levels

• Identification of trends in customer


satisfaction
Understanding the distribution pattern to identify • Correlation between rating distribution and
trends in customer satisfaction levels product features
• Implications for product improvement
Top-Rated Products
List of Top-Rated Products Key Characteristics and Features

• Products with the highest ratings based onthe


• Characteristics of the top-rated products
dataset analysis.
identified in the dataset.
• Features that make these products standout.

Factors Contributing to High Ratings Comparison of Top-Rated Products

• Insights into the factors contributing to the • Comparison in terms of ratings, price, anduser
high ratings of these products. reviews.

Implications on Strategy and Marketing • Implications of the top-rated products on


product strategy.
• Impact on marketing decisions.
Review Content Analysis
Common Themes and Sentiments Visualization of Review Content

• Analysis of common themes and sentiments


• Visualization showcasing a word cloud of the
found in customer reviews.
most frequent terms used in there views.
• Identification of frequently mentioned terms
and phrases in the review content.

• Like:- Looks durable, Build Quality is good.

Impact on Product Ratings and Engagement Summary of Key Findings

• Insights into the factors contributing to the • Summary of key findings from the review
high ratings of these Insights into the impact content analysis.
of review content on product ratings and
customer engagement.
• Examination of the relationship between
review sentiment and product success..
Market Share
Market share measures a company's sales percentage within an industry, reflecting its competitive standing. Companies with higher
market shares benefit from economies of scale, brand recognition, and customer loyalty. Analyzing market share reveals a company's
market position, strategy effectiveness, and improvement areas, guiding resource allocation and marketing decisions. In dynamic
markets, sustaining or growing market share is crucial for long-term success and competitiveness.
Select product_id, product_name, category From Amazon;

List all products with their product id,


product name, and category
Select * From Amazon Where rating >= 4;

Display all columns for products that


have a rating of 4.0 or higher

When conducting an Amazon sales analysis, focusing on products


with ratings greater than 4 is crucial because these high ratings
indicate strong customer satisfaction and can significantly impact
sales performance
Select product_id, product_name, category From amazonWhere
category like "%Computers&Accessories%";

products that are in the Computers &


Accessories category
Select product_id, product_name From amazonWhere
about_product like "%durable%";

nd all products where the about product


column contains the word durable

Analyzing product durability is critical in various industries to ensure


customer satisfaction, brand reputation, and cost-efficiency. Here's
an overview of key aspects involved in the analysis of product
durability
Select Count(distinct product_id) As No_of_products From Amazon;

ite a query to count the total number of


products in the dataset.

When presenting data about the number of products available on


Amazon, it's important to highlight the vast diversity and scale of the
platform
Select Avg(rating) as Avg_rating From Amazon;

Find the average rating of all products.

The average rating of products on Amazon is a crucial metric for both


consumers and sellers, influencing purchasing decisions, sales
performance, and overall customer satisfaction.
Select product_id, product_name, RatingFrom AmazonOrder By
Rating Desc Limit 5;

t the top 5 highest-rated products based


the rating, sorted in descending order..

Highlighting the importance of the top 5 products on Amazon


involves understanding how these bestsellers impact various
stakeholders, including consumers, sellers, and the platform itself.
Select product_id, product_name, Count(review_id) as
review_countFrom AmazonGroup By product_id, product_name;

t all products along with the number of


eviews they have Include columns for
duct id, product name, and review count.

Understanding the number of reviews a product has on Amazon is


crucial for various stakeholders, including consumers, sellers, and the
platform itself. Here’s an in-depth look at why finding the number of
reviews for a product on Amazon is important.
Select A1.product_id, A1.product_name, A1.category, A1.rating From
Amazon A1join Amazon A2On A1.rating = A2.rating And A1.category
= A2.category;

d products that have the same rating and


belong to the same category.-- Display
roduct id, product name, category, and
rating.

Consumers often look for similar products to compare features,


prices, and reviews. Finding products within the same category and
rating allows for more informed purchasing decisions.
Select product_id, product_name, rating, Case When rating >
4.5 Then "Excellent" When rating >= 4.0 Then "Good" Else
"Average" End as Rating_CategoryFrom Amazon;

Categorize products into three categories


based on their rating: Excellent for ratings
.5 -- and above, Good for ratings between
4.0 and 4.5, and Average for ratings below
4.0.

Sellers can tailor their marketing strategies based on the product


category. For instance, products in the Excellent category can be
marketed as premium or top-rated products, while those in the
Good category can be promoted as high-value options.
Select *, (actual_price - discounted_price) as Discount_amountFrom
Amazon;

new column discount_amount to the


cts table that calculates the difference
en actual_price and discounted_price.

Understanding the discount amount allows consumers to compare


prices effectively. By knowing the discounted price, consumers can
assess whether the discounted offer is better than the original price
or other competing offers
With RankedProducts AS (SELECT product_name,
discount_percentage, ROW_NUMBER() OVER (ORDER BY
discount_percentage DESC) AS row_num FROM amazon)
SELECT product_name, discount_percentageFROM
RankedProductsWHERE row_num = 1;

he product with the highest discount


percentage

Consumers often engage in competitive shopping, seeking out the


best deals and discounts available. Knowing the highest discount
percentage enables consumers to stay competitive in their shopping
endeavors, ensuring they don't miss out on significant savings
offered by competitors.
Create view HighRatingProducts as (Select * From Amazon
Where rating >= 4.5);
Select product_id, product_name, rating From HighRatingProducts;

eate a view named High Rating Products


hat includes products with a rating of 4.5
and above.

Choosing products with ratings exceeding 4.5 reduces the risk of


encountering issues such as defects, poor performance, or
dissatisfaction. Consumers can feel more confident in their
purchases, knowing that they are less likely to experience negative
outcomes associated with lower-rated products.
Select product_id, product_name, category, rating, Rank() Over
(Partition By Category Order By rating Desc) as RankingFrom
Amazon;

sing a window function to rank products


ased on their rating within each category.

If a customer searches for a product within a specific category,


products within that category may be given higher priority in the
search results. Therefore, it is important for sellers to accurately
categorize their products to increase their visibility and improve their
chances of being found by potential customers.
Select product_id, product_name, discounted_price,
Count(product_id) Over (Order By discounted_price Rows Between
unbounded preceding and current row) as cumulated_countFrom
Amazon;

lculate the cumulative count of products


dded each month sorted by discounted
price.

Cumulative counts allow you to track the growth of product


additions over time. By sorting them by discounted price, you can
identify trends related to pricing strategies.
DELIMITER $
Create Procedure Update_rating(IN productId Varchar(200), IN
new_rating decimal(10,1))
Begin
Update Amazon Set rating = New_rating Where
product_id = productId;
End $
DELIMITER ;
Call Update_rating("B07JW9H4J1", 4.4);
ored procedure to update the rating of a
oduct given its product id and new rating.
Select Category From ( Select Category, Avg(rating) as Avg_rating
From Amazon Group By category) cr
Group By category
Order By Avg_rating Desc Limit 1;

nd the category with the highest average


rating for products. Use subqueries and
aggregate functions to achieve this.

By examining the competitive landscape within a product category,


you can identify your main competitors, understand their product
offerings, and assess their strengths and weaknesses
Select A1.product_id as product_id_1, A1.product_name as
product_name_1, A1.rating as rating_1, A2.product_id as
product_id_2, A2.product_name as product_name_2, A2.rating as
rating_2
From Amazon A1, Amazon A2
Where A1.category = A2.category and A1.rating > A2.rating And
A1.product_id <> A2.product_id;
Find pairs of products from the same
ategory where one product has a higher
ting than the other.-- Display columns for
roduct_id_1, product_name_1, rating_1,
product_id_2, product_name_2, and
rating_2.

When customers search for products within a specific category,


they often compare different options. By identifying pairs of
products with varying ratings, you can guide customers toward
higher-rated items. This improves their overall shopping
experience and satisfaction.
Team
Members

➢ Hemant choudhary
➢ Prakash Adithya J.
➢ Debanshi Paul
➢ Mohit Kumar
thanks!

You might also like