0% found this document useful (0 votes)
5 views19 pages

Data Mining Final Report

This data mining report investigates the key factors influencing laptop prices, focusing on specifications such as RAM, screen size, and battery life. The analysis, based on a dataset of 11,769 responses, reveals that both numerical and categorical variables significantly impact consumer choices and pricing. The findings indicate a right-skewed price distribution, highlighting a market preference for budget and mid-range laptops, particularly from brands like Apple and operating systems like Windows.

Uploaded by

bts005980
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views19 pages

Data Mining Final Report

This data mining report investigates the key factors influencing laptop prices, focusing on specifications such as RAM, screen size, and battery life. The analysis, based on a dataset of 11,769 responses, reveals that both numerical and categorical variables significantly impact consumer choices and pricing. The findings indicate a right-skewed price distribution, highlighting a market preference for budget and mid-range laptops, particularly from brands like Apple and operating systems like Windows.

Uploaded by

bts005980
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 19

DATA MINING REPORT

Course: data mining


Professor: Bouziane laila
Date: April 2025

This report is made collectively by :


 Takouaha ihsane. 22012348
 Elfettane Maroua. 22004016
 Makaoui Safia. 22003566
 Abdellatif fatach. 21018528
 Rkiba Nadia. 22011809
Laptop Price

Table of Contents

1. Research Question.
2. General introduction .
3. Literature review .
4. Data description .
5. Methodology .
6. Data treatment and empirical analysis .
7. Data Visualization.
8.Conclusion.
9. Bibliography.
Research Question:

“What are the key factors influencing laptop


prices, and which specifications have the most
significant impact?”

 This is the central question we aim to answer through many analysis


and calculations to answer it by the end of this report.
General introduction:
Laptop pricing is an important topic that affects both consumers
and businesses worldwide. International organizations like WTO
(world trade organization) focus on market trends, competition,
and fair pricing to ensure technology remains accessible.

They highlight the role of pricing in economic growth and digital


inclusion. This topic is important for various groups:

 Consumers need to understand what influences laptop


prices to make better purchasing decisions.
 Manufacturers and retailers can use this information to
set competitive prices.
 Governments and policymakers also study laptop pricing
to ensure fairness and affordability, especially in education
and business.
Literature Review:

1“Analysis of Factors Influencing Laptop Purchase Decisions” by


Nawiyah, Kurniawati, and Endrawati (2023)

-This study investigates the determinants affecting consumers’


decisions when purchasing laptops. Utilizing linear regression
analysis on data collected from 200 students at APP Polytechnic,
the research identifies that price, brand image, and quality
significantly and positively influence purchasing decisions. These
findings suggest that consumers prioritize these factors when
selecting laptops.
2.“Factors Influencing Consumers’ Laptop Purchases” by Nasır,
Yoruker, Güneş, and Ozdemir

-This study identifies seven key factors influencing consumers’


laptop purchase decisions:
 Core Technical Features: Specifications like processor
speed and memory.
 Post-Purchase Services: Warranty and customer support.
 Price and Payment Conditions: Cost and financing
options.
 Peripheral Specifications: Additional features like ports
and accessories.
 Physical Appearance: Design and aesthetics.
 Value-Added Features: Extras such as bundled software.
 Connectivity and Mobility: Features like Bluetooth and
portability.
Data Description:
Our dataset comprises 11769 answers from different individuals
and observations with the 11 following variables:

1. Brand: Categorical variable representing the manufacturer


of the laptop (e.g., Apple, Dell, HP).
2. Processor: Categorical variable indicating the CPU model,
such as Intel i5, AMD Ryzen 7, etc.
3. RAM (GB): Numerical variable specifying the amount of
Random Access Memory in gigabytes (e.g., 8, 16).
4. Storage: Categorical variable detailing the type and
capacity of storage, like 512GB SSD or 1TB HDD.
5. GPU: Categorical variable denoting the Graphics Processing
Unit, such as Nvidia GTX 1650 or integrated graphics.
6. Screen Size: Numerical variable measuring the diagonal
size of the laptop screen in inches (e.g., 15.6).
7. Resolution: Categorical variable describing the display
resolution, like 1920x1080 or 2560x1440.
8. Battery Life: Numerical variable indicating the battery
duration in hours (e.g., 10).
9. Weight (kg): Numerical variable representing the laptop’s
weight in kilograms (e.g., 1.5).
10.Operating System: Categorical variable specifying the OS
installed, such as Windows, macOS, or Linux.
11.Price ($): Numerical variable indicating the laptop’s price
in US dollars (e.g., 1200).
Methodology:

This study analyzes the relationship between laptop specifications


and prices using statistical and data mining techniques:
1. Data Preprocessing:
Clean the dataset, handle missing values, and encode
categorical variables.
2. Descriptive Statistics:
Calculate summary statistics for numerical variables and
frequency distributions for categorical variables.
3. Data Visualization:
Visualize categorical variables with bar and pie charts, and
numerical variables using histograms, box plots, and scatter
plots.
4. Statistical Analysis:
Conduct correlation analysis between numerical variables, and
analyze skewness/kurtosis for data distribution.

Data treatment and empirical analysis:


Numerical Variables Analysis:

Metric RAM Screen Battery life Weight Price


(GB) seize(insh) (hours) (Kg) ($)

Count 11,768 11,768 11,768 11,768 11,768

Mean 24.85 15.21 8.03 2.34 $2,183.57

Median 16 15.6 8.0 2.34 $1,840.87

Standard 21.76 1.43 2.31 0.67 $10,807.88


deviation

Maximu 64 17.3 12.0 3.5 $279.59


m

Minimum 4 13.3 4.0 1.2 $279.59

25th 8 14.0 6.0 1.76 $1,272.05


percentil
e

75th 32 16.0 10.0 2.91 $2,698.37


percentil
e

Interpretation:

 RAM: Most laptops have 16GB RAM or less, but some high-end models
go up to 64GB, leading to a high standard deviation.
 Screen Size: Most laptops have a screen size between 14 and 16
inches, with the median at 15.6 inches.
 Battery Life: Laptops typically offer 6 to 10 hours of battery life, with
a few lasting up to 12 hours.
 Weight: Most laptops weigh between 1.7 kg and 2.9 kg, with an
average of 2.34 kg.
 Price: The median price is lower than the mean, indicating a right-
skewed distribution.

Categorical Variables Analysis:

Category Most Frequent Count


Value

Resolution 1366x768 2,932

Storage 1TB SSD 2,313

GPU Nvidia GTX 1650 1,698

Processor Intel i3 1,570

Operating System Windows 2,954

Brand Apple 1,262

Interpretation:

 Resolution: Most laptops have 1366x768 resolution, suggesting many


budget or mid-range laptops in the dataset.
 Storage: The 1TB SSD is the most common, indicating a preference
for modern, high-speed storage.
 GPU: The Nvidia GTX 1650 is the most frequent, suggesting many
gaming or creative laptops in the dataset.
 Processor: The Intel i3 is the most frequent, indicating many entry-
level or mid-range laptops.
 Operating System: Windows dominates the dataset, aligning with
global market trends.
 Brand: Apple is the most common brand, indicating a significant
presence of premium devices.

Visualization of data :
SCREEN S

INTERPRETATION :
The chart depicts the distribution of screen sizes (measured in
hours). The x-axis represents different screen size ranges, while
the y-axis shows the frequency.Most screen sizes have a
consistent frequency across the ranges. The distribution appears
uniform, indicating no specific preference or dominance in screen
size hours.

Economic Interpretation:
Uniform distribution in screen size usage suggests that consumers
value various screen sizes equally. Manufacturers and retailers
should ensure a balanced inventory across all screen size
categories, as there is no significant deviation in demand.

PRICE:

INTERPRETATION :
The histogram illustrates the distribution of prices. The x-axis
represents price ranges, while the y-axis shows the frequency.
The graph is skewed to the right, with a majority of products or
services priced in the lower range. Higher prices are less
frequent.
Economic Interpretation:
The right-skewed distribution indicates that most consumers
prefer lower-priced products or services. This reflects price
sensitivity in the market. Businesses could focus on producing
affordable options to attract a larger customer base while
maintaining premium offerings for niche markets.

BATTERY LIFE :

INTERPRETATION:
The box plot shows the distribution of battery life in hours.
The median battery life lies at around 9 hours, with the
interquartile range (IQR) spanning approximately 7 to 11
hours.The whiskers indicate the minimum and maximum battery
life values, with no apparent outliers in the data.

Economic Interpretation:
A median battery life of 9 hours is competitive in the market, and
the consistency (small IQR) suggests that most devices deliver
similar performance. Businesses could focus on innovation to
extend battery life beyond 11 hours to differentiate their products
and appeal to consumers seeking long-lasting devices.

Weight:

INTERPRETATION :

The bar chart represents the frequency distribution of weight (kg), and the
orange cumulative line shows the percentage of data points covered as
weight increases.The distribution indicates a higher frequency of products in
the lower weight categories, with a steady decline as weight increases. The
cumulative line reaches 100% near the maximum weight range.

Economic Interpretation:
Most products are lightweight, suggesting consumer preference for
portability. Businesses can target this demand by designing lightweight and
compact products. The cumulative

trend indicates that only a small percentage of consumers might opt for
heavier products, so those could be positioned as niche offerings.

RAM:
RAM (GB)
12

10

0
0 2 4 6 8 10 12

INTERPRETATION :The distribution of RAM sizes suggests that


the majority of items or users cluster around lower RAM values,
with only a few instances at higher RAM sizes (e.g., 10,000 GB
and beyond). This could indicate that higher RAM configurations
are rare and potentially cater to niche markets, such as
specialized computing tasks or high-end users.

ECONOMIC INTERPRETATION:
Businesses might focus their offerings on the more common configurations
while maintaining some presence in the high-RAM segment for specialized
demand.
PCA: Steer pca

INTERPRETATION:

The distribution of RAM sizes suggests that the majority of items or users
cluster around lower RAM values, with only a few instances at higher RAM
sizes (e.g., 10,000 GB and beyond). This could indicate that higher RAM
configurations are rare and potentially cater to niche markets, such as
specialized computing tasks or high-end users.

ECONOMIC INTERPRETATION :

Businesses might focus their offerings on the more common configurations


while maintaining some presence in the high-RAM segment for specialized
demands.
PCA:

INTERPRETATION:
Variables like Screen Size (inch), weight (kg) and Battery Life
(hours) contribute strongly to the first component (Dim1)we will
be named it serviceability, indicating these factors are crucial in
explaining the largest variation in the data.
The (Dim2) is dominated by Price($) and RAM(GB), suggesting
these variables are associated with the device’s Performance and
cost factors, so we will e calling it performance-cost .
Analysis of Distribution:
 Skewness:

-RAM (GB): 0.884 (positively skewed, longer right tail)

-Screen Size (inch): 0.038 (almost symmetric)

-Battery Life (hours): -0.014 (almost symmetric)

-Weight (kg): 0.007 (almost symmetric)

-Price ($): 1.763 (positively skewed, longer right tail).

 The analysis for the Price ($) column is as follows:

 Skewness: 1.76

 The distribution is positively skewed, meaning most laptop prices


are concentrated on the lower end, with fewer laptops priced
significantly higher.

 Kurtosis:

-RAM (GB): -0.670 (platykurtic, flatter peak)

-Screen Size (inch): -1.352 (platykurtic, flatter peak)

-Battery Life (hours): -1.206 (platykurtic, flatter peak)

-Weight (kg): -1.203 (platykurtic, flatter peak)


-Price ($): 4.406 (leptokurtic, heavy tails).

 Kurtosis: 4.41
 The distribution is leptokurtic, indicating a sharper peak and
heavier tails compared to a normal distribution. This suggests
some extreme values (outliers) in the price data.
Conclusion:
This report examined the factors influencing laptop prices through
data analysis. The key findings include:

 RAM, screen size, battery life, weight, and price are


significant numerical variables impacting laptop pricing.
 Brand, processor, GPU, storage, resolution, and
operating system are important categorical variables that
affect consumer choice.
 The price distribution is right-skewed due to the presence of
high-priced models, and extreme values are more frequent
than expected.
 The dataset shows a focus on budget and mid-range
laptops, with a significant presence of Apple and
Windows-based devices.

Bibliography:

 https://www.researchgate.net/publication/
261810646_Factors_Affecting_the_Notebook_Computer_Prices_in_Turk
ey_A_Hedonic_Analysis

 https://www.researchgate.net/publication/
383859999_Determinants_of_Laptop_Prices_An_Empirical_Study_Using
_Regression_Analysis

You might also like