Data Mining Final Report
Data Mining Final Report
Table of Contents
1. Research Question.
2. General introduction .
3. Literature review .
4. Data description .
5. Methodology .
6. Data treatment and empirical analysis .
7. Data Visualization.
8.Conclusion.
9. Bibliography.
Research Question:
Interpretation:
RAM: Most laptops have 16GB RAM or less, but some high-end models
go up to 64GB, leading to a high standard deviation.
Screen Size: Most laptops have a screen size between 14 and 16
inches, with the median at 15.6 inches.
Battery Life: Laptops typically offer 6 to 10 hours of battery life, with
a few lasting up to 12 hours.
Weight: Most laptops weigh between 1.7 kg and 2.9 kg, with an
average of 2.34 kg.
Price: The median price is lower than the mean, indicating a right-
skewed distribution.
Interpretation:
Visualization of data :
SCREEN S
INTERPRETATION :
The chart depicts the distribution of screen sizes (measured in
hours). The x-axis represents different screen size ranges, while
the y-axis shows the frequency.Most screen sizes have a
consistent frequency across the ranges. The distribution appears
uniform, indicating no specific preference or dominance in screen
size hours.
Economic Interpretation:
Uniform distribution in screen size usage suggests that consumers
value various screen sizes equally. Manufacturers and retailers
should ensure a balanced inventory across all screen size
categories, as there is no significant deviation in demand.
PRICE:
INTERPRETATION :
The histogram illustrates the distribution of prices. The x-axis
represents price ranges, while the y-axis shows the frequency.
The graph is skewed to the right, with a majority of products or
services priced in the lower range. Higher prices are less
frequent.
Economic Interpretation:
The right-skewed distribution indicates that most consumers
prefer lower-priced products or services. This reflects price
sensitivity in the market. Businesses could focus on producing
affordable options to attract a larger customer base while
maintaining premium offerings for niche markets.
BATTERY LIFE :
INTERPRETATION:
The box plot shows the distribution of battery life in hours.
The median battery life lies at around 9 hours, with the
interquartile range (IQR) spanning approximately 7 to 11
hours.The whiskers indicate the minimum and maximum battery
life values, with no apparent outliers in the data.
Economic Interpretation:
A median battery life of 9 hours is competitive in the market, and
the consistency (small IQR) suggests that most devices deliver
similar performance. Businesses could focus on innovation to
extend battery life beyond 11 hours to differentiate their products
and appeal to consumers seeking long-lasting devices.
Weight:
INTERPRETATION :
The bar chart represents the frequency distribution of weight (kg), and the
orange cumulative line shows the percentage of data points covered as
weight increases.The distribution indicates a higher frequency of products in
the lower weight categories, with a steady decline as weight increases. The
cumulative line reaches 100% near the maximum weight range.
Economic Interpretation:
Most products are lightweight, suggesting consumer preference for
portability. Businesses can target this demand by designing lightweight and
compact products. The cumulative
trend indicates that only a small percentage of consumers might opt for
heavier products, so those could be positioned as niche offerings.
RAM:
RAM (GB)
12
10
0
0 2 4 6 8 10 12
ECONOMIC INTERPRETATION:
Businesses might focus their offerings on the more common configurations
while maintaining some presence in the high-RAM segment for specialized
demand.
PCA: Steer pca
INTERPRETATION:
The distribution of RAM sizes suggests that the majority of items or users
cluster around lower RAM values, with only a few instances at higher RAM
sizes (e.g., 10,000 GB and beyond). This could indicate that higher RAM
configurations are rare and potentially cater to niche markets, such as
specialized computing tasks or high-end users.
ECONOMIC INTERPRETATION :
INTERPRETATION:
Variables like Screen Size (inch), weight (kg) and Battery Life
(hours) contribute strongly to the first component (Dim1)we will
be named it serviceability, indicating these factors are crucial in
explaining the largest variation in the data.
The (Dim2) is dominated by Price($) and RAM(GB), suggesting
these variables are associated with the device’s Performance and
cost factors, so we will e calling it performance-cost .
Analysis of Distribution:
Skewness:
Skewness: 1.76
Kurtosis:
Kurtosis: 4.41
The distribution is leptokurtic, indicating a sharper peak and
heavier tails compared to a normal distribution. This suggests
some extreme values (outliers) in the price data.
Conclusion:
This report examined the factors influencing laptop prices through
data analysis. The key findings include:
Bibliography:
https://www.researchgate.net/publication/
261810646_Factors_Affecting_the_Notebook_Computer_Prices_in_Turk
ey_A_Hedonic_Analysis
https://www.researchgate.net/publication/
383859999_Determinants_of_Laptop_Prices_An_Empirical_Study_Using
_Regression_Analysis