0% found this document useful (0 votes)
93 views

Regression Analysis Final Project

The document is a group report submitted to Dr. Ayesha Iftikhar by four students at Lahore Business School analyzing data from Consumer Reports on point-and-shoot digital cameras. [1] It provides numerical summaries of the camera data and develops scatter plots comparing overall score to price, megapixels, and weight. [2] A simple linear regression finds that price best predicts overall score with the equation Ŷ = 46.66 + 0.055X. [3] Multiple linear regression including price, megapixels, and weight finds the equation Ŷ = 50.14 + 0.0556X1 - 0.3566X2 + 0.1794X3 best predicts overall score

Uploaded by

Usama Yasin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
93 views

Regression Analysis Final Project

The document is a group report submitted to Dr. Ayesha Iftikhar by four students at Lahore Business School analyzing data from Consumer Reports on point-and-shoot digital cameras. [1] It provides numerical summaries of the camera data and develops scatter plots comparing overall score to price, megapixels, and weight. [2] A simple linear regression finds that price best predicts overall score with the equation Ŷ = 46.66 + 0.055X. [3] Multiple linear regression including price, megapixels, and weight finds the equation Ŷ = 50.14 + 0.0556X1 - 0.3566X2 + 0.1794X3 best predicts overall score

Uploaded by

Usama Yasin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

GROUP REPORT

UNIVERSITY OF LAHORE
Lahore Business School
Submitted To:
Dr. Ayesha Iftikhar

Submitted By:

M. Tayyab Ilyas (70145659)


Usama Ahmad (70145178)
Uzba Mehmood (70152108)
Saman Bibi (70091820)
Zoya Tassawar (70146558)

Subject:

Data Analytics
CASE PROBLEM: SELECTING A POINT- AND - SHOOT DIGITAL CAMERA

Consumer Reports tested 166 different point-and-shoot digital cameras. Based upon factors such as the number
of megapixels, weight (oz), image quality, and ease of use, they developed an overall score for each camera
tested. The overall score ranges from 0 to 100, with higher scores indicating better overall test results. Selecting
a camera with many options can be a difficult process, and price is certainly a key issue for most consumers. By
spending more, will a consumer really get a superior camera? And, do cameras that have more megapixels, a
factor often considered to be a good measure of picture quality, cost more than cameras with fewer megapixels?
Table 14.15 shows the brand, average retail price ($), number of megapixels, weight (oz), and the overall score
for 13 Canon and 15 Nikon subcompact cameras tested by Consumer Reports (Consumer Reports website).

Managerial Report

1. Develop numerical summaries of the data.


2. Using overall score as the dependent variable, develop three scatter diagrams, one using price as the
independent variable, one using the number of megapixels as the independent variable, and one using
weight as the independent variable. Which of the three independent variables appears to be the best
predictor of overall score?
3. Using simple linear regression, develop an estimated regression equation that could be used to predict
the overall score given the price of the camera.
4. Analyze the data using only the observations for the Canon cameras. Discuss the appropriateness of
using simple linear regression and make any recommendations regarding the prediction of overall score
using just the price of the camera.

Note: Use MS EXCEL for analysis and MS Word for report Writing. No marks will be rewarded if only
excel file will be submitted. It is a group project. No marks will be awarded for plagiarized or similar
content.

Project must contain proper Title page having name of all group members, subject, teacher’s name, date
of Submission and LBS LOGO. Which is attached herewith.

Font size should be 12 and Use Times New Roman style.

A report must be comprehensive and should have proper interpretation of each part.
Place Table no and Figure No with each table and plots.
1. Develop numerical summaries of the data.

Price ($) Megapixels

Mean 175.3571 Mean 12.85714286


Standard Error 15.64725 Standard Error 0.347760354
Median 160
Median 12
Mode 200
Standard 82.79748 Mode 12
Deviation Standard 1.840174825
Sample Variance 6855.423 Deviation
Kurtosis 0.663444 Sample Variance 3.386243386
Skewness 1.056995 Kurtosis -0.63315
Range 320 Skewness 0.225731061
Minimum 80
Range 6
Maximum 400
Sum 4910 Minimum 10
Count 28 Maximum 16
Sum 360
Count 28

Weight (oz)

Mean 5.821428571
Standard Error 0.185831261
Median 6
Mode 5
Standard 0.983326607
Deviation
Sample Variance 0.966931217
Kurtosis -1.190294996
Skewness -0.119748909
Range 3
Minimum 4
Maximum 7
Sum 163
Count 28
Score

Mean 56.35714286
Standard Error 1.26534422
Median 56.5
Mode 66
Standard 6.695572256
Deviation
Sample Variance 44.83068783
Kurtosis -0.616238347
Skewness -0.429488071
Range 24
Minimum 42
Maximum 66
Sum 1578
Count 28

2. Using overall score as the dependent variable, develop three scatter diagrams, one using price as the
independent variable, one using the number of megapixels as the independent variable, and one using
weight as the independent variable. Which of the three independent variables appears to be the best
predictor of overall score?
price as the independent variable

Price
80
70
60
50
Score

40
30
20
10
0
0 50 100 150 200 250 300 350 400 450
Price

Megapixels as the independent variable


Megapixels
70

60

50

40
Score

30

20

10

0
0 2 4 6 8 10 12 14 16 18
Megapixels

Weight as the independent variable

Weight
70

60

50

40
Score

30

20

10

0
0 1 2 3 4 5 6 7 8
Weight

The Scatter Diagram between score as independent variable and price as independent variable. This is the
best predictor of the overall score because dots are closely to each other and relation is strong positive.

3. Choose independent variable of your own choice. Using simple linear regression, develop an estimated
regression equation that could be used to predict the overall score of the camera. Interpret your findings
in detail.
We choose price independent variable,
Estimated Regression Equation:
̂ = 𝒃𝟎 + 𝒃𝟏𝑿
𝒀
̂ = 𝟒𝟔. 𝟔𝟔 + 𝟎. 𝟎𝟓𝟓𝑿
𝒀
Findings,
Regression Statistics
Multiple R 0.683211844
R Square 0.466778424
Adjusted R Square 0.446269901
Standard Error 4.982379069
Observations 28

Multiple R shows the coefficient of correlation which is 0.683. This is more closer to one than zero which
shows this relation is strong positive.
R Square is the Square root of R and it is coefficient of determination. It shows there is 46.66% change in
score due to price.
Sample Size is 28.

Coefficients Standard Error t Stat P-value


Intercept 46.66880198 2.238439352 20.84881233 9.38682E-18
Price ($) 0.055249194 0.011580778 4.770766964 6.15507E-05

P-Value is 0.0000615 which is less than 0.05. This mean that this model is significant good.
Intercept 𝑏0 is 46.66.
Slope 𝑏1 which is 0.0552
Standard Error which is 4.98 shows the percentage variance along with the regression line.
Assumptions:
Normal Probability Plot is linearity and normality.
Residual Plot is showing equal Variance.
Line fit plot is showing independence of error.

4. Repeat Step 3 and develop an estimated regression equation by Using Multiple linear regression that
could be used to predict the overall score of the camera. Interpret your findings in details.
We choose Price, Megapixels and Weight independent variable,
Estimated Regression Equation:
̂ = 𝒃𝟎 + 𝒃𝟏𝑿𝟏 + 𝒃𝟐𝑿𝟐 + 𝒃𝟑𝑿𝟑
𝒀
̂ = 𝟓𝟎. 𝟏𝟒 + 𝟎. 𝟎𝟓𝟓𝑿𝟏 − 𝟎. 𝟑𝟓𝑿𝟐 + 𝟎. 𝟏𝟕𝟗𝑿𝟑
𝒀
Findings,

Regression Statistics
Multiple R 0.691437086
R Square 0.478085244
Adjusted R Square 0.4128459
Standard Error 5.130547941
Observations 28

Multiple R shows the coefficient of correlation which is 0.69. This is more closer to one than
zero which shows this relation is strong positive.
R Square is the Square root of R and it is coefficient of determination. It shows there is
47.80% change in score due to price.
Sample Size is 28.

Coefficients Standard t Stat P-value


Error
Intercept 50.14687144 10.30599948 4.8657941 5.84283E-05
Price ($) 0.055607068 0.013064659 4.256296896 0.000275221
Megapixels - 0.56213898 -0.634382855 0.531832002
0.356611331
Weight (oz) 0.179367944 1.11159308 0.161361156 0.87315964

P-Value is 0.001184 which is less than 0.05. This mean that this model is significant
good fit.
Intercept 𝑏0 is 50.14.
Slope 𝑏1 which is 0.0556
Slope 𝑏2 which is -0.35
Slope 𝑏3 which is 0.17
Standard Error which is 5.13 shows the percentage variance along with the regression line.
Assumptions:
Normal Probability Plot is linearity and
normality. Residual Plot is showing equal
Variance.
Line fit plot is showing independence of error.

You might also like