0% found this document useful (0 votes)

413 views21 pages

Simple Linear Regression Analysis

This document discusses simple linear regression analysis and related statistical tests. It provides the formulas for calculating the slope and intercept of a regression line, as well as the formulas for hypothesis tests of the regression coefficients. It also discusses calculating the correlation coefficient and provides an example of computing the correlation between gestational age and birth weight using sample data. Confidence intervals for means are also discussed for both known and unknown variances, including the formulas for large and small sample sizes.

Uploaded by

Engr.bilal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

413 views21 pages

Simple Linear Regression Analysis

Uploaded by

Engr.bilal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Simple Linear Regression Analysis:

A linear regression model attempts to explain the relationship between two or more variables
using a straight line. Consider the data obtained from a chemical process where the yield of the
process is thought to be related to the reaction temperature (see the table below).

Regression Formula:
Regression Equation(y) = a + bx

Slope (b) = (NΣXY - (ΣX) (ΣY)) / (NΣX2 - (ΣX) 2)

Intercept (a) = (ΣY - b (ΣX)) / N

Where,
X and y are the variables.

b = the slope of the regression line

a = the intercept point of the regression line and the y axis.

N = Number of values or elements

X = First Score

Y = Second Score

ΣXY = Sum of the product of first and Second Scores

ΣX = Sum of First Scores

ΣY = Sum of Second Scores

ΣX2 = Sum of square First Score

Regression Example:

To find the Simple/Linear Regression of

X Values Y Values

60 3.1

61 3.6
62 3.8

63 4

65 4.1

To find regression equation, we will first find slope, intercept and use it to form regression equation.
Step 1:
Count the number of values. N = 5
Step 2:
Find XY, X2 See the below table

X Value Y Value XY XX

60 3.1 60 * 3.1 = 186 60 * 60 = 3600

61 3.6 61 * 3.6 = 219.6 61 * 61 = 3721

62 3.8 62 * 3.8 = 235.6 62 * 62 = 3844

63 4 63 * 4 = 252 63 * 63 = 3969

65 4.1 65 * 4.1 = 266.5 65 * 65 = 4225

Step 3:
Find ΣX, ΣY, ΣXY, ΣX2.
ΣX = 311 ΣY = 18.6 ΣXY = 1159.7 ΣX2 = 19359

Step 4:
Substitute in the above slope formula given.
Slope (b) = (NΣXY - (ΣX)(ΣY)) / (NΣX2 - (ΣX)2)
= ((5)*(1159.7)-(311)*(18.6))/((5)*(19359)-(311)2)
= (5798.5 - 5784.6)/ (96795 - 96721)
= 13.9/74 = 0.19
Step 5:
Now, again substitute in the above intercept formula given.

Intercept (a) = (ΣY - b (ΣX)) / N

= (18.6 - 0.19(311))/5
= (18.6 - 59.09)/5

= -40.49/5

= -8.098
Step 6:
Then substitute these values in regression equation formula Regression Equation(y) = a + bx

= -8.098 + 0.19x.

Suppose if we want to know the approximate y value for the variable x = 64. Then we can substitute the
value in the above equation.

Regression Equation(y) = a + bx

= -8.098 + 0.19(64).

= -8.098 + 12.16

= 4.06

Hypothesis Tests in Simple Linear Regression

The following sections discuss hypothesis tests on the regression coefficients in simple linear
regression. These tests can be carried out if it can be assumed that the random error term, , is
normally and independently distributed with a mean of zero and variance of .
t Tests

The tests are used to conduct hypothesis tests on the regression coefficients obtained in simple
linear regression. A statistic based on the distribution is used to test the two-sided hypothesis
that the true slope, , equals some constant value, . The statements for the hypothesis test
are expressed as:

The test statistic used for this test is:

where is the least square estimate of , and is its standard error. The value of can
be calculated as follows:
The test statistic, , follows a distribution with degrees of freedom, where is the total
number of observations.

Example

The test for the significance of regression for the data in the preceding table is illustrated in this
example. The test is carried out using the test on the coefficient . The hypothesis to be
tested is . To calculate the statistic to test , the estimate, , and the standard
error, , are needed. The value of was obtained in this section. The standard error can
be calculated as follows:

Then, the test statistic can be calculated using the following equation:

The value corresponding to this statistic based on the distribution with 23 (n-2 = 25-2 = 23) degrees
of freedom can be obtained as follows:
Assuming that the desired significance level is 0.1, since value < 0.1, is rejected
indicating that a relation exists between temperature and yield for the data in the preceding table.

Correlation Analysis

In correlation analysis, we estimate a sample correlation coefficient, more specifically

the Pearson Product Moment correlation coefficient. The sample correlation coefficient,
denoted r,

ranges between -1 and +1 and quantifies the direction and strength of the linear
association between the two variables. The correlation between two variables can be positive
(i.e., higher levels of one variable are associated with higher levels of the other) or negative (i.e.,
higher levels of one variable are associated with lower levels of the other).

The sign of the correlation coefficient indicates the direction of the association. The magnitude
of the correlation coefficient indicates the strength of the association.

For example, a correlation of r = 0.9 suggests a strong, positive association between two
variables, whereas a correlation of r = -0.2 suggest a weak, negative association. A correlation
close to zero suggests no linear association between two continuous variables.

The formula for the sample correlation coefficient is

where Cov(x,y) is the covariance of x and y defined as

are the sample variances of x and y, defined as

Example - Correlation of Gestational Age and Birth Weight

A small study is conducted involving 17 infants to investigate the association between
gestational age at birth, measured in weeks, and birth weight, measured in grams.
The variance of gestational age is:

Next, we summarize the birth weight data. The mean birth weight is:

The variance of birth weight is computed just as we did for gestational age as shown in the table
below.
The variance of birth weight is:

Next we compute the covariance,

To compute the covariance of gestational age and birth weight, we need to multiply the
deviation from the mean gestational age by the deviation from the mean birth weight for each
participant (i.e.,

The computations are summarized below. Notice that we simply copy the deviations from the
mean gestational age and birth weight from the two tables above into the table below and
multiply.
The covariance of gestational age and birth weight is:

We now compute the sample correlation coefficient:

Not surprisingly, the sample correlation coefficient indicates a strong positive correlation.

CONFIDENCE INTERVALS FOR THE MEAN, UNKNOWN VARIANCE:

Sigma Known Sigma Unknown

n≥30 ̅ ̅
√ √

n<30 ̅ ̅
√ √
Eample:

n = 8, = 0.2, s = 0.07, α = 0.05, df = n − 1 = 7. From Table 6, t α/2 = t0.025 = 2.365. Therefore, the confidence
interval is

Solution:

̅
√

0.2 2.365
√

0.2 0.059

(0.141, 0.259)

Confidence Interval for Two Independent Samples

If n1 > 30 and n2 > 30 If n1 < 30 or n2 < 30

Use Z table for standard normal

distribution Use t-table with df=n1+n2-2

Large Sample Example

The table below summarizes data n=3,539 participants attending the 7th examination of the
Offspring cohort in the Framingham Heart Study.
Men Women

Characteristic
N s n s
Systolic Blood 1,623 128.2 17.5 1,911 126.5 20.1
Pressure

Diastolic Blood 1,622 75.6 9.8 1,910 72.6 9.7

Pressure

Total Serum 1,544 192.4 35.2 1,766 207.1 36.7

Cholesterol

Weight 1,612 194.0 33.8 1,894 157.7 34.6

Height 1,545 68.9 2.7 1,781 63.4 2.5

Body Mass 1,545 28.8 4.6 1,781 27.6 5.9

Index

Small Sample Example

We previously considered a subsample of n=10 participants attending the 7th examination of
the Offspring cohort in the Framingham Heart Study. The following table contains descriptive
statistics on the same continuous characteristics in the subsample stratified by sex.

Men Women

Characteristic n Sample s n Sample s

Mean Mean

Systolic Blood 6 117.5 9.7 4 126.8 12.0

Pressure

Diastolic Blood 6 72.5 7.1 4 69.5 8.1

Pressure

Total Serum 6 193.8 30.2 4 215.0 48.8

Cholesterol

Weight 6 196.9 26.9 4 146.0 7.2

Height 6 70.2 1.0 4 62.6 2.3

Body Mass Index 6 28.0 3.6 4 26.2 2.0

Confidence Interval for One Sample

= x/n = 1,219/3,532 = 0.345

This is the point estimate, i.e., our best estimate of the proportion of the population on treatment
for hypertension is 34.5%. The sample is large, so the confidence interval can be computed
using the formula:

Example: During the 7th examination of the Offspring cohort in the Framingham Heart Study
there were 1,219 participants being treated for hypertension and 2,313 who were not on
treatment. If we call treatment a "success", then x=1,219 and n=3,532. The sample proportion
is
= x/n = 1,219/3,532 = 0.345
This is the point estimate, i.e., our best estimate of the proportion of the population on treatment
for hypertension is 34.5%. The sample is large, so the confidence interval can be computed
using the formula:

Confidence Interval for -

Confidence interval

s2 s2
( x  x )  tdf * 1  2
1 2 n n
1 2
where tdf * is the value from the t-table
that corresponds to the confidence level
2
s2
s 2

  
1 2

df   n1 n2 
2 2
1  s1 2
1  s2 
2

    
n1  1  n1  n2  1  n2 

Question:

home: x1  68.25 s1  21.8 n1  8

road: x2  68.63 s2  8.9 n2  8

Calculate a 95% CI for 1 - 2 where

1 = mean points per game allowed by Duke at home.
2 = mean points per game allowed by Duke on road
2 2 2 2
• n1 = 8, n2 = 8; s1 = (21.8) = 475.36; s2 = (8.9) =
2
s s  2 2
 475.36 79.41 
2

  1

2
  
df   n1 n2
  8 8 
 9.27
2 2 2 2
1  s12  1  s22  1  475.36   1  79.41 
    7 8  7 8 
n1  1  n1  n2  1  n2     

Chi-Square Test for Independence:

This lesson explains how to conduct a chi-square test for independence. The test is applied
when you have two categorical variables from a single population. It is used to determine
whether there is a significant association between the two variables.

When to Use Chi-Square Test for Independence

The test procedure described in this lesson is appropriate when the following conditions are met:

 The sampling method is simple random sampling.

 The variables under study are each categorical.
 If sample data are displayed in a contingency table, the expected frequency count for each cell of
the table is at least 5.

State the Hypotheses

Suppose that Variable A has r levels, and Variable B has c levels. The null hypothesis states that
knowing the level of Variable A does not help you predict the level of Variable B. That is, the
variables are independent.

H0: Variable A and Variable B are independent.

Ha: Variable A and Variable B are not independent.

The alternative hypothesis is that knowing the level of Variable A can help you predict the level
of Variable B.

Formula:

Χ2 = Σ [ (Or,c - Er,c)2 / Er,c ]

Problem

A public opinion poll surveyed a simple random sample of 1000 voters. Respondents were
classified by gender (male or female) and by voting preference (Republican, Democrat, or
Independent). Results are shown in the contingency table below.

Voting Preferences
Row total
Republican Democrat Independent

Male 200 150 50 400

Female 250 300 50 600

Column total 450 450 100 1000

Is there a gender gap? Do the men's voting preferences differ significantly from the women's
preferences? Use a 0.05 level of significance.

Solution

The solution to this problem takes four steps: (1) state the hypotheses, (2) formulate an analysis
plan, (3) analyze sample data, and (4) interpret results. We work through those steps below:

 State the hypotheses. The first step is to state the null hypothesis and an alternative
hypothesis.

H0: Gender and voting preferences are independent.

Ha: Gender and voting preferences are not independent.

 Formulate an analysis plan. For this analysis, the significance level is 0.05. Using
sample data, we will conduct a chi-square test for independence.
 Analyze sample data. Applying the chi-square test for independence to sample data,
we compute the degrees of freedom, the expected frequency counts, and the chi-
square test statistic. Based on the chi-square statistic and the degrees of freedom, we
determine the P-value.
 DF = (r - 1) * (c - 1) = (2 - 1) * (3 - 1) = 2
Er,c = (nr * nc) / n
E1,1 = (400 * 450) / 1000 = 180000/1000 = 180
E1,2 = (400 * 450) / 1000 = 180000/1000 = 180
E1,3 = (400 * 100) / 1000 = 40000/1000 = 40
E2,1 = (600 * 450) / 1000 = 270000/1000 = 270
E2,2 = (600 * 450) / 1000 = 270000/1000 = 270
E2,3 = (600 * 100) / 1000 = 60000/1000 = 60

2 2
Χ = Σ [ (Or,c - Er,c) / Er,c ]
2 2 2 2
Χ = (200 - 180) /180 + (150 - 180) /180 + (50 - 40) /40
2 2 2
+ (250 - 270) /270 + (300 - 270) /270 + (50 - 60) /60
2
Χ = 400/180 + 900/180 + 100/40 + 400/270 + 900/270 + 100/60
2
Χ = 2.22 + 5.00 + 2.50 + 1.48 + 3.33 + 1.67 = 16.2
 where DF is the degrees of freedom, r is the number of levels of gender, c is the number
of levels of the voting preference, nr is the number of observations from level r of gender,
nc is the number of observations from level c of voting preference, n is the number of
observations in the sample, Er,c is the expected frequency count when gender is
level r and voting preference is level c, and Or,c is the observed frequency count when
gender is level r voting preference is level c.

The P-value is the probability that a chi-square statistic having 2 degrees of freedom is
more extreme than 16.2.

We use the Chi-Square Distribution Calculator to find P(Χ2 > 16.2) = 0.0003.

 Interpret results. Since the P-value

(0.0003) is less than the significance level (0.05), we
cannot accept the null hypothesis. Thus, we conclude that there is a relationship between
gender and voting preference.

Chi-Square Goodness of Fit Test:

This lesson explains how to conduct a chi-square goodness of fit test. The test is applied when
you have one categorical variable from a single population. It is used to determine whether
sample data are consistent with a hypothesized distribution.
When to Use the Chi-Square Goodness of Fit Test

The chi-square goodness of fit test is appropriate when the following conditions are met:

 The sampling method is simple random sampling.

 The variable under study is categorical.
 The expected value of the number of sample observations in each level of the variable is
at least 5.

State the Hypotheses

Every hypothesis test requires the analyst to state a null hypothesis (H0) and an alternative
hypothesis (Ha). The hypotheses are stated in such a way that they are mutually exclusive. That
is, if one is true, the other must be false; and vice versa.

For a chi-square goodness of fit test, the hypotheses take the following form.

H0: The data are consistent with a specified distribution.

Ha: The data are not consistent with a specified distribution.

Analyze Sample Data

Using sample data, find the degrees of freedom, expected frequency counts, test statistic, and the
P-value associated with the test statistic.

 Degrees of freedom. The degrees of freedom (DF) is equal to the number of levels (k) of the
categorical variable minus 1: DF = k - 1 .

 Expected frequency counts. The expected frequency counts at each level of the categorical
variable are equal to the sample size times the hypothesized proportion from the null hypothesis

Ei = npi

where Ei is the expected frequency count for the ith level of the categorical variable, n is the total
sample size, and pi is the hypothesized proportion of observations in level i.

2
 Test statistic. The test statistic is a chi-square random variable (Χ ) defined by the following
equation.

2 2
Χ = Σ [ (Oi - Ei) / Ei ]
where Oi is the observed frequency count for the ith level of the categorical variable, and Ei is the
expected frequency count for the ith level of the categorical variable.

 P-value. The P-value is the probability of observing a sample statistic as extreme as the test
statistic. Since the test statistic is a chi-square, use the Chi-Square Distribution Calculator to
assess the probability associated with the test statistic. Use the degrees of freedom computed
above

Problem

 Acme Toy Company prints baseball cards. The company claims that 30% of the cards are rookies,
60% veterans, and 10% are All-Stars.
 Suppose a random sample of 100 cards has 50 rookies, 45 veterans, and 5 All-Stars. Is this
consistent with Acme's claim? Use a 0.05 level of significance.

Solution

The solution to this problem takes four steps: (1) state the hypotheses, (2) formulate an analysis plan, (3)
analyze sample data, and (4) interpret results. We work through those steps below:

 State the hypotheses. The first step is to state the null hypothesis and an alternative hypothesis.
 Null hypothesis: The proportion of rookies, veterans, and All-Stars is 30%, 60% and 10%,
respectively.
 Alternative hypothesis: At least one of the proportions in the null hypothesis is false.

 Formulate an analysis plan. For this analysis, the significance level is 0.05. Using sample data, we
will conduct a chi-square goodness of fit test of the null hypothesis.
 Analyze sample data. Applying the chi-square goodness of fit test to sample data, we compute
the degrees of freedom, the expected frequency counts, and the chi-square test statistic. Based
on the chi-square statistic and the degrees of freedom, we determine the P-value.

DF = k - 1 = 3 - 1 = 2

(Ei) = n * pi
(E1) = 100 * 0.30 = 30
(E2) = 100 * 0.60 = 60
(E3) = 100 * 0.10 = 10

2 2
Χ = Σ [ (Oi - Ei) / Ei ]
2 2 2 2
Χ = [ (50 - 30) / 30 ] + [ (45 - 60) / 60 ] + [ (5 - 10) / 10 ]
2
Χ = (400 / 30) + (225 / 60) + (25 / 10) = 13.33 + 3.75 + 2.50 = 19.58

where DF is the degrees of freedom, k is the number of levels of the categorical variable, n is the
number of observations in the sample, Ei is the expected frequency count for level i, Oi is the
2
observed frequency count for level i, and Χ is the chi-square test statistic.

The P-value is the probability that a chi-square statistic having 2 degrees of freedom is more
extreme than 19.58.

2
We use the Chi-Square Distribution Calculator to find P(Χ > 19.58) = 0.0001.

 Interpret results. Since the P-value (0.0001) is less than the significance level (0.05), we cannot
accept the null hypothesis.

Simple Linear Regression Explained
No ratings yet
Simple Linear Regression Explained
11 pages
ANOVA for Diet Efficiency Analysis
No ratings yet
ANOVA for Diet Efficiency Analysis
11 pages
Chapter-9-Simple Linear Regression & Correlation
No ratings yet
Chapter-9-Simple Linear Regression & Correlation
11 pages
Probability and Statistic
100% (1)
Probability and Statistic
132 pages
Overview of Regression Techniques
No ratings yet
Overview of Regression Techniques
14 pages
Numerical Analysis Lecture Notes
No ratings yet
Numerical Analysis Lecture Notes
36 pages
Community Project: ANCOVA (Analysis of Covariance) in SPSS
No ratings yet
Community Project: ANCOVA (Analysis of Covariance) in SPSS
4 pages
CTSD Programming Problems Overview
100% (1)
CTSD Programming Problems Overview
61 pages
Ix. Introduction To Statistical Concepts: Frequency Distribution Measures of Central Tendency Measures of Variability
No ratings yet
Ix. Introduction To Statistical Concepts: Frequency Distribution Measures of Central Tendency Measures of Variability
119 pages
Bahan Univariate Linear Regression
No ratings yet
Bahan Univariate Linear Regression
64 pages
Probability Distributions Guide
No ratings yet
Probability Distributions Guide
10 pages
UGC Statistics Curriculum 2001
No ratings yet
UGC Statistics Curriculum 2001
101 pages
2 - Lab 2.2 Series - and - Parallel - Circuits
0% (1)
2 - Lab 2.2 Series - and - Parallel - Circuits
6 pages
Inferential Statistics Last
No ratings yet
Inferential Statistics Last
53 pages
IR System Evaluation Guide
No ratings yet
IR System Evaluation Guide
28 pages
Data Structures Using C' Language
100% (1)
Data Structures Using C' Language
2 pages
Chapter 08
No ratings yet
Chapter 08
41 pages
CH 03
No ratings yet
CH 03
48 pages
Regression Analysis Basics
0% (1)
Regression Analysis Basics
14 pages
Introduction to Statistics in Public Health
No ratings yet
Introduction to Statistics in Public Health
212 pages
Simple Linear Regression and Correlation
100% (1)
Simple Linear Regression and Correlation
52 pages
Business Analytics Module 8
100% (1)
Business Analytics Module 8
65 pages
Assignment 04
No ratings yet
Assignment 04
17 pages
Regression and Correlation Analysis
No ratings yet
Regression and Correlation Analysis
48 pages
JAVA Lab Practices
No ratings yet
JAVA Lab Practices
7 pages
Regression Model Transformations
No ratings yet
Regression Model Transformations
45 pages
PSCV Unit-Iii Digital Notes
No ratings yet
PSCV Unit-Iii Digital Notes
46 pages
Applications of The Geometric Mean
100% (1)
Applications of The Geometric Mean
5 pages
Exercises Java Advanced Features
No ratings yet
Exercises Java Advanced Features
26 pages
Matrix Methods in Linear Regression
No ratings yet
Matrix Methods in Linear Regression
23 pages
1
100% (1)
1
385 pages
Midterm Practice Questions: Algorithms II
No ratings yet
Midterm Practice Questions: Algorithms II
14 pages
OS Hackerrank Challenge 123
No ratings yet
OS Hackerrank Challenge 123
4 pages
Hypothesis Testing - Analysis of Variance (ANOVA)
No ratings yet
Hypothesis Testing - Analysis of Variance (ANOVA)
14 pages
Session 15 Regression and Correlation
No ratings yet
Session 15 Regression and Correlation
66 pages
Mind Map or Summary For Chapter 2
No ratings yet
Mind Map or Summary For Chapter 2
3 pages
MCQs - Variables and Data Types
No ratings yet
MCQs - Variables and Data Types
2 pages
Understanding Multivariate Regression
No ratings yet
Understanding Multivariate Regression
20 pages
Key Features of NumPy Arrays
No ratings yet
Key Features of NumPy Arrays
15 pages
Probability & Statistics Course File
No ratings yet
Probability & Statistics Course File
72 pages
Statistics and Probability
No ratings yet
Statistics and Probability
76 pages
Introduction To Official Statistics Lecture 1
No ratings yet
Introduction To Official Statistics Lecture 1
9 pages
Confidence Interval Estimation Guide
No ratings yet
Confidence Interval Estimation Guide
14 pages
R for Economics Students
No ratings yet
R for Economics Students
128 pages
Assignment 3
No ratings yet
Assignment 3
6 pages
MSM 192 Unit 1
No ratings yet
MSM 192 Unit 1
45 pages
Chapter-12 Equation of A Straight Line
No ratings yet
Chapter-12 Equation of A Straight Line
21 pages
Lab Exercises - JAVA
100% (1)
Lab Exercises - JAVA
13 pages
ANOVA
No ratings yet
ANOVA
15 pages
Java GUI Basics and Components
No ratings yet
Java GUI Basics and Components
207 pages
Binomial Probability Distributions Guide
No ratings yet
Binomial Probability Distributions Guide
7 pages
Understanding Inferential Statistics Concepts
No ratings yet
Understanding Inferential Statistics Concepts
154 pages
Chapter10 Sol PDF
No ratings yet
Chapter10 Sol PDF
13 pages
C++ Operators and Expressions Explained
No ratings yet
C++ Operators and Expressions Explained
32 pages
Basic Business Statistics: Analysis of Variance
No ratings yet
Basic Business Statistics: Analysis of Variance
85 pages
GUI Development in R: A Comprehensive Guide
No ratings yet
GUI Development in R: A Comprehensive Guide
25 pages
Correlation and Regression Tutorial
No ratings yet
Correlation and Regression Tutorial
4 pages
Inference For Regression
No ratings yet
Inference For Regression
24 pages
Estimation
No ratings yet
Estimation
18 pages
Lecture 10 - Revision
No ratings yet
Lecture 10 - Revision
28 pages
Herbal REmedies
No ratings yet
Herbal REmedies
2 pages
Inora Range - GCC Mark - CN-GSOG-20210208 - BL214&227&215&229&213&230&236
No ratings yet
Inora Range - GCC Mark - CN-GSOG-20210208 - BL214&227&215&229&213&230&236
8 pages
CB Cert SE-95732M1
No ratings yet
CB Cert SE-95732M1
2 pages
Prayer Timing Abu Dhabi
No ratings yet
Prayer Timing Abu Dhabi
1 page
English G 2 Project Term 1 (2024 2025)
No ratings yet
English G 2 Project Term 1 (2024 2025)
2 pages
Weekly Eating Plan Smart
No ratings yet
Weekly Eating Plan Smart
2 pages
LV Guideline Busway 1-2 PDF
No ratings yet
LV Guideline Busway 1-2 PDF
51 pages
Weekly Workout Plan
No ratings yet
Weekly Workout Plan
3 pages
Estidama A-Z PDF
No ratings yet
Estidama A-Z PDF
15 pages
Transco System Safety Rules Summary For Contractors v4.1-301212 PDF
No ratings yet
Transco System Safety Rules Summary For Contractors v4.1-301212 PDF
12 pages
Electrical Specs Musanda
No ratings yet
Electrical Specs Musanda
95 pages
Pearl Building Rating System Overview
No ratings yet
Pearl Building Rating System Overview
233 pages
ADDC HSE Golden Rules
No ratings yet
ADDC HSE Golden Rules
25 pages
Solution Manual of Elements of Information Theory
100% (19)
Solution Manual of Elements of Information Theory
197 pages
ERISA
No ratings yet
ERISA
1 page
Business Plan On: Submitted To
No ratings yet
Business Plan On: Submitted To
22 pages
Multirate Digital Signal Processing
No ratings yet
Multirate Digital Signal Processing
25 pages
ANOVA
No ratings yet
ANOVA
14 pages
Time Series Data Analysis Methods
No ratings yet
Time Series Data Analysis Methods
3 pages
Analysis of The Influence of Changes in Non-Taxable Income On The Realization of Tax Revenues at The Raba Bima Pratama Tax Office
No ratings yet
Analysis of The Influence of Changes in Non-Taxable Income On The Realization of Tax Revenues at The Raba Bima Pratama Tax Office
7 pages
Trương Khánh Linh
No ratings yet
Trương Khánh Linh
22 pages
Odisha Tourism: Growth and Policy Insights
No ratings yet
Odisha Tourism: Growth and Policy Insights
14 pages
Problem Set 1: Deadline
No ratings yet
Problem Set 1: Deadline
12 pages
Chen Dissertation 2016
No ratings yet
Chen Dissertation 2016
117 pages
Statistical Analysis for Students
No ratings yet
Statistical Analysis for Students
6 pages
Michael J Panik - Regression Modeling - Methods, Theory, and Computation With SAS-CRC Press (2009)
No ratings yet
Michael J Panik - Regression Modeling - Methods, Theory, and Computation With SAS-CRC Press (2009)
806 pages
Lecture 7: Exponential Smoothing Methods Please Read Chapter 4 and Chapter 2 of MWH Book
No ratings yet
Lecture 7: Exponential Smoothing Methods Please Read Chapter 4 and Chapter 2 of MWH Book
15 pages
Sample Size Requirements Reliability Studies: K and Number
No ratings yet
Sample Size Requirements Reliability Studies: K and Number
8 pages
Peter H. Westfall, Andrea L. Arias - Understanding Regression Analysis - A Conditional Distribution Approach-Chapman and Hall - CRC (2020)
No ratings yet
Peter H. Westfall, Andrea L. Arias - Understanding Regression Analysis - A Conditional Distribution Approach-Chapman and Hall - CRC (2020)
515 pages
Triacylglycerol Analysis of Fats and Oils by Evaporative Light Scattering Detection
No ratings yet
Triacylglycerol Analysis of Fats and Oils by Evaporative Light Scattering Detection
7 pages
Evaluation of Glucose Oxidase and Hexokinase Methods
No ratings yet
Evaluation of Glucose Oxidase and Hexokinase Methods
8 pages
One-Way ANOVA: Overview and Applications
No ratings yet
One-Way ANOVA: Overview and Applications
30 pages
Grou Diff Moderation
No ratings yet
Grou Diff Moderation
15 pages
Multiple Regression - WPS Office
No ratings yet
Multiple Regression - WPS Office
2 pages
CHAPTER FOUR - Research Design: Learning Objectives
No ratings yet
CHAPTER FOUR - Research Design: Learning Objectives
13 pages
Chapter 9,10,11,12 - Công TH C
No ratings yet
Chapter 9,10,11,12 - Công TH C
9 pages
Final Internship Report
No ratings yet
Final Internship Report
37 pages
Revisiting The Energy-Economy-Environment Relationships For Attaining Environmental Sustainability: Evidence From Belt and Road Initiative Countries
No ratings yet
Revisiting The Energy-Economy-Environment Relationships For Attaining Environmental Sustainability: Evidence From Belt and Road Initiative Countries
30 pages
Probability & Statistics Tutorial
No ratings yet
Probability & Statistics Tutorial
2 pages
Stat4 Normal Distribution
No ratings yet
Stat4 Normal Distribution
41 pages
ADU5301 - Home Assignment
No ratings yet
ADU5301 - Home Assignment
3 pages
FUNSUR 214 Chapter 1 Lessons
No ratings yet
FUNSUR 214 Chapter 1 Lessons
64 pages
Individual Assignment (MBA, 2012)
No ratings yet
Individual Assignment (MBA, 2012)
1 page
A Study On Factors Influencing Claims in General Insurance Business in India
No ratings yet
A Study On Factors Influencing Claims in General Insurance Business in India
13 pages
Confidence Intervals for Students
No ratings yet
Confidence Intervals for Students
18 pages
Applications of DOE in Engineering and Science 2019 Revised-LYE Sept 17 PDF
100% (1)
Applications of DOE in Engineering and Science 2019 Revised-LYE Sept 17 PDF
214 pages
Group B 2
No ratings yet
Group B 2
12 pages

Simple Linear Regression Analysis

Uploaded by

Simple Linear Regression Analysis

Uploaded by

Simple Linear Regression Analysis:

Slope (b) = (NΣXY - (ΣX) (ΣY)) / (NΣX2 - (ΣX) 2)

Intercept (a) = (ΣY - b (ΣX)) / N

b = the slope of the regression line

a = the intercept point of the regression line and the y axis.

N = Number of values or elements

ΣXY = Sum of the product of first and Second Scores

ΣX = Sum of First Scores

ΣY = Sum of Second Scores

ΣX2 = Sum of square First Score

To find the Simple/Linear Regression of

X Value Y Value X*Y X*X

60 3.1 60 * 3.1 = 186 60 * 60 = 3600

61 3.6 61 * 3.6 = 219.6 61 * 61 = 3721

62 3.8 62 * 3.8 = 235.6 62 * 62 = 3844

65 4.1 65 * 4.1 = 266.5 65 * 65 = 4225

Intercept (a) = (ΣY - b (ΣX)) / N

Hypothesis Tests in Simple Linear Regression

The test statistic used for this test is:

In correlation analysis, we estimate a sample correlation coefficient, more specifically

The formula for the sample correlation coefficient is

where Cov(x,y) is the covariance of x and y defined as

are the sample variances of x and y, defined as

Example - Correlation of Gestational Age and Birth Weight

Next we compute the covariance,

We now compute the sample correlation coefficient:

CONFIDENCE INTERVALS FOR THE MEAN, UNKNOWN VARIANCE:

Sigma Known Sigma Unknown

Confidence Interval for Two Independent Samples

If n1 > 30 and n2 > 30 If n1 < 30 or n2 < 30

Use Z table for standard normal

Large Sample Example

Diastolic Blood 1,622 75.6 9.8 1,910 72.6 9.7

Total Serum 1,544 192.4 35.2 1,766 207.1 36.7

Weight 1,612 194.0 33.8 1,894 157.7 34.6

Height 1,545 68.9 2.7 1,781 63.4 2.5

Body Mass 1,545 28.8 4.6 1,781 27.6 5.9

Small Sample Example

Characteristic n Sample s n Sample s

Systolic Blood 6 117.5 9.7 4 126.8 12.0

Diastolic Blood 6 72.5 7.1 4 69.5 8.1

Total Serum 6 193.8 30.2 4 215.0 48.8

Weight 6 196.9 26.9 4 146.0 7.2

Height 6 70.2 1.0 4 62.6 2.3

Body Mass Index 6 28.0 3.6 4 26.2 2.0

= x/n = 1,219/3,532 = 0.345

Confidence Interval for -

home: x1  68.25 s1  21.8 n1  8

Calculate a 95% CI for 1 - 2 where

Chi-Square Test for Independence:

When to Use Chi-Square Test for Independence

 The sampling method is simple random sampling.

State the Hypotheses

H0: Variable A and Variable B are independent.

Χ2 = Σ [ (Or,c - Er,c)2 / Er,c ]

Male 200 150 50 400

Female 250 300 50 600

Column total 450 450 100 1000

H0: Gender and voting preferences are independent.

 Interpret results. Since the P-value

Chi-Square Goodness of Fit Test:

 The sampling method is simple random sampling.

State the Hypotheses

H0: The data are consistent with a specified distribution.

Analyze Sample Data

You might also like

X Value Y Value XY XX