0% found this document useful (0 votes)
85 views30 pages

Lec 03 - Regresi Linier (Optimized)

Regresi linier adalah teknik statistik yang digunakan untuk memodelkan hubungan antara variabel-variabel yang berhubungan secara non-deterministik. Regresi linier menggunakan persamaan garis lurus untuk memprediksi variabel output berdasarkan variabel input. Metode ini digunakan untuk menganalisis hubungan antara dua atau lebih variabel sehingga dapat memprediksi nilai variabel output berdasarkan nilai variabel input.

Uploaded by

Asal Review
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
85 views30 pages

Lec 03 - Regresi Linier (Optimized)

Regresi linier adalah teknik statistik yang digunakan untuk memodelkan hubungan antara variabel-variabel yang berhubungan secara non-deterministik. Regresi linier menggunakan persamaan garis lurus untuk memprediksi variabel output berdasarkan variabel input. Metode ini digunakan untuk menganalisis hubungan antara dua atau lebih variabel sehingga dapat memprediksi nilai variabel output berdasarkan nilai variabel input.

Uploaded by

Asal Review
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Regresi Linier

Dr. Rusdha Muharar


Jurusan Teknik Elektro dan Komputer
Universitas Syiah Kuala
ane travel (Fig. E2.30). (a) Find the cat’s velocity at t = 4.0 s and at
e speed of t = 7.0 s. (b) What is the cat’s acceleration at t = 3.0 s? At

ident, the Fungsi: kasus deterministik


t = 6.0 s? At t = 7.0 s? (c) What distance does the cat move dur-
ing the first 4.5 s? From t = 0 to t = 7.5 s? (d) Sketch clear graphs
more gen- of the cat’s acceleration and position as functions of time, assuming
According that the cat started at the origin.
leration of
ers) does a Grafik
§ Figure E2.30
kecepatan kucing yang berjalan dalam garis lurus.
a constant
vx
ting in hip /
(cm s)

he elderly. 8 • Hubungan antara dua variabel


his can be 7
cture. One 6 Ø Waktu (𝑡)
ical pad is
ct of a fall,
5 Ø Kecepatan (𝑣! )
4
s the hip
? (b) The 3
ge, but to 2
asts. 1
d, and not
O 1 2 3 4 5 6 7
t (s) Cek :
been car-
ces of rock
tronomers 2.31 .. The graph in Fig. E2.31 shows the velocity of a motorcycle • t = 0, ! vx = 8
this way. police officer plotted as a function of time. (a) Find the instantaneous
for “ALH acceleration at t = 3 s, at t = 7 s, and at t = 11 s. (b) How far t = 6, ! vx = 0
ld have to
8
does the officer go in the first 5 s? The first 9 s? The first 13 s?
act. Let us
vx = t+8 • t = 2, ! vx =?
Figure E2.31 6
To escape
velocity of vx (m s) /
istance of 50
uld be the
the accel-
45
40
Hubungan antara 𝑡 dan 𝑣! adalah deterministik
n last? (c) 35
us subtilis 30
Bagaimana dengan kasus ini?
16 2. Statistical Learning
25

25

25
20

20

20
Sales

Sales

Sales
15

15

15
10

10

10
5

5
0 50 100 200 300 0 10 20 30 40 50 0 20 40 60 80 100

TV Radio Newspaper

• Garis biru gelap menandakan sebuah model (persamaan garis) yang


FIGURE 2.1. The Advertising data set. The plot displays sales, in thousands
of units,menghubungkan
as a functionbudget
of TV, iklan
radio,dengan penjualan.
and newspaper budgets, in thousands of
• Tanpa
dollars, for 200model: berapa
different sales In
markets. (ribu)
eachbila
plotbudget
we showiklan
theTV 400 juta?
simple least squares
• Salestomana
fit of sales (TV, Radio,
that variable, Newspaper)
as described yang paling
in Chapter 3. Indipengaruhi
other words,iklan?
each blue
Bagaimana dengan kasus ini? (2)
2.1 What Is Statistical Learning? 17

80

80
70

70
60

60
Income

Income
50

50
40

40
30

30
20

20
10 12 14 16 18 20 22 10 12 14 16 18 20 22

Years of Education Years of Education

FIGURE 2.2. The Income data set. Left: The red dots are the observed values
of income
§ Mana(in tens
modelof thousands of dollars)
yang lebih and years
baik? Linear education for 30 indi-
ofnon-linier?
atau
viduals. Right: The blue curve represents the true underlying relationship between
income and years of education, which is generally unknown (but is known in
this case because the data were simulated). The black lines represent the error
associated with each observation. Note that some errors are positive (if an ob-
servation lies above the blue curve) and some are negative (if an observation lies
Analisis
16 Regresi
2. Statistical Learning

25

25

25
20

20

20
Sales

Sales

Sales
15

15

15
10

10

10
5

5
0 50 100 200 300 0 10 20 30 40 50 0 20 40 60 80 100

TV Radio Newspaper

FIGURE 2.1. The Advertising data set. The plot displays sales, in thousands
DEFINISI

Analisis
of units, Regresi:
as a function of TV,Perangkat statistikbudgets,
radio, and newspaper yang diguakan
in thousands untuk
of
dollars, for memodelkan hubungan
200 different markets. antara
In each plot we showvariabel-variabel yang
the simple least squares
fit of salessaling berhubungan
to that variable, secara
as described non-deterministik
in Chapter 3. In other words, each blue
line represents a simple model that can be used to predict sales using TV, radio,
and newspaper, respectively. Y = f (X) + ✏

Variabel Output
More generally, Variabel
suppose that we observeInputa quantitative
Error (random)
response Y and p
different predictors, X1 , X2 , . . . , Xp . We assume that there is some
relationship between Y and X = (X , X , . . . , X ), which can be written
Analisis
16 Regresi
2. Statistical Learning

25

25

25
20

20

20
Sales

Sales

Sales
15

15

15
10

10

10
5

5
0 50 100 200 300 0 10 20 30 40 50 0 20 40 60 80 100

TV Radio Newspaper

FIGURE 2.1. The Advertising data set. The plot displays sales, in thousands
Y = f (X) + ✏
of units, as a function of TV, radio, and newspaper budgets, in thousands of
dollars, for 200 different markets. In each plot we show the simple least squares
fit of sales to that variable, as described in Chapter 3. In other words, each blue
Sales model
line represents a simple Xthat
= can
(X1be, X
used, X
2
to predict
3) Error
sales (random)
using TV, radio,
and newspaper, respectively.

Biaya Biaya Biaya Iklan


More generally, supposeIklan
that we observe Iklana quantitative
Newspaperresponse Y and p
different predictors, X1 , TV
X2 , . . . , Xp .Radio
We assume that there is some
relationship between Y and X = (X , X , . . . , X ), which can be written
Empirical
16 Model
2. Statistical Learning

25

25

25
20

20

20
Sales

Sales

Sales
15

15

15
10

10

10
5

5
0 50 100 200 300 0 10 20 30 40 50 0 20 40 60 80 100

TV Radio Newspaper

FIGURE 2.1. The Advertising data set. The plot displays sales, in thousands
Y = f (X) + ✏
of units, as a function of TV, radio, and newspaper budgets, in thousands of
dollars, for 200 different markets. In each plot we show the simple least squares
fit of sales to that variable, as described in Chapter 3. In other words, each blue
• Output
line represents a simple model that can be used X
variable to predict , X2 ,using
= (X1sales X3 )TV, radio,
and newspaper, respectively.
• Response (tanggapan)
variable
• Dependent variable • Predictors
• Features (Fitur)
More generally, suppose that we observe a quantitative response Y and p
• Independent variable
different predictors, X , X , . . . , X . We assume that there is some
1 2 p
relationship between Y and X = (X , X , . . . , X ), which can be written
Regresi Linier Sederhana 2.1 What I

Sederhana ?

80

80
§ # Response variable = 1

70

70
§ # Fitur/Predictor =1

60

60
Income

Income
50

50
Linier?

40

40
f (X) = 0 + 1X

30

30
20

20
Persamaan linier /
garis lurus
10 12 14 16 18 20 22 10

Years of Education

Y = 0 + 1X + ✏FIGURE 2.2. The Income data set. Left: The red


of income (in tens of thousands of dollars) and yea
viduals. Right: The blue curve represents the true u
income and years of education, which is genera
this case because the data were simulated). The b
Regresi Linier Sederhana (2) 2.1 What I

Sederhana ? 2.1 What Is Statistical


2.1 W L

80

80
§ # Response variable = 1

70

70
§ # Fitur/Predictor =1
80

80
80

60

60
Income

Income
50

50
70

70
70
Linier?

40

40
f (X) = 0 + 1X
60

60
60

30

30
Income

Income
Income

Income
20

20
50

50
50

10 12 14 16 18 20 22 10
Intercept
40

40
40

Years of Education
Slope
30

30
30

FIGURE 2.2. The Income data set. Left: The red


of income (in tens of thousands of dollars) and yea
20

20
20

viduals. Right: The blue curve represents the true u


0 income and years of education, which is genera
this case because the data were simulated). The b
10 12 14 16 18 20 10 22 12 14 16 10
18 12
20 14
22 16 18
ableness of the model is indicated by a scatterplot exhibiting a substantial
ern (as in Figures 12.1 and 12.2).
Regresi Linier Sederhana (3)
y
(x1, y1) True regression line
y 5 ! 0 1 ! 1x

«1
«2 • Model
Y = 0 + 1X +✏
(x2, y2)
x Variabel acak dengan
mean = 0 dan varian 𝜎 " .
x1 x2

2.3 Points corresponding to observations from the simple linear regression model
• Observations • Prediction / Estimation model
uplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
y1 = + 1 x1 + ✏1
earning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
0
Ŷ = ˆ0 + ˆ1 X
y2 = 0 + 1 x2 + ✏2
..
.
yn = 0 + 1 xn + ✏n
Regresi Linier Sederhana (4)
2.1 What Is Statistical Learning? 17

• Prediction / Estimation model


80

80
Ŷ = ˆ0 + ˆ1 X
70

70
60

60
• Observations
Income

Income
50

50
yi = ˆ0 + ˆ1 xi + ei , i = 1, 2, · · · , n
40

40
(xn , yn )
30

30
Residual
20

20
10 12 14 16 18 20 22 • 10
Residual
12 14 sum
16 18of20squares
22 (RSS)
Years of Education ei Years
= yofi Education
ŷi
GURE 2.2. The Income data set. Left: The redŷdots ˆare theˆ observed values
i = 0 + 1 xi
income (in (x
tens of thousands of dollars) and years of education for 30 indi-X n
1 , y1 )
2 2 2
uals. Right: The blue (x2curve
, y2 ) represents the true underlying
RSS = e1 relationship
+ e2 + · · · +between
en = e2i
ome and years of education, which is generally unknown (but is known ini=1
s case because the data were simulated). The black lines represent the error
ociated with each observation. Note that some errors are positive (if an ob-
vation lies above the blue curve) and some are negative (if an observation lies
mbol, ˆ , to denote the estimated value for an unknown parameter
Regresi Linier Sederhana (5)
fficient, or to denote the predicted value of the response.
2

80
Estimating thedalam
Persoalan Coefficients
regresi linier sederhana:

70
ctice, β0 and β1 are unknown. So before we can use (3.1) to make
§ Dari observasi

60
tions, we must use data to estimate the coefficients. Let

Income

50
(x1 , y1 ), (x2 , y2 ), . . . , (xn , yn )

40
ent n observation
denganpairs, each of which consists of a measurement (xn , yn )

30
model prediksi
and a measurement of ˆY . In the Advertising example, this data

20
Ŷ = 0 + ˆ1 X
nsists of the TV advertising budget and product sales in n = 200
nt markets.tentukan
(Recall that ˆ0 dan are displayed in Figure 2.1.)10Our
ˆ1 untuk
the data 12 14 16 18 20 22

meminimalkan
to obtain coefficient estimatesRSS
β̂0 and β̂1 such that the linear model
Years of Education

fits the available data well—that is, so that yi ≈FIGURE


β̂0 + β̂1 x2.2.
i forThe i =Income data set. Le
RSS = e21 + e22 + · · · + e2n
n. In other words, we want to find an intercept β̂0 of income
and (in β̂
a slope tens
(x y1of) thousands of dollars
11 ,such
he resulting line is as close as possible to the n viduals. Right:points.
= 200 data The blue (x2curve
, y2 ) represents
income and years of education, which
are a number of ways of measuring closeness. However, by far the
this case because the data were simulate
common approach involves minimizing the leastassociated squares withcriterion,
each observation.
least squaresNote t
Least-Squares
e take that approach Estimation
in this chapter. Alternativeservation
approaches will be
lies above the blue curve) and s
ered in Chapter 6. below the curve). Overall, these errors ha
Regresi Linier Sederhana (6)

Least Squares Estimates: Nilai ˆ0 dan ˆ1 yang


RESULT

meminimalkan RSS adalah


n
X
(xi x̄)(yi ȳ)
ˆ1 = i=1 Sxy
n =
X Sxx
(xi x̄)2
i=1

ˆ0 = ȳ ˆ1 x̄
dengan

n n
1X 1X
ȳ = yi dan x̄ = xi
n i=1 n i=1
Regresi Linier Sederhana (7)

Alternatif perhitungan:
RESULT

ˆ1 = Sxy
Sxx
dengan ! !
n
X n
X
n n
xi yi
X X i=1 i=1
Sxy = (yi ȳ)(xi x̄) = x i yi
i=1 i=1
n

n
!2
X
n n
xi
X X i=1
Sxx = (xi x̄)2 = x2i
i=1 i=1
n
PLE 12.4 The cetane number is a critical property in specifying the ignition quality of a fuel
used in a diesel engine. Determination of this number for a biodiesel fuel is expen-
Contoh perhitungan:
sive and time-consuming. The article “Relating the Cetane Number of Biodiesel
Fuels to Their Fatty Acid Composition: A Critical Study” (J. of Automobile
Engr., 2009: 565–583) included the following data on x 5 iodine value sgd and
§ Hubungan antara
y 5 cetane number for angka
a sampleCetane bahan
of 14 biofuels. Thebakar
iodinebiodisel dan
value is the nilaiof
amount
iodine (gram) :to saturate a sample of 100 g of oil. The article’s authors fit the
iodine necessary
simple linear
𝑥 =regression model to this data, so let’s follow their lead.
nilai iodine
𝑦= angka cetane

x 132.0 129.0 120.0 113.2 105.0 92.0 84.0 83.2 88.4 59.0 80.0 81.5 71.0 69.2
y 46.0 48.0 51.0 52.1 54.0 52.0 59.0 58.7 61.6 64.0 61.4 54.6 58.8 58.0

§ Kuantitas yang perlu dihitung:


The necessary summary quantities for hand calculation can be obtained by placing
Xn x valuesX
the nin a columnX X n in another column and then creating col-
nand the y values
2
umnsxifor x2, xy,yand 2
i y (these yi valuesxare
xilatter i not needed at the moment but will be
used shortly).
i=1 i=1Calculating sums gives ox i 5 1307.5, oyi 5 779.2, ox 2i 5
the column i=1
i=1
128,913.93, ox iyi 5 71,347.30, oy2i 5 43,745.22, from which
Sxx 5 128,913.93 2 s1307.5d2y14 5 6802.7693
Sxy 5 71,347.30 2 s1307.5ds779.2dy14 5 2 1424.41429
The estimated slope of the true regression line (i.e., the slope of the least squares
line) is
Sxy 2 1424.41429
!ˆ 1 5 5 5 2 .20938742
line (i.e., the intercept of the least squares line) is
!ˆ 0 5 y 2 !ˆ 1x 5 55.657143 2 s2.20938742ds93.392857d 5 75.212432
Observasi
The equation vs estimasi
of the estimated regression line (least squares line) is y 5
75.212 2 .2094x, exactly that reported in the cited article. Figure 12.8 displays a
scatterplot of the data with the least squares line superimposed. This line provides
a very good summary of the relationship between the two variables.

cet num = 75.21 – 0.2094 iod val


65

60
cet num

55

50

45
50 60 70 80 90 100 110 120 130 140
iod val

Figure 12.8 Scatterplot for Example 12.4 with least squares line superimposed, from
Minitab ■

The estimated regression line can immediately be used for two different
3.1.1 Estimating thedalam
Persoalan Coefficien
regr

80
Estimating
Regresithe Coefficients
Linier
Persoalan dalam dengan
regresi MATLAB
linier sederhana:
In practice, β0 and β1 are unknown.

70
ctice, β0 and β1 are unknown. So before we can use (3.1) we to§make Dari observasi
§ Dari observasi predictions, must use data to estim

60
ions, we must use data to estimate the coefficients. Let

Income
(x1 , y1 ), (x2 , y2

50
§ Menggunakan dua fungsi polyfit dan polyval
(x1 , y1 ), (x2 , y2 ), . . . , (xn , yn )

40
§ Fungsi polyfit digunakan untuk menentukan represent n observation dan ˆ1 pairs,
ˆ0 dengan model eachpre
nt n observation pairs, each of which consists of a measurement (x , y )

30
§ dengan
Fungsi polyval prediksi untukof
modeldigunakan X and a measurement of ˆY . In
menentukan prediksi
n
Ŷ = 0 + ˆ1 X
n

and a measurement of ˆY . In the Advertisingsetexample,


consists this
of thedata TV advertising bu

20
berdasarkan
Ŷ = 0nilai
+ 1 X, 0 , dan 1
ˆ ˆ ˆ
ˆ0 dan ˆ
sists of the TV advertising budget and product different in n = tentukan
sales markets. 200(Recall that the da
§ Buatlah
nt markets. (Recall eksperimen
tentukan ˆ0 dan
that the data dengan
ˆ1 untuk kode
are displayed berikut:
in
goal Figure
is to obtain meminimalkan
2.1.)10coefficient
Our 12 14 16 18 20 22
estimates RS
meminimalkan
to obtain coefficient estimates RSSβ̂0 and β̂1 such (3.1) that the fitslinear
the available
Years of Education
model data well—tha
RSS = e21 + e22
ts the available data well—that is, so that yi1,≈FIGURE
. .β̂.0, n.
+ β̂In1 xother
i forThe
2.2. i =Income
words, wedata
want set.toLeft
fin
RSS = e21 + e22 + · · · + e2n of income (in β̂
tens
n. In other words, we want to find an intercept β̂that 0 and the resulting
a slope (x y1of)line
11 ,such
thousands
is as close of dollars)
as p
e resulting line is as close as possible to the There n viduals.
= 200 Right:
aredata The (garis)
a Regresi
number
points. blue(xofcurve
Linier
2, yways
2)
represents
of meast
income and years of education, which
are a number of ways of measuring closeness.most However, common by far approach
the involves mi
this case because the data were
Least-Squares simulated
ommon approach involves minimizing the least and we
squares take that
criterion, approach
associated with each observation.
in this cha
least squaresNote th
Least-Squares
take that approach Estimation
in this chapter. Alternative considered
approaches
servation in above
lies Chapter
will be 6. curve) and so
the blue
red in Chapter 6. below the curve). Overall, these errors hav

In essence, statistical learning refers


Rangkuman

§ Konsep analisis regresi


§ Regresi linier sederhana:
• Model regresi linier sederhana
• Konsep RSS
• Estimasi nilai intercept dan slope
Tugas regresi linier sederhana
x

Figure 7.6: Residuals as vertical deviations.

§ Kelompok 1
304 Chapter 7 Linear Regression

ech to de- Arm Dynamic Arm Dynamic Pressure, x (lb/sq in.) Scale Reading, y
ures have Individual Strength, x Lift, y Individual Strength, x Lift, y 10 13
stics of an 1 17.3 71.7 11 28.2 68.3 10 18
bjected to 2 19.3 48.3 12 28.7 96.7 10 16
m a weight- 3 19.5 88.3 13 29.0 76.7 10 15
ifted over- 4 19.7 75.0 14 29.6 78.3 10 20
5 22.9 91.7 15 29.9 60.0 50 86
6 23.1 100.0 16 29.9 71.7 50 90
sion curve
7 26.4 73.3 17 30.3 85.0 50 88
8 26.8 65.0 18 31.3 85.0 50 88
9 27.6 75.0 19 36.0 88.3 50 92
strength). 10 28.1 88.3 20 39.5 100.0
21 40.4 100.0 (a) Find the equation of the regression line.
(cont.)
22 44.3 100.0 (b) The purpose of calibration in this application is to
23 44.6 91.7 estimate pressure from an observed scale reading.
24 50.4 100.0 Estimate the pressure for a scale reading of 54 using
25 55.9 71.7 x̂ = (54 − b0 )/b1 .

7.2 The grades of a class of 9 students on a midterm


7.5 A study was made on the amount of converted
report (x) and on the final examination (y) are as fol-
sugar in a certain process at various temperatures. The
lows:
data were coded and recorded.
1. Buatlah scatter plot dari data
x 77 di 50 atas
71 72 81 94 96 99 67 (a) Estimate the linear regression line.
2. Tentukan persamaan garis regresi linier dari data diatas (Ŷ = ˆ + ˆ X ).
y 82 66 78 34 47 85 99 99 68 (b) Estimate the mean amount of converted sugar pro-
duced when the0 coded1temperature is 1.75.
(a) Estimate the linear regression line.
3. Gambarkan plot garis tersebut
(b) Estimate pada
the final scatter
examination plot
grade of a soal
student no.1
(c) Plot the residuals versus temperature. Comment.
who received a grade of 85 on the midterm report.
4. Tentukan nilai estimasi dynamic lift ketika arm strength bernilai 27.
Temperature, x Converted Sugar, y
1.0 8.1
7.3 The amounts of a chemical compound y that dis- 1.1 7.8
5. Tentukan RSS dari observasi dan
solved in 100 grams hasilat various
of water estimasi.
temperatures 1.2 8.5
x were recorded as follows: 1.3 9.8
x (◦ C) y (grams) 1.4 9.5
0 8 6 8 1.5 8.9
15 12 10 14 1.6 8.6
30 25 21 24 1.7 10.2
45 31 33 28 1.8 9.3
1.9 9.2
20 39.5 100.0
21 40.4 100.0 (a) Find the equation of the regression line
Tugas regresi linier sederhana
22
23
44.3
44.6
100.0
91.7
(b) The purpose of calibration in this applic
estimate pressure from an observed sca
24 50.4 100.0 Estimate the pressure for a scale reading
25 55.9 71.7 x̂ = (54 − b0 )/b1 .

§ Kelompok
7.2 The grades of 2
a class of 9 students on a midterm
7.5 A study was made on the amount of
report (x) and on the final examination (y) are as fol-
sugar in a certain process at various tempera
lows:
data were coded and recorded.
x 77 50 71 72 81 94 96 99 67 Nilai UTS
(a) Estimate the linear regression line.
y 82 66 78 34 47 85 99 99 68 Nilai UAS
(b) Estimate the mean amount of converted
duced when the coded temperature is 1
(a) Estimate the linear regression line.
(c) Plot the residuals versus temperature. C
(b) Estimate the final examination grade of a student
who
1. received
Buatlah ascatter
gradeplot
of dari
85 on thedimidterm
data atas report. Temperature, x Converted Su
2. Tentukan persamaan garis regresi linier dari data diatas (Ŷ = 1.0 ˆ0 + ˆ1 X ).
1.1
8.1
7.8
7.3 The amounts of a chemical compound y that dis-
3. inGambarkan
solved 100 grams ofplot garisattersebut
water pada scatter plot soal no.1
various temperatures 1.2 8.5
4. recorded
x were Tentukanas nilai estimasi nilai UAS ketika nilai UTS bernilai 75.
follows: 1.3 9.8
5. Tentukan
x (◦ C) RSS dariy observasi
(grams) dan hasil estimasi. 1.4 9.5
0 8 6 8 1.5 8.9
15 12 10 14 1.6 8.6
30 25 21 24 1.7 10.2
45 31 33 28 1.8 9.3
60 44 39 42 1.9 9.2
75 48 51 44 2.0 10.5

7.6 In a certain type of metal test specime


(a) Find the equation of the regression line. mal stress on a specimen is known to be fu
Tugas regresi linier sederhana

§ Kelompok 3 Chapter 7 Linear Regression

Arm Dynamic Pressure, x (lb/sq in.) Scale Reading, y


ength, x Lift, y 10 13
28.2 68.3 10 18
28.7 96.7 10 16
29.0 76.7 10 15
29.6 78.3 10 20
29.9 60.0 50 86
29.9 71.7 50 90
30.3 85.0 50 88
31.3 85.0 50 88
36.0 88.3 50 92
39.5 100.0
40.4 100.0 (a) Find the equation of the regression line.
44.3 100.0 (b) The purpose of calibration in this application is to
44.6 91.7 estimate pressure from an observed scale reading.
50.4 1.100.0
Buatlah scatter plot dari
Estimate data di for
the pressure atasa scale reading of 54 using
55.9 2. 71.7
Tentukan persamaan
x̂ = (54 − b0garis
)/b1 . regresi linier dari data diatas (Ŷ = ˆ0 + ˆ1 X ).
3. Gambarkan plot garis tersebut pada scatter plot soal no.1
of 9 students on a midterm
4. areTentukan
examination (y) as fol-
7.5nilai
A estimasi
study waspembacaan
made on the skala ketika
amount nilai tekanan
of converted
sugar in a certain process at various temperatures. The
bernilai 30.
5. Tentukandata RSS dari
were observasi
coded dan hasil estimasi.
and recorded.
81 94 96 99 67 (a) Estimate the linear regression line.
47 85 99 99 68 (b) Estimate the mean amount of converted sugar pro-
duced when the coded temperature is 1.75.
ression line.
(c) Plot the residuals versus temperature. Comment.
5.9 71.7 x̂ = (54 − b0 )/b1 .

Tugas regresi linier sederhana


9 students on a midterm
xamination (y) are as fol-
7.5 A study was made on the amount of converted
sugar in a certain process at various temperatures. The
data were coded and recorded.
81 94 96 99 67 (a) Estimate the linear regression line.
47 85 99 99 68 (b) Estimate the mean amount of converted sugar pro-
ssion line. § Kelompok 4
duced when the coded temperature is 1.75.
(c) Plot the residuals versus temperature. Comment.
nation grade of a student
85 on the midterm report. Temperature, x Converted Sugar, y
1.0 8.1
ical compound y that dis- 1.1 7.8
r at various temperatures 1.2 8.5
1.3 9.8
(grams) 1.4 9.5
6 8 1.5 8.9
10 14 1.6 8.6
21 24 1.7 10.2
33 28 1.8 9.3
39 42 1.9 9.2
51 44 2.0 10.5

7.6 In a certain type of metal test specimen, the nor-


regression line. mal stress on a specimen is known to be functionally
ter diagram. 1. Buatlah scatter
related to plot dari resistance.
the shear data di atas A data set of coded ex-
chemical that will

2. Tentukan
dissolve
here.
persamaan
perimental garis
measurements regresi
on the linier
two dari
variables data
is diatas (Ŷ = ˆ0
given
+ ˆ1 X ).
50 C.
3. Gambarkan plot the
(a) Estimate garis tersebut
regression linepada
µY |x scatter
= β0 + plot
β1 x. soal no.1
4. Tentukan nilai estimasi
(b) Estimate the shearconverted
resistance sugar ketika stress
for a normal suhu ofbernilai 2.2.
re collected to determine 24.5.
5. Tentukan RSS dari observasi dan hasil estimasi.
ssure and the correspond-
pose of calibration.
Tugas regresi linier sederhana

§ Kelompok 5
Exercises

Normal Stress, x Shear Resistance, y Placement Test Course Gra


26.8 26.5 50 53
25.4 27.3 35 41
28.9 24.2 35 61
23.6 27.1 40 56
27.7 23.6 55 68
23.9 25.9 65 36
24.7 26.3 35 11
28.1 22.5 60 70
26.9 21.7 90 79
27.4 21.4 35 59
22.6 25.8 90 54
25.6 24.9 80 91
60 48
7.7 The following is a portion of a classic data set 60 71
called the “pilot plot data” in Fitting Equations to 60 71
1. Buatlah scatter plot dari data di atas
Data by Daniel and Wood, published in 1971. The 40 47
2. Tentukan persamaan
response garis
y is the acid regresi
content linier dari
of material databydiatas (Ŷ
produced 1 X ).
= ˆ0 + ˆ55
50
53
68
titration, whereas the regressor x is the organic acid
3. Gambarkan
content plot garisbytersebut
produced extractionpada scatter plot soal no.1
and weighing. 65 57
50 79
4. Tentukan nilai estimasi
y shear
x resistance
y xketika normal stress bernilai 25.
5. Tentukan RSS dari76 observasi
123 dan
70 hasil109estimasi.
62 55 37 48 7.9 A study was made by a retail merchan
66 100 82 138 mine the relation between weekly advertisin
58 75 88 164 tures and sales.
88 159 43 28
Advertising Costs ($) Sales (
Tugas regresi linier sederhana
s 305

ormal Stress, x Shear Resistance, y Placement Test Course Grade


26.8 26.5 50 53
25.4
28.9
§ Kelompok 6 27.3
24.2
35
35
41
61 𝑦
23.6 27.1 𝑥 40 56
27.7 23.6 55 68
23.9 25.9 65 36
24.7 26.3 35 11
28.1 22.5 60 70
26.9 21.7 90 79
27.4 21.4 35 59
22.6 25.8 90 54
25.6 24.9 80 91
60 48
e following is a portion of a classic data set 60 71
e “pilot plot data” in Fitting Equations to 60 71
Daniel and Wood, published in 1971. The 40 47
y is the acid content of material produced by 55 53
whereas the regressor x is the organic acid 50 68
produced by extraction and weighing. 65 57
50 79
y x y x
76 123 70 109
62 55 1. Buatlah scatter plot7.9
37 48 dariA data di atas
study was made by a retail merchant to deter-
66 100 82 138 mine the relation between weekly advertising expendi-
58 75 2. Tentukan persamaan
88 164 tures garis regresi linier dari data diatas (Ŷ = ˆ0 + ˆ1 X ).
and sales.
88 159 43 28
3. Gambarkan plot garis tersebut pada
Advertising scatter
Costs ($) plot
Salessoal
($) no.1
the data; does it appear that a simple linear 40 385
4. model?
ssion will be a suitable Tentukan nilai estimasi course grade 20 ketika placement400 test bernilai 75.
25 395
5. Tentukan
simple linear regression;
cept.
RSSanddari observasi dan
estimate a slope
20 hasil estimasi. 365
30 475
h the regression line on the plot in (a). 50 440
40 490
20 420
mathematics placement test is given to all en- 50 560
plot data” in Fitting Equations to
nd Wood, published in 1971. The 40 47
cid content of material produced by 55 53

Tugas regresi linier sederhana


the regressor x is the organic acid
by extraction and weighing.
50
65
50
68
57
79
x y x
123 70 109
55 37 48 7.9 A study was made by a retail merchant to deter-
100
75 §88Kelompok
82 138
164 7 mine the relation between weekly advertising expendi-
tures and sales.
159 43 28
Advertising Costs ($) Sales ($)
does it appear that a simple linear 40 385
20 400
be a suitable model?
𝑥 25 395 𝑦
ear regression; estimate a slope and
20 365
30 475
ession line on the plot in (a). 50 440
40 490
20 420
cs placement test is given to all en- 50 560
a small college. A student who re- 40 525
w 35 is denied admission to the regu- 25 480
urse and placed in a remedial class. 50 510
t scores and the final grades for 20
the regular course were recorded. (a) Plot a scatter diagram.
diagram. (b) Find the equation of the regression line to predict
1. Buatlah
ion of the regression scatter plot weekly
line to predict dari data
salesdifrom
atas.advertising expenditures.
2. scores.
rom placement test Tentukan persamaan garisthe
(c) Estimate regresi
weeklylinier
salesdari
whendata diatas (costs
advertising Ŷ = ˆ0 + ˆ1 X ).
on the scatter diagram. are $35.
3. Gambarkan plot garis tersebut pada scatter plot soal no.1.
nimum passing grade, which place- (d) Plot the residuals versus advertising costs. Com-
e should be the4. Tentukan
cutoff nilai estimasi
below which ment. sales ketika biaya iklan bernilai 55.
5. denied
future should be Tentukan RSS dari observasi dan hasil estimasi.
admission
7.10 The following data are the selling prices z of a
certain make and model of used car w years old. Fit a
curve of the form µz|w = γδ w by means of the nonlin-
ear sample regression equation ẑ = cdw . [Hint: Write
Tugas regresi linier sederhana
Section 11-2/Simple Linear Regression

§ Kelompok 8
E11-2 House Data 11-8. Table E11-3 presents the highway
gasoline mileage performance and engine displacement for
Sale Taxes Sale Taxes DaimlerChrysler vehicles for model year 2005 (U.S. Environ-
Price/ (local, school), Price/ (local, school), mental Protection Agency).
1000 county)/1000 1000 county)/1000
25.9 4.9176 30.0 5.0500
𝑥
(a) Fit a simple linear model relating highway miles per gallon (y
to engine displacement ( x ) in cubic inches using least squares
𝑦 29.5 5.0208 36.9 8.2464 (b) Find an estimate of the mean highway gasoline mile
27.9 4.5429 41.9 6.6969 age performance for a car with 150 cubic inches engine
displacement.
25.9 4.5573 40.5 7.7841 (c) Obtain the fitted value of y and the corresponding residua
29.9 5.0597 43.9 9.0384 for a car, the Neon, with an engine displacement of 122
29.9 3.8910 37.5 5.9894 cubic inches.
30.9 5.8980 37.9 7.5422 11-9. An article in the Tappi Journal (March 1986) presented
28.9 5.6039 44.5 8.7951 data on green liquor Na2S concentration (in grams per liter)
and paper machine production (in tons per day). The data (read
35.9 5.8282 37.9 6.0831
from a graph) follow:
31.5 5.3003 38.9 8.3607
31.0 6.2712 36.9 8.1400 y 40 42 49 46 44 48
30.9 5.9592 45.8 9.1416 x 825 830 890 895 890 910

y 46 43 53 52 54 57 58
1. Buatlah scatter plot dari data di atas.
(c) Calculate the fitted value of y corresponding to x = 5.8980 .
Find the corresponding residual. x 915 960 990 1010 1012 1030 1050

2. Tentukan persamaan garis regresi linier dari data diatas (a)(ŶFit=a simple ˆ0 +
(d) Calculate the fitted ŷi for each value of xi used to fit the
model. Then construct a graph of ŷi versus the correspond- 1 X ). model with y = green liquor
linearˆregression
Na S concentration and x = production. Find an estimate
3. Gambarkan plot garis tersebut pada scatter plot soal no.1. least
2
ing observed value yi and comment on what this plot would
of σ . Draw a scatter diagram of the data and the resulting
2

look like if the relationship between y and x was a deter-


squares fitted model.
4. Tentukan nilai estimasi sale price ketika taxes bernilai 5.712.
ministic (no random error) straight line. Does the plot
(b) Find the fitted value of y corresponding to x = 910 and the
actually obtained indicate that taxes paid is an effective
associated residual.
5. Tentukan RSS dari observasi dan hasil estimasi.
regressor variable in predicting selling price?
11-7. The number of pounds of steam used per month by a
(c) Find the mean green liquor Na S concentration when the 2
production rate is 950 tons per day.
chemical plant is thought to be related to the average ambient 11-10. An article in the Journal of Sound and Vibration (1991
temperature (in °F) for that month. The past year’s usage and Vol. 151, pp. 383–394) described a study investigating the rela-
temperatures are in the following table: tionship between noise exposure and hypertension. The follow-
ing data are representative of those reported in the article.
Usage/ Usage/
Tugas regresi linier sederhana
Section 11-2/Simple Linear Regression 439

§ Kelompok 9
E11-4 Propellant Data (b) What is the estimate of expected BOD level when the time
is 15 days?
Observation Strength y Age x (c) What change in mean BOD is expected when the time
Number (psi) (weeks) 1. Buatlah scatter plot dari data di atas.
changes by three days?
(d) Suppose that the time used is six days. Calculate the fitted
1
2
2158.70
1678.15
15.50
23.75
2. Tentukan persamaan garis regresi linier dari data diatas
value of y and the corresponding residual.
(e) Calculate the fitted ŷi for each value of xi used to fit the
3 2316.00 8.00 ( Ŷ = ˆ + ˆ X ).
model. Then construct0 a graph1 of ŷi versus the correspond-
4 2061.30 17.00 ing observed values yi and comment on what this plot
5 2207.50 5.00
3. Gambarkan plot garis tersebut pada scatter plot soal no.1.
would look like if the relationship between y and x was a
deterministic (no random error) straight line. Does the plot
6 1708.30 19.00 4. Tentukan nilai estimasi strength ketika age bernilai 10.
actually obtained indicate that time is an effective regressor
7 1784.70 24.00 variable in predicting BOD?
8 2575.00 2.50 5. AnTentukan
11-16. RSS and
article in Wood Science dari observasi
Technology [“Creep dan hasil estimasi.
9 2357.90 7.50 in Chipboard, Part 3: Initial Assessment of the Influence of
Moisture Content and Level of Stressing on Rate of Creep
10 2277.70 11.00
and Time to Failure” (1981, Vol. 15, pp. 125–144)] reported
11 2165.20 13.00 a study of the deflection (mm) of particleboard from stress
12 2399.55 3.75 levels of relative humidity. Assume that the two variables are
related according to the simple linear regression model. The
13 1779.80 25.00
data follow:
14 2336.75 9.75
15 1765.30 22.00 x = Stress level (%): 54 54 61 61 68

16 2053.50 18.00 y = Deflection (mm): 16.473 18.693 14.305 15.121 13.505

17 2414.40 6.00 x = Stress level (%): 68 75 75 75


18 2200.50 12.50 y = Deflection (mm): 11.640 11.168 12.534 11.224
19 2654.20 2.00
(a) Calculate the least square estimates of the slope and inter-
20 1753.70 21.50 cept. What is the estimate of σ 2 ? Graph the regression
model and the data.
) Fit the simple linear regression model using the method of (b) Find the estimate of the mean deflection if the stress level
least squares. Find an estimate of σ .
2
can be limited to 65%.
) Estimate the mean porosity for a temperature of 1400 °C. (c) Estimate the change in the mean deflection associated with
) Find the fitted value corresponding to y = 11.4 and the a 5% increment in stress level.
associated residual. (d) To decrease the mean deflection by one millimeter, how
) Draw a scatter diagram of the data. Does a simple linear much increase in stress level must be generated?
i i (a) Fit a simple linear regression
model. Then construct a graph of ŷi versus the correspond- Na2S concentration and x =
ing observed value yi and comment on what this plot would of σ 2 . Draw a scatter diagram
Tugas regresi linier sederhana look like if the relationship between y and x was a deter-
ministic (no random error) straight line. Does the plot
least squares fitted model.
(b) Find the fitted value of y cor
actually obtained indicate that taxes paid is an effective associated residual.
regressor variable in predicting selling price? (c) Find the mean green liquor
11-7. The number of pounds of steam used per month by a production rate is 950 tons p
§ Kelompok 10 𝑥 𝑦to the average ambient
chemical plant is thought to be related 11-10. An article in the Journal
temperature (in °F) for that month. The past year’s usage and Vol. 151, pp. 383–394) described
temperatures are in the following table: tionship between noise exposure
ing data are representative of tho
Usage/ Usage/
Month Temp. 1000 Month Temp. 1000
y 1 0 1 2 5
Jan. 21 185.79 July. 68 621.55
x 60 63 65 70 70
Feb. 24 214.47 Aug. 74 675.06
Mar. 32 288.03 Sept. 62 562.03 y 5 4 6 8 4
Apr. 47 424.84 Oct. 50 452.93 x 85 89 90 90 90 9
May 50 454.58 Nov. 41 369.95
(a) Draw a scatter diagram of y
June 59 539.03 Dec. 30 273.98 meters of mercury) versus x (s
(a) Assuming that a simple linear regression model is appro- Does a simple linear regressio
priate, fit the regression model relating steam usage ( y) to situation?
the average temperature ( x ) . What is the estimate of σ 2 ? (b) Fit the simple linear regressi
Find an estimate of σ 2 .
1. Buatlah scatter plot dari
Graphdata di atas.line.
the regression
(c) Find the predicted mean rise
(b) What is the estimate of expected steam usage when the
2. Tentukan persamaan garis
average regresiislinier
temperature 55°F? dari data diatas (Ŷ =
ˆ0 + ˆated ). a sound pressure le
1 X with
3. Gambarkan plot(c)garis
What tersebut
change in mean padasteam
scatter
usageplotis soal no.1.
expected when the 11-11. An article in Wear (1992
monthly average temperature changes by 1°F? sents data on the fretting wear o
4. Tentukan nilai estimasi usage ketika temperature bernilai 55.
(d) Suppose that the monthly average temperature is 47°F. Representative data follow with
5. Tentukan RSS dariCalculate
observasi dan
the fitted hasil
value estimasi.
of y and the corresponding residual. volume (10 −4 cubic millimeters).
of σ . Draw a scatter diagram of the data and the resulting
between y and x was a deter- least squares fitted model.
straight line. Does the plot (b) Find the fitted value of y corresponding to x = 910 and the
Tugas regresi linier sederhana
hat taxes paid is an effective
ng selling price?
associated residual.
(c) Find the mean green liquor Na2S concentration when the
f steam used per month by a production rate is 950 tons per day.
elated to the average ambient 11-10. An article in the Journal of Sound and Vibration (1991,
Kelompok
§ year’s
th. The past usage and 11Vol. 151, pp. 383–394) described a study investigating the rela-
table: tionship between noise exposure and hypertension. The follow-
ing data are representative of those reported in the article.
Kenaikan tekanan
Usage/ darah
Month Temp.
(mm 1000
mercury) y 1 0 1 2 5 1 4 6 2 3
July. 68 621.55
Level noise (dB) x 60 63 65 70 70 70 80 90 80 80
Aug. 74 675.06
Sept. 62 562.03 y 5 4 6 8 4 5 7 9 7 6
Oct. 50 452.93 x 85 89 90 90 90 90 94 100 100 100
Nov. 41 369.95
(a) Draw a scatter diagram of y (blood pressure rise in milli
Dec. 30 273.98 meters of mercury) versus x (sound pressure level in decibels).
ar regression model is appro- Does a simple linear regression model seem reasonable in this
el relating steam usage ( y) to situation?
1. Buatlah scatter(b)plot
What is the estimate of σ 2 ? Fitdari data dilinear
the simple atas.regression model using least squares.
2. Tentukan persamaan Find angaris
estimate of σ 2linier
regresi . dari data diatas (Ŷ = ˆ0 + ˆ1 X ).
3. when
ected steam usage Gambarkan
the (c) Find
plot thetersebut
garis predicted pada
mean rise in blood
scatter pressure
plot soal no.1.level associ-
ated with a sound pressure level of 85 decibels.
4. Tentukan nilai estimasi kenaikan tekanan darah ketika level noise bernilai 55.
usage is expected when the 11-11. An article in Wear (1992, Vol. 152, pp. 171–181) pre-
5. Tentukan RSS dari observasi dan hasil estimasi.
changes by 1°F? sents data on the fretting wear of mild steel and oil viscosity.
average temperature is 47°F. Representative data follow with x = oil viscosity and y = wear
and the corresponding residual. volume (10 −4 cubic millimeters).
(c) Define SSR and give its distribution.
(d) Derive a test of H0 : β = β0 versus H1 : β ̸= β0 .
Tugas regresi linier sederhana
(e) Determine a 100(1 − α) percent prediction interval for Y (x0 ), the response
at input level x0 .
30. Prove the identity
2
SxY
R2 =
Sxx SYY
§ Kelompok 12
31. The weight and systolic blood pressure of randomly selected males in age group
25 to 30 are shown in the following table.
Tabel berikut menggambarkan data berat (𝑥)-dalam pound dan tekanan
darah systolic (𝑦) dari 20 sampel.
Subject Weight Systolic BP Subject Weight Systolic BP
1 165 130 11 172 153
2 167 133 12 159 128
3 180 150 13 168 132
4 155 128 14 174 149
5 212 151 15 183 158
6 175 146 16 215 150
7 190 150 17 195 163
8 210 140 18 180 156
9 200 148 19 143 124
10 149 125 20 240 170

1. Buatlah (a)
scatter plotthedari
Estimate datacoefficients.
regression di atas.
(b) Do the data support the claim that systolic blood pressure does not depend
2. Tentukan persamaan
on an individual’sgaris
weight?regresi linier dari data diatas (Ŷ = 0 + 1 X ).
ˆ ˆ
3. Gambarkan
(c) If aplot garis tersebut
large number pada182scatter
of males weighing pounds plot soalblood
have their no.1.pressures
taken, determine an interval that, with 95 percent confidence, will contain
4. Tentukan nilai estimasi
their average blood tekanan
pressure. darah systolic ketika berat badan 230 pound.
5. Tentukan(d) RSS
Analyzedari observasi
the standardized dan hasil estimasi.
residuals.
(e) Determine the sample correlation coefficient.
32. It has been determined that the relation between stress (S) and the number of
cycles to failure (N ) for a particular type of alloy is given by
A

You might also like