0% found this document useful (0 votes)
8 views36 pages

Bab 6 Estimasi

Uploaded by

Ilham Putra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views36 pages

Bab 6 Estimasi

Uploaded by

Ilham Putra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Kelas Statistik & Prob-02

BAB VI STATISTIK &


PROBABILISTIK
ESTIMASI DESAIN EKSPERIM
Pengajar:

Bhidara Swantika, S.Si., M.Mat

1
Source image: pngegg.com
MATERI
PEMBELAJARAN
1.Pengenalan konsep statistik dan
probabilistik pada Teknik Sipil
2.Statistik Deskriptif
3.Konsep dasar probabilistik
4.Variabel acak dan distribusi
probabilistik
5.Random sampling
6.Estimasi
7.Uji Hipotesis
8.Regresi Linear Sederhana
9.Desain eksperimen

Bhidara Swantika, S.Si., M.Mat 2


Source image: pngegg.com
ESTIMASI?
ESTIMASI : Suatu metode dimana kita dapat memperkirakan nilai populasi dengan memakai
nilai sampel

POPULASI SAMPLE Sifat :


Parameter: Abstrak, sering tidak dikertahui

statistik Statistik: Empiris, nyata karena dapat dihitung


Parameter dari sample
𝜇, 𝜎, 𝑝 𝑥,ҧ 𝑠, 𝑝ҧ

Tujuan Utama mengambil sample: untuk


memperoleh informasi mengenai parameter
populasi

3
What is estimation?
• In statistics, estimation (or inference) refers to the process by which one
makes inferences (e.g. draws conclusions) about a population, based on
information obtained from a sample.

• A statistic is any measurable quantity calculated from a sample of data (e.g.


the average). This is a stochastic variable as, for a given population, it will in
general vary from sample to sample.

• An estimator is any quantity calculated from the sample data which is used to
give information about an unknown quantity in the population (the estimand).

• An estimate is the particular value of an estimator that is obtained by a


particular sample of data and used to indicate the value of a parameter.
4
A simple example
• Population: people in this room

• Sample I: people sitting in the middle row


Sample II: people whose names start with the letter M

• Statistic: average height

• I can use this statistic as an estimator for the average


height of the population obtaining different results from the
two samples
PDF of an estimator
• Ideally one can consider all
possible samples
corresponding to a given
sampling strategy and build
a probability density
function (PDF) for the
different estimates

• We will use the


characteristics of this PDF
to evaluate the quality of
Value of estimated statistic
an estimator
Bias of an estimator
Mean estimate

Population value

• The bias of an estimator is the difference between the expectation


value over its PDF (i.e. its mean value) and the population value

b( ˆ ) E ( ˆ) 0
ˆ 0
ˆ 0

• An estimator is called unbiased if b=0 while it is called biased


otherwise
Examples
• The sample mean in an unbiased estimator of the population mean

N N
x 1 x i, E[x] 1 E[x i ] N
N i i
1 1
N N

• Exercise: Is the sample variance an unbiased estimator of the


population variance?

N
1
s2 (x i x E[s2 ]
N ???
) 2, i
1
Examples
• Note that functional invariance does not hold.

• If you have an unbiased estimator S2 for the population


variance 2 and you take its square root, this will NOT be
an unbiased estimator for the population rms value !

• This applies to any non-linear transformation including


division.

• Therefore avoid to compute ratios of estimates as much as


you can.
Consistent estimators

• We can build a sequence of estimators by progressively increasing


the sample size

• If the probability that the estimates deviate from the population


value by more than «1 tends to zero as the sample size tends to
infinity, we say that the estimator is consistent
Example
• The sample mean is a consistent estimator of the population mean
Relative efficiency
Suppose there are 2 or more unbiased
estimators of the same quantity, which one
should we use? (e.g. should we use the
sample mean or sample median to estimate
the centre of a Gaussian distribution?)

• Intuition suggests that we should use the estimator that is closer


(in a probabilistic sense) to the population value. One way to do this
is to choose the estimator with the lowest variance.
• We can thus define a relative efficiency as:

• If there is an unbiased estimator that has lower variance than any


other for all possible population values, this is called the minimum-
variance unbiased estimator (MVUE)
Parameter populasi
1.Estimasi
2.Pengujian hipotesis
Ciri-ciri estimator/ penduga yang
baik
1.Tidak bias
2.Efisien
3.konsisten

13
Bhidara Swantika, S.Si., M.Mat
Accuracy vs precision
• The bias and the variance
of an estimator are very
different concepts (see the
bullseye analogy on the
right)

• Bias quantifies accuracy

• Variance quantifies
precision
Efficient estimators
• A theorem known as the Cramer-Rao bound (see Alan
Heaven’s lectures) proves that the variance of an unbiased
estimator must be larger of or equal to a specific value
which only depends on the sampling strategy (it corresponds
to the reciprocal of the Fisher information of the sample)

• We can thus define an absolute efficiency of an estimator as


the ratio between the minimum variance and the actual
variance

• An unbiased estimator is called efficient if its variance


coincides with the minimum variance for all values of the
population parameter 0
Desirable properties of an
estimator

 Consistency
 Unbiasedness
 Efficiency

• However, unbiased and/or efficient estimators do not always


exist

• Practitioners are not particularly keen on unbiasedness. So


they often tend to favor estimators such that the mean
square error, MSE= E[( ˆ 0 2) , is as low as possible
independently of the] bias.
Minimum mean-square error
• Note that,
MSE E[( ˆ )] E[( ˆ E[ ˆ ] E[ ˆ ]
0) ]
0
2 2

E[( ˆ E[ ˆ])2 ] (E[ ˆ] )2 ] 2


( ˆ) b2
( ˆ)
0

• A biased estimator with small


variance can then be preferred
to an unbiased one with large
variance

• However, identifying the


minimum mean-square error
estimator from first principles is
often not an easy task. Also the
solution might not be unique
(the bias-variance tradeoff)
Point vs interval estimates
• A point estimate of a population parameter is a single value of a
statistic (e.g. the average height). This in general changes with the
selected sample.

• In order to quantify the uncertainty of the sampling method it is


convenient to use an interval estimate defined by two numbers
between which a population parameter is said to lie

• An interval estimate is generally associated with a confidence level.


Suppose we collected many different samples (with the same
sampling strategy) and computed confidence intervals for each of
them. Some of the confidence intervals would include the population
parameter, others would not. A 95% confidence level means that
95% of the intervals contain the population parameter.
ESTIMASI TITIK ESTIMASI INTERVAL

⇢ MEAN
Nilai parameter 𝜃 pada populasi menggambarkan
σ𝑋 beberapa nilai stastistik yang ada pada suatu inteval,
𝑋ത = misalkan ҧ ҧ
𝑛 𝜃1 < 𝜃 < 𝜃2
⇢ VARIANSI
Derajat kepercayaan 𝜃ҧ adalah 1 − 𝛼,
σ(𝑋 −
2
𝑋)2 dengan 0 < 1 − 𝛼 < 1
𝑆 =
𝑛−1 Jika diambil sampel acak, maka akan diperoleh
distribusi statistik 𝜃 sehingga peluang interval 𝜃1ҧ <
⇢ PROPORSI 𝜃 < 𝜃ҧ2 dapat dihitung yaitu
𝑋 𝑃(𝜃1 < 𝜃 < 𝜃2ҧ ) = 1 − 𝛼
𝑃=
Bhidara Swantika, S.Si., M.Mat
𝑛 19
ESTIMASI INTERVAL
• SAMPEL BESAR

• SAMPEL KECIL

20
Bhidara Swantika, S.Si., M.Mat
ESTIMASI RERATAAN
• Selisih dua rata-rata sampel Besar

• Selisih dua rata-rata sampel Kecil

21
Bhidara Swantika, S.Si., M.Mat
ESTIMASI PROPORSI
• Satu proporsi sampel besar

• Selisih dua proporsi

22
Bhidara Swantika, S.Si., M.Mat
ESTIMASI VARIANSI

23
Bhidara Swantika, S.Si., M.Mat
This is all theory but how do
we build an estimator in
practice?
Let’s consider a simple (but common) case.

Suppose we perform an experiment where we measure a real-


valued variable X.

The experiment is repeated n times to generate a random


sample X1, … , Xn of independent, identically distributed
variables (iid).

We also assume that the shape of the population PDF of X is


known (Gaussian, Poisson, binomial, etc.) but has k unknown
parameters 1, … , k with k<n.
The old way: method of moments
• The method of moments is a technique for constructing
estimators of the parameters of the population PDF

• It consists of equating sample moments (mean, variance,


skewness, etc.) with population moments

• This gives a number of equations that might (or might not)


admit an acceptable solution
R.A. Fisher (1890-1962)
“Fisher was to statistics what Newton was to Physics” (R. Kass)

“Even scientists need their heroes, and R.A. Fisher was the hero of
20th century statistics” (B. Efron)
Fisher’s concept of likelihood
• “Two radically distinct concepts have been confused under the name of
‘probability’ and only by sharply distinguishing between these can we
state accurately what information a sample does give us respecting the
population from which it was drawn.” (Fisher 1921)

• “We may discuss the probability of occurrence of quantities which can


be observed…in relation to any hypotheses which may be suggested to
explain these observations. We can know nothing of the probability of
the hypotheses…We may ascertain the likelihood of the hypotheses…by
calculation from observations:…to speak of the likelihood…of an
observable quantity has no meaning.” (Fisher 1921)

• “The likelihood that any parameter (or set of parameters) should have
any assigned value (or set of values) is proportional to the probability
that if this were so, the totality of observations should be that
observed.” (Fisher 1922)
The Likelihood function
• In simple words, the likelihood of a model given a dataset is
proportional to the probability of the data given the model

• The likelihood function supplies an order of preference or plausibility


of the values of the 𝜃 i by how probable they make the observed
dataset

• The likelihood ratio between two models can then be used to prefer
one to the other

• Another convenient feature of the likelihood function is that it is


functionally invariant. This means that any quantitative statement
about the 𝜃 i implies a corresponding statements about any one to
one function of the 𝜃 i by direct algebraic substitution
Maximum Likelihood
• The likelihood function is a statistic (i.e. a function of the data) which
gives the probability of obtaining that particular set of data, given the
chosen parameters 𝜃 1, … , 𝜃 k of the model. I t should be understood as a
function of the unknown model parameters (but it is NOT a probability
distribution for them)

• The values of these parameters that maximize the sample likelihood are
known as the Maximum Likelihood Estimates or MLE’s.

• Assuming that the likelihood function is differentiable, estimation is done


by solving

• On the other hand, the maximum value may not exists at all.
Properties of MLE’s
As the sample size increases to infinity (under weak regularity conditions):

• MLE’s become asymptotically efficient and asymptotically unbiased


• MLE’s asymptotically follow a normal distribution with covariance
matrix equal to the inverse of the Fisher’s information matrix (see Alan
Heaven’s lectures)

However, for small samples,

• MLE’s can be heavily biased and the large-sample optimality does not
apply
The Bayesian way
Bayesian estimation
• In the Bayesian approach to statistics (see Jasper Wall’s lectures),
population parameters are associated with a posterior probability
which quantifies our degree of belief in the different values

• Sometimes it is convenient to introduce estimators obtained by


minimizing the posterior expected value of a loss function

• For instance one might want to minimize the mean square error,
which leads to using the mean value of the posterior distribution as
an estimator

• If, instead one prefers to keep functional invariance, the median of


the posterior distribution has to be chosen

• Remember, however, that whatever choice you make is somewhat


arbitrary as the relevant information is the entire posterior
probability density.
CONTOH SOAL 1
Dari populasi para pegawai suatu perusahaan
diambil sampel sebanyak 100 orang dan dicatat
gaji bulanan masing-masing. Rata-rata dan
simpangan baku gaji mereka adalah Rp
3000.000 dan Rp 600.000. buatlah interval
kepercayaan 95% untuk menduga berapa
sesungguhnua rata-rata gaji pegawai di
perusahaan tersebut

33
CONTOH SOAL 2
Manajer marketing sebuah bangunan ingin
memperkirakan berapa lama waktu yang diperlukan
untuk menjual satu jenis bangunan. Hasil sampling
pada 25 sampel ternyata memiliki rataan waktu 54
hari, standar deviasi sampel 5 hari, berasal dari
populasi berdistribusi hampiran normal. Tentukan
taksiran interval kepercayaan 90% dan 99% unuk
rata-rata populasinya

34
CONTOH SOAL 3
Terdapat 2 kelompok mahasiswa yang mengikuti
ujian teknik sipil, yaitu kelompok A dan B. Dari
kelompok A diambil sampel sebanyak 75 orang
yang memiliki rata-rata nilai 82 dan dari kelompok B
diambil 50 orang yang memiliki rata-rata nilai 76.
diketahui simpangan baku kelompok A adalah 8 dan
simpangan baku kelompok B adalah 6. Tentukan
interval kepercayaan 96% untuk memperkirakan
beda rata-rata nilai dua kelompok mahasiswa
tersebut.

35
Bhidara Swantika, S.Si., M.Mat
CONTOH SOAL 4
Sebuah sampel dari 500 lamaran pada suatu
perusahaan konstruksi terdapat 340 diantaranya
laki-laki. Carilah taksiran interval kepercayaan 90%
dari proporsi pria yang melamar.

36

You might also like