CHAPTER 1: Statistics and Data
a. 3 steps to do statistics :
1. Find the right data
2. Use the appropriate tool
3. Interpret the number
b. 2 branches of statistics :
1. Descriptive = summary of data set
2. Inferential = extracting useful information using sample to know
population.
Why do we use sample?
- Expensive (sensus penduduk)
- Impossible (battery, blood)
c. 2 types of data :
1. Cross sectional = - many subject/categories
- at 1 period of time
2. Time series = - one subject
- overtime
d. Variable = a different characteristic
2 types of variable :
1. Qualitative : verbally
2. Quantitative : numerically (meaningful)
2 types of quantitative variable : 1. Discrete = people, page
2. Continuous = weight, time
e. 4 categories of data measurement :
1. Nominal : nama kelompok OPK dan nomer kelompok OPK
2. Ordinal : makanan enak banget, enak, biasa aja, ga enak
(Differences are meaningless)
3. Interval : suhu
(differences are equal, no true zero point)
4. Ratio : jumlah jerawat
(differences are equal, has true zero point)
CHAPTER 2: Tabular and Graphic Method
a. Qualitative data : Nominal and ordinal
- Contoh : Nominal = pesan makanan
Frequency distribution table
Makanan Tally Frequency Cumulative Relative Cumulative
Frequency Frequency Relative
Frequency
Total: Total: Total: 1 Total:
b. Quantitative Data – Interval and ratio
Guideline : 1. Mutually exclusive à not overlap
2. Classes are exhaustive
3. Total classes are usually 5-20
Width of each class : _Largest-Smallest_
Number of classes
Tally Frequency Cumulative Relative Cumulative
Frequency Frequency Relative
Frequency
Total: Total: Total: 1 Total:
c. Types of Diagram :
Pie Bar
Histogram Polygon
Ogive (cumulative freq) Scatterplot
Stem and Leaf
CHAPTER 3 : Numerical Descriptive Measures
a. Measure of Central location = Quantitative data which cluster around the
central value
1. Mean
Two types : i. Sample mean :
ii. Population mean :
The weakness : influenced by outliers (extremely small/large value)
2. Median = middle value of data set (in order)
8 3 7 6 6
3. Mode = the value which has the most frequency
Might have more than one mode/no mode at all
Percentiles : information about how data are spread in ascending order
Contoh : GPA 3,8 à percentile ke 80
Arti : - 80% dari populasi IP nya < 3,8
- 1-80% dari populasi IP nya > 3,8
Terms in percentile :
25th = 1st quartile (Q1) 50th = 2nd quartile (Q2) = median 75th = 3rd quartile (Q3)
Calculating the pth percentile :
= (n+1) p/100
If the value is integer, that’s the value.
Example : 3 4 5 6 6 The 20th percentile?
If not : Example : 3 4 5 6 6 The 25th percentile (Q1)?
6 (25/100) = 1,5 à value ke 1,5
Value 1 :3 Value 1,5 : 3 + 0,5 (4-3) = 3,5 Arti?
Latihan : 58 60 63 66 66 68 69 73 Q3? Arti?
Boxplot : consists of the Q1, Q2 (median), Q3, smallest value, largest value à outlier à IQR
(interquartile range) :1,5 x Q1 and 1,5 x Q3
Detecting outliers
1.5 ×IQR, or 1.5 ×43.48 = 65.22. Outliers if :
Q1 –S > 65.22, or if L – Q3 > 65.22
Geometric Mean Return = Multiperiod
n = number of multiperiod returns
Contoh :
Hari ini : 10 juta Minggu depan : -10% 2 Minggu lagi : +12%
Arithmetic : 2% : 10,2 juta Geometric : 10 juta 40 ribu
Average growth rate
Measures of Dispersion
1. Range
= Maximum Value - Minimum Value
2. Mean Absolute Deviation : average of the difference of each observation from the mean.
3. Variance and Standard Deviation
4. Coefficient of Variation (CV) = CV differences in the magnitudes of the means
Mean-Variance Analysis : Measuring the performance of an asset is by its rate
of return
(from the return and the risk)
The Sharpe ratio (reward to volatility ratio) : measure the extra return per unit of risk
Chebyshev’s Theorem
: For any data set, the proportion of observations that lie within k
standard deviations from the mean is at least 1- 1/k2 , where k is any number greater than 1.
Jumlah anak di kelas = 35
Mean = 84, std dev = 7
Ada berapa anak yang nilainya antara 70-98?