0% found this document useful (0 votes)
1 views7 pages

BUSINESS STATISTIC Notes

Download as docx, pdf, or txt
Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1/ 7

BUSINESS STATISTIC

- Descriptive statistic (Visual)


- Thống kê suy diễn (Inferential statistic)
- Regression (Hồi quy): Cho phép đánh giá ảnh hưởng của các yếu tố tác động đến một vấn
đề nào đó.
- Cross-Sectional data (nhiều subject, nhiều location, 1 thời điểm, independent) and Time
Series data (1 subject, 1 location, nhiều thời điểm, related).
- Union: Hợp; Intersection: Giao
- Multiplication Rule: P(A.B) = P(A).P(B/A)
- Additional Rule: P(A+B) = P(A) + P(B) - P(A.B)
Relationship between two variables:
- 2 quantitative variable: Scatter plot + correlation
- 2 qualitative variables: Compare percentage of row or column
- 1 quanti + 1 quali
- Outlier: Giá trị ngoại lai (bất thường)
- IQR: Inter Quartile Range: Khoảng tứ phân vị
- 4 IQR : Khoảng bình thường (ngoài ra là outlier)
- Whisker: ria mèo
- Đồ thị Boxplot
Numerical Method
- Location: Mean, Median, Mode, 30th percentile (giá trị ở dưới)
- Variation
+ Variance: Sample variance
2
Tổng ( xi−x )
S2 = (Chênh lệch bình phương)
n−1
 Standard deviation S=√ S 2 (Chênh lệch quan sát)
+ Average deviation of data around their mean (Độ lệch trung bình quanh giá trị
trung bình của mẫu).
+ 1 data có scale càng lớn thì độ biến động càng cao.
S
- Coefficient of variation: x 100%
x

Rule of relationship between mean and sd:


If data is symmetric (hình chuông/phân phối đối xứng):
- There is about 68% of data points lying between mean +/-1s (+/-1 độ lệch chuẩn)
- There is about 95% of data points lying between mean +/-2s
- There is about 99.93% of data points lying between mean +/-3s

The rule of three sigma (σ ¿


Any value lying outside this range (mean +/-3s) can be considered the outlier.
To make it easy to identify the outlier, we standardize data. We create new value:
Say, zi = (xi – mean)/s: then, mean of zi = 0, and its standard deviation =1.
Three sigma rule: zi will be in the range from -3 to 3. Any value outside this range will be
the outlier.

Data “Health”
1. Compare the mean of three aspects to evaluate which one is the most satisfying
and which one is the least? (As the whole).

Descriptive Statistics

N Minimum Maximum Mean Std. Deviation

Work 50 63 95 79,80 8,288


Pay 50 25 90 54,46 14,747
Promotion 50 16 92 58,48 15,999
Valid N (listwise) 50

 Comment: Work is the most satisfying one and two other aspects are quite low.

2. Do the same thing but classifying by hospitals?


3. Find the measurement to identify the aspect that nurses have the opinion with
largest and smallest difference (as the whole and classify by hospital).
4. From the analysis above, which one do you suggest the hospitals should improve?

PROBABILITY (tỉ lệ/xác suất/possibility của một sự kiện nào đó xảy ra là bao nhiêu)
P(A): probability of event A 0<= P(A) <=1
P(A)=0: A is impossible event
P(A)=1: A is certain event
0< P(A) <1: Random event
How to evaluate (find) P(A):
- Method 1: Classical Probability:
P(A) =
¿ of outcome that favor for A ( số kết cục mà event A sẽ xảy ra)
of outcomes ( Không gian mẫu)¿
Total ¿

 Example 1: Roll two die randomly. Find the probability that:


a. 2 even number (2 mặt chẵn)
b. Sum of 2 faces is 7
c. Sum of 2 faces is at least 4
 Solution:
a. Total # of outcomes = 6x6 = 36
2 even number = 9
b. Sum of 2 faces is 7 = 6
 P(A) = 1/6
c. At most = 3 (3 cases)
 P(A) = 1 – 3/36 = 11/12
 Example 2: A competition has 12 teams. The first way to organize the
matches is form two groups, play round robin, then two first team will be
qualified into semi-final.
The second way is forming 4 groups, play round robin, then quarter final,
semi-final and final. Compare the number of matches for each way.
 First way: 33
15x2+2+1=33
 Second way: 19
3x4+4+2+1=19
 Example 3:

- Method 2: Frequency method


- Method 3: Theorems of Probability
30/10/2023

You might also like