Chapter 1
Chapter 1
Chapter 1
ANALYSIS
Pan Guangming
OUTLINE
| Syllabus
| The position of this course
| Example 1
| Example 2
| Summary
SYLLABUS
COURSE WORK
| 10% - 2 or 3 times assignments
| 20% - Mid-term test
| Prediction.
(x,y)
1.2 2.4
3.5 7
2.4 4.8
4.9 9.8
1.8 3.6
3.1 6.2
THE POSITION OF THIS COURSE
| We use different tools to deal with all kinds of data.
Regression analysis
Time series analysis
Sampling Multivariate analysis
Data Survival analysis
survey
…
| 1) Description
| 2) Control
| 3) Prediction.
HOW DOES THE REGRESSION
ANALYSIS WORK ?
EXAMPLE 1
| The following data set records the plasma levels of
total cholesterol level of 24 patients with
hypercholesterolemia admitted to a hospital:
3.5,1.9,4.0,2.6,4.5,3.0,2.9,3.8,2.1,3.8,4.1,3.0,
2.5,4.6,3.2,4.2,2.3,4.0,4.3,3.9,3.3,3.2,2.5,3.3
46,20,52,30,57,25,28,36,22,43,57,33,
22,63,40,48,28,49,52,58,29,34,24,50
QUESTIONS
| When a new patient with known age (e.g. 45)
come here, do you have some idea about his/her
plasma level?
SCATTER PLOTS
SCATTER PLOT
SIMPLE LINEAR REGRESSION
| It seems that the plasma levels depend on the
ages.
| We can use a straight line to express such
dependence.
| For the new patient with age 45, we can use this
line to get some basic idea about his/her plasma
levels.
| The above example highlights the importance in
data analysis of collecting data on some other
variables (e.g. age) relevant to the main variable
of interest (e.g. cholesterol level) .
RELATIONSHIPS BETWEEN VARIABLES
Functional Relationships – The value of the
dependent variable Y can be computed exactly if
we know the value of the independent variable X.
(e.g., Y=2X)