0% found this document useful (0 votes)
36 views80 pages

Introduction

This document provides information about the STATS 244.3 course Introduction to Statistical Methods. It outlines the evaluation criteria, which includes term tests, computer assignments, and a final exam. No textbook is required, but alternative texts are available in the library. The document also lists the dates for term tests and computer assignment due dates. Details are provided about downloading lectures and printing them. An outline of topics to be covered in the course is provided.

Uploaded by

wiliam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views80 pages

Introduction

This document provides information about the STATS 244.3 course Introduction to Statistical Methods. It outlines the evaluation criteria, which includes term tests, computer assignments, and a final exam. No textbook is required, but alternative texts are available in the library. The document also lists the dates for term tests and computer assignment due dates. Details are provided about downloading lectures and printing them. An outline of topics to be covered in the course is provided.

Uploaded by

wiliam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 80

Stats 244.

Introduction to Statistical Methods


Instructor: W.H.Laverty

Office: 235 McLean Hall

Phone: 966-6096

Assignments, Labs, Term tests - 40%


Evaluation:
Every Thursday – Term Test
Final Examination - 60%
Text:

1. No text will be assigned


Alternative Texts (Available in Library)
Title Author(s)
1. Statistics Informed Decision using Data Sullivan
2. Introductory Statistics Mann
3. Modern Elementary Statistics Freund
4. Elementary Statistics: A Brief version Bluman
5. Elementary Statistics Hoel
6. Statistics The Exploration and Analysis of Data Devore and Peck
7. Statistics -A first course Freund
8. Statistics -A first course Saunders, Smit, Adatia & Larson
9. Basic Statistical Concepts Bartz
10. An Introduction To Statistical Methods and Ott
Data Analysis
11. Introductory Statistics Wonnacott & Wonnacott
Marks will be distributed in this
manner
1. 6 Term Tests – in the lab every two weeks
– the lowest mark out of 6 tests will be dropped
– Term tests will be worth 30%
2. 4 computer assignments
– Worth 10%
3. Final Exam
– Worth 60%
Dates for term tests Stats 244

1. Thursday, Sept 24- 2:30-3:20 (in Lab)


2. Thursday, Oct 08 2:30-3:20 (in Lab)
3. Thursday, Oct 22- 2:30-3:20 (in Lab)
4. Thursday, Nov 05- 2:30-3:20 (in Lab)
5. Thursday, Nov 19- 2:30-3:20 (in Lab)
6. Thursday, Dec 03 - 2:30-3:20 (in Lab)
Due Dates computer assignments

1. Thusday, Oct 01
2. Tuesday, Oct 22
3. Thursday, Nov 19
4. Thursday, Dec 03
Comments
1. All tests will be On Line
– Each Test will be:
• Emailed to students prior to date of test
• Answered on line using link beside the test date
2. Practice assignments with solutions will be
posted before each test. These are not
compulsory but practice for the test. I will
post lectures covered by the test.
Text: No text

Texts (Available in Library)


1. Statistics Informed Decision using Data - Sullivan
2. Introductory Statistics - Mann
3. Modern Elementary Statistics - Freund
4. Elementary Statistics: A Brief version - Bluman
5. Elementary Statistics - Hoel
You
6. Statistics The not need
Exploration to buy
and Analysis a text
of Data - Devore and Peck
7. Statistics ‐Abook
first course – Freund
8. Statistics ‐A first course - Saunders, Smit, Adatia & Larson
9. Basic Statistical Concepts - Bartz
10. An Introduction To Statistical Methods and Data Analysis - Ott
11. Introductory Statistics - Wonnacott & Wonnacott
• The lectures will be given in Power Point
To download lectures
1. Go to the stats 244 web site
– by going to the website of the department of
Mathematics and Statistics -> people -> faculty -
> W.H. Laverty -> Stats 244-> Lectures.
2. Then
a)select the lecture
b)Right click and choose Save as
To print lectures
1. Open the lecture using MS Powerpoint
2. Select the menu item File -> Print
The following dialogue box appear
In the Print what box, select handouts
Set Slides per page to 6 or 3.
6 slides per page will result in the least amount
of paper being printed

1 2

3 4

5 6
3 slides per page leaves room for notes.

3
Course Outline
Introduction
• Populations, samples
• Variables
• Data Collection
Exploratory Statistics

•Organizing and displaying Data


•Numerical measures of Central Tendency
and Variability
•Describing Bivariate Data
Probability Theory

•Concepts of Probability
•Random variables and their distributions
•Binomial distribution, Normal distribution
Inferential Statistics

•Estimation, Hypotheses testing


•Comparing Samples
•Analyzing count data
•Regression and Correlation
•Non-parametric Statistics
Introduction
The circular process of research:
Questions arise about
a phenomenon

Conclusions are A decision is made to


drawn from the collect data
analysis

A decision is made as
The data is how to collect the
summarized and data
analyzed

The data is collected


What is Statistics?

It is the major mathematical tool of


scientific inference (research) – with
an interest in drawing conclusion from
data.
Data that is to some extent corrupted
by some component of random
variation (random noise)
Random variation or (random noise)
can be defined to be the variation in the
data that is not accounted for by factors
considered in the analysis.
Example

Suppose we are collecting data on


• Blood Pressure
• Height
• Weight
• Age
Suppose we are interested in how
• Blood Pressure
is influenced by the following factors
• Height
• Weight
• Age
Blood Pressure will not be perfectly
predictable from :
• Height
• Weight
• Age
There will departures (random variation)
from a perfect prediction because of other
factors the could affect Blood pressure
(diet, exercise, hereditary factors)
Another Example
In this example we are interested in the use
of:
1. antidepressants,
2. mood stabilizing medication,
3. anxiety medication,
4. stimulants and
5. sleeping pills.

The data were collected for n = 16383 cases


In addition we are interested in how the use
these medications is affected by:
1. Age
20-29, 30-39,40-49, 50-59, 60-69, 70+
2. Gender
Male, female
3. Education
– < Secondary,
– Secondary Grad.,
– some Post-Sec.,
– Post-Sec. Grad.
4. Income
– Low, Low Mid, Up Mid, High
5. Role
– parent, partner , worker
– parent, partner
– parent, worker
– partner, worker
– worker only
– parent only
– partner only
– no roles
Some questions of interest
1. How are the dependent variables
(antidepressant use, mood stabilizing
medication use, anxiety medication use,
stimulants use, sleeping pill use)
interrelated?
2. How are the dependent variables (drug
use) related to the independent variables
(age, gender, income, education and role)?
• Again the relationships will not be perfect
• Because of the effects of other factors
(variables) that have not been considered in
the experiment
• If the data is recollected, the patterns
observed at the second collection will not
be exactly the same as that observed at the
first collection
The data appears in the following Excel file

Drug data
In Statistics
• Questions
– About some scientific, sociological, medical or
economic phenomena
• Data
– The purpose of the data is to find answers to the
questions
• Answers
– Because of the random variation in the data (the
noise). Conclusions based on the data will be
subject to error.
The circular process of research:
In what part of this process does statistics play a role?

Questions arise about


a phenomenon

Conclusion are drawn A decision is made to


from the analysis collect data

Statistics
Statistics
A decision is made as Experimental
The data is
summarized and
how to collect the Design
data
analyzed

The data is collected


Statistical Theory is interested in
1. The design of the data collection
procedures. (Experimental designs,
Survey designs). The experiment can be
totally lost if it is not designed correctly.
2. The techniques for analyzing the data.
In any statistical analysis it is
important to assess the
magnitude of the error made
by the conclusions of the
analysis.
Consider the following statement:

You can prove anything with Statistics.


In fact:

One is unable to “prove” anything with


Statistics.
At the end of any statistical
analysis there always is a
possibility of an error in any of the
decisions that it makes.
The success of a research project
does not depend on the its
conclusions

The success of a research project


depends on the accuracy of its
conclusions
If one is testing the effectiveness
of a drug

There is two possible conclusions:

1. The drug is effective:

2. The drug is not effective:


The success of a this project does
not depend on the its conclusions

The success depends on the


accuracy of its conclusions
For this reason:

It is extremely important in any


study to assess the accuracy of its
conclusions
Some definitions

important to Statistics
A population:
this is the complete collection of subjects
(objects) that are of interest in the study.
There may be (and frequently are) more
than one in which case a major objective
is that of comparison.
A case (elementary sampling
unit):
This is an individual unit (subject) of the
population.
A variable:

a measurement or type of measurement


that is made on each individual case in
the population.
Types of variables
Some variables may be measured on a
numerical scale while others are
measured on a categorical scale.

The nature of the variables has a great


influence on which analysis will be
used. .
For Variables measured on a numerical scale
the measurements will be numbers.

Ex: Age, Weight, Systolic Blood Pressure

For Variables measured on a categorical scale


the measurements will be categories.

Ex: Sex, Religion, Heart Disease


Note
Sometimes variables can be measured on
both a numerical scale and a categorical
scale.

In fact, variables measured on a numerical


scale can always be converted to
measurements on a categorical scale.
Example
The following variables were evaluated
for a study of individuals receiving head
injuries in Saskatchewan.

1. Cause of the injury (categorical)


• Motor vehicle accident
• Fall
• Violence
• other
2. Time of year (date) (numerical or
categorical)
• summer
• fall
• winter
• spring

3. Sex on injured individual (categorical)


• male
• female
4. Age (numerical or categorical)
• < 10
• 10-19
• 20 - 29
• 30 - 49
• 50 – 65
• 65+

5. Mortality (categorical)
• Died from injury
• alive
Types of variables

In addition some variables are labeled as


dependent variables and some variables
are labeled as independent variables.
This usually depends on the objectives of
the analysis.

Dependent variables are output or


response variables while the
independent variables are the input
variables or factors.
Usually one is interested in determining
equations that describe how the
dependent variables are affected by the
independent variables
Example

Suppose we are collecting data on


• Blood Pressure
• Height
• Weight
• Age
Suppose we are interested in how
• Blood Pressure
is influenced by the following factors
• Height
• Weight
• Age
Then
• Blood Pressure
is the dependent variable
and
• Height
• Weight
• Age
Are the independent variables
Example – Head Injury study
Suppose we are interested in how
• Mortality
is influenced by the following factors
• Cause of head injury
• Time of year
• Sex
• Age
Then
• Mortality
is the dependent variable
and
• Cause of head injury
• Time of year
• Sex
• Age
Are the independent variables
dependent Response
variable

independent predictor
variable
A sample:

Is a subset of the population


In statistics:

One draws conclusions about the


population based on data collected
from a sample
Reasons:
Cost

It is less costly to collect data from a


sample then the entire population

Accuracy
Accuracy
Data from a sample sometimes leads to
more accurate conclusions then data
from the entire population

Costs saved from using a sample can


be directed to obtaining more accurate
observations on each case in the
population
Types of Samples

different types of samples are determined


by how the sample is selected.
Convenience Samples
In a convenience sample the subjects that
are most convenient to the researcher are
selected as objects in the sample.
This is not a very good procedure for
inferential Statistical Analysis but is
useful for exploratory preliminary work.
Quota samples
In quota samples subjects are chosen
conveniently until quotas are met for
different subgroups of the population.
This also is useful for exploratory
preliminary work.
Random Samples
Random samples of a given size are
selected in such that all possible samples
of that size have the same probability of
being selected.
Convenience Samples and Quota samples
are useful for preliminary studies. It is
however difficult to assess the accuracy
of estimates based on this type of
sampling scheme.
Sometimes however one has to be
satisfied with a convenience sample and
assume that it is equivalent to a random
sampling procedure
Population

× Case

Sample

Variables
X
Y
Z
Some other definitions
A population statistic
(parameter):

Any quantity computed from the values


of variables for the entire population.
A sample statistic:
Any quantity computed from the values
of variables for the cases in the sample.
Since only cases from the sample are
observed
– only sample statistics are computed
– These are used to make inferences about
population statistics
– It is important to be able to assess the accuracy
of these inferences
Organizing Data
the next topic

You might also like