1 Chapter 1 Lecture Notes
1 Chapter 1 Lecture Notes
Topics: Objectives:
● Definitions of statistics, probability and Key terms ● Recognize and differentiate between key terms
● Data, Sampling and Variation in Data and ● Apply various types of sampling methods to data
Sampling collection
● Frequency Table and Levels of Measurement ● Create and interpret frequency tables
● Experimental Design and Ethics
Vocabulary:
Average Blinding Categorical Variable
Treatments Variable
Notes Chapter 1:
Population:
Parameter:
Sample:
Statistics:
●
How Population and Samples Relate:
Example:
We want to know the mean amount of money first year college students spend at Shenandoah
University on school supplies that do not include books. We randomly survey 100 first year students
at the college. Three of those students spent $150, $200, and $225, respectively.
_____________________.
Types of Variables:
● Numerical or Quantitative Data:
Quantitative data - are always ___________ and are the result of _____________ or
____________ attributes of a population. Amount of money, pulse rate, weight, number of people
living in your town, and number of students who take statistics are examples of quantitative data.
Quantitative data may be either __________ or ________________.
Discrete data - all data that are the result of ___________are called ____________________
data. These data take on only certain ___________ values. If you count the number of phone calls
you receive for each day of the week, you might get values such as zero, one, two, or three.
Continuous data - all data that are the result of ____________ are
_________________________ data assuming that we can measure accurately. If you and your
friends carry backpacks with books in them to school, the numbers of books in the backpacks are
discrete data and the weights of the backpacks are continuous data.
Example:
You go to the supermarket and purchase three cans of soup (19 ounces) tomato bisque, 14.1 ounces
lentil, and 19 ounces Italian wedding, two packages of nuts (walnuts and peanuts), four different
kinds of vegetable (broccoli, cauliflower, spinach, and carrots), and two desserts (16 ounces Cherry
Garcia ice cream and two pounds 32 ounces chocolate chip cookies).
Example:
Suppose Lisa wants to form a four-person study group from her pre-calculus class, which has 31
members.
● To choose a simple random sample of size three from the other members of her class, Lisa
could put all 30 names in a hat, shake the hat, close her eyes, and pick out three names.
● Lisa could put all of her classmates into an order (maybe alphabetical) and use the Random
Number generator to select her three members. (using two-digit numbers and ignoring
numbers greater than 31)
Stratified Sample:
To choose a stratified sample,
● ___________ the _______________ into groups called strata
● Take a ___________________ number from each stratum.
Example:
You could stratify (group) your college population by department and then choose a proportionate
simple random sample from each stratum (each department) to get a stratified random sample.
Suppose there are 5 departments, and you want to choose a sample of 50. You choose 10 from each
department using SRS sampling.
Cluster Sample:
To choose a cluster sample
● ________the ______________ into clusters (groups)
● ____________ select some of the clusters.
● All the members from these clusters are in the cluster sample.
Example:
If you randomly sample four departments from your college population, the four departments make
up the cluster sample. Divide your college faculty by department. The departments are the clusters.
Number each department, and then choose four different numbers using simple random sampling.
All members of the four departments with those numbers are the cluster sample.
Systematic Sample:
To choose a systematic sample:
● ___________ select a ______________
● Take every ________ piece of data from a listing of the population.
Example:
Suppose you have to do a phone survey. Your phone book contains 20,000 residence listings. You
must choose 400 names for the sample. Number the population 1–20,000 and then use a simple
random sample to pick a number that represents the first name in the sample. Then choose every
fiftieth name thereafter until you have a total of 400 names (you might have to go back to the
beginning of your phone list). Systematic sampling is frequently chosen because it is a simple method.
Convenience Sampling:
A type of sampling that is _______________ is convenience sampling. Convenience sampling
involves using results that are _____________________________.
Example:
A computer software store conducts a marketing study by interviewing potential customers who
happen to be in the store browsing through the available software. The results of convenience
sampling may be very good in some cases and highly biased (favor certain outcomes) in others.
However for practical reasons, in most populations, simple random sampling is done _________
replacement. Surveys are typically done without replacement. That is, a member of the population may be
chosen only once.
Most samples are taken from _______ populations and the sample tends to be ______ in comparison to the
population. Since this is the case, sampling without replacement is ___________________ the same as
sampling with replacement because the chance of picking the same individual more than once with
replacement is __________________.
When you analyze data, it is important to be aware of sampling _______ and non-sampling errors.
The actual ________of sampling causes sampling errors.
For example:
In reality, a sample will __________be exactly representative of the population so there will always
be ________ sampling error. As a rule, the _______ the sample, the _________ the sampling
error.
Critical Evaluation:
We need to evaluate the statistical studies we read about critically and analyze them
____________________________ the results of the studies. Common problems to be aware of
include:
Example:
A study is done to determine the average tuition that San Jose State undergraduate students pay per semester.
Each student in the following samples is asked how much tuition he or she paid for the Fall semester. What is
the type of sampling in each case?
Variation:
Example:
16-ounce cans of beverage may contain more or less than 16 ounces of liquid. In one study, six 16
ounce cans were measured and produced the following amount (in ounces) of beverage: 15.8 16.1
15.2 14.8 15.8 15.9
Measurements of the amount of beverage in a 16-ounce can may vary because different people make
the measurements or because the exact amount, 16 ounces of liquid, was not put into the cans.
Manufacturers regularly run tests to determine if the amount of beverage in a 16-ounce can falls
within the desired range.
In this course, I would like you to round to ______ decimal places, unless the answer automatically
rounds to one place or is not a decimal.
It is ______ necessary to reduce most fractions in this course. Especially in Probability Topics, the
chapter on probability, it is more helpful to leave an answer as an unreduced _____________
Levels of Measurement:
The way a set of data is __________ is called its level of measurement. Correct statistical
procedures depend on a researcher being familiar with levels of measurement. Not every statistical
operation can be used with every set of data. Data can be classified into ______ levels of
measurement. They are (from lowest to highest level):
○ Example: Trying to order people according to their favorite food does not make any
sense. Putting veggie pizza first and vegan sushi second is not meaningful.
○ Example of ordinal scale data is a list of the top five national parks in the United States.
The top five national parks in the United States can be ranked from one to five but we
cannot measure differences between the data.
Nominal Yes No No No
Frequency:
Twenty students were asked how many hours they worked per day. Their responses, in hours, are as
follows:
5 6 3 3 2 4 7 5 2 3 5 6 5 4 4 3 5 2 5 3
Table below lists the different data values in ascending order and their frequencies.
Relative Frequency:
A relative frequency is the ______ (fraction or proportion) of the number of times a value of the data
_______ in the set of all outcomes to the ______ number of outcomes. To find the relative
frequencies, ________ each frequency by the _______number of students in the sample–in this
case, 20. Relative frequencies can be written as ___________, ____________, or __________.
Example:
Researchers want to investigate whether taking aspirin regularly reduces the risk of heart attack. Four
hundred men between the ages of 50 and 84 are recruited as participants. The men are divided
randomly into two groups: one group will take aspirin, and the other group will take a placebo. Each
man takes one pill each day for three years, but he does not know whether he is taking aspirin or the
placebo. At the end of the study, researchers count the number of men in each group who have had
heart attacks.