Gathering and Organizing Data
Gathering and Organizing Data
Gathering and Organizing Data
Objectives: Define data and statistics Explain the difference between a population and a sample. Describe four basic methods of sampling Construct a frequency distribution for a data set Draw a stem and leaf plot for a data set
B. Sampling Methods
We will study four basic sampling methods: 1. In order to obtain a random sample, each subject of the population must have an equal chance of being selected. 2. A systematic sample is taken by numbering each member of the population and then selecting every kth member, where k is a natural number. When using systematic sampling, its important that the starting number is selected at random. 3. When a population is divided into groups where the members of each group have similar characteristics and members from each group are chosen at random, the result is called a stratified sample. 4. When an existing group of subjects that represent the population is used for a sample, it is called a cluster sample.
Another area of inferential statistics is called hypothesis testing. A researcher tries to test a hypothesis to see if there is enough evidence to support it. A third aspect of inferential statistics is determining whether or not a relationship exists between two or more variables. This area of statistics is called correlation and regression.
Frequency Distributions
The data collected for a statistical study are called raw data. In order to describe situations and draw conclusions, the researcher must organize the data in a meaningful way. Two methods that we will use are frequency distributions and stem and leaf plots. The first type of frequency distributions that we will investigate is the categorical frequency distribution. This is used when the data are categorical rather than numerical.
Video
D. Frequency Distributions
Another type of frequency distribution that can be constructed uses numerical data and is called a grouped frequency distribution. In a grouped frequency distribution, the numerical data are divided into classes. When deciding on classes, here are some useful guidelines: 1. Try to keep the number of classes between 5 and 15. 2. Make sure the classes do not overlap. 3. Dont leave out any numbers between the lowest and highest, even if nothing falls into a particular class. 4. Make sure the range of numbers included in a class is the same for each one.
Video
4. The data below show the number of games won by the Chicago Cubs in each of the 21 seasons from 19882008, with the exception of 1994, which was a short season because of a player strike. Draw a stem and leaf plot for the data. 97 85 66 79 89 88 67 88 65 67 90 68 76 73 84 78 77 77 93 77 Video