Lesson 3 EDA
Lesson 3 EDA
Objective:
Learn the basics of conducting surveys, including survey creation, distribution and collection, and
how to utilize survey data.
Lesson Presentation:
Designing, Conducting, and Analyzing Surveys
Conducting a Survey
There are various methods for administering a survey. It can be done as a face-to face interview or a
phone interview where the researcher is questioning the subject. A different option is to have a self-
administered survey where the subject can complete a survey on paper and mail it back, or complete
the survey online. There are advantages and disadvantages to each of these methods.
The advantages of face-to-face interviews include fewer misunderstood questions, fewer incomplete
responses, higher response rates, and greater control over the environment in which the survey is
administered; also, the researcher can collect additional information if any of the respondents’
answers need clarifying. The disadvantages of face-to-face interviews are that they can be expensive
and time-consuming and may require a large staff of trained interviewers. In addition, the response
can be biased by the appearance or attitude of the interviewer.
The advantages of self-administered surveys are that they are less expensive than interviews, do not
require a large staff of experienced interviewers and can be administered in large numbers. In
addition, anonymity and privacy encourage more candid and honest responses, and there is less
pressure on respondents. The disadvantages of self-administered surveys are that responders are
more likely to stop participating mid-way through the survey and respondents cannot ask them to
clarify their answers. In addition, there are lower response rates than in personal interviews, and often
the respondents who bother to return surveys represent extremes of the population – those people
who care about the issue strongly, whichever way their opinion leans.
Designing a Survey
Surveys can take different forms. They can be used to ask only one question or they can ask a series
of questions. We can use surveys to test out people’s opinions or to test a hypothesis.
When designing a survey, the following steps are useful:
1. Determine the goal of your survey: What question do you want to answer?
2. Identify the sample population: Whom will you interview?
3. Choose an interviewing method: face-to-face interview, phone interview, self-administered
paper survey, or internet survey.
4. Decide what questions you will ask in what order, and how to phrase them. (This is important if
there is more than one piece of information you are looking for.)
5. Conduct the interview and collect the information.
6. Analyze the results by making graphs and drawing conclusions.
Constructing a Survey
1. Martha wants to construct a survey that shows which sports students at her school like to play the
most.
a) List the goal of the survey.
The goal of the survey is to find the answer to the question: “Which sports do students at Martha’s
school like to play the most?”
b) What population sample should she interview?
A sample of the population would include a random sample of the student population in Martha’s
school. A good strategy would be to randomly select students (using dice or a random number
generator) as they walk into an all-school assembly.
c) How should she administer the survey?
Face-to-face interviews are a good choice in this case. Interviews will be easy to conduct since the
survey consists of only one question which can be quickly answered and recorded and asking the
question face to face will help eliminate non-response bias.
d) Create a data collection sheet that she can use to record her results.
To collect the data to this simple survey Martha can design a data collection sheet such as the one
below:
Sport Tally
baseball
basketball
football
soccer
volleyball
swimming
This is a good, simple data collection sheet because:
Plenty of space is left for the tally marks.
Only one question is being asked.
Many possibilities are included, but space is left at the bottom in case students give answers
that Martha didn’t think of.
The answer from each interviewee can be quickly collected and then the data collector can
move on to the next person.
Once the data has been collected, suitable graphs can be made to display the results.
2. Raoul wants to construct a survey that shows how many hours per week the average student at his
school works.
a) List the goal of the survey.
The goal of the survey is to find the answer to the question “How many hours per week do you work?”
b) What population sample will he interview?
Raoul suspects that older students might work more hours per week than younger students. He
decides that a stratified sample of the student population would be appropriate in this case. The strata
are grade levels 9th through 12th. He would need to find out what proportion of the students in his
school are in each grade level, and then include the same proportions in his sample.
c) How would he administer the survey?
Face-to-face interviews are a good choice in this case since the survey consists of two short
questions which can be quickly answered and recorded.
d) Create a data collection sheet that Raoul can use to record his results.
In order to collect the data for this survey Raoul designed the data collection sheet shown below:
9th grade
10th grade
11th grade
12th grade
This data collection sheet allows Raoul to write down the actual numbers of hours worked per week
by students as opposed to just collecting tally marks for several categories.
Sport Tally
gymnastics ||| 3
fencing || 2
Total: 112
a) Make a bar graph of the results showing the percentage of students in each category.
To make a bar graph, we list the sport categories on the x−axis and let the percentage of students
be represented by the y−axis.
To find the percentage of students in each category, we divide the number of students in each
category by the total number of students surveyed:
Sport Percentage
baseball 31112=.28=28%
Sport Percentage
basketball 17112=.15=15%
football 14112=.125=12.5%
soccer 28112=.25=25%
volleyball 9112=.08=8%
swimming 8112=.07=7%
gymnastic 3112=.025=2.5%
fencing 2112=.02=2%
Now we can make a graph where the height of each bar represents the percentage of students in
each category:
b. Make a pie chart of the collected information, showing the percentage of students in each
category.
To make a pie chart, we find the percentage of the students in each category by dividing the number
of students in each category as in part a. The central angle of each slice of the pie is found by
multiplying the percentage of students in each category by 360 degrees (the total number of degrees
in a circle). To draw a pie-chart by hand, you can use a protractor to measure the central angles that
you find for each category.
Examples
In the second example Raoul found that that 30% of the students at his school are in 9th grade, 26%
of the students are in the 10th grade, 24% of the students are in 11th grade and 20% of the students
are in the 12th grade. He surveyed a total of 60 students using these proportions as a guide for the
number of students he interviewed from each grade. Raoul recorded the following data:
Grade Level Number of hours worked Total number of students
0, 5, 4, 0, 0, 10, 5, 6, 0,
9th grade 0, 2, 4, 0, 8, 0, 5, 7, 0 18
Example 1
Construct a stem-and-leaf plot of the collected data
The ordered stem-and-leaf plot looks as follows:
00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 4 4 5 5 5 5 5 6 6 7 7 8 8 810 0 0 0 0 0
0 0 2 2 2 2 2 5 5 5 5 5 5 6 8 820 0 2
We can easily see from the stem-and-leaf plot that the mode of the data is 0. This makes sense
because many students do not work in high school.
Example 2
Construct a frequency table with bin size of 5.
We construct the frequency table by counting how many students fit in each category.
0≤x<5 23
5≤x<10 12
10≤x<15 13
15≤x<20 9
Hours worked Frequency
20≤x<25 3
Example 3
Draw a histogram of the data.
The histogram associated with this frequency table is shown below.
Example 4
Find the five number summary of the data and draw a box-and-whisker plot.
The five number summary is as follows:
smallest number = 0
largest number = 22
Since there are 60 data points, (n+12)=30.5. The median is the mean of the 30th and
the 31st values:
median = 6.5
Since each half of the list has 30 values in it, then the first and third quartiles are the medians of each
of the smaller lists. The first quartile is the mean of the 15th and 16th values:
first quartile = 0
The third quartile is the mean of the 45th and 46th values:
third quartile = 12
The associated box-and-whisker plot is shown below.
ACTIVITY:
1.
A method of gathering information from a sample of people, traditionally with the intention of
generalizing the results to a larger population.
Survey
Experiment
Research
2.
It is a pen and paper survey. It is conducted personally to get people’s opinion about a
product, service, or personality.
Online Survey
Traditional Survey
Interview
3.
Also called Internet Survey, one of the most popular data-collection sources, where a set of
survey questions is sent out to a target sample and the members of this sample can respond
to the questions over the world wide web.
Interview Survey
Traditional Survey
Online Survey
4.
Which of the following is NOT an online survey tool.
Google Forms
SurveyMonkey
Google Slides
5.
Which of the following is an advantage of conducting online surveys?
Some respondents pass a fraud survey.
It is easy to analyze and group the responses.
It can be inaccessible to others.
6.
Which of the following is NOT an advantage of conducting online surveys?
It is faster than manual surveys.
It has a limited question type.
It is easy to create a design.
7.
Which of the following is NOT included in the benefits of a business from conducting online
surveys?
Decreased Profit
Get to know customers better
Know strengths & weaknesses
8.
In conducting online surveys, you should be clear about your survey goal. This will help you
to determine your audience and come up with relevant questions.
TRUE
FALSE
9.
Now that you’ve collected your statistical survey results and have a data analysis plan, it’s
time to begin the process of calculating survey results you got back.
TRUE
FALSE
10.
In analyzing the results of a survey, _________________ is important. It means that the result
of your survey measures what it intends to measure.
Validity
Reliability
Scalability
ASSIGNMENT:
Review
1. Make a pie chart for the problem in the Guided Practice. Specifically, a total of 60 students in
four groups composed of: 18 ninth grade students, 16 tenth grade students, 14 eleventh grade
students, and 12 twelfth grade students.
2. Melissa conducted a survey to answer the question “What sport do high school students like to
watch on TV the most?” She collected the following information on her data collection sheet.
Sport Tally
Total: 147
a) Make a pie-chart of the results showing the percentage of people in each category.
b) Make a bar-graph of the results.
3. Samuel conducted a survey to answer the following question: “What is the favorite kind of pie
of the people living in my town?” By standing in front of his grocery store, he collected the
following information on his data collection sheet:
cherry |||| 4
other |||| || 7
Total: 122
a) Make a pie chart of the results showing the percentage of people in each category.
b) Make a bar graph of the results.
4. “How much allowance money do students in your school get per week?”
1. Design the data collection sheet that can be used to collect this information,
2. Conduct the survey. This activity is best done as a group with each person contributing
at least 20 results.
3. Make a stem-and-leaf plot of the data.
4. Decide on an appropriate bin size and construct a frequency table.
5. Make a histogram of the results.
6. Find the five-number summary of the data and construct a box-and-whisker plot.