Assessment 1 - Getting Started With Your Data
Assessment 1 - Getting Started With Your Data
Overview: Throughout this course, you will be working toward your final assessment—an
archival data research project. What does that mean? It means you will be using real data,
collected as part of the General Social Survey (GSS), to answer a real question.
To help you complete this final assessment, these earlier assessments will walk you through
the steps. In fact, to provide added assistance (and maybe even a sense of fun), you will
have three fictional research assistants.
Juanita: She is junior in college with the intention of being a counselor. She’s more
interested in the results of the study than in the process to get the results.
Duante: Duante is a senior in college who loves algebra but is skeptical about statistics.
However, he wants a job where he can do research and is very interested in what you are
doing.
Amanda: Amanda is a first-year student in college. She’s friendly and likes to talk a lot (and
ask plenty of questions). She hasn’t been through her statistics course yet, but she is great
at staying organized and loves using Excel.
The General Social Survey (GSS) has been studying America, specifically American society, for 50
years. All of their data are available to the public (and to researchers). This makes GSS a great source
for archival data projects. The surveys are long, providing a number of potential variables. For this
project, certain variables have been chosen for you and the data extracted from the GSS site ahead of
time. If you’re curious and want to learn more about GSS, feel free to check out the About the GSS
web page.
Directions: Complete the steps below. These steps will prepare you for later portions of this
assessment as well as for your final project.
Step 1: Choose your variables, one from each column and fill in the white blocks of the table
below.
List A (Choose ONE variable from this list) List B (Choose ONE variable from this list)
RACLIVE. Have other race living in HAPPY. General happiness. This asks the
neighborhood. This is a Yes/No question respondents to rate how happy they are (small
asking respondents if they live in a Likert scale).
neighborhood with people of another race.
NEWS. How often does respondent read LIFE. If life is exciting or dull. This question
newspaper. This question asks respondents to asks respondents to rate their life as exciting,
rate how often they read the newspaper (Likert routine, or dull (small Likert scale).
scale).
WWWHR. Internet hours per week. This MNTLHLTH. Days of poor mental health
question asks respondents to share how many past 30 days. This question asks respondents
hours in a week they use the Internet for non- how many days of poor mental health they’ve
email activities. had in the past 30 days.
DEPRESS. Told have depression. This is a
Yes/No question asking respondents if they
have been told they have depression.
My Variable (from List A): My Variable (from List B):
RACLIVE HAPPY
Note: Once you’ve selected your variables, use them throughout the course. Changing
variables partway through will require you to re-do work you’ve already done.
What’s your preliminary research question? Note: The survey method used to collect
these data was not designed to allow you to determine causation or the effect of one
variable on the other.
Step 2: Download data. Download the "GSS Data 2018" Excel file from the assessment. Make
sure you save this somewhere on your computer (where you can find it again).
Step 3: Clean the data. In this case, Amanda helped you out and, if you look at the Excel file
you downloaded, the data have been cleaned for you. She spent time going through the data
to make sure there were results for both variables you will use and remove those
participants who did not respond to both questions. She also made sure that the data were
in a form that JASP could read accurately (right labels, right file format, et cetera). She’s a bit
of an overachiever, so she cleaned everyone’s data. On the Excel spreadsheet, you will want
to make sure you are on the right tab for your project. To understand what answer the
numbers correspond to, look at the Codes tab.
Amanda is confused by the measures of central tendency and has come to you asking questions.
You’ve sent Amanda off to think more about measures and central tendency. Working with Duante,
you start to consider your variables. He wants to know what the data type for each available variable
is.
What data type is our data (for JASP)? Note: JASP offers you three options: Nominal,
Ordinal, Continuous. So, if the data is ratio or interval, call it continuous for the purpose of
this table.
NOTES:
Race: Respondents had a choice of: White, Black, or Other (The survey uses limited options,
which is a limitation for modern studies but useful when comparing modern data with
historical data.)
Sex: Respondents had a choice of: Male, Female (Based on biological sex at birth; this is not a
variable looking at gender.)
Racliv: Respondents were given a Yes/No question on whether they lived in a neighborhood
with other races.
Wwwhr: Respondents entered the number of hours they are on the Internet each week.
Depress: Respondents were given a Yes/No question on whether they had been told they had
depression.
In JASP, select the three blue bars, then select open, then select the location you saved the
.csv file.
Click on the appropriate variable data icon in the column title to change it to the
correct format.
Continuing to work with Duante, you decide to learn more about your participants.
Duante has done some research into GSS and provides you with the following information:
The General Social Survey uses random sampling to get a representative sample of adults
across the United States (NORC, 2019).
Reference:
National Opinion Research Center (NORC). (2019). Appendix A: Sampling design and
weighting. In General Social Surveys 1972–2018: Cumulative Codebook (pp. 3171–3189).
https://gss.norc.org/documents/codebook/GSS_Codebook_AppendixA.pdf
Before you start looking at your participants, Juanita has a few questions for you.
Scoring Criterion: Explain the use of a mean with different types of variables.
Can you use a mean for the variable Sex? This feature value "Sex" was categorical
Why or why not? and had two groups (e.g. male and female),
thus "Sex" was the nominal data. Therefore,
the mean for the variable "Sex" isn't
appropriate to use. Discrete/ located
variables like "Sex" that have not been given
a numerical order or magnitude are
required for figuring out a mean. The mean
is a type of an average that is calculated for
continuous variables via numeric values
which may be of no use. Rather than, to
mention the cases of categorical variables
like "Sex," one would expect percentages or
proportions to be used in order to describe
the distribution of categories within the
dataset.
Can you use a mean for the variable Race? Nope, "Race" cannot be the vector you are
Why or why not? looking for since "Race" is "-ace" categorical
variable which represents different
categories or groups of individuals based
upon their racial or ethnic identities or lines
of descent. Ordinal variables cannot be used
to calculate a mean because they may lack
mathematical specificity like the ability to
mend the variable into numerical
progression or magnitude. In contrast,
parameters such as "Race" do not account to
a group of people. So, categorical variables
like "Race" are usually analyzed using
frequencies, proportions, or percentages
with which the data differs among distinct
groups.
Can you use a mean for the variable Age? Yes, you can use a mean for the variable
Why or why not? "Age." Age distributed in values that are
numerically continuous like the age of that
person. Discrete variables can be
formulated in ways that make them capable
of having the meaningful numerical
properties, such as magnitude and order,
which are fiendly to their calculation of a
mean. The mean gives a central tendency
measure of the data set for us to view the
average age of all individuals as the set of
data. On the other hand, an average
calculation for the variable "Age" is
justifiable given the fact that it is able to
sum up the central trend of ages within the
sample.
Now that Juanita has a better understanding of the mean, the two of you start looking at your
demographic data.
Open JASP.
Then open your .csv data file for your project. You can do this by clicking the three
blue bars, selecting Open, then select Computer, then choose your file from
wherever you saved it on your computer.
Juanita likes to process information visually and asks if you can create graphs or charts of the data.
Directions: Create a bar graph, pie chart, and frequency table by following the directions
below.
Copy and paste or take a screenshot of the Bar Graph for Race. Place it below.
Next, Click Basic Plots, then put a check next to Pie Charts.
Copy and paste or take a screenshot of the Pie Chart for Sex. Place it below.
As you look over the participants, Amanda, Juanita, and Duante discuss what careers they want in
the future. They ask you what your thoughts are and whether you’d ever consider a job that involves
data analysis.
Step 1: Statistics and data analysis are marketable job skills. Search the Internet for jobs you
could apply for with a bachelor’s degree that require the use of statistics. Some good, key
search terms: psychology research assistance or survey data analysis.
The issues of detail and accuracy in data analysis demand for a more detailed
and accurate attention to detail and analysis.
Familiarity with the latter will continue to remain an important criteria for job
seekers.
Does the Of course, I am an enthusiastic candidate for such a position. To me, this
job activity is an exciting opportunity to work with data and make discoveries
sound that can be further used to drive the business decisions or become the
interestin fundamentals of research. The employment of statistics, data visualization,
g to you? problem-solving tendencies among others, rekindles my analytical self. It also
Why or matches my skills and background with the framework of the job. Another
why not? compelling aspect involves aligning my skills to facilitate the development
projects and using a data-informed approach to initiate change.
Reference