0% found this document useful (0 votes)

47 views13 pages

Videos and Tutorials On Data Analysis in The Psychometrics Lab

The document provides instructions for analyzing psychometric data in R. It includes steps to export and clean data, install necessary packages, start an analysis script to load packages and data, recode items, calculate scale reliability and scores, and conduct descriptive statistics and regression analysis. Specifically, it discusses exporting data from Qualtrics, installing packages like tidyverse and psyntur, reading data into R from a CSV file, calculating Cronbach's alpha, aggregate scores, and performing regression on the cleaned data.

Uploaded by

Sreesha Chakraborty

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

47 views13 pages

Videos and Tutorials On Data Analysis in The Psychometrics Lab

Uploaded by

Sreesha Chakraborty

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Psychometrics Lab: Data Analysis

Export data and clean it

Install necessary packages
Start a script, load the packages
Read in your data
Recode the negatively key items
Calculate Cronbach’s alpha for each scale
Calculate the aggregate scores
Calculating descriptives
Regression analysis

Export data and clean it

You should do your analysis on cleaned-up data. In the following video, we explain what clean data should look
like, and how to export raw data from Qualtrics and clean it.

Install necessary packages

For this analysis, you need to have the following packages installed.

tidyverse
psyntur
car
lm.beta

Check your package listing to see if these are all installed. If not, install these package using the “Install” button in
the “Packages” tab in RStudio. Alternatively, you can do the following:

install.packages(c('tidyverse', 'psyntur', 'car', 'lm.beta'))

Make sure that you have up to date versions of these packages. In particular, make sure that the version of psyntur
is at least 0.1.0. Once installed, you can check your version number in the package listing in the “Packages” tab
(see column labelled “Version”). Alternatively, in R, you can use the command packageVersion to check package
versions. For example, to check the version of psyntur, do the following:

packageVersion("psyntur")

## [1] '0.1.0'

Start a script, load the packages

You should put all your analysis code in one R script. This script should contain all and only the code for the
analysis. In other words, everything you need to do every step of the analysis, including the reading of the data,
should be there, and there should be no unnecessary code. Keep your code clean and well organized. Use
“sections” in the script to organize your code into regions of similar code. Sections can be inserted using the “Insert
Section” item RStudio’s “Code” menu.

Once installed, you must then load the required packages using the library function as follows:

library(tidyverse)
library(psyntur)
library(car)
library(lm.beta)

Read in your data

With your data, which is should be a .csv file, you can read it into R by using the “Import Dataset” button, and
choosing “From Text (readr)”. However, this is not recommended. You should instead write a read_csv command
in your R script that reads in the data from a file. This is a much better option because you can always come back to
your script at a later point and re-run it, and your data will be read in from file without any manual intervention. In
order to use this read_csv simply and easily, you should move the data .csv file to your working directory, and
your script should be there too, and then use the read_csv command with the filename to read it in. For example, if
your .csv file is called psychometrics_lab_data.csv, first copy or move this file into R’s working directory.
You can find your working directory by typing the getwd() command. On my system, getwd() tells me that my
working directory is a folder called psychometrics in my home directory.

> getwd()
[1] "/home/andrews/psychometrics"

If I had a file named psychometrics_lab_data.csv and moved into this folder, then in my R script, I can do the
following:

lab_data <- read_csv("psychometric_lab_data.csv")

In general, whenever we read in a csv file into R using the command read_csv, the data is returned as an R data
frame. In the above example, I named this data frame lab_data. If I type lab_data, I will then see the data.
Alternatively, I were to type glimpse(lab_data), I will get a more useful view of it (see next example).

For the purposes of this guide, I will read in an example .csv file from a URL web address rather than a file on my
local computer, and I will give the data frame that is returned the name psymetr_df.

psymetr_df <- read_csv("http://data.ntupsychology.net/psychometrics_demo_data.csv")

Look at your data

Let us take a look at psymetr_df with glimpse.

glimpse(psymetr_df)

## Rows: 44
## Columns: 52
## $ gender <dbl> 1, 1, 2, 2, 2, 2, 2, 1, 1, 2, 1, 2, 2, 2, 1, 1, 1, 1, 2…
## $ age <dbl> 19, 22, 20, 20, 19, 21, 19, 18, 22, 23, 19, 18, 19, 21,…
## $ anxiety_1 <dbl> 1, 2, 2, 2, 0, 1, 2, 2, 1, 1, 2, 2, 3, 3, 2, 2, 1, 2, 1…
## $ anxiety_2 <dbl> 3, 3, 1, 2, 1, 2, 2, 2, 2, 2, 3, 1, 2, 2, 2, 1, 2, 2, 3…
## $ anxiety_3 <dbl> 1, 3, 2, 3, 1, 1, 3, 1, 1, 2, 2, 1, 2, 2, 1, 2, 3, 2, 2…
## $ anxiety_4 <dbl> 2, 2, 2, 2, 1, 2, 2, 2, 2, 1, 2, 1, 2, 2, 2, 1, 2, 3, 4…
## $ anxiety_5 <dbl> 2, 1, 2, 1, 1, 2, 2, 1, 1, 2, 3, 1, 3, 2, 2, 1, 3, 2, 1…
## $ anxiety_6 <dbl> 2, 3, 2, 2, 3, 3, 1, 2, 2, 3, 2, 2, 1, 2, 3, 3, 4, 2, 2…
## $ anxiety_7 <dbl> 2, 2, 2, 2, 3, 3, 0, 2, 2, 3, 2, 3, 1, 3, 2, 2, 2, 1, 2…
## $ anxiety_8 <dbl> 2, 3, 1, 2, 2, 2, 1, 2, 3, 3, 2, 2, 1, 3, 1, 1, 1, 1, 2…
## $ anxiety_9 <dbl> 2, 3, 2, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 3, 2, 2, 1…
## $ anxiety_10 <dbl> 1, 2, 1, 3, 3, 2, 2, 2, 3, 2, 2, 2, 1, 2, 1, 1, 2, 1, 1…
## $ depression_1 <dbl> 2, 2, 2, 3, 2, 1, 3, 4, 1, 2, 2, 1, 3, 3, 4, 2, 2, 4, 3…
## $ depression_2 <dbl> 1, 2, 2, 2, 2, 1, 3, 2, 2, 1, 1, 2, 1, 2, 1, 1, 2, 1, 2…
## $ depression_3 <dbl> 3, 3, 2, 3, 2, 2, 3, 2, 3, 3, 2, 1, 1, 2, 3, 2, 3, 2, 2…
## $ depression_4 <dbl> 1, 1, 4, 2, 1, 3, 3, 2, 2, 1, 1, 2, 3, 2, 1, 1, 2, 2, 3…
## $ depression_5 <dbl> 1, 3, 2, 3, 2, 1, 4, 2, 2, 2, 2, 1, 2, 2, 2, 2, 4, 2, 3…
## $ depression_6 <dbl> 1, 2, 3, 2, 1, 2, 2, 2, 1, 3, 3, 2, 1, 2, 3, 2, 3, 4, 3…
## $ depression_7 <dbl> 1, 1, 3, 2, 2, 3, 5, 2, 1, 1, 4, 1, 3, 2, 2, 4, 2, 4, 3…
## $ depression_8 <dbl> 3, 5, 2, 4, 4, 2, 2, 4, 3, 4, 2, 4, 2, 4, 4, 4, 3, 3, 4…
## $ depression_9 <dbl> 3, 4, 4, 4, 5, 4, 3, 4, 3, 4, 4, 4, 3, 3, 4, 2, 4, 1, 5…
## $ depression_10 <dbl> 2, 5, 2, 4, 5, 2, 4, 3, 2, 5, 3, 4, 2, 4, 4, 4, 1, 1, 5…
## $ efficacy_1 <dbl> 2, 1, 1, 2, 4, 3, 2, 3, 2, 3, 1, 2, 2, 2, 3, 3, 2, 3, 2…
## $ efficacy_2 <dbl> 2, 1, 3, 3, 4, 2, 2, 1, 3, 3, 2, 1, 2, 2, 2, 2, 2, 2, 3…
## $ efficacy_3 <dbl> 3, 1, 2, 3, 4, 2, 1, 1, 3, 1, 2, 2, 2, 1, 2, 2, 1, 1, 1…
## $ efficacy_4 <dbl> 3, 2, 2, 5, 3, 3, 1, 1, 3, 2, 1, 2, 2, 2, 3, 2, 2, 3, 2…
## $ efficacy_5 <dbl> 1, 1, 3, 4, 3, 3, 2, 1, 3, 3, 3, 2, 2, 1, 2, 2, 4, 3, 2…
## $ efficacy_6 <dbl> 2, 2, 2, 4, 3, 2, 2, 2, 3, 1, 1, 3, 3, 2, 2, 2, 2, 2, 1…
## $ efficacy_7 <dbl> 3, 4, 4, 4, 3, 2, 5, 5, 2, 4, 5, 4, 2, 5, 3, 3, 3, 5, 4…
## $ efficacy_8 <dbl> 3, 4, 5, 4, 1, 2, 3, 5, 4, 4, 5, 4, 2, 4, 3, 3, 3, 4, 3…
## $ efficacy_9 <dbl> 5, 2, 4, 4, 3, 3, 4, 4, 2, 3, 4, 4, 2, 3, 4, 4, 4, 2, 5…
## $ efficacy_10 <dbl> 4, 5, 4, 3, 5, 5, 3, 2, 3, 3, 5, 3, 4, 4, 5, 3, 4, 4, 4…
## $ sociability_1 <dbl> 1, 2, 4, 4, 4, 1, 2, 4, 5, 5, 4, 3, 3, 4, 2, 3, 3, 2, 5…
## $ sociability_2 <dbl> 1, 2, 1, 2, 4, 1, 2, 4, 4, 5, 5, 1, 3, 2, 5, 3, 3, 1, 5…
## $ sociability_3 <dbl> 4, 3, 1, 2, 3, 5, 3, 2, 2, 2, 3, 4, 4, 4, 2, 3, 1, 3, 4…
## $ sociability_4 <dbl> 1, 3, 2, 1, 3, 2, 3, 3, 3, 1, 4, 3, 2, 3, 1, 3, 2, 4, 3…
## $ sociability_5 <dbl> 3, 5, 5, 2, 2, 5, 3, 4, 2, 1, 4, 2, 5, 2, 2, 4, 3, 2, 4…
## $ sociability_6 <dbl> 2, 3, 4, 3, 2, 2, 4, 3, 1, 1, 1, 2, 2, 2, 1, 1, 3, 5, 3…
## $ sociability_7 <dbl> 4, 5, 5, 4, 3, 1, 5, 3, 4, 1, 4, 4, 5, 2, 2, 1, 2, 4, 1…
## $ sociability_8 <dbl> 1, 3, 4, 1, 1, 2, 2, 2, 4, 4, 1, 5, 5, 1, 2, 1, 2, 5, 3…
## $ sociability_9 <dbl> 2, 5, 5, 1, 3, 5, 5, 2, 3, 3, 1, 4, 4, 2, 4, 2, 2, 5, 4…
## $ sociability_10 <dbl> 3, 2, 4, 3, 3, 2, 4, 4, 1, 2, 3, 2, 4, 1, 3, 1, 4, 3, 1…
## $ stress_1 <dbl> 1, 4, 2, 1, 1, 0, 3, 2, 2, 2, 4, 2, 3, 1, 1, 0, 3, 4, 1…
## $ stress_2 <dbl> 2, 4, 1, 0, 1, 2, 3, 3, 2, 1, 3, 3, 2, 3, 3, 1, 1, 2, 4…
## $ stress_3 <dbl> 1, 2, 3, 1, 2, 2, 3, 1, 3, 2, 4, 3, 2, 1, 2, 0, 0, 4, 3…
## $ stress_4 <dbl> 4, 1, 0, 4, 4, 1, 2, 1, 4, 2, 1, 1, 0, 2, 3, 3, 3, 2, 0…
## $ stress_5 <dbl> 0, 2, 0, 3, 3, 4, 1, 0, 3, 3, 1, 0, 0, 0, 1, 4, 0, 0, 3…
## $ stress_6 <dbl> 0, 4, 3, 3, 1, 1, 3, 1, 2, 0, 4, 2, 3, 1, 3, 2, 1, 2, 2…
## $ stress_7 <dbl> 1, 0, 3, 2, 4, 2, 0, 0, 4, 2, 0, 4, 0, 1, 1, 2, 1, 1, 2…
## $ stress_8 <dbl> 0, 2, 3, 1, 2, 4, 2, 3, 3, 2, 0, 1, 1, 0, 1, 3, 0, 0, 3…
## $ stress_9 <dbl> 1, 2, 2, 1, 2, 3, 3, 1, 3, 0, 4, 0, 4, 3, 0, 1, 2, 3, 4…
## $ stress_10 <dbl> 0, 2, 2, 0, 0, 1, 4, 4, 1, 1, 3, 4, 3, 3, 3, 1, 1, 4, 2…

Note: As we can see, in this data, there is a consistent naming pattern for each item on each scale. For example, the
“anxiety” scale items are anxiety_1, anxiety_2, and so on, the “depression” scale items are depression_1,
depression_2, and so on. It is necessary for you to use a consistent naming scheme like this for your data.

Recode the negatively key items

In the case of my data, I have the following information about the range of values of the items and which items are
negatively keys.

anxiety: values 0-4; neg keyed = 6, 7, 8, 9, 10

depression: values 1-5; neg keyed = 8, 9, 10
efficacy: values 1-5; neg keyed = 7, 8, 9, 10
sociability: values 1-5; neg keyed = 4, 5, 6, 7, 8, 9, 10
stress: values 0-4; neg keyed = 4, 5, 7, 8

For your scales, you must get this information too.

In the following code, for each item that needs to be reverse coded, we have one line of code that names the item
and gives their original values and the new values to which they are mapped.

psymetr_df_fix <- mutate(psymetr_df,

anxiety_6 = re_code(anxiety_6, 0:4, 4:0),
anxiety_7 = re_code(anxiety_7, 0:4, 4:0),
anxiety_8 = re_code(anxiety_8, 0:4, 4:0),
anxiety_9 = re_code(anxiety_9, 0:4, 4:0),
anxiety_10 = re_code(anxiety_10, 0:4, 4:0),
depression_8 = re_code(depression_8, 1:5, 5:1),
depression_9 = re_code(depression_9, 1:5, 5:1),
depression_10 = re_code(depression_10, 1:5, 5:1),
efficacy_7 = re_code(efficacy_7, 1:5, 5:1),
efficacy_8 = re_code(efficacy_8, 1:5, 5:1),
efficacy_9 = re_code(efficacy_9, 1:5, 5:1),
efficacy_10 = re_code(efficacy_10, 1:5, 5:1),
sociability_4 = re_code(sociability_4, 1:5, 5:1),
sociability_5 = re_code(sociability_5, 1:5, 5:1),
sociability_6 = re_code(sociability_6, 1:5, 5:1),
sociability_7 = re_code(sociability_7, 1:5, 5:1),
sociability_8 = re_code(sociability_8, 1:5, 5:1),
sociability_9 = re_code(sociability_9, 1:5, 5:1),
sociability_10 = re_code(sociability_10, 1:5, 5:1),
stress_4 = re_code(stress_4, 0:4, 4:0),
stress_5 = re_code(stress_5, 0:4, 4:0),
stress_7 = re_code(stress_7, 0:4, 4:0),
stress_8 = re_code(stress_8, 0:4, 4:0)
)

Be careful with this code. Check every item to make sure the item name is correct and the original and new values
are correct.
Remember to assign results to new data frame. In the code above, after the recoding is done, the new data frame
produced is named psymetr_df_fix. We use this data frame from now on.

Calculate Cronbach’s alpha for each scale

For each scale, we want to calculate Cronbach’s alpha measure of internal consistency. We do this using the
cronbach function in the psyntur package.

Remember to use the data frame where the items have been recoded. For example, in my case, this is
psymetr_df_fix.

In the following calculations, we select the items for each scale using the start_with functions. This assumes that
all items for each scale begin with a common prefix, which they do in my case, as mentioned above. For example,
all the items on the stress scale begin with stress_, and all the items on the depression scale begin with
depression_, and so on. For each set of items that is selected, the cronbach function will return the estimate of the
$\alpha$ coefficient and its 95% confidence interval.

cronbach(psymetr_df_fix,
anxiety = starts_with('anxiety_'),
depression = starts_with('depression_'),
efficacy = starts_with('efficacy_'),
sociability = starts_with('sociability_'),
stress = starts_with('stress_')
)

## # A tibble: 5 × 4
## scale alpha ci_lo ci_hi
## <chr> <dbl> <dbl> <dbl>
## 1 anxiety 0.620 0.452 0.788
## 2 depression 0.734 0.620 0.848
## 3 efficacy 0.706 0.577 0.835
## 4 sociability 0.634 0.473 0.794
## 5 stress 0.834 0.761 0.907

Calculate the aggregate scores

For each scale, we must calculate the mean over all the items (again, we must use the data frame with the reversed
items) to calculate the mean scores. For this, we use the command total_scores from the psyntur package. Its
syntax is very similar to that used above in cronbach. In particular, for each scale, we select all items to be
averaged over using starts_with. For example, using the following code, we get back a new data frame, which
we name psymetr_df_total, that has the the average anxiety, depression, efficacy, sociability, and
stress score for each participant.

psymetr_df_total <- total_scores(psymetr_df_fix,

anxiety = starts_with('anxiety_'),
depression = starts_with('depression_'),
efficacy = starts_with('efficacy_'),
sociability = starts_with('sociability_'),
stress = starts_with('stress_')
)
psymetr_df_total

## # A tibble: 44 × 5
## anxiety depression efficacy sociability stress
## <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2 2 2.2 3.2 1.6
## 2 1.8 1.8 1.7 2.3 2.9
## 3 2.1 2.8 2 1.9 2.3
## 4 1.8 2.3 3 3.5 1.2
## 5 1.1 1.6 3.3 3.6 1
## 6 1.6 2.3 2.7 3 1.4
## 7 2.5 3.2 1.9 2.3 3
## 8 1.8 2.3 1.7 3.1 2.4
## 9 1.5 2.2 3 3.5 1.5
## 10 1.5 1.8 2.3 4.1 1.3
## # … with 34 more rows

Should we calculate the mean or the sum of all the items’ values? If we have missing values, the sum can be
misleading. For example, if we have 10 items on a 5 point scale, the total score possible is 50. The mean shows
that they have on average the maximum score, but this is not apparent from the sum. However, sometimes people
want to report the sum, though don’t want it affect by any missing values. A way of doing this is to multiply the
mean, calculated after the missing values have been removed, by the number of items. For example, in the example
just mentioned, we could multiply the mean of 5.0 by 10 to get 50. This can be done in the total_scores function,
by saying .method = 'sum_like as follows.

total_scores(psymetr_df_fix,
anxiety = starts_with('anxiety_'),
depression = starts_with('depression_'),
efficacy = starts_with('efficacy_'),
sociability = starts_with('sociability_'),
stress = starts_with('stress_'),
.method = 'sum_like'
)

## # A tibble: 44 × 5
## anxiety depression efficacy sociability stress
## <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 20 20 22 32 16
## 2 18 18 17 23 29
## 3 21 28 20 19 23
## 4 18 23 30 35 12
## 5 11 16 33 36 10
## 6 16 23 27 30 14
## 7 25 32 19 23 30
## 8 18 23 17 31 24
## 9 15 22 30 35 15
## 10 15 18 23 41 13
## # … with 34 more rows

The total_scores function uses the same aggregation method for all the variables. Sometimes, however, you
might like to calculate the, for example, mean for some variables and the sum (or sum_like) for other variables. To
do this, you must use the total_scores function twice, once of one set of variables, and then a second time for
another set of variables. The resulting two data frames can be bound together using bind_cols. In the following
example, we calculate the mean for the anxiety and depression scores, and the sum (sum_like) for the remaining
three variables, and then we bind them together with bind_cols.

bind_cols(
total_scores(psymetr_df_fix,
anxiety = starts_with('anxiety_'),
depression = starts_with('depression_'),
.method = 'mean'),
total_scores(psymetr_df_fix,
efficacy = starts_with('efficacy_'),
sociability = starts_with('sociability_'),
stress = starts_with('stress_'),
.method = 'sum_like')
)

## # A tibble: 44 × 5
## anxiety depression efficacy sociability stress
## <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2 2 22 32 16
## 2 1.8 1.8 17 23 29
## 3 2.1 2.8 20 19 23
## 4 1.8 2.3 30 35 12
## 5 1.1 1.6 33 36 10
## 6 1.6 2.3 27 30 14
## 7 2.5 3.2 19 23 30
## 8 1.8 2.3 17 31 24
## 9 1.5 2.2 30 35 15
## 10 1.5 1.8 23 41 13
## # … with 34 more rows

Calculating descriptives
For each variable, we can get back the mean, standard deviation, or any other descriptive statistic, as follows:

describe_across(psymetr_df_total,
variables = c(stress, anxiety, depression, efficacy, sociability),
functions = list(avg = mean, stdev = sd),
pivot = TRUE)

## # A tibble: 5 × 3
## variable avg stdev
## <chr> <dbl> <dbl>
## 1 stress 2.16 0.826
## 2 anxiety 1.85 0.458
## 3 depression 2.21 0.506
## 4 efficacy 2.25 0.488
## 5 sociability 3.08 0.615

We can make the above code a little bit simpler by using the everything() function for the value of the variables
argument. As we can see above, we individually selected each one of all the variables in the data set. Instead, if we
use everything(), we automatically select all the variables.

describe_across(psymetr_df_total,
variables = everything(),
functions = list(avg = mean, stdev = sd),
pivot = TRUE)

## # A tibble: 5 × 3
## variable avg stdev
## <chr> <dbl> <dbl>
## 1 anxiety 1.85 0.458
## 2 depression 2.21 0.506
## 3 efficacy 2.25 0.488
## 4 sociability 3.08 0.615
## 5 stress 2.16 0.826

If there are any missing values in psymetr_df_total, we will get NA values in the table of results from
describe_across. To avoid this, we can use counterparts of mean and sd that remove missing values before they
calculate the results. These are mean_xna and sd_xna, respectively. The following code uses these, but in this case,
because there were no missing values in the data, nothing changes in the table.

describe_across(psymetr_df_total,
variables = everything(),
functions = list(avg = mean_xna, stdev = sd_xna),
pivot = TRUE)
## # A tibble: 5 × 3
## variable avg stdev
## <chr> <dbl> <dbl>
## 1 anxiety 1.85 0.458
## 2 depression 2.21 0.506
## 3 efficacy 2.25 0.488
## 4 sociability 3.08 0.615
## 5 stress 2.16 0.826

Inter correlation matrix

We can also get the pairwise inter-correlation matrix as follows.

cor(psymetr_df_total)

## anxiety depression efficacy sociability stress

## anxiety 1.000000000 0.83274894 -0.56651310 0.004535846 0.8105351
## depression 0.832748939 1.00000000 -0.36339327 -0.085415608 0.6830987
## efficacy -0.566513097 -0.36339327 1.00000000 -0.042575472 -0.6259372
## sociability 0.004535846 -0.08541561 -0.04257547 1.000000000 -0.1448385
## stress 0.810535074 0.68309866 -0.62593715 -0.144838515 1.0000000

Note. If we had NA values in the psymetr_df_total, we would have to remove these first before we calculate the
correlation matrix. We would do this with the following version of the cor command.

cor(psymetr_df_total, use = 'complete.obs')

## anxiety depression efficacy sociability stress

We can make a scatterplot matrix using the command scatterplot_matrix from psyntur.

scatterplot_matrix(psymetr_df_total,
anxiety,
depression,
efficacy,
sociability,
stress)
Regression analysis

Regression
We do the multiple regression by indicating the outcome variable and the predictor variables, which in this case are
stress and anxiety, depression, efficacy, sociability, respectively.

Remember to use the data frame with the total scores.

model <- lm(stress ~ anxiety + depression + efficacy + sociability, data = psymetr_df_total)

The summary will give you the

coefficients table
the $R^2$
the adjusted $R^2$
the F statistic for the overall model null hypothesis

All of these are expected to be reported in your analysis report.

summary(model)

##
## Call:
## lm(formula = stress ~ anxiety + depression + efficacy + sociability,
## data = psymetr_df_total)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.8980 -0.3043 0.0488 0.2591 0.9700
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.5410 0.7505 2.053 0.04678 *
## anxiety 1.0748 0.3203 3.355 0.00178 **
## depression 0.1250 0.2583 0.484 0.63129
## efficacy -0.4512 0.1776 -2.541 0.01513 *
## sociability -0.2045 0.1143 -1.790 0.08127 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.455 on 39 degrees of freedom
## Multiple R-squared: 0.7247, Adjusted R-squared: 0.6965
## F-statistic: 25.67 on 4 and 39 DF, p-value: 1.803e-10

As we can see, the $R^2$ value is 0.725, the adjusted $R^2$ value is 0.696, the F statistic is $F(4, 39) = 25.67$.

Confidence intervals
We can get the confidence intervals for the coefficients as follows.

confint(model)

## 2.5 % 97.5 %
## (Intercept) 0.02304224 3.05905333
## anxiety 0.42688263 1.72274893
## depression -0.39752733 0.64742803
## efficacy -0.81037120 -0.09208644
## sociability -0.43558962 0.02661865

Multicollinearity
We can measure the multicollinearity using the variance inflation factor using the vif function from car.

vif(model)

## anxiety depression efficacy sociability

## 4.476008 3.541564 1.560851 1.026784

Standardized coefficients
The standardized coefficients can be obtained using the lm.beta function from the lm.beta package. We send the
model to lm.beta to get a new standardized model, and then we can use summary etc with this model.

model_standardized <- lm.beta(model)

summary(model_standardized)

##
## Call:
## lm(formula = stress ~ anxiety + depression + efficacy + sociability,
## data = psymetr_df_total)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.8980 -0.3043 0.0488 0.2591 0.9700
##
## Coefficients:
## Estimate Standardized Std. Error t value Pr(>|t|)
## (Intercept) 1.54105 0.00000 0.75049 2.053 0.04678 *
## anxiety 1.07482 0.59642 0.32033 3.355 0.00178 **
## depression 0.12495 0.07648 0.25831 0.484 0.63129
## efficacy -0.45123 -0.26675 0.17756 -2.541 0.01513 *
## sociability -0.20449 -0.15237 0.11426 -1.790 0.08127 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.455 on 39 degrees of freedom
## Multiple R-squared: 0.7247, Adjusted R-squared: 0.6965
## F-statistic: 25.67 on 4 and 39 DF, p-value: 1.803e-10

早年自敲代码
No ratings yet
早年自敲代码
96 pages
(Practical) Programming With R
No ratings yet
(Practical) Programming With R
5 pages
Healthcare Analytics
No ratings yet
Healthcare Analytics
72 pages
R Programming
No ratings yet
R Programming
47 pages
Untitled 6
No ratings yet
Untitled 6
3 pages
Datarium
No ratings yet
Datarium
12 pages
Jay Borda Itr Practical 1 To 9
No ratings yet
Jay Borda Itr Practical 1 To 9
42 pages
Task 1 RR Usa
No ratings yet
Task 1 RR Usa
5 pages
Efa Medstat
No ratings yet
Efa Medstat
20 pages
R Studio Notes
No ratings yet
R Studio Notes
6 pages
Experiment 2
No ratings yet
Experiment 2
7 pages
Aman DA 111
No ratings yet
Aman DA 111
14 pages
Experiment 2
No ratings yet
Experiment 2
7 pages
Introduction To Psych Package
No ratings yet
Introduction To Psych Package
65 pages
2 Project
No ratings yet
2 Project
9 pages
Data - Analysis - With - R - 24
No ratings yet
Data - Analysis - With - R - 24
47 pages
Lab 2
No ratings yet
Lab 2
22 pages
Overview
No ratings yet
Overview
94 pages
Sunil Test
No ratings yet
Sunil Test
15 pages
Psycho
No ratings yet
Psycho
20 pages
Uni T - 2 - R Programming
No ratings yet
Uni T - 2 - R Programming
10 pages
Mock Exam - Appendix
No ratings yet
Mock Exam - Appendix
15 pages
Lecture 5 (Managing and Understanding Data)
No ratings yet
Lecture 5 (Managing and Understanding Data)
9 pages
R Practicals
No ratings yet
R Practicals
32 pages
Useful R Functions-1
No ratings yet
Useful R Functions-1
4 pages
R Code
No ratings yet
R Code
9 pages
D.A Lab Assignment-07: Input
No ratings yet
D.A Lab Assignment-07: Input
15 pages
Scientific Data Visualization: Using Ggplot2
No ratings yet
Scientific Data Visualization: Using Ggplot2
53 pages
IntroR 2
No ratings yet
IntroR 2
18 pages
Da Lab File 2
No ratings yet
Da Lab File 2
13 pages
R Programming-1
No ratings yet
R Programming-1
6 pages
Experiment 2
No ratings yet
Experiment 2
7 pages
BDA MSC It
No ratings yet
BDA MSC It
35 pages
R - Lecture #2
No ratings yet
R - Lecture #2
21 pages
Assignment# 06
No ratings yet
Assignment# 06
16 pages
Mann-Whitney U Test: Theoretical Discussion and Sample Problem
No ratings yet
Mann-Whitney U Test: Theoretical Discussion and Sample Problem
14 pages
Pool
No ratings yet
Pool
13 pages
AMDA Practical - A048
No ratings yet
AMDA Practical - A048
35 pages
R File Code
No ratings yet
R File Code
16 pages
Experiment 5
No ratings yet
Experiment 5
13 pages
R Console
No ratings yet
R Console
6 pages
R Commands
No ratings yet
R Commands
18 pages
Lec 13
No ratings yet
Lec 13
46 pages
R
No ratings yet
R
6 pages
BDA Assignment Aman 19019
No ratings yet
BDA Assignment Aman 19019
38 pages
All Values in The First Column
No ratings yet
All Values in The First Column
7 pages
Simulating Multivariate Structures
No ratings yet
Simulating Multivariate Structures
3 pages
An Introduction To The Psych Package: Part I: Data Entry and Data Description
No ratings yet
An Introduction To The Psych Package: Part I: Data Entry and Data Description
63 pages
R Course
No ratings yet
R Course
7 pages
Simple and Multiple Linear Regression and Correlation
No ratings yet
Simple and Multiple Linear Regression and Correlation
41 pages
Bayesian Tutorial
83% (6)
Bayesian Tutorial
76 pages
Logistic Regression Assignment
No ratings yet
Logistic Regression Assignment
20 pages
Abadie2010 - Synthetic Control Methods
No ratings yet
Abadie2010 - Synthetic Control Methods
14 pages
R Script Module 3
No ratings yet
R Script Module 3
6 pages
Introduction To Quantitative Analysis. Leonardo D. Villamil. HW2 09/26/2016
No ratings yet
Introduction To Quantitative Analysis. Leonardo D. Villamil. HW2 09/26/2016
7 pages
R Syntax Examples 1
No ratings yet
R Syntax Examples 1
6 pages
GPower Manual
No ratings yet
GPower Manual
85 pages
Model Misspecification: Gerda Claeskens
No ratings yet
Model Misspecification: Gerda Claeskens
21 pages
R Commands: Appendix B
No ratings yet
R Commands: Appendix B
5 pages
Chap013 Test Bank
No ratings yet
Chap013 Test Bank
7 pages
Basics: TH TH TH TH TH TH TH
No ratings yet
Basics: TH TH TH TH TH TH TH
3 pages
STAT-2450 Assignment 1: Name:, Student ID: B00
No ratings yet
STAT-2450 Assignment 1: Name:, Student ID: B00
9 pages
（去）政黨標籤化及其影響：以村里長選舉為例
No ratings yet
（去）政黨標籤化及其影響：以村里長選舉為例
58 pages
A Short List of The Most Useful R Commands
No ratings yet
A Short List of The Most Useful R Commands
11 pages
A Short List of The Most Useful R Commands
No ratings yet
A Short List of The Most Useful R Commands
8 pages
Statistical Test Assumptions
No ratings yet
Statistical Test Assumptions
28 pages
RatioandRegressionMethodofEstimation Lecture7 10
No ratings yet
RatioandRegressionMethodofEstimation Lecture7 10
30 pages
52
No ratings yet
52
81 pages
Business Statistics, 4e: by Ken Black
No ratings yet
Business Statistics, 4e: by Ken Black
53 pages
Chapter 03 Correlation and Regression
No ratings yet
Chapter 03 Correlation and Regression
21 pages
Notes - Module 4
No ratings yet
Notes - Module 4
17 pages
A Regression Analysis Investigating The Relationship Between Income and Happiness
No ratings yet
A Regression Analysis Investigating The Relationship Between Income and Happiness
7 pages
STA 114 Test 2 30th April 2022 - Question Paper
No ratings yet
STA 114 Test 2 30th April 2022 - Question Paper
5 pages
126 Saena
No ratings yet
126 Saena
22 pages
Tests For The Ratio of Two Poisson Rates
No ratings yet
Tests For The Ratio of Two Poisson Rates
15 pages
IFS Course Handout
No ratings yet
IFS Course Handout
3 pages
Outsourcing in Banking
No ratings yet
Outsourcing in Banking
10 pages
Quality Kitchens Meat Loaf Mix: Team 8
No ratings yet
Quality Kitchens Meat Loaf Mix: Team 8
7 pages
Hypothesis Testing of OLS Unit 1
No ratings yet
Hypothesis Testing of OLS Unit 1
3 pages
One Way Anova: SUM 175 195 153 Mean
No ratings yet
One Way Anova: SUM 175 195 153 Mean
4 pages
ISEE 760 Syllabus Fall 2021
No ratings yet
ISEE 760 Syllabus Fall 2021
14 pages
Sta 316 Cat2 PDF
No ratings yet
Sta 316 Cat2 PDF
2 pages
ECMT1010 Final Formula
No ratings yet
ECMT1010 Final Formula
2 pages
Review-Validation of QSAR Models-Strategies and Importance
No ratings yet
Review-Validation of QSAR Models-Strategies and Importance
9 pages
Simple Tutorial in R
No ratings yet
Simple Tutorial in R
15 pages
Business Analytics: Team 7
No ratings yet
Business Analytics: Team 7
7 pages
Forecasting For Asian Paints
No ratings yet
Forecasting For Asian Paints
3 pages
Regression Analysis With Cross-Sectional Data
No ratings yet
Regression Analysis With Cross-Sectional Data
0 pages
English Prepositions Test
From Everand
English Prepositions Test
Więckowski Radosław
No ratings yet
Advanced Cue Ball Control Self-Testing Program
From Everand
Advanced Cue Ball Control Self-Testing Program
Allan P. Sand
No ratings yet

Videos and Tutorials On Data Analysis in The Psychometrics Lab

Uploaded by

Videos and Tutorials On Data Analysis in The Psychometrics Lab

Uploaded by

Psychometrics Lab: Data Analysis

Export data and clean it

Export data and clean it

Install necessary packages

install.packages(c('tidyverse', 'psyntur', 'car', 'lm.beta'))

Start a script, load the packages

Read in your data

lab_data <- read_csv("psychometric_lab_data.csv")

psymetr_df <- read_csv("http://data.ntupsychology.net/psychometrics_demo_data.csv")

Look at your data

Recode the negatively key items

anxiety: values 0-4; neg keyed = 6, 7, 8, 9, 10

For your scales, you must get this information too.

psymetr_df_fix <- mutate(psymetr_df,

Calculate Cronbach’s alpha for each scale

Calculate the aggregate scores

psymetr_df_total <- total_scores(psymetr_df_fix,

Inter correlation matrix

## anxiety depression efficacy sociability stress

cor(psymetr_df_total, use = 'complete.obs')

## anxiety depression efficacy sociability stress

Remember to use the data frame with the total scores.

model <- lm(stress ~ anxiety + depression + efficacy + sociability, data = psymetr_df_total)

The summary will give you the

All of these are expected to be reported in your analysis report.

## anxiety depression efficacy sociability

model_standardized <- lm.beta(model)

You might also like