0% found this document useful (0 votes)

80 views9 pages

"Cps - TXT" "Education" "South" "SEX" "Experience" "Union" "WAGE" "AGE" "RACE" "Occupat Ion" "Sector" "MARR"

This document demonstrates using R to analyze a dataset called cps. It loads packages, prepares the data, and answers several questions involving exploring and visualizing relationships in the data. For question 1, it selects columns, calculates correlations, filters rows, and calculates means. For question 2, it creates scatter plots and density plots of variables, recodes factors, and makes box plots. Question 3 involves applying functions like mean, sd, and split to randomly generated data and calculating statistics within groups.

Uploaded by

Alper Tamay Arslan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

80 views9 pages

"Cps - TXT" "Education" "South" "SEX" "Experience" "Union" "WAGE" "AGE" "RACE" "Occupat Ion" "Sector" "MARR"

Uploaded by

Alper Tamay Arslan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 9

library(magrittr)

library(dplyr)

##
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':

##
## filter, lag

## The following objects are masked from 'package:base':

##
## intersect, setdiff, setequal, union

cps = read.delim(file="cps.txt", header = FALSE)

colnames(cps)=
c("EDUCATION","SOUTH","SEX","EXPERIENCE","UNION","WAGE","AGE","RACE","OCCUPAT
ION","SECTOR","MARR")

#QUESTION 1
#PART A

cps1=data.frame()
cps1= cps %>% select ("SEX","EXPERIENCE","WAGE","AGE","MARR")

#PART B
cps1 %>% cor(cps1)

## SEX EXPERIENCE WAGE AGE MARR

## SEX 1.00000000 0.07522998 -0.20537055 0.07917859 0.01122521
## EXPERIENCE 0.07522998 1.00000000 0.08705953 0.97796125 0.27089961
## WAGE -0.20537055 0.08705953 1.00000000 0.17696688 0.10057887
## AGE 0.07917859 0.97796125 0.17696688 1.00000000 0.27894727
## MARR 0.01122521 0.27089961 0.10057887 0.27894727 1.00000000

#PART C
cps2=data.frame()
cps2= cps %>% select(-"SOUTH",-"UNION",-"MARR") %>% filter(between(AGE, 30,
50), SECTOR==2)

I used the select function to drop the columns and also filter function for subseting the
dataframe.
cps %>% mutate(New_Column = WAGE / AGE) %>% filter(New_Column>0.25) %>% count

## n
## 1 216

Mutate function is for combining two variables and generating a new variable. İ could not
solve how to print the output of count.
#PART E
x=cps %>% filter(SEX==0)
tapply(x$WAGE , x$OCCUPATION , mean)

## 1 2 3 4 5 6
## 13.721765 9.495714 7.489048 7.226471 12.773962 9.068175

#or
cps %>%group_by(OCCUPATION)%>% filter(SEX==0) %>% summarise(mean(WAGE))

## `summarise()` ungrouping output (override with `.groups` argument)

## # A tibble: 6 x 2
## OCCUPATION `mean(WAGE)`
## <int> <dbl>
## 1 1 13.7
## 2 2 9.50
## 3 3 7.49
## 4 4 7.23
## 5 5 12.8
## 6 6 9.07

There are 2 solutions of this question. The difference is my first answer is including tapply
function which is really good to appyling 1 function to 2 variables.
#PART F
#cps %>% table(cps$MARR,cps$SEX)

In thıs question, in the begining the code was working but, i could not understand why it is
not working now. It says lenghts must be same. I checked the lenghts of them but its not the
problem. In addition the table functions is generating a contingency table for 2 variables.
#QUESTION 2
#PART A

library(ggplot2)
ggplot(cps, aes(x = AGE, y = WAGE)) +
geom_point(aes(color = factor(SEX))) + scale_color_brewer(palette =
"Dark2") + labs (title="Distribution of age and wage by sex")
It representing the distribution of age and wage accordingly to sex variable. I used the
factor function to convert continous values to discrete values.
ggplot(cps, aes(WAGE)) +
geom_density(aes(color = factor(RACE)),size=2,linetype="dashed") + labs
(title="The density plot of the wage by race")
The graph shows the density plot of the wage by race.
cps$RACE[cps$RACE=="1"]="Other"
cps$RACE[cps$RACE=="2"]="Hispanic"
cps$RACE[cps$RACE=="3"]="White"
cps$OCCUPATION[cps$OCCUPATION=="1"]="Management"
cps$OCCUPATION[cps$OCCUPATION=="2"]="Sales"
cps$OCCUPATION[cps$OCCUPATION=="3"]="Clerical"
cps$OCCUPATION[cps$OCCUPATION=="4"]="Service"
cps$OCCUPATION[cps$OCCUPATION=="5"]="Professional"
cps$OCCUPATION[cps$OCCUPATION=="6"]="Other"
cps$RACE = factor(cps$RACE , levels=c("Other", "Hispanic", "White"))
cps$OCCUPATION= factor(cps$OCCUPATION,
levels=c("Management","Sales","Clerical","Service","Professional","Other"))

ggplot(cps, aes(x = RACE, y = WAGE ,fill =RACE)) +

geom_boxplot()+
facet_wrap(~ OCCUPATION, ncol = 3) +ylab("WAGE ($/hour)")
I noticed that i should have assigned the names to values while working on this question. At
the begining it was not the same order as in the homework pdf, so i listed them again to get
same order.
#PART D
male_hist=ggplot(cps[cps$SEX==0,], aes(x=AGE ))+
geom_histogram(fill="blue",bins=200) + labs(title="Ages of Males")
male_hist
female_hist=ggplot(cps[cps$SEX==1,], aes(x=AGE ))+
geom_histogram(fill="pink",bins=200) + labs(title="Ages of Females")
female_hist
gridExtra::grid.arrange(male_hist,female_hist,ncol=2)

Ages of males and ages of females in the same window.

#PART E
library(ggmosaic)

## Warning: package 'ggmosaic' was built under R version 4.0.5

ggplot(data = cps) +
geom_mosaic(aes(x = product(RACE), fill = OCCUPATION)) +
labs(title="Relationship between occupation and race")
Mosaic plot of relationship between occupation and race variables.
#QUESTION 3
#PART A
set.seed(124)
X = rnorm(1000)
Y = rnorm(50,10,2)
Z = runif(200,-5,20)
obj1=list(X,Y,Z)
lapply(obj1,mean)

## [[1]]
## [1] -0.0653552
##
## [[2]]
## [1] 10.58706
##
## [[3]]
## [1] 6.639656

Rnorm for normal dist. and runif for uniform dist.

#PART B
obj2=matrix(X,nrow=50,ncol=20)
apply(obj2, 2, sd)
## [1] 0.7978615 0.9672100 0.9389446 0.8773311 1.1407254 0.9078230 1.0698188
## [8] 0.9932702 0.9363832 0.9563979 0.8518203 0.8733509 1.0939566 0.9817683
## [15] 1.0541420 1.1166804 0.9214657 1.1430706 0.9734228 0.9550968

#PART C
h=LETTERS[1:4]
g=rep(h,50)
obj3=data.frame(Z,g)
mean(obj3$Z[which(obj3$g=="A")])

## [1] 6.839135

mean(obj3$Z[which(obj3$g=="B")])

## [1] 6.986212

mean(obj3$Z[which(obj3$g=="C")])

## [1] 5.560432

mean(obj3$Z[which(obj3$g=="D")])

## [1] 7.172845

“tapply(obj3Z , o b j3g["A"],mean)” I tried this but did not work.

#PART D

j=split(Y,gl(5,10))
matrix(lapply(j,max))

## [,1]
## [1,] 12.63274
## [2,] 13.20395
## [3,] 14.82519
## [4,] 15.30867
## [5,] 14.68656

matrix(lapply(j,min))

## [,1]
## [1,] 5.565061
## [2,] 9.320027
## [3,] 7.280937
## [4,] 4.97121
## [5,] 7.33229

Lecture 1
No ratings yet
Lecture 1
167 pages
Exercise - Analytical Exposition Text
40% (5)
Exercise - Analytical Exposition Text
3 pages
R语言学习笔记
No ratings yet
R语言学习笔记
78 pages
Stastistics and Probability With R Programming Language: Lab Report
50% (2)
Stastistics and Probability With R Programming Language: Lab Report
44 pages
Tobit Models: Econ 60303 Bill Evans
No ratings yet
Tobit Models: Econ 60303 Bill Evans
20 pages
Final Draft
No ratings yet
Final Draft
36 pages
RSCH8079 - Session 09 - Data Science With R
No ratings yet
RSCH8079 - Session 09 - Data Science With R
69 pages
Tobit Models - R Data Analysis Examples
No ratings yet
Tobit Models - R Data Analysis Examples
9 pages
Applied Statistics MAT1011
No ratings yet
Applied Statistics MAT1011
22 pages
Commands For Data Analysis Using R
No ratings yet
Commands For Data Analysis Using R
11 pages
Returns To Education: Chapter 1: Defining and Collecting Data
100% (1)
Returns To Education: Chapter 1: Defining and Collecting Data
13 pages
4 TH
No ratings yet
4 TH
10 pages
Lab 02 - Compound Data Structures
No ratings yet
Lab 02 - Compound Data Structures
12 pages
RStudio
No ratings yet
RStudio
4 pages
R Studio Notes
No ratings yet
R Studio Notes
6 pages
(Practical) Programming With R
No ratings yet
(Practical) Programming With R
5 pages
R Programing Bhagu
No ratings yet
R Programing Bhagu
40 pages
Statistic and R Programming Lab Exercise
No ratings yet
Statistic and R Programming Lab Exercise
8 pages
Lab File AD PDF
No ratings yet
Lab File AD PDF
25 pages
Day 9: Primary Health Care (PHC) : CHN Lec Term 2 Exam
No ratings yet
Day 9: Primary Health Care (PHC) : CHN Lec Term 2 Exam
46 pages
R Working Materials Prep
No ratings yet
R Working Materials Prep
43 pages
Lab 2
No ratings yet
Lab 2
22 pages
Data Wrangling
No ratings yet
Data Wrangling
12 pages
Assignments: Statistics Workshop 1: Introduction To R. Tuesday May 26, 2009
No ratings yet
Assignments: Statistics Workshop 1: Introduction To R. Tuesday May 26, 2009
39 pages
Lab1 Revathy
No ratings yet
Lab1 Revathy
6 pages
Rubel Assignment 2
No ratings yet
Rubel Assignment 2
7 pages
R Practicals
No ratings yet
R Practicals
32 pages
WEEK 3 Activity - Assignment 1
No ratings yet
WEEK 3 Activity - Assignment 1
5 pages
R Tutorial
No ratings yet
R Tutorial
6 pages
BDA MSC It
No ratings yet
BDA MSC It
35 pages
Sunil Test
No ratings yet
Sunil Test
15 pages
Maths Lab
No ratings yet
Maths Lab
17 pages
Experiment Lab-II
No ratings yet
Experiment Lab-II
9 pages
R For Machine Learning Lab Practical Work: Master of Business Administration in Business Analytics
0% (1)
R For Machine Learning Lab Practical Work: Master of Business Administration in Business Analytics
9 pages
IntroR 2
No ratings yet
IntroR 2
18 pages
R Note
No ratings yet
R Note
56 pages
Experiment Lab-II
No ratings yet
Experiment Lab-II
9 pages
R Cheat Sheet
No ratings yet
R Cheat Sheet
9 pages
Experiment 2
No ratings yet
Experiment 2
7 pages
Practical 10
No ratings yet
Practical 10
22 pages
Innovative Lpe Coatings
No ratings yet
Innovative Lpe Coatings
30 pages
Numericals (Force)
No ratings yet
Numericals (Force)
22 pages
R Commands
No ratings yet
R Commands
18 pages
Lab 5 EA
No ratings yet
Lab 5 EA
4 pages
Assignment EDA
No ratings yet
Assignment EDA
17 pages
PA Univariate R Solution
No ratings yet
PA Univariate R Solution
6 pages
R Working Manuals Students
No ratings yet
R Working Manuals Students
11 pages
R Console
No ratings yet
R Console
6 pages
R Program Record Book Iba
No ratings yet
R Program Record Book Iba
24 pages
R Examples
No ratings yet
R Examples
56 pages
Vorplex - MST - Airblowing and Water Flushing
No ratings yet
Vorplex - MST - Airblowing and Water Flushing
14 pages
R
No ratings yet
R
6 pages
EDF 222 - Philosophy of Education
No ratings yet
EDF 222 - Philosophy of Education
7 pages
R Functions
No ratings yet
R Functions
6 pages
Essay On My Hero
100% (2)
Essay On My Hero
3 pages
ME 111 Thermodynamics 1
No ratings yet
ME 111 Thermodynamics 1
8 pages
Partial Molar Heat Content and Chemical Potential, Significance and Factors Affecting, Gibb's-Duhem Equation
No ratings yet
Partial Molar Heat Content and Chemical Potential, Significance and Factors Affecting, Gibb's-Duhem Equation
11 pages
R Studio Notes
No ratings yet
R Studio Notes
10 pages
General Physics 1: Phys100
No ratings yet
General Physics 1: Phys100
20 pages
GM 3500T OwnersManual
No ratings yet
GM 3500T OwnersManual
36 pages
R - Tutorial: Matrices Are Vectors
No ratings yet
R - Tutorial: Matrices Are Vectors
13 pages
Shop Drawings
No ratings yet
Shop Drawings
3 pages
Session Set Working Directory Choose Directlry
No ratings yet
Session Set Working Directory Choose Directlry
17 pages
Operations and Supply Chain Management Week 6
No ratings yet
Operations and Supply Chain Management Week 6
13 pages
BAN5
No ratings yet
BAN5
2 pages
UL2
No ratings yet
UL2
2 pages
Dawn 2
No ratings yet
Dawn 2
8 pages
STAT-2450 Assignment 1: Name:, Student ID: B00
No ratings yet
STAT-2450 Assignment 1: Name:, Student ID: B00
9 pages
All Values in The First Column
No ratings yet
All Values in The First Column
7 pages
100% Original Combo
No ratings yet
100% Original Combo
4 pages
Department of Economics Problem Set
No ratings yet
Department of Economics Problem Set
5 pages
Since R Considers All Variables As Numeric, We Convert Them Into Factors
No ratings yet
Since R Considers All Variables As Numeric, We Convert Them Into Factors
3 pages
Lecture-3.1.5
No ratings yet
Lecture-3.1.5
14 pages
18CSP83 - Project Phase 2 - Body
No ratings yet
18CSP83 - Project Phase 2 - Body
11 pages
Internal Analysis of FedEx V
100% (1)
Internal Analysis of FedEx V
3 pages
Evaluasi Penggunaan Oksigen Sebagai Penghasil Uap Terapi Nebulizer Pada Pasien Asma
No ratings yet
Evaluasi Penggunaan Oksigen Sebagai Penghasil Uap Terapi Nebulizer Pada Pasien Asma
7 pages
Workshop Activity: X Seq y Length
No ratings yet
Workshop Activity: X Seq y Length
3 pages
ChuteDesignFormulas Paper43
No ratings yet
ChuteDesignFormulas Paper43
11 pages
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Aspratame :from Dr. Adrian Gross, FDA Toxicologist, To Carl Sharp
No ratings yet
Aspratame :from Dr. Adrian Gross, FDA Toxicologist, To Carl Sharp
3 pages
Sneha Sarkar, 127, B, Beta and Gamma Function
No ratings yet
Sneha Sarkar, 127, B, Beta and Gamma Function
12 pages
2av56 Sensor
No ratings yet
2av56 Sensor
1 page
Gender Inequality Reflected in Play Medea
No ratings yet
Gender Inequality Reflected in Play Medea
3 pages
AMCA Standard 99-0401-86 Classification For Spark Resistant Construction - REA HVAC
No ratings yet
AMCA Standard 99-0401-86 Classification For Spark Resistant Construction - REA HVAC
2 pages
PERSONAL-LIFELONG-LEARNING-PLAN Marilyn D. Tagao
No ratings yet
PERSONAL-LIFELONG-LEARNING-PLAN Marilyn D. Tagao
7 pages
(Utkarsh Pandey WTLF)
No ratings yet
(Utkarsh Pandey WTLF)
28 pages
1210 6261v1 PDF
No ratings yet
1210 6261v1 PDF
8 pages
Analysis: SEED: The Untold Store
No ratings yet
Analysis: SEED: The Untold Store
1 page
Amazing Java: Learn Java Quickly
From Everand
Amazing Java: Learn Java Quickly
Andrei Besedin
No ratings yet
C# Tutorial - SoloLearn - Learn To Code For FREE!
No ratings yet
C# Tutorial - SoloLearn - Learn To Code For FREE!
1 page