0% found this document useful (0 votes)
15 views3 pages

Class 3

The document reads in employee data from a CSV file, examines the data frame structure and variables, and performs several aggregations and calculations on the data. Key points: - Employee data is read from a CSV file with variables like name, gender, salary, ratings, etc. - Summary statistics are calculated for variables like salary, experience, ratings to understand distributions. - Aggregations are performed to calculate average salary by gender, designation, and their combination. - Median experience is aggregated by gender and gender-designation to understand differences. - Selected variables are aggregated and averaged based on gender, designation, and minority to a new CSV file.

Uploaded by

Jaan Mukherjee
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views3 pages

Class 3

The document reads in employee data from a CSV file, examines the data frame structure and variables, and performs several aggregations and calculations on the data. Key points: - Employee data is read from a CSV file with variables like name, gender, salary, ratings, etc. - Summary statistics are calculated for variables like salary, experience, ratings to understand distributions. - Aggregations are performed to calculate average salary by gender, designation, and their combination. - Median experience is aggregated by gender and gender-designation to understand differences. - Selected variables are aggregated and averaged based on gender, designation, and minority to a new CSV file.

Uploaded by

Jaan Mukherjee
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

class3.

R
apple

2023-09-05
emp <- read.csv("~/Desktop/Cares/RIMS/RDA/emp.csv", stringsAsFactors=TRUE)
head(emp)

## id Names gender DOB educ Designation Level salary


## 1 1 Dr.Liam Johnson Male 1986-11-24 PG MLM III 57000
## 2 2 Mr.Noah Smith Male 1963-06-23 PG MLM I 40200
## 3 3 Mr.William Williams Male 1991-12-23 PG ELM II 32100
## 4 4 Dr.James Brown Male 1998-04-07 PG ELM III 36000
## 5 7 Mr.Oliver Jones Male 1996-03-28 UG ELM I 27300
## 6 11 Dr.Benjamin Davis Male 1987-01-06 PG ELM II 31050
## Last.drawn.salary PRE..EXP minority RATINGS2.BY.INTERVIEWER
## 1 27000 10 No 5
## 2 18750 20 No 5
## 3 13500 10 No 4
## 4 18750 3 No 5
## 5 13500 3 No 8
## 6 12600 10 No 4
## RATINGS1.BY.INTERVIEWER RATINGS3.BY.INTERVIEWER RATINGS4.BY.INTERVIEWER
## 1 5 7 9
## 2 3 4 9
## 3 4 10 7
## 4 5 9 1
## 5 7 1 8
## 6 7 1 2

names(emp)

## [1] "id" "Names"


## [3] "gender" "DOB"
## [5] "educ" "Designation"
## [7] "Level" "salary"
## [9] "Last.drawn.salary" "PRE..EXP"
## [11] "minority" "RATINGS2.BY.INTERVIEWER"
## [13] "RATINGS1.BY.INTERVIEWER" "RATINGS3.BY.INTERVIEWER"
## [15] "RATINGS4.BY.INTERVIEWER"

summary(emp)

## id Names gender DOB educ

## Min. : 1.00 Dr. Abdul Rahman: 1 Female:53 1963-01-13: 1


HS:16
## 1st Qu.: 35.00 Dr. Amir Khan : 1 Male :80 1963-02-05: 1
PG:69
## Median : 68.00 Dr. Anusha Patel: 1 1963-06-23: 1
UG:48
## Mean : 67.95 Dr. Arjun Patel : 1 1964-03-02: 1

## 3rd Qu.:101.00 Dr. Cheng Wang : 1 1964-08-23: 1

## Max. :134.00 Dr. Faisal Ahmed: 1 1964-11-20: 1

## (Other) :127 (Other) :127

## Designation Level salary Last.drawn.salary PRE..EXP


## ELAM:18 I :46 Min. : 20400 Min. :10950 Min. : 2.00
## ELM :75 II :48 1st Qu.: 25950 1st Qu.:13500 1st Qu.: 4.00
## MLM :21 III:39 Median : 32550 Median :15750 Median : 8.00
## TLM :19 Mean : 41405 Mean :19582 Mean :11.01
## 3rd Qu.: 55750 3rd Qu.:21750 3rd Qu.:18.00
## Max. :135000 Max. :79980 Max. :34.50
##
## minority RATINGS2.BY.INTERVIEWER RATINGS1.BY.INTERVIEWER
## No :111 Min. : 1.00 Min. : 1.000
## Yes: 22 1st Qu.: 3.00 1st Qu.: 3.000
## Median : 5.00 Median : 5.000
## Mean : 5.15 Mean : 5.624
## 3rd Qu.: 7.00 3rd Qu.: 8.000
## Max. :10.00 Max. :10.000
##
## RATINGS3.BY.INTERVIEWER RATINGS4.BY.INTERVIEWER
## Min. : 1.000 Min. : 1.000
## 1st Qu.: 3.000 1st Qu.: 3.000
## Median : 6.000 Median : 5.000
## Mean : 5.729 Mean : 5.353
## 3rd Qu.: 8.000 3rd Qu.: 8.000
## Max. :10.000 Max. :10.000
##

### aggregate

aggregate(salary ~ gender,mean,data=emp)

## gender salary
## 1 Female 28141.98
## 2 Male 50191.25

aggregate(salary ~ Designation,mean,data=emp)

## Designation salary
## 1 ELAM 21633.33
## 2 ELM 30310.00
## 3 MLM 56419.05
## 4 TLM 87335.53

aggregate(salary ~ gender+Designation,mean,data=emp)

## gender Designation salary


## 1 Female ELAM 21460.00
## 2 Male ELAM 22500.00
## 3 Female ELM 28542.86
## 4 Male ELM 31856.25
## 5 Female MLM 56875.00
## 6 Male MLM 56343.06
## 7 Male TLM 87335.53

aggregate(PRE..EXP ~ gender,median,data=emp)

## gender PRE..EXP
## 1 Female 4
## 2 Male 10

aggregate(PRE..EXP ~ gender+Designation,median,data=emp)

## gender Designation PRE..EXP


## 1 Female ELAM 3
## 2 Male ELAM 4
## 3 Female ELM 4
## 4 Male ELM 8
## 5 Female MLM 16
## 6 Male MLM 17
## 7 Male TLM 25

emp_cont = emp[,c(3,6,8:15)]
names(emp_cont)

## [1] "gender" "Designation"


## [3] "salary" "Last.drawn.salary"
## [5] "PRE..EXP" "minority"
## [7] "RATINGS2.BY.INTERVIEWER" "RATINGS1.BY.INTERVIEWER"
## [9] "RATINGS3.BY.INTERVIEWER" "RATINGS4.BY.INTERVIEWER"

ag1 = aggregate(. ~ gender+Designation+minority,mean,data=emp_cont)


write.csv(ag1,"output_ag1.csv")

You might also like