0% found this document useful (0 votes)

43 views16 pages

Business Analytics-1: STR (Crew - Data)

The document describes analyzing a dataset containing employee data. It lists the categorical and numeric variables, describes the numeric variable salary using descriptive statistics like mean, median, standard deviation and variance. It also counts the number of groups in the categorical variable "Job code" and enumerates functions used to analyze both the categorical variable "Job code" and numeric variable "salary". Functions like count, group_by, summarise, mean and table are used.

Uploaded by

Nikhil Malhotra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

43 views16 pages

Business Analytics-1: STR (Crew - Data)

Uploaded by

Nikhil Malhotra

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 16

Business Analytics- 1

1) List the categorical and numeric variables of the data set

ANS.
A. Categorical variables:
1. Hire date
2. Lastname
3. firstname
4. Location
5. Phone
6. EmpId
7. Job.code
B. Numeric variable
1. Salary
Output:
str(Crew.data)
'data.frame': 69 obs. of 8 variables:
$ Hire.date: Factor w/ 69 levels "1-Jul-87","1-Mar-90",..: 35 50 3 16 27 36 62
60 24 17 ...
$ Lastname : Factor w/ 69 levels "BEAUMONT","BERGAMASCO",..: 21 35 69 19
41 18 42 64 67 9 ...
$ Firstname: Factor w/ 69 levels "ANITA M.","ANNETTE M.",..: 30 29 24 58 54
26 68 39 59 37 ...
$ Location : Factor w/ 3 levels "CARY","FRANKFURT",..: 1 2 3 1 3 2 3 2 2 3 ...
$ Phne : int 1168 2164 1565 1157 2360 1595 2366 1197 1553 1369 ...
$ EmpId : Factor w/ 69 levels "E00034","E00084",..: 53 36 49 46 31 4 25 29
41 18 ...
$ Job.code : Factor w/ 6 levels "FLTAT1","FLTAT2",..: 1 1 1 1 1 1 1 1 1 1 ...
$ Salary : int 21000 22000 22000 23000 24000 25000 25000 26000 27000
28000 ...
2) Describe the numeric variable using descriptive technique
Ans:

a) Summary
Output:
summary(Crew.data$Salary)
Min. 1st Qu. Median Mean 3rd Qu. Max.
21000 33000 42000 52145 73000 112000

b) Mean
Output:
>mean(Crew.data$Salary)
[1] 52144.93

c) Median
Output:
>median(Crew.data$Salary)
[1] 42000

d) Standard Deviation
Output:
>sd(Crew.data$Salary)
[1] 25521.78

e) Variance
Output:
>var(Crew.data$Salary)
[1] 651361040

3) How many groups are containing in the variable “Job code”

Ans: There are 6 categories

O/p 1: Using dplyr function

>Crew.data%>%count(Job.code)
# A tibble: 6 x 2
Job.code n
<fct><int>
1 FLTAT1 14
2 FLTAT2 18
3 FLTAT3 12
4 PILOT1 8
5 PILOT2 9
6 PILOT3 8

O/p 2: Using group_by function:

# A tibble: 6 x 2
Job.code count
<fct><int>
1 FLTAT1 14
2 FLTAT2 18
3 FLTAT3 12
4 PILOT1 8
5 PILOT2 9
6 PILOT3 8

4) Enumerate all functions explained in the video for “Job code”

Ans:
a) Count function
Output:
>Crew.data%>%count(Job.code)
A tibble: 6 x 2
Job.code `mean(Salary)`
<fct><dbl>
1 FLTAT1 25643.
2 FLTAT2 35111.
3 FLTAT3 44250
4 PILOT1 69500
5 PILOT2 80111.
6 PILOT3 99875
b) Group by function:

O/p

>Crew.data%>%group_by(Job.code)%>%summarise(count=n())
# A tibble: 6 x 2
Job.code count
<fct><int>
1 FLTAT1 14
2 FLTAT2 18
3 FLTAT3 12
4 PILOT1 8
5 PILOT2 9
6 PILOT3 8

c) Table function:

Output
>table(Crew.data$Job.code)

FLTAT1 FLTAT2 FLTAT3 PILOT1 PILOT2 PILOT3

14 18 12 8 9 8

5) Enumerate all functions explained in the video for “salary”

Ans:
a) Mean Salary:
Output:

mean(Crew.data$Salary)
[1] 52144.93
b) Standard Deviation in Salary:

Output:

sd(Crew.data$Salary)
[1] 25521.78

c) Variance

Output:

>var(Crew.data$Salary)
[1] 651361040

d) Summary

Output:

summary(Crew.data$Salary)
Min. 1st Qu. Median Mean 3rd Qu. Max.
21000 33000 42000 52145 73000 112000

e) Median Salary:

Output:

median(Crew.data$Salary)
[1] 42000

f) Jobcode Category-wise Salary

Output:

Crew.data%>%group_by(Job.code)%>%summarise(mean(Salary))
# A tibble: 6 x 2
Job.code `mean(Salary)`
<fct><dbl>
1 FLTAT1 25643.
2 FLTAT2 35111.
3 FLTAT3 44250
4 PILOT1 69500
5 PILOT2 80111.
6 PILOT3 99875

Question 2:

1) Enumerate all functions explained in the video for all categorical and
numerical variables of the data set.
Ans:

Although it shows all as numeric variables, here 5 are categorical variables.

Categorical variables: cyl,vs, am, gear and carb.- (5)
Numeric Variables: mpg,disp, hp, drat, wt and qsec – (6)

Output:
str(mtcars)
'data.frame': 32 obs. of 11 variables:
$ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
$ cyl : num 6 6 4 6 8 6 8 4 4 6 ...
$ disp: num 160 160 108 258 360 ...
$ hp : num 110 110 93 110 175 105 245 62 95 123 ...
$ drat: num 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
$ wt : num 2.62 2.88 2.32 3.21 3.44 ...
$ qsec: num 16.5 17 18.6 19.4 17 ...
$ vs : num 0 0 1 1 0 1 0 1 1 1 ...
$ am : num 1 1 1 0 0 0 0 0 0 0 ...
$ gear: num 4 4 4 3 3 3 3 4 4 4 ...
$ carb: num 4 4 1 1 2 1 4 2 2 4 ...

Numeric Variables:
1) mpg:
Mean:
>mean(mtcars$mpg)
[1] 20.09062

Median:
>median(mtcars$mpg)
[1] 19.2

Standard deviation:
>sd(mtcars$mpg)
[1] 6.026948
Variance:
>var(mtcars$mpg)
[1] 36.3241

Summary:
>summary(mtcars$mpg)
Min. 1st Qu. Median Mean 3rd Qu. Max.
10.40 15.43 19.20 20.09 22.80 33.90

2) disp

Mean:
>mean(mtcars$disp)
[1] 230.7219

Median:
>median(mtcars$disp)
[1] 196.3

Standard deviation:
>sd(mtcars$disp)
[1] 123.9387

Variance:
>var(mtcars$disp)
[1] 15360.
Summary:
>summary(mtcars$disp)
Min. 1st Qu. Median Mean 3rd Qu. Max.
71.1 120.8 196.3 230.7 326.0 472.0

3) hp:

Mean:
>mean(mtcars$hp)
[1] 146.6875

Median:
>median(mtcars$hp)
[1] 123

Standard deviation:
>sd(mtcars$hp)
[1] 68.56287

Variance:
>var(mtcars$hp)
[1] 4700.867

Summary:
>summary(mtcars$hp)
Min. 1st Qu. Median Mean 3rd Qu. Max.
52.0 96.5 123.0 146.7 180.0 335.0

4) drat

Mean:
>mean(mtcars$drat)
[1] 3.596563

Median:
>median(mtcars$drat)
[1] 3.695

Standard deviation:
>sd(mtcars$drat)
[1] 0.5346787

Variance:
>var(mtcars$drat)
[1] 0.285881

Summary:
>summary(mtcars$drat)
Min. 1st Qu. Median Mean 3rd Qu. Max.
2.760 3.080 3.695 3.597 3.920 4.930

5) wt

Mean:
>mean(mtcars$wt)
[1] 3.21725

Median:
>median(mtcars$wt)
[1] 3.32

Standard deviation:
>sd(mtcars$wt)
[1] 0.9784574

Variance:
>var(mtcars$wt)
[1] 0.957379

Summary:
>summary(mtcars$wt)
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.513 2.581 3.325 3.217 3.610 5.424
6) qsec
Mean:
>mean(mtcars$qsec)
[1] 17.84875

Median:
>median(mtcars$qsec)
[1] 17.71

Standard deviation:
>sd(mtcars$qsec)
[1] 1.786943

Variance:
>var(mtcars$qsec)
[1] 3.193166

Summary:
>summary(mtcars$qsec)
Min. 1st Qu. Median Mean 3rd Qu. Max.
14.50 16.89 17.71 17.85 18.90 22.90

Categorical Variables:
1) cyl:
a) using dplyr package
>mtcars%>%count(cyl)
# A tibble: 3 x 2
cyl n
<dbl><int>
1 4 11
2 6 7
3 8 14

b) using group_by function

>mtcars%>%group_by(cyl)%>%summarise(count=n())
# A tibble: 3 x 2
cyl count
<dbl><int>
1 4 11
2 6 7
3 8 14

2) vs:
a) using dplyr package:
>mtcars%>%count(vs)
# A tibble: 2 x 2
vs n
<dbl><int>
1 0 18
2 1 14

b) using group_by function

>mtcars%>%group_by(vs)%>%summarise(count=n())
# A tibble: 2 x 2
vs count
<dbl><int>
1 0 18
2 1 14

3) am:
a) using dplyr package:
>mtcars%>%count(am)
# A tibble: 2 x 2
am n
<dbl><int>
1 0 19
2 1 13

b) using group_by function

>mtcars%>%group_by(am)%>%summarise(count=n())
# A tibble: 2 x 2
am count
<dbl><int>
1 0 19
2 1 13

4) gear:
a) using dplyr package:
>mtcars%>%count(gear)
# A tibble: 3 x 2
gear n
<dbl><int>
1 3 15
2 4 12
3 5 5

b) using group_by function

>mtcars%>%group_by(gear)%>%summarise(count=n())
# A tibble: 3 x 2
gear count
<dbl><int>
1 3 15
2 4 12
3 5 5
5) carb:
a) using dplyr package:
>mtcars%>%count(carb)
# A tibble: 6 x 2
carb n
<dbl><int>
1 1 7
2 2 10
3 3 3
4 4 10
5 6 1
6 8 1

b) using group_by function

>mtcars%>%group_by(carb)%>%summarise(count=n())
# A tibble: 6 x 2
carb count
<dbl><int>
1 1 7
2 2 10
3 3 3
4 4 10
5 6 1
6 8 1

2. Prepare a data frame for at least two categorical variables and find the
mean salary of those groups.
Ans:
Numeric Variables:

I. Finding the mean mpg of the cars with different gears.

a) using count function:
>mtcars%>%count(gear)
# A tibble: 3 x 2
gear n
<dbl><int>
1 3 15
2 4 12
3 5 5

b) using group by function:

>mtcars%>%group_by(gear)%>%summarise(count=n())
# A tibble: 3 x 2
gear count
<dbl><int>
1 3 15
2 4 12
3 5 5
Mean mpg of different geared car:
>mtcars%>%group_by(gear)%>%summarise(mean(mpg))
# A tibble: 3 x 2
gear `mean(mpg)`
<dbl><dbl>
1 3 16.1
2 4 24.5
3 5 21.4

II. Finding average horsepower generated by different geared cars

a) using count function:

>mtcars%>%count(gear)
# A tibble: 3 x 2
gear n
<dbl><int>
1 3 15
2 4 12
3 5 5

c) using group by function:

>mtcars%>%group_by(gear)%>%summarise(count=n())
# A tibble: 3 x 2
gear count
<dbl><int>
1 3 15
2 4 12
3 5 5
Mean hp of different geared car:

>mtcars%>%group_by(gear)%>%summarise(mean(hp))
# A tibble: 3 x 2
gear `mean(hp)`
<dbl><dbl>
1 3 176.
2 4 89.5
3 5 196.

Categorical Variables:
1) For Cyl:
Steps:
table(mtcars$cyl)
mtcarst=table(mtcars$cyl)
class(mtcarst)
mtcarsf=as.data.frame(mtcarst)
mtcarsf

Output:
>mtcarsf
Var1 Freq
1 4 11
2 6 7
3 8 14

2) For am:
Steps:
table(mtcars$am)
mtcarst1=table(mtcars$am)
class(mtcarst1)
mtcarsf1=as.data.frame(mtcarst1)
mtcarsf1

Output:
>mtcarsf1
Var1 Freq
1 0 19
2 1 13

Spatial Reasoning
100% (4)
Spatial Reasoning
28 pages
Linear Motion Lecture
No ratings yet
Linear Motion Lecture
42 pages
Algebra Lineal I
No ratings yet
Algebra Lineal I
100 pages
Golden Section Search
No ratings yet
Golden Section Search
6 pages
Lecture 1
No ratings yet
Lecture 1
167 pages
Project 5 PDF
100% (1)
Project 5 PDF
48 pages
MDPN460 Lecture05
No ratings yet
MDPN460 Lecture05
32 pages
R Module 5
No ratings yet
R Module 5
21 pages
Starting With R
No ratings yet
Starting With R
34 pages
Fourier Representation of Signals and LTI Systems
No ratings yet
Fourier Representation of Signals and LTI Systems
45 pages
Final Cost Practical
No ratings yet
Final Cost Practical
29 pages
Aayushi Bda File
No ratings yet
Aayushi Bda File
41 pages
R Module 5
No ratings yet
R Module 5
21 pages
FDA Assignment 4
No ratings yet
FDA Assignment 4
34 pages
Syllabus in CE 421 THEORY 2 JMC
No ratings yet
Syllabus in CE 421 THEORY 2 JMC
18 pages
DS Lab
No ratings yet
DS Lab
31 pages
Tutorial 1 - R Programming
No ratings yet
Tutorial 1 - R Programming
40 pages
Final DSR Lab Record
No ratings yet
Final DSR Lab Record
16 pages
Da Lab It
No ratings yet
Da Lab It
20 pages
Aditya Garg DMDW
No ratings yet
Aditya Garg DMDW
40 pages
Using R For Basic Statistical Analysis
No ratings yet
Using R For Basic Statistical Analysis
11 pages
CH 3
No ratings yet
CH 3
33 pages
Statistical Modeling Using R - Lab Manual
No ratings yet
Statistical Modeling Using R - Lab Manual
23 pages
ProbList2 24 SLN
No ratings yet
ProbList2 24 SLN
20 pages
R Lab Ex 1 To 5
No ratings yet
R Lab Ex 1 To 5
26 pages
Data Science Using R
No ratings yet
Data Science Using R
11 pages
FDS Lab Manual
No ratings yet
FDS Lab Manual
32 pages
Presentation 1
No ratings yet
Presentation 1
34 pages
Statistics and Data Science With R Part - 4
No ratings yet
Statistics and Data Science With R Part - 4
23 pages
SML Practicals All
No ratings yet
SML Practicals All
22 pages
CS605 Labcf
No ratings yet
CS605 Labcf
30 pages
X - 15 x-1 2. Print ('Hello Word!') ## (1) "Hello Word!" 3. X - 4 y - 5 Z - X+y Print (Z) 4. X - 4 y - 5 Cat ('The Sum of X and y Is', X+y)
No ratings yet
X - 15 x-1 2. Print ('Hello Word!') ## (1) "Hello Word!" 3. X - 4 y - 5 Z - X+y Print (Z) 4. X - 4 y - 5 Cat ('The Sum of X and y Is', X+y)
15 pages
Lec 4
No ratings yet
Lec 4
18 pages
Assignment Submitted By-Srishti Bhateja 19021141116: STR (Crew - Data)
No ratings yet
Assignment Submitted By-Srishti Bhateja 19021141116: STR (Crew - Data)
11 pages
Data - Wrangling Analysis
No ratings yet
Data - Wrangling Analysis
26 pages
20BCE1205 Lab3
No ratings yet
20BCE1205 Lab3
9 pages
Mtcars: Choosing The Most Related Variable (S) To The Response
No ratings yet
Mtcars: Choosing The Most Related Variable (S) To The Response
13 pages
Functions and Packages
No ratings yet
Functions and Packages
7 pages
Lab1: Introduction To R: Islr2
No ratings yet
Lab1: Introduction To R: Islr2
10 pages
The Binomial Theorem
No ratings yet
The Binomial Theorem
9 pages
FDP Indoglobal Group of Colleges: 27 April To 1 May R Programming Language Assignment Submission
No ratings yet
FDP Indoglobal Group of Colleges: 27 April To 1 May R Programming Language Assignment Submission
12 pages
Dav Pracs
No ratings yet
Dav Pracs
9 pages
Introduction To Basics of R - Assignment: Log2 (2 5) Log (Exp (1) Exp (2) )
No ratings yet
Introduction To Basics of R - Assignment: Log2 (2 5) Log (Exp (1) Exp (2) )
10 pages
A Nova Cars Test
No ratings yet
A Nova Cars Test
8 pages
Descriptive and Inferential Statistics With R
No ratings yet
Descriptive and Inferential Statistics With R
6 pages
Practical4 Solution-1
No ratings yet
Practical4 Solution-1
9 pages
Chapter - 3 Common Statistical Procedure
No ratings yet
Chapter - 3 Common Statistical Procedure
20 pages
Lesson 7 - The Data Frame
No ratings yet
Lesson 7 - The Data Frame
7 pages
2023 Tutorial 12
No ratings yet
2023 Tutorial 12
6 pages
Solutions For QB3
No ratings yet
Solutions For QB3
14 pages
R Functions List
No ratings yet
R Functions List
8 pages
Practice Questions On Central Tendency On Mtcars
No ratings yet
Practice Questions On Central Tendency On Mtcars
3 pages
R Regression Commands
No ratings yet
R Regression Commands
5 pages
WWWWWW WWWWWW WWWWWW WWWWWW WWWW WWWW WWWWWW: Data Transformation With Dplyr
No ratings yet
WWWWWW WWWWWW WWWWWW WWWWWW WWWW WWWW WWWWWW: Data Transformation With Dplyr
2 pages
Data Transformation
No ratings yet
Data Transformation
2 pages
R Program
No ratings yet
R Program
2 pages
Mit 302 Cat Solutions - 1
No ratings yet
Mit 302 Cat Solutions - 1
4 pages
Czarnecki 2016
No ratings yet
Czarnecki 2016
50 pages
WWWWWW WWWWWW WWWWWW WWWWWW WWWW WWWW WWWWWW: Data Transformation With Dplyr
No ratings yet
WWWWWW WWWWWW WWWWWW WWWWWW WWWW WWWW WWWWWW: Data Transformation With Dplyr
2 pages
R Module 6 - Data Summarization
No ratings yet
R Module 6 - Data Summarization
25 pages
Week2 Submission Assignment Solution AshaA-3
No ratings yet
Week2 Submission Assignment Solution AshaA-3
2 pages
R Module 11 - Statistics
No ratings yet
R Module 11 - Statistics
35 pages
Basics: TH TH TH TH TH TH TH
No ratings yet
Basics: TH TH TH TH TH TH TH
3 pages
Week 5 Lab
No ratings yet
Week 5 Lab
3 pages
Spreadsheets Path To Math Petti
No ratings yet
Spreadsheets Path To Math Petti
108 pages
Data Transformation With Dplyr Cheat Sheet
No ratings yet
Data Transformation With Dplyr Cheat Sheet
2 pages
Workshop Activity: X Seq y Length
No ratings yet
Workshop Activity: X Seq y Length
3 pages
Exemplar Lesson Plan
No ratings yet
Exemplar Lesson Plan
12 pages
A Stata Implementation of The Blinder-Oaxaca Decomposition
No ratings yet
A Stata Implementation of The Blinder-Oaxaca Decomposition
25 pages
CEN 202 Mechanics of Materials: Beam Deflections
No ratings yet
CEN 202 Mechanics of Materials: Beam Deflections
13 pages
Presentation On Solid Modelling
No ratings yet
Presentation On Solid Modelling
31 pages
Multinomial Distribution
No ratings yet
Multinomial Distribution
11 pages
EE2403-Intro To computing-S2-20-21-HW2
No ratings yet
EE2403-Intro To computing-S2-20-21-HW2
2 pages
Euclid Geometry MCQ1
No ratings yet
Euclid Geometry MCQ1
2 pages
Xi Maths Public Preparation 2023 2024
No ratings yet
Xi Maths Public Preparation 2023 2024
57 pages
Experiment 8
No ratings yet
Experiment 8
4 pages
Unit 8 Integral Calculus: Structure
No ratings yet
Unit 8 Integral Calculus: Structure
45 pages
Fortran Problem List - 50 Batch-1
No ratings yet
Fortran Problem List - 50 Batch-1
2 pages
04 Intro To LMI
No ratings yet
04 Intro To LMI
38 pages
Do Students' Backgrounds Influence How They Learn Mathematics?
No ratings yet
Do Students' Backgrounds Influence How They Learn Mathematics?
14 pages
Import "Crew Data - CSV" From MS Teams Files and Answer To The Following Questions
No ratings yet
Import "Crew Data - CSV" From MS Teams Files and Answer To The Following Questions
17 pages
Operations Research Project: BY:-Nikhil Malhotra 19021141074
No ratings yet
Operations Research Project: BY:-Nikhil Malhotra 19021141074
11 pages
Math 257/316: Partial Differential Equations: Term 2, December - April 2019
No ratings yet
Math 257/316: Partial Differential Equations: Term 2, December - April 2019
1 page
Business Analytics: Assignment-2
No ratings yet
Business Analytics: Assignment-2
9 pages
Maths Microproject It Group 2
No ratings yet
Maths Microproject It Group 2
31 pages
Stat Prob Las 10
No ratings yet
Stat Prob Las 10
6 pages
Supply Chain Management: Assignment 1 & 2
No ratings yet
Supply Chain Management: Assignment 1 & 2
7 pages
2010 Algebra II Trig Review 13.4-13.6
No ratings yet
2010 Algebra II Trig Review 13.4-13.6
2 pages
Mtap Grade 5
No ratings yet
Mtap Grade 5
2 pages
Logical Reasoning
No ratings yet
Logical Reasoning
1 page
Dhar' S: RRB Group-D - Quant
No ratings yet
Dhar' S: RRB Group-D - Quant
2 pages
Ch.4 Volumes of Revolution QP
No ratings yet
Ch.4 Volumes of Revolution QP
4 pages
Mubiito SMACK S5 Assesment 1 2024
No ratings yet
Mubiito SMACK S5 Assesment 1 2024
3 pages
The Elements of Quantitative Investing
From Everand
The Elements of Quantitative Investing
Giuseppe A. Paleologo
No ratings yet

Business Analytics-1: STR (Crew - Data)

Uploaded by

Business Analytics-1: STR (Crew - Data)

Uploaded by

Business Analytics- 1

1) List the categorical and numeric variables of the data set

3) How many groups are containing in the variable “Job code”

Ans: There are 6 categories

O/p 1: Using dplyr function

O/p 2: Using group_by function:

4) Enumerate all functions explained in the video for “Job code”

FLTAT1 FLTAT2 FLTAT3 PILOT1 PILOT2 PILOT3

5) Enumerate all functions explained in the video for “salary”

f) Jobcode Category-wise Salary

Although it shows all as numeric variables, here 5 are categorical variables.

b) using group_by function

b) using group_by function

b) using group_by function

b) using group_by function

b) using group_by function

I. Finding the mean mpg of the cars with different gears.

b) using group by function:

II. Finding average horsepower generated by different geared cars

a) using count function:

c) using group by function:

You might also like