0% found this document useful (0 votes)

55 views

Data Mining: Slide-1: Title Slide

This document summarizes a presentation on predicting whether MBA students will be placed after their studies and which factors influence placement outcomes the most. It provides an overview of the dataset used, which contains information on 215 students' demographics, academic performance, work experience, and placement outcomes. Exploratory data analysis is conducted to examine the relationships between these factors and placements. Key findings include that female students performed better academically but were placed less often than male students, certain degree specializations and work experience increased placement chances, and academic performance correlated with placement but did not guarantee it. Feature engineering steps are also briefly outlined.

Uploaded by

Parikshit Mahajan

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

55 views

Data Mining: Slide-1: Title Slide

Uploaded by

Parikshit Mahajan

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Data Mining

Slide- 1: Title Slide

Job Offer Prediction for MBA Students
(Will you get placed or not?)

Intro:
We all have paid 11 lakhs, and al. this for placements .... ??

Problem Statement:
To predict if a candidate was placed in a role after their MBA studies and if so, then which factors
helped the most (i.e., work experience, degree, school results, gender, etc)

Slide- 2:Dataset Overview

Dataset Name: Campus Recruitment (Academic and Employability Factors influencing placement)

Link: https://www.kaggle.com/benroshan/factors-affecting-campus-placement

We selected the Campus Recruitment dataset from Kaggle which was made available by Ben Roshan

Snapshot of our dataset: df.head()

Shape: 215 * 15
This is a snapshot of the dataset, It contains 15 variables and a total of 215 observations with the
following details

attribute type description

sl_no factor Serial Number

gender factor Gender: Male=‘M’, Female=‘F’

Secondary Education Percentage (grades

ssc_p numeric
9 and 10) - exam at the end of 10th grade

ssc_b factor Board of Education - Central/Others

Higher Secondary Education (grades 11
hsc_p numeric
and 12) - exam at the end of 12th grade

hsc_b factor Board of Education - Central/Others

Specialization in Higher Secondary

hsc_s factor
Education

degree_p numeric Degree Percentage

Undergraduate (Degree type) - Field of

degree_t factor
degree education

workex factor Work Experience - Yes/No

Employability test Percentage (conducted

etest_p numeric
by college)

specialisati
factor Post Graduation (MBA) - Specialization
on

mba_p numeric MBA percentage

status factor Status of placement - Placed/Not placed

salary numeric Salary offered by corporate to candidates

We have 7 columns with real values (Numeric) and 8 with object data type (Categorical)

So, before moving ahead we checked for Null and NA values in the dataset.
Slide- 3: Dataset Overview
df.info()

Check for NULLs :

Code

Number of
NULLs

Check for NAs:

Code

Number of
NAs

Code

sl_ gend ssc ssc hsc hsc hsc degree degre wor etest specialis mba stat sala
no er _p _b _p _b _s _p e_t kex _p ation _p us ry

0 0 0 0 0 0 0 0 0 0 0 0 0 0 67
So, we can see that there are no NULL values but there are 67 NA values and all are from the salary
field. Now, we need to check why we have these 67 NAs in the salary category. Is this missing data or
another reason behind it?

Code

Status n

Not
67
Placed

Placed 148

It looks like 67 NAs in the salary column are due to the fact that 67 students did not get a placement.
This makes sense and therefore, no further investigation is needed.

Explanation:
So, from the dataset overview we can say that, except for hsc_s and degree_t with 3 classes, all others
have 2 classes each and also we can see that this data is slightly imbalanced as we have 148 placed
students and 67 not placed students. This means that around 31% candidates were not placed which is
sad but let's see what were the reasons :) by performing the EDA which would be explained by Noor
and Parikshit

Exploratory Data Analysis

EDA ....
First, let’s check whether gender affects the candidate placements or not

Slide- 4: Does gender affect placements?

But, before that let’s check if there are any gender- specific differences in the performance scores?

1. University Scores:

2. MBA:

Interpretation:
From, the above two graphs we can see that female students scored significantly higher
percentages than men at university and MBA level and there is no significant difference in
performance during secondary, higher secondary levels and employability test (We haven’t
added those graphs)

3. Gender Vs Placement Stats:

4. Gender Vs Salary Box- Whisker plot:

Interpretation:
1. And, from the below two graphs we can see that the dataset contains a sample of 139
male students and 76 female students which means the number of male students is
almost double as compared to female students
2. And, more outliers in the male box plot tells that they are getting high CTC jobs
3. And, are offered slightly greater salaries than female students on an average

Slide- 5: Does the board of education affect placements?

Now, let’s move on to check whether different boards make a significant difference in placement offer
or not?
Interpretation:
1. The count of central board students is very high as compared to all other boards in ssc_b but
its reverse in hsc_b
2. And there is no significant difference in the number of people that received an offer from
either board at the secondary or higher secondary level

Slide- 6: Which degree and MBA specialization has the highest Salary?
The next variable is specialisation. Now, let’s check the impact of specialisation on the chances of
receiving a better score or place for an offer?
Interpretation:
1. Looks like Commerce and Science degree students are preferred by companies which is
obvious
2. Students who opted for Others have very low placement chance
3. Specialisation is a clear indicator in placements. Significantly more Marketing and Finance
students received an offer when compared to those specialised in Marketing and HR. This
might be because there is a low requirement for HR in a company
4. The last graph shows that Mkt & Fin students get highly paid jobs and also that Commerce &
Mgmt students occasionally get dream placements with high salary

So, now let’s check if academic scores influence the chances of placements or not

Slide- 7: Does your academic score influence your chances of placement?

1. Correlation Plot:

In this correlation plot, the darker the colour is the higher the correlation.

Interpretation:
1. And here, as we can observe there are medium correlations between the academic
scores which suggests that the students who performed well in secondary school also
performed well in their further education (i.e., higher secondary, university and
MBA)
2. Also, we can notice that employability test scores have a low correlation with
academic scores therefore we can say that these tests were more practical than
theoretical

Next let’s check how the scores for each level of education are distributed

2. What does the distribution of the scores look like for each level of education?
a. Average academic scores Vs Placement Status (How many students were placed?):
b. Secondary:

c. Higher Secondary:

d. University:
e. MBA:

f. Employability:

Interpretation:
We can see that,
1. From the 1st graph, we can see that, most of the candidates educational
performances are between 60 - 80%
2. The distribution is more concentrated around the median range (62 - 66%) as
the students progressed in their education, from secondary (wide distribution)
to MBA (which is a narrow distribution)
3. The employability test has a different trend, with a very wide and almost
equal distribution of each bucket

3. Box- Whiskers plot:

Interpretation:
And, these box plots tell us that,
1. Good percentages in MBA does not guarantee placements of the candidate
2. And there's a comparatively slight difference between the percentage scores of both
the groups, But still the placed candidates have an upper hand

Slide- 8: Did previous work experience matter?

After academic scores, now let’s check whether work experience helps in getting job offers?

Interpretation:
Significantly more students with work experience received offers than those without any work
experience. Work Experience is a clear indicator as more work experience results in higher CTC jobs.

Slide- 9: Salary
And the last variable is salary

Interpretation:
1. Looking at the distribution we can say that the most of the students get a package between
200k - 400k and most salaries above 400,000 are outliers.
2. Male candidates are making more money as compared to female candidates

This was all about EDA.

Next is Feature Engg. In Feature engg. We,

Slide- 10: Feature Engineering

1. Create Dummy Variables: Dummy variable is a categorical variable that has been transformed
into numeric. For example the column Gender, we have "male" and "female" we will
transform these variables into numeric. Creating a new column just for gender_id, where male
category is coded as 0 and female as 1. In this manner all 5 variables (Gender, ssc_b, hsc_b,
degree_t, specialization, placement status) are coded.

2. Create a Correlation Matrix: Correlation is a statistical technique that can show whether and
how strongly pairs of variables are related

3. Feature selection: From the correlation matrix we can now select the features for our model
that are highly correlated with the placement status variable. Ssc_p, hsc_p, degree_p, workex,
and specialisation are the 7 significant features that will help our model identify patterns.

Now that we have our variables decided, let’s move on to perform logistic regression as explained by
Ranjani

Slide- 11: Logistic Regression

SPSS Outputs (chi sq, model*, variables, correlation)
4. "When feature engineering is done, we usually tend to decrease the dimensionality by
selecting the "right" number of features that capture the essential." (I/p and O/P)

Slide 12 (Confusion matrix):

Confusion Matrix + Formulae
It is important to define what “performance” means when it comes to choosing a model (one type of
prediction error is costlier than the other). For example, incorrectly predicting that someone would be
placed(false positive) is not as bad as incorrectly predicting that someone would not be placed(false
negative). The cost of the former is the time spent interviewing, while the cost of the latter is losing
out on a job that the student would’ve secured

Slide 13 (Conclusion):
Here are a few things to keep in mind:

Specialisations Matter. Choose the right one.

Go for Internship. Work Experience helps.
Don't worry about grades for salary (although you need them to get placed).

1. overall, the grades became more concentrated as the students progressed in their education; it
could be that it is harder for students to differentiate on grades alone and that they will focus
on other achievements (work experience, voluntary roles)
2. successfuly placed students performed significantly better than their counterparts during
secondary, highschool and university, but not at the MBA level

Slide 14 (Thank you)

8 The Role and Design of Instructional Materials
100% (1)
8 The Role and Design of Instructional Materials
33 pages
Complete Assignment 2
100% (1)
Complete Assignment 2
25 pages
Machine Learning Project
67% (3)
Machine Learning Project
30 pages
Data Interpretation Guide For All Competitive and Admission Exams
From Everand
Data Interpretation Guide For All Competitive and Admission Exams
Mohmmad Khaja Shareef
2.5/5 (6)
Personality Development
100% (1)
Personality Development
6 pages
Club Filipina Volume 3 Issue 15
80% (5)
Club Filipina Volume 3 Issue 15
32 pages
Name: Abhinandita Banerjee REG NO:20BCE2080 Theory Digital Assignment Data Visualization
No ratings yet
Name: Abhinandita Banerjee REG NO:20BCE2080 Theory Digital Assignment Data Visualization
6 pages
Assignment 2
No ratings yet
Assignment 2
11 pages
A_Model_to_Predict_Pay_Scale_Fixation_in_Job_Marke
No ratings yet
A_Model_to_Predict_Pay_Scale_Fixation_in_Job_Marke
6 pages
amr.850 851.1069
No ratings yet
amr.850 851.1069
5 pages
Business Research Education & Employement
No ratings yet
Business Research Education & Employement
17 pages
R - 12 - An Advanced Machine Learning Approach For Student Placement Prediction and Analysis
No ratings yet
R - 12 - An Advanced Machine Learning Approach For Student Placement Prediction and Analysis
11 pages
January 2019 - December 2019
No ratings yet
January 2019 - December 2019
19 pages
IEEE
No ratings yet
IEEE
6 pages
Higher Education Prediction BY Using Data Mining: Related Work
No ratings yet
Higher Education Prediction BY Using Data Mining: Related Work
2 pages
Typing
No ratings yet
Typing
16 pages
Discussion #3 shahrzad karbasi
No ratings yet
Discussion #3 shahrzad karbasi
6 pages
102128-Article Text-211595-1-10-20240913
No ratings yet
102128-Article Text-211595-1-10-20240913
3 pages
PlacementAnalysisforStudentsusingMachineLearning2
No ratings yet
PlacementAnalysisforStudentsusingMachineLearning2
16 pages
In-Group Gender Bias in Hiring
No ratings yet
In-Group Gender Bias in Hiring
3 pages
m22
No ratings yet
m22
18 pages
Lakshami Through Sarasawati: 1. Introduction of Data
No ratings yet
Lakshami Through Sarasawati: 1. Introduction of Data
6 pages
Ba Report
No ratings yet
Ba Report
8 pages
Presentation For Project
No ratings yet
Presentation For Project
20 pages
SEC - Accepted Student List - ZOHO
No ratings yet
SEC - Accepted Student List - ZOHO
7 pages
BUETK Students Employment Prediction Using Machine Learning
No ratings yet
BUETK Students Employment Prediction Using Machine Learning
5 pages
Campus Based Job Placements
No ratings yet
Campus Based Job Placements
9 pages
9797-Article Text-19191-1-10-20210824
No ratings yet
9797-Article Text-19191-1-10-20210824
10 pages
Tracer 2002 2022
No ratings yet
Tracer 2002 2022
8 pages
Comprehensive Career Placement Predictor an Analytical Tool for Optimizing Job Placement Outcomes
No ratings yet
Comprehensive Career Placement Predictor an Analytical Tool for Optimizing Job Placement Outcomes
6 pages
Stats Final Project 2 1
No ratings yet
Stats Final Project 2 1
7 pages
Documentation - Ishaan Mittal - Jio - Assessment
No ratings yet
Documentation - Ishaan Mittal - Jio - Assessment
9 pages
Skills For Employability: Employers' Perspective: A B S T R A C T
No ratings yet
Skills For Employability: Employers' Perspective: A B S T R A C T
8 pages
HAYUDINI, MUDZRAMER - ACTIVITY 4
No ratings yet
HAYUDINI, MUDZRAMER - ACTIVITY 4
9 pages
Employability of The Bachelor Physical Education of DHVSU
No ratings yet
Employability of The Bachelor Physical Education of DHVSU
43 pages
Predicting Employability Skills
No ratings yet
Predicting Employability Skills
6 pages
Employability of Bs in Business Management of Cvsu-Carmona 2017-2018
No ratings yet
Employability of Bs in Business Management of Cvsu-Carmona 2017-2018
11 pages
Student Placement Analysis Report
No ratings yet
Student Placement Analysis Report
3 pages
Extent of Employability
No ratings yet
Extent of Employability
4 pages
Classification Techniques For Predicting Graduate Employability
No ratings yet
Classification Techniques For Predicting Graduate Employability
9 pages
Defense Script
No ratings yet
Defense Script
4 pages
Employability of Management Graduates and Challenges in Indian Higher Education
No ratings yet
Employability of Management Graduates and Challenges in Indian Higher Education
8 pages
Thesis Primary Correction
No ratings yet
Thesis Primary Correction
55 pages
BU1007 Report Roddy
No ratings yet
BU1007 Report Roddy
19 pages
mini_projet
No ratings yet
mini_projet
49 pages
Employ Ability Skill Lit Rat Ure Review in India
No ratings yet
Employ Ability Skill Lit Rat Ure Review in India
12 pages
Nor Afiqah Wan Othman 2020 IOP Conf. Ser. Mater. Sci. Eng. 769 012018
No ratings yet
Nor Afiqah Wan Othman 2020 IOP Conf. Ser. Mater. Sci. Eng. 769 012018
9 pages
Excel Practice MBA Class
No ratings yet
Excel Practice MBA Class
42 pages
RRL
No ratings yet
RRL
5 pages
Data, Variables and Methods
No ratings yet
Data, Variables and Methods
15 pages
Kingmaaaaaaaaaaaaaaaa
No ratings yet
Kingmaaaaaaaaaaaaaaaa
10 pages
To Be Employed Is To Be at Risk To Be Employable Is To Be Secure
No ratings yet
To Be Employed Is To Be at Risk To Be Employable Is To Be Secure
33 pages
Prediction_of_Final_Result_and_Placement_of_Studen
No ratings yet
Prediction_of_Final_Result_and_Placement_of_Studen
7 pages
ABSTRACT
No ratings yet
ABSTRACT
14 pages
Employability Vs Employable
No ratings yet
Employability Vs Employable
11 pages
idea3
No ratings yet
idea3
13 pages
Degree Not Necessary For Success
No ratings yet
Degree Not Necessary For Success
7 pages
m2
No ratings yet
m2
18 pages
Student Campus Placement Prediction Analysis Using ChiSquared Test On Machine Learning Algorithms-IJRASET
No ratings yet
Student Campus Placement Prediction Analysis Using ChiSquared Test On Machine Learning Algorithms-IJRASET
10 pages
A Study On Career Preference of Under Graduate Engineering and Technology Students in Anna University Regional Campus Coimbatore
No ratings yet
A Study On Career Preference of Under Graduate Engineering and Technology Students in Anna University Regional Campus Coimbatore
13 pages
Investigating the Relationship Between Career Planning, Proactivity and Employability Perceptions Among Higher Education Students in Uncertain Labour Market Conditions
No ratings yet
Investigating the Relationship Between Career Planning, Proactivity and Employability Perceptions Among Higher Education Students in Uncertain Labour Market Conditions
21 pages
Lab. Hypothesis Testing Smath201la 12
No ratings yet
Lab. Hypothesis Testing Smath201la 12
11 pages
Scientific Management of the Classroom
From Everand
Scientific Management of the Classroom
Pernell Hodges
No ratings yet
Application Development Tracer of Teaching in The Philippines
No ratings yet
Application Development Tracer of Teaching in The Philippines
6 pages
MCQ in Values Education Part 1 Licensure Exam For Teachers 2021
No ratings yet
MCQ in Values Education Part 1 Licensure Exam For Teachers 2021
19 pages
UED Script For Video
No ratings yet
UED Script For Video
1 page
Reflection Nur Atikah Binti Zolkifli
No ratings yet
Reflection Nur Atikah Binti Zolkifli
3 pages
EEL4518 Syllabus F19 Rev3 PDF
No ratings yet
EEL4518 Syllabus F19 Rev3 PDF
2 pages
CSE310
No ratings yet
CSE310
19 pages
05 20230921155237 619
No ratings yet
05 20230921155237 619
83 pages
RPH Bi Y2 (M45)
No ratings yet
RPH Bi Y2 (M45)
10 pages
IT Training, Plans and Operations: Joshua J Conner
No ratings yet
IT Training, Plans and Operations: Joshua J Conner
1 page
PC Ii-3
No ratings yet
PC Ii-3
2 pages
2023 AMP University Challenge Overview 6 Pages v1.0
No ratings yet
2023 AMP University Challenge Overview 6 Pages v1.0
6 pages
Project
No ratings yet
Project
53 pages
MATH 119 STRATEGIES IN MATH Exam
No ratings yet
MATH 119 STRATEGIES IN MATH Exam
2 pages
Test Bank For Advanced Accounting, 14th Edition, Joe Ben Hoyle, Thomas Schaefer Timothy Doupnik All Chapter Instant Download
100% (12)
Test Bank For Advanced Accounting, 14th Edition, Joe Ben Hoyle, Thomas Schaefer Timothy Doupnik All Chapter Instant Download
38 pages
Lesson Plan-1
No ratings yet
Lesson Plan-1
12 pages
Chapter No.1: The Damned of Human Race
No ratings yet
Chapter No.1: The Damned of Human Race
2 pages
18 March 2012 Dan Dennett: Ants, Terrorism, and The Awesome Power of Memes
No ratings yet
18 March 2012 Dan Dennett: Ants, Terrorism, and The Awesome Power of Memes
4 pages
Ces Application Form
No ratings yet
Ces Application Form
2 pages
Work-Immersion-Jhon Ashly
No ratings yet
Work-Immersion-Jhon Ashly
13 pages
A Report Is A Factual Description of An Issue or A Problem
No ratings yet
A Report Is A Factual Description of An Issue or A Problem
7 pages
Informative Speech Outline
100% (1)
Informative Speech Outline
2 pages
Cns Lab Manual
No ratings yet
Cns Lab Manual
26 pages
Apprenticeship Application Form
No ratings yet
Apprenticeship Application Form
2 pages
Wbi13 01 Pef 20230302
No ratings yet
Wbi13 01 Pef 20230302
6 pages
Rpms Ipcrf for Sy 2020 2021
No ratings yet
Rpms Ipcrf for Sy 2020 2021
4 pages
English For 1st Year Business Students 2011 SB
100% (1)
English For 1st Year Business Students 2011 SB
106 pages
Jayvi Joshua Paares Arcaya Lesson Plan On G8 Probability
No ratings yet
Jayvi Joshua Paares Arcaya Lesson Plan On G8 Probability
6 pages