0% found this document useful (0 votes)

17 views12 pages

Code Output

Uploaded by

Rakshit Anand

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views12 pages

Code Output

Uploaded by

Rakshit Anand

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

final-project - Jupyter Notebook 2021-12-09, 2:30 AM

Automated Resume Screening

COMP 4750: Natural Language Processing
Shawon Ibn Kamal, 201761376

In [1]: from os import path

from glob import glob

from pdfminer.high_level import extract_text

import nltk
from [Link] import stopwords
import re
import subprocess

import pandas as pd
import numpy as np

from sklearn.feature_extraction.text import TfidfVectorizer

from [Link] import cosine_similarity
from [Link] import euclidean_distances

Part 1: Parsing
Read resume pdf

[Link] Page 1 of 12
final-project - Jupyter Notebook 2021-12-09, 2:30 AM

In [2]: mypath = "resumes-list"

def find_ext(dr, ext):

return glob([Link](dr,"*.{}".format(ext)))

resumepaths = find_ext(mypath, "pdf")

df = [Link] (resumepaths, columns = ['path'])

df['text'] = df['path'].apply(lambda x: extract_text(x))

[Link]()

Out[2]:
path text

resumes-list/resume-example-option-
0 justin.green11@[Link]\n(123) 456-7890\nWash...
software-en...

resumes-list/resume-example-option-project-
1 Stephen Greet\nProject Manager\nPMP certified p...
man...

resumes-list/resume-example-option-
2 Ashley Doyle, Esq\n\[Link]@[Link]\n\n(1...
[Link]

resumes-list/resume-example-option-
3 Stephen Greet\nSales Associate\n\nWork Experie...
[Link]

resumes-list/data-scientist-resume- KANDICE LOUDOR\n\nDATA

4 [Link] SCIENTIST\n\nCONTACT\n\...

Retrieve candidate name

[Link] Page 2 of 12
final-project - Jupyter Notebook 2021-12-09, 2:30 AM

In [3]: def extract_names(txt):

person_names = []

for sent in nltk.sent_tokenize(txt):

for chunk in nltk.ne_chunk(nltk.pos_tag(nltk.word_tokenize(sent
if hasattr(chunk, 'label') and [Link]() == 'PERSON':
person_names.append(
' '.join(chunk_leave[0] for chunk_leave in [Link]
)

return person_names

df['name'] = [Link](lambda x: extract_names(x)[0])

[Link]()

Out[3]:
path text name

resumes-list/resume-example-option- justin.green11@[Link]\n(123) 456- Github

0 software-en... 7890\nWash... SKILLS

resumes-list/resume-example-option- Stephen Greet\nProject Manager\nPMP

1 Stephen
project-man... certified p...

resumes-list/resume-example-option- Ashley Doyle,

2 Ashley
[Link] Esq\n\[Link]@[Link]\n\n(1...

resumes-list/resume-example-option- Stephen Greet\nSales Associate\n\nWork

3 Stephen
[Link] Experie...

resumes-list/data-scientist-resume- KANDICE LOUDOR\n\nDATA

4 Github
[Link] SCIENTIST\n\nCONTACT\n\...

Extract phone-number

[Link] Page 3 of 12
final-project - Jupyter Notebook 2021-12-09, 2:30 AM

In [4]: phone_regex = [Link](r'[\+\(]?[1-9][0-9 .\-\(\)]{8,}[0-9]')

def extract_phone_number(resume_text):
phone = [Link](phone_regex, resume_text)

if phone:
number = ''.join(phone[0])

if resume_text.find(number) >= 0 and len(number) < 16:

return number
return None

df['phone'] = [Link](lambda x: extract_phone_number(x))

[Link]()

Out[4]:
path text name phone

resumes-list/resume-example- justin.green11@[Link]\n(123) 456- Github (123)

0 option-software-en... 7890\nWash... SKILLS 456-7890

resumes-list/resume-example- Stephen Greet\nProject Manager\nPMP (123)

1 Stephen
option-project-man... certified p... 456-7890

resumes-list/resume-example- Ashley Doyle, (123)

2 Ashley
[Link] Esq\n\[Link]@[Link]\n\n(1... 456-7890

resumes-list/resume-example- Stephen Greet\nSales (123)

3 Stephen
[Link] Associate\n\nWork Experie... 456-7890

resumes-list/data-scientist- KANDICE LOUDOR\n\nDATA (123)

4 Github
[Link] SCIENTIST\n\nCONTACT\n\... 456-7890

Extract email

[Link] Page 4 of 12
final-project - Jupyter Notebook 2021-12-09, 2:30 AM

In [5]: email_regex = [Link](r'[a-z0-9\.\-+_]+@[a-z0-9\.\-+_]+\.[a-z]+')

def extract_emails(resume_text):
return [Link](email_regex, resume_text)

df['email'] = [Link](lambda x: extract_emails(x))

[Link]()

Out[5]:
path text name phone email

resumes-
list/resume- (123)
justin.green11@[Link]\n(123) Github
0 example- 456- [justin.green11@[Link]]
456-7890\nWash... SKILLS
option- 7890
software-en...

resumes-
list/resume- (123)
Stephen Greet\nProject
1 example- Stephen 456- [stephen@[Link]]
Manager\nPMP certified p...
option- 7890
project-man...

resumes-
list/resume- (123)
Ashley Doyle,
2 example- Ashley 456- [[Link]@[Link]]
Esq\n\[Link]@[Link]\n\n(1...
option- 7890
[Link]

resumes-
list/resume- (123)
Stephen Greet\nSales
3 example- Stephen 456- [stephen@[Link]]
Associate\n\nWork Experie...
option- 7890
[Link]

resumes-
list/data- (123)
KANDICE LOUDOR\n\nDATA
4 scientist- Github 456- [kloudor@[Link]]
SCIENTIST\n\nCONTACT\n\...
resume- 7890
[Link]

Extract school

In [6]: school_keywords = [
'school',
'college',
'university',
'academy',
'faculty',
'institute',
'diploma',
]

def extract_education(input_text):
[Link] Page 5 of 12
final-project - Jupyter Notebook 2021-12-09, 2:30 AM

def extract_education(input_text):
organizations = []

for sent in nltk.sent_tokenize(input_text):

for chunk in nltk.ne_chunk(nltk.pos_tag(nltk.word_tokenize(sent
if hasattr(chunk, 'label'): #and [Link]() == 'ORGANIZATIO
[Link](' '.join(c[0] for c in [Link]

education = set()
for org in organizations:
for word in school_keywords:
if [Link]().find(word) >= 0:
[Link](org)

return education

df['school'] = [Link](lambda x: extract_education(x))

Out[6]:
path text name phone

resumes-list/resume- (123)
justin.green11@[Link]\n(123) Github
0 example-option- 456- [justin.green11@[Link]
456-7890\nWash... SKILLS
software-en... 7890

resumes-list/resume- (123)
Stephen Greet\nProject
1 example-option- Stephen 456- [stephen@[Link]
Manager\nPMP certified p...
project-man... 7890

resumes-list/resume- (123)
Ashley Doyle,
2 example-option- Ashley 456- [[Link]@[Link]
Esq\n\[Link]@[Link]\n\n(1...
[Link] 7890

resumes-list/resume- (123)
Stephen Greet\nSales
3 example-option- Stephen 456- [stephen@[Link]
Associate\n\nWork Experie...
[Link] 7890

resumes-list/data- (123)
KANDICE LOUDOR\n\nDATA
4 scientist-resume- Github 456- [kloudor@[Link]
SCIENTIST\n\nCONTACT\n\...
[Link] 7890

resumes-list/full-stack- (123)
ALEKS LUDKEE\nFull-Stack
5 developer-resume- ALEKS 456- [[Link]@[Link]
Developer\n\nludkee.a...
examp... 7890

resumes- Mobile: +1 (709) 986-7643\nWebsite:

6 Education None [sikamal@mun
list/shawon_resume.pdf [Link]

Niantic
resumes-list/entry- Data (123)
Trish Mathers\nEntry-Level Data
7 level-data-scientist- Scientist 456- [tmathers@[Link]
Scientist\nInn...
resume... Intern 7890
Seattle

[Link] Page 6 of 12
final-project - Jupyter Notebook 2021-12-09, 2:30 AM

resumes-list/resume- Stephen\nGreet\nWeb (123)

8 Stephen [stephen@[Link]
example-option- Developer\n\nWork Experien... 456-
college-stu... 7890

resumes-list/resume- (123)
ALICE LEWIS, APRN\n\nNurse San
9 example-option- 456- [alicelewis409@[Link]
Practitioner\n\nCON... Diego
[Link] 7890

Extract previous job titles

In [7]: df_job_titles = pd.read_csv('job_titles_set.csv')

df_job_titles.[Link]

Out[7]: array(['owner', 'manager', 'president', ...,

'corporate account executive', 'trade marketing',
'library director'], dtype=object)

In [8]: JOB_TITLE_DB = df_job_titles.[Link]

def extract_job_titles(input_text):
stop_words = set([Link]('english'))
word_tokens = [Link].word_tokenize(input_text)

#preprocessing
filtered_tokens = [w for w in word_tokens if w not in stop_words]
filtered_tokens = [w for w in word_tokens if [Link]()]

grams = list(map(' '.join, [Link](filtered_tokens, 2, 3)))

found_skills = set()

for i in filtered_tokens:
if [Link]() in JOB_TITLE_DB:
found_skills.add(i)

for i in grams:
if [Link]() in JOB_TITLE_DB:
found_skills.add(i)

return found_skills

df['job_titles'] = [Link](lambda x: extract_job_titles(x))

[Link]()

Out[8]:
path text name phone email

[Link] Page 7 of 12
final-project - Jupyter Notebook 2021-12-09, 2:30 AM

resumes-
list/resume-
(123)
example- justin.green11@[Link]\n(123) Github
0 456- [justin.green11@[Link]]
option- 456-7890\nWash... SKILLS
7890
software-
en...

resumes-
list/resume-
(123)
example- Stephen Greet\nProject {Admin
1 Stephen 456- [stephen@[Link]]
option- Manager\nPMP certified p...
7890
project-
man...

resumes-
list/resume- (123)
Ashley Doyle,
2 example- Ashley 456- [[Link]@[Link]]
Esq\n\[Link]@[Link]\n\n(1...
option- 7890
[Link]

resumes-
list/resume- (123)
Stephen Greet\nSales {Johns
3 example- Stephen 456- [stephen@[Link]]
Associate\n\nWork Experie...
option- 7890
[Link]

resumes-
list/data- (123)
KANDICE LOUDOR\n\nDATA
4 scientist- Github 456- [kloudor@[Link]]
SCIENTIST\n\nCONTACT\n\...
resume- 7890
[Link]

Part 2: Evaluation
Calculate similarity between job description and resume

[Link] Page 8 of 12
final-project - Jupyter Notebook 2021-12-09, 2:30 AM

In [9]: job_description = open("job_description.txt", "r").read()

job_description

Out[9]: "Software Developer\nLocation: St. John's;\n\nEach day our Software D

evelopers get to work on challenging problems. No two days are the sa
me, each day you’ll collaborate with other Software Developers to pro
blem solve and write code that has an impact in the real world. Our p
roduct, Verafin, helps fight crime by stopping fraud and money launde
ring. Stopping the flow of this money means stopping crimes such as h
uman trafficking, elder abuse, and drug trafficking. Our Software Dev
elopers get the opportunity to move around the business as there are
new teams and projects developed all the time to help us towards our
mission of stopping crime. Being a Software Developer at Verafin mean
s getting the opportunity to have an impact on criminal activity by g
etting to do what you love – solve cool problems using code.\n\nEssen
tial Skills & Qualifications\nA university degree or college diploma
in Computer Engineering, Computer Science, or a combination of educat
ion and previous experience would be considered\nStrong analytical sk
ills for complex and creative problem solving\nExperience in object-o
riented software development \nAutomated testing\nExcellent int
erpersonal and organizational skills; able to work closely with team
members\nWould be good to have experience in a few of the following a
reas\nJava\nExperience using JavaScript, CSS, REST\nPrevious experien
ce working with Core Banking Systems\nAmazon Web Services\nIntelligen
t systems, artificial intelligence and data science\nDistributed comp
uting\nDatabase technologies (PostgresSQL)\nBig data technologies\nDa
ta extraction, manipulation/cleansing and integration \n\n\nIndustry
and on-the-job training is provided for all roles at Verafin. \n\n\u2
00bVerafin places a high value on building a diverse team, candidates
of all backgrounds are encouraged to apply.\n\nMobile devices are not
supported for job applications currently. Please apply using a deskto
p device for the best user experience.\n\nPlease note: we frequently
see our jobs posted on job aggregators, which are essentially search
engines for jobs. Generally those sites ask you to use their sites to
apply for the posted job and they do not send us the application. As
a reminder, the the only way to apply for a job with Verafin is on ou
r site [Link]/careers. We look forward to reviewing your app
lication."

[Link] Page 9 of 12
final-project - Jupyter Notebook 2021-12-09, 2:30 AM

In [10]: new_row = [Link]({'path':'job_description', 'text': job_description

df = [Link]([new_row, df]).reset_index(drop = True)

[Link]()

Out[10]:
path text name phone email

Software Developer\nLocation: St.

0 job_description NaN NaN NaN
John's;\n\nE...

resumes-
list/resume- (123)
justin.green11@[Link]\n(123) Github
1 example- 456- [justin.green11@[Link]]
456-7890\nWash... SKILLS
option- 7890
software-en...

resumes-
list/resume- (123)
Stephen Greet\nProject
2 example- Stephen 456- [stephen@[Link]]
Manager\nPMP certified p...
option-project- 7890
man...

resumes-
list/resume- (123)
Ashley Doyle,
3 example- Ashley 456- [[Link]@[Link]]
Esq\n\[Link]@[Link]\n\n(1...
option- 7890
[Link]

resumes-
list/resume- (123)
Stephen Greet\nSales
4 example- Stephen 456- [stephen@[Link]]
Associate\n\nWork Experie...
option- 7890
[Link]

[Link] Page 10 of 12
final-project - Jupyter Notebook 2021-12-09, 2:30 AM

In [11]: # Remove stop words and punctuations from text

stop_words_l=[Link]('english')

df['text_cleaned']=[Link](lambda x: " ".join([Link](r'[^a-zA-Z]'

tfidfvectoriser=TfidfVectorizer()
[Link](df.text_cleaned)
tfidf_vectors=[Link](df.text_cleaned)

similarities=[Link](tfidf_vectors,tfidf_vectors.T).toarray()

for i in range(len(similarities[0])):
[Link][i, "similarity"] = similarities[0][i]

df.sort_values(by='similarity', ascending=False, inplace=True)

df = [Link](0)
df.reset_index(drop=True, inplace=True)

Out[11]:
path text name phone

resumes-list/full-stack- (123)
ALEKS LUDKEE\nFull-Stack
0 developer-resume- ALEKS 456- [[Link]@[Link]
Developer\n\nludkee.a...
examp... 7890

resumes- Mobile: +1 (709) 986-7643\nWebsite:

1 Education None [sikamal@mun
list/shawon_resume.pdf [Link]

resumes-list/resume- (123)
justin.green11@[Link]\n(123) Github
2 example-option- 456- [justin.green11@[Link]
456-7890\nWash... SKILLS
software-en... 7890

resumes-list/resume- (123)
Stephen\nGreet\nWeb
3 example-option- Stephen 456- [stephen@[Link]

Ranking Output

[Link] Page 11 of 12
final-project - Jupyter Notebook 2021-12-09, 2:30 AM

In [12]: df[['path', 'name', 'email', 'similarity']]

Out[12]:
path name email similarity

resumes-list/full-stack-developer-
0 ALEKS [[Link]@[Link]] 0.143581
resume-examp...

1 resumes-list/shawon_resume.pdf Education [sikamal@[Link]] 0.138904

resumes-list/resume-example-
2 Github SKILLS [justin.green11@[Link]] 0.101460
option-software-en...

resumes-list/resume-example-
3 Stephen [stephen@[Link]] 0.079581
option-college-stu...

resumes-list/resume-example-
4 Stephen [stephen@[Link]] 0.079037
option-project-man...

resumes-list/data-scientist-resume-
5 Github [kloudor@[Link]] 0.052557
[Link]

resumes-list/entry-level-data- Niantic Data Scientist

6 [tmathers@[Link]] 0.050098
scientist-resume... Intern Seattle

resumes-list/resume-example-
7 San Diego [alicelewis409@[Link]] 0.030303
[Link]

resumes-list/resume-example-
8 Stephen [stephen@[Link]] 0.028344
[Link]

resumes-list/resume-example-
9 Ashley [[Link]@[Link]] 0.021063
[Link]

[Link] Page 12 of 12

Ai Agent Build Guide
No ratings yet
Ai Agent Build Guide
14 pages
Build a Resume Parser with Python
No ratings yet
Build a Resume Parser with Python
12 pages
Resume Screening Project Report Final
No ratings yet
Resume Screening Project Report Final
13 pages
ATS Scanner Development Roadmap
No ratings yet
ATS Scanner Development Roadmap
2 pages
Resume Screener
No ratings yet
Resume Screener
17 pages
International Journal of Research Publication and Reviews: A Smart Resume Analyser For Career Optimization Using NLP
No ratings yet
International Journal of Research Publication and Reviews: A Smart Resume Analyser For Career Optimization Using NLP
6 pages
Take-Home Assignment - Machine Learning Engineer
No ratings yet
Take-Home Assignment - Machine Learning Engineer
2 pages
AI Resume Parsing System
No ratings yet
AI Resume Parsing System
13 pages
Resume Phrase Matcher Code GitHub
No ratings yet
Resume Phrase Matcher Code GitHub
2 pages
Import Pandas As PD
No ratings yet
Import Pandas As PD
3 pages
Resume Parser FYP Progress Update
No ratings yet
Resume Parser FYP Progress Update
2 pages
Resume Parser Using Natural Language Processing Techniques
No ratings yet
Resume Parser Using Natural Language Processing Techniques
6 pages
NLP Project
No ratings yet
NLP Project
12 pages
Resume Information Extraction System
No ratings yet
Resume Information Extraction System
75 pages
Capstone Project AI
No ratings yet
Capstone Project AI
10 pages
Project
No ratings yet
Project
7 pages
Project Proposal
No ratings yet
Project Proposal
1 page
Poster (Resume - Parser) (420 × 297 MM)
No ratings yet
Poster (Resume - Parser) (420 × 297 MM)
1 page
A 0 DYYSLy Yu
No ratings yet
A 0 DYYSLy Yu
8 pages
AI-Powered Resume Parser Project
No ratings yet
AI-Powered Resume Parser Project
1 page
Resum (1) (3) Pro
No ratings yet
Resum (1) (3) Pro
16 pages
Automated Resume Parsing A Natural Language Processing Approach
No ratings yet
Automated Resume Parsing A Natural Language Processing Approach
6 pages
Miraj PWP Report
No ratings yet
Miraj PWP Report
16 pages
ABSTRACT1
No ratings yet
ABSTRACT1
1 page
Lang Chain
No ratings yet
Lang Chain
11 pages
Scholarly Paper
No ratings yet
Scholarly Paper
8 pages
Resume Parsing Case Study
No ratings yet
Resume Parsing Case Study
19 pages
IEEE Conference Template 1
No ratings yet
IEEE Conference Template 1
5 pages
Resume Parser Documentation1
No ratings yet
Resume Parser Documentation1
8 pages
Resume Mini
No ratings yet
Resume Mini
10 pages
E-Recruitment with NLP & AI
No ratings yet
E-Recruitment with NLP & AI
5 pages
Resume Screening
No ratings yet
Resume Screening
16 pages
Resume Screening with NLP & ML Review
No ratings yet
Resume Screening with NLP & ML Review
8 pages
AI Resume Parser and Analyzer Tool
No ratings yet
AI Resume Parser and Analyzer Tool
1 page
Resume Parser and Job Recommendation System Using Machine Learning
No ratings yet
Resume Parser and Job Recommendation System Using Machine Learning
6 pages
Synopsis
No ratings yet
Synopsis
8 pages
Resume Shortlisting System (14!2!2025)
No ratings yet
Resume Shortlisting System (14!2!2025)
15 pages
Smart Resume Analyzer Project Synopsis
No ratings yet
Smart Resume Analyzer Project Synopsis
14 pages
NLP-Powered Resume Analysis Tool
No ratings yet
NLP-Powered Resume Analysis Tool
7 pages
Sourcecode
No ratings yet
Sourcecode
16 pages
Abhishek, Intelligent Resume Screening Tool
No ratings yet
Abhishek, Intelligent Resume Screening Tool
17 pages
Resume Parser Progress
No ratings yet
Resume Parser Progress
11 pages
RESUME ANALYSER SYNOPSIS VTH Semm
No ratings yet
RESUME ANALYSER SYNOPSIS VTH Semm
9 pages
Resume Parsing Solution Guide
No ratings yet
Resume Parsing Solution Guide
7 pages
Major Project Report
No ratings yet
Major Project Report
37 pages
CV Nagaraj 3 4 2023.pdf 1680525267971
No ratings yet
CV Nagaraj 3 4 2023.pdf 1680525267971
3 pages
Resume Parsing Techniques Guide
No ratings yet
Resume Parsing Techniques Guide
1 page
? Project Title
No ratings yet
? Project Title
3 pages
ICCSAI 2025: Resume Analysis System
No ratings yet
ICCSAI 2025: Resume Analysis System
9 pages
Ai Resume Analyzer
No ratings yet
Ai Resume Analyzer
1 page
NLP-Based Resume Analyzer Project
No ratings yet
NLP-Based Resume Analyzer Project
18 pages
Ieee Paper
No ratings yet
Ieee Paper
7 pages
Resume Analyser PDF
No ratings yet
Resume Analyser PDF
9 pages
Automated Resume Screening: COMP 4750: Natural Language Processing Shawon Ibn Kamal
No ratings yet
Automated Resume Screening: COMP 4750: Natural Language Processing Shawon Ibn Kamal
14 pages
Shradha Pujari Resume Screening NLP Python
No ratings yet
Shradha Pujari Resume Screening NLP Python
12 pages
Automated Resume Screening: COMP 4750: Natural Language Processing Shawon Ibn Kamal
No ratings yet
Automated Resume Screening: COMP 4750: Natural Language Processing Shawon Ibn Kamal
14 pages
Paraphrased Research Paper
No ratings yet
Paraphrased Research Paper
5 pages
ABSTRACT Form
No ratings yet
ABSTRACT Form
2 pages
Python Code
No ratings yet
Python Code
5 pages
Asisten Konten Creator AI Source Code
No ratings yet
Asisten Konten Creator AI Source Code
51 pages
Building Planning Drawing
No ratings yet
Building Planning Drawing
4 pages
Registration of Suppliers For Fy 2017-2018-2019 1
No ratings yet
Registration of Suppliers For Fy 2017-2018-2019 1
8 pages
Module 19: On-Board Transit Management Systems
No ratings yet
Module 19: On-Board Transit Management Systems
29 pages
SAP Add-On Installation Guide
No ratings yet
SAP Add-On Installation Guide
14 pages
mSBMApp UserManual
No ratings yet
mSBMApp UserManual
28 pages
New Resume Julianna
No ratings yet
New Resume Julianna
2 pages
Intro To Comp Week-5
No ratings yet
Intro To Comp Week-5
12 pages
Ijwis 05 2013 0014
No ratings yet
Ijwis 05 2013 0014
17 pages
Spring Security Masterclass Slides
No ratings yet
Spring Security Masterclass Slides
217 pages
0417 w14 QP 3
No ratings yet
0417 w14 QP 3
8 pages
Flexible Learning Environments Overview
No ratings yet
Flexible Learning Environments Overview
16 pages
(ENG) Utop-Blockchain Enterprise Solution - 20231213
No ratings yet
(ENG) Utop-Blockchain Enterprise Solution - 20231213
23 pages
VC Audio Pro User Manual Guide
No ratings yet
VC Audio Pro User Manual Guide
23 pages
What Is Artificial Intelligence by Jack Copeland
No ratings yet
What Is Artificial Intelligence by Jack Copeland
37 pages
Sarath Kumar Latest Resume
No ratings yet
Sarath Kumar Latest Resume
1 page
Joseph Abou-Sakher: Production Assistant Experience
No ratings yet
Joseph Abou-Sakher: Production Assistant Experience
2 pages
Download
No ratings yet
Download
2 pages
Chapter 4 Software
No ratings yet
Chapter 4 Software
15 pages
Erased Log by Sos
No ratings yet
Erased Log by Sos
1 page
Kaluza Analysis Brochure v2.0
No ratings yet
Kaluza Analysis Brochure v2.0
8 pages
h17926 Dellemc Powerprotect DD Ds
No ratings yet
h17926 Dellemc Powerprotect DD Ds
5 pages
Video Violence Detection with MoBiLSTM
No ratings yet
Video Violence Detection with MoBiLSTM
12 pages
Essay On Global Village
100% (2)
Essay On Global Village
7 pages
Veritas Netbackup - Adminguide - Advancedclient
100% (4)
Veritas Netbackup - Adminguide - Advancedclient
268 pages
365 Days of Drawing Prompts
100% (1)
365 Days of Drawing Prompts
12 pages
Cheat Sheet Data Type Oracle
No ratings yet
Cheat Sheet Data Type Oracle
1 page
Airwolf AW3D 3D Printer User Manual
No ratings yet
Airwolf AW3D 3D Printer User Manual
38 pages
MI03 - Lab Manual
No ratings yet
MI03 - Lab Manual
80 pages
GV350M Series Quick Start: Basic Operation
No ratings yet
GV350M Series Quick Start: Basic Operation
2 pages

Code Output

Uploaded by

Code Output

Uploaded by

final-project - Jupyter Notebook 2021-12-09, 2:30 AM

Automated Resume Screening

In [1]: from os import path

from pdfminer.high_level import extract_text

from sklearn.feature_extraction.text import TfidfVectorizer

In [2]: mypath = "resumes-list"

def find_ext(dr, ext):

resumepaths = find_ext(mypath, "pdf")

df = [Link] (resumepaths, columns = ['path'])

df['text'] = df['path'].apply(lambda x: extract_text(x))

resumes-list/data-scientist-resume- KANDICE LOUDOR\n\nDATA

Retrieve candidate name

In [3]: def extract_names(txt):

for sent in nltk.sent_tokenize(txt):

df['name'] = [Link](lambda x: extract_names(x)[0])

resumes-list/resume-example-option- justin.green11@[Link]\n(123) 456- Github

resumes-list/resume-example-option- Stephen Greet\nProject Manager\nPMP

resumes-list/resume-example-option- Ashley Doyle,

resumes-list/resume-example-option- Stephen Greet\nSales Associate\n\nWork

resumes-list/data-scientist-resume- KANDICE LOUDOR\n\nDATA

In [4]: phone_regex = [Link](r'[\+\(]?[1-9][0-9 .\-\(\)]{8,}[0-9]')

if resume_text.find(number) >= 0 and len(number) < 16:

df['phone'] = [Link](lambda x: extract_phone_number(x))

resumes-list/resume-example- justin.green11@[Link]\n(123) 456- Github (123)

resumes-list/resume-example- Stephen Greet\nProject Manager\nPMP (123)

resumes-list/resume-example- Ashley Doyle, (123)

resumes-list/resume-example- Stephen Greet\nSales (123)

resumes-list/data-scientist- KANDICE LOUDOR\n\nDATA (123)

In [5]: email_regex = [Link](r'[a-z0-9\.\-+_]+@[a-z0-9\.\-+_]+\.[a-z]+')

df['email'] = [Link](lambda x: extract_emails(x))

for sent in nltk.sent_tokenize(input_text):

df['school'] = [Link](lambda x: extract_education(x))

resumes- Mobile: +1 (709) 986-7643\nWebsite:

resumes-list/resume- Stephen\nGreet\nWeb (123)

Extract previous job titles

In [7]: df_job_titles = pd.read_csv('job_titles_set.csv')

Out[7]: array(['owner', 'manager', 'president', ...,

In [8]: JOB_TITLE_DB = df_job_titles.[Link]

grams = list(map(' '.join, [Link](filtered_tokens, 2, 3)))

df['job_titles'] = [Link](lambda x: extract_job_titles(x))

In [9]: job_description = open("job_description.txt", "r").read()

Out[9]: "Software Developer\nLocation: St. John's;\n\nEach day our Software D

In [10]: new_row = [Link]({'path':'job_description', 'text': job_description

Software Developer\nLocation: St.

In [11]: # Remove stop words and punctuations from text

df['text_cleaned']=[Link](lambda x: " ".join([Link](r'[^a-zA-Z]'

df.sort_values(by='similarity', ascending=False, inplace=True)

resumes- Mobile: +1 (709) 986-7643\nWebsite:

In [12]: df[['path', 'name', 'email', 'similarity']]

1 resumes-list/shawon_resume.pdf Education [sikamal@[Link]] 0.138904

resumes-list/entry-level-data- Niantic Data Scientist

You might also like