Code Output
Code Output
import nltk
from [Link] import stopwords
import re
import subprocess
import pandas as pd
import numpy as np
Part 1: Parsing
Read resume pdf
[Link] Page 1 of 12
final-project - Jupyter Notebook 2021-12-09, 2:30 AM
[Link]()
Out[2]:
path text
resumes-list/resume-example-option-
0 justin.green11@[Link]\n(123) 456-7890\nWash...
software-en...
resumes-list/resume-example-option-project-
1 Stephen Greet\nProject Manager\nPMP certified p...
man...
resumes-list/resume-example-option-
2 Ashley Doyle, Esq\n\[Link]@[Link]\n\n(1...
[Link]
resumes-list/resume-example-option-
3 Stephen Greet\nSales Associate\n\nWork Experie...
[Link]
[Link] Page 2 of 12
final-project - Jupyter Notebook 2021-12-09, 2:30 AM
return person_names
[Link]()
Out[3]:
path text name
Extract phone-number
[Link] Page 3 of 12
final-project - Jupyter Notebook 2021-12-09, 2:30 AM
def extract_phone_number(resume_text):
phone = [Link](phone_regex, resume_text)
if phone:
number = ''.join(phone[0])
[Link]()
Out[4]:
path text name phone
Extract email
[Link] Page 4 of 12
final-project - Jupyter Notebook 2021-12-09, 2:30 AM
[Link]()
Out[5]:
path text name phone email
resumes-
list/resume- (123)
justin.green11@[Link]\n(123) Github
0 example- 456- [justin.green11@[Link]]
456-7890\nWash... SKILLS
option- 7890
software-en...
resumes-
list/resume- (123)
Stephen Greet\nProject
1 example- Stephen 456- [stephen@[Link]]
Manager\nPMP certified p...
option- 7890
project-man...
resumes-
list/resume- (123)
Ashley Doyle,
2 example- Ashley 456- [[Link]@[Link]]
Esq\n\[Link]@[Link]\n\n(1...
option- 7890
[Link]
resumes-
list/resume- (123)
Stephen Greet\nSales
3 example- Stephen 456- [stephen@[Link]]
Associate\n\nWork Experie...
option- 7890
[Link]
resumes-
list/data- (123)
KANDICE LOUDOR\n\nDATA
4 scientist- Github 456- [kloudor@[Link]]
SCIENTIST\n\nCONTACT\n\...
resume- 7890
[Link]
Extract school
In [6]: school_keywords = [
'school',
'college',
'university',
'academy',
'faculty',
'institute',
'diploma',
]
def extract_education(input_text):
[Link] Page 5 of 12
final-project - Jupyter Notebook 2021-12-09, 2:30 AM
def extract_education(input_text):
organizations = []
education = set()
for org in organizations:
for word in school_keywords:
if [Link]().find(word) >= 0:
[Link](org)
return education
df
Out[6]:
path text name phone
resumes-list/resume- (123)
justin.green11@[Link]\n(123) Github
0 example-option- 456- [justin.green11@[Link]
456-7890\nWash... SKILLS
software-en... 7890
resumes-list/resume- (123)
Stephen Greet\nProject
1 example-option- Stephen 456- [stephen@[Link]
Manager\nPMP certified p...
project-man... 7890
resumes-list/resume- (123)
Ashley Doyle,
2 example-option- Ashley 456- [[Link]@[Link]
Esq\n\[Link]@[Link]\n\n(1...
[Link] 7890
resumes-list/resume- (123)
Stephen Greet\nSales
3 example-option- Stephen 456- [stephen@[Link]
Associate\n\nWork Experie...
[Link] 7890
resumes-list/data- (123)
KANDICE LOUDOR\n\nDATA
4 scientist-resume- Github 456- [kloudor@[Link]
SCIENTIST\n\nCONTACT\n\...
[Link] 7890
resumes-list/full-stack- (123)
ALEKS LUDKEE\nFull-Stack
5 developer-resume- ALEKS 456- [[Link]@[Link]
Developer\n\nludkee.a...
examp... 7890
Niantic
resumes-list/entry- Data (123)
Trish Mathers\nEntry-Level Data
7 level-data-scientist- Scientist 456- [tmathers@[Link]
Scientist\nInn...
resume... Intern 7890
Seattle
[Link] Page 6 of 12
final-project - Jupyter Notebook 2021-12-09, 2:30 AM
resumes-list/resume- (123)
ALICE LEWIS, APRN\n\nNurse San
9 example-option- 456- [alicelewis409@[Link]
Practitioner\n\nCON... Diego
[Link] 7890
def extract_job_titles(input_text):
stop_words = set([Link]('english'))
word_tokens = [Link].word_tokenize(input_text)
#preprocessing
filtered_tokens = [w for w in word_tokens if w not in stop_words]
filtered_tokens = [w for w in word_tokens if [Link]()]
found_skills = set()
for i in filtered_tokens:
if [Link]() in JOB_TITLE_DB:
found_skills.add(i)
for i in grams:
if [Link]() in JOB_TITLE_DB:
found_skills.add(i)
return found_skills
[Link]()
Out[8]:
path text name phone email
[Link] Page 7 of 12
final-project - Jupyter Notebook 2021-12-09, 2:30 AM
resumes-
list/resume-
(123)
example- justin.green11@[Link]\n(123) Github
0 456- [justin.green11@[Link]]
option- 456-7890\nWash... SKILLS
7890
software-
en...
resumes-
list/resume-
(123)
example- Stephen Greet\nProject {Admin
1 Stephen 456- [stephen@[Link]]
option- Manager\nPMP certified p...
7890
project-
man...
resumes-
list/resume- (123)
Ashley Doyle,
2 example- Ashley 456- [[Link]@[Link]]
Esq\n\[Link]@[Link]\n\n(1...
option- 7890
[Link]
resumes-
list/resume- (123)
Stephen Greet\nSales {Johns
3 example- Stephen 456- [stephen@[Link]]
Associate\n\nWork Experie...
option- 7890
[Link]
resumes-
list/data- (123)
KANDICE LOUDOR\n\nDATA
4 scientist- Github 456- [kloudor@[Link]]
SCIENTIST\n\nCONTACT\n\...
resume- 7890
[Link]
Part 2: Evaluation
Calculate similarity between job description and resume
[Link] Page 8 of 12
final-project - Jupyter Notebook 2021-12-09, 2:30 AM
[Link] Page 9 of 12
final-project - Jupyter Notebook 2021-12-09, 2:30 AM
[Link]()
Out[10]:
path text name phone email
resumes-
list/resume- (123)
justin.green11@[Link]\n(123) Github
1 example- 456- [justin.green11@[Link]]
456-7890\nWash... SKILLS
option- 7890
software-en...
resumes-
list/resume- (123)
Stephen Greet\nProject
2 example- Stephen 456- [stephen@[Link]]
Manager\nPMP certified p...
option-project- 7890
man...
resumes-
list/resume- (123)
Ashley Doyle,
3 example- Ashley 456- [[Link]@[Link]]
Esq\n\[Link]@[Link]\n\n(1...
option- 7890
[Link]
resumes-
list/resume- (123)
Stephen Greet\nSales
4 example- Stephen 456- [stephen@[Link]]
Associate\n\nWork Experie...
option- 7890
[Link]
[Link] Page 10 of 12
final-project - Jupyter Notebook 2021-12-09, 2:30 AM
tfidfvectoriser=TfidfVectorizer()
[Link](df.text_cleaned)
tfidf_vectors=[Link](df.text_cleaned)
similarities=[Link](tfidf_vectors,tfidf_vectors.T).toarray()
for i in range(len(similarities[0])):
[Link][i, "similarity"] = similarities[0][i]
df = [Link](0)
df.reset_index(drop=True, inplace=True)
df
Out[11]:
path text name phone
resumes-list/full-stack- (123)
ALEKS LUDKEE\nFull-Stack
0 developer-resume- ALEKS 456- [[Link]@[Link]
Developer\n\nludkee.a...
examp... 7890
resumes-list/resume- (123)
justin.green11@[Link]\n(123) Github
2 example-option- 456- [justin.green11@[Link]
456-7890\nWash... SKILLS
software-en... 7890
resumes-list/resume- (123)
Stephen\nGreet\nWeb
3 example-option- Stephen 456- [stephen@[Link]
Ranking Output
[Link] Page 11 of 12
final-project - Jupyter Notebook 2021-12-09, 2:30 AM
Out[12]:
path name email similarity
resumes-list/full-stack-developer-
0 ALEKS [[Link]@[Link]] 0.143581
resume-examp...
resumes-list/resume-example-
2 Github SKILLS [justin.green11@[Link]] 0.101460
option-software-en...
resumes-list/resume-example-
3 Stephen [stephen@[Link]] 0.079581
option-college-stu...
resumes-list/resume-example-
4 Stephen [stephen@[Link]] 0.079037
option-project-man...
resumes-list/data-scientist-resume-
5 Github [kloudor@[Link]] 0.052557
[Link]
resumes-list/resume-example-
7 San Diego [alicelewis409@[Link]] 0.030303
[Link]
resumes-list/resume-example-
8 Stephen [stephen@[Link]] 0.028344
[Link]
resumes-list/resume-example-
9 Ashley [[Link]@[Link]] 0.021063
[Link]
[Link] Page 12 of 12