Ongoing FYP Progress
Current module
Resume Parser
What I did so far?
As an alternative to pdfminer, pypdf2 is studied and resume extraction steps are followed.
Some CVs that start with some images or have images aren’t processed.
Wide datasets of experience and qualifications are required to capture both from resumes.
Regular expression problem in getting phone numbers. The international digit number is truncated
sometimes.
The project so far is being saved in a zip file.
Current workspace
1. Chrome tabs
a. http://www.nltk.org/book/
b. https://www.nltk.org/book/ch07.html
c. http://www.nltk.org/howto/chunk.html
d. https://m-clark.github.io/text-analysis-with-R/img/POS-Tags.png
e. https://medium.com/@divalicious.priya/information-extraction-from-cv-acec216c3f48
f. https://regexr.com/
g. https://help.libreoffice.org/Common/List_of_Regular_Expressions
h. https://www.onlinegdb.com/online_python_interpreter
i. https://www.youtube.com/watch?v=nxhCyeRR75Q&t=18s
j. https://www.youtube.com/watch?v=yGKTphqxR9Q&list=PLQVvvaa0QuDf2JswnfiGkliBIn
ZnIC4HL&index=3
k. https://www.datacamp.com/community/tutorials/stemming-lemmatization-python
l. https://pythonprogramming.net/lemmatizing-nltk-tutorial/
Problems
1. Phone extracting regular expression seems correct (as proved by regexr.com as well). It scans
complete phone number of Salman Anjum CV but not haider’s cv and Resume –Rohini Prakash.
Questions
1. What is the difference between context={} and context=[]
2. What does 'self' keyword do in python classes
3. What does .as_view( ) function
do?
4. Difference between path(‘’) and url(‘’)
Other issues in project
How to recognize applicant / company from one login / register form
Resume db. Storing every applicant's resume details