0% found this document useful (0 votes)

74 views46 pages

Tensorflow Implementation For Job Market Classification: Taras Mitran Jeff Waller

This document discusses using a convolutional neural network (CNN) to classify job descriptions by similar roles. Currently, job roles are classified manually which is inefficient as new roles are constantly being added. The document proposes using TensorFlow to build a CNN model that can learn embeddings and classifications from large datasets of job descriptions. It provides examples of CNN concepts like convolutions, pooling, activations and softmax. It also discusses preprocessing job data into fixed-length vectors and tuning hyperparameters like embedding size. Building the model requires GPU hardware, Linux, and libraries like CUDA and CUDNN from Nvidia. The document shares tips for installing TensorFlow and the challenges faced, like crashes, during initial model testing and development.

Uploaded by

subhanshu babbar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

74 views46 pages

Tensorflow Implementation For Job Market Classification: Taras Mitran Jeff Waller

Uploaded by

subhanshu babbar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 46

TensorFlow Implementation for

Job Market Classification

Taras Mitran
Jeff Waller
HR Compensation Workflow
Scenario: ABC Corp wants to hire a statistician.

What the market rate for this job, at the 50th percentile?
60%ile?

Issue: Almost every company’s job title and description for

roughly the same “job” is different than other companies.
HR Compensation Workflow
1. ABC Corp submits all salaries with job information to salary survey
companies annually.
2. Survey companies aggregate data across companies.
3. They sell it back to ABC Corp in .csv format.
4. ABC Corp has to match several overlapping surveys to the job they
want to price.
HR Compensation Workflow
Instead of searching the entire dataset via keyword search:

Can similar or nearly identical jobs be clustered, so the user only has to
review a small subset of jobs?
HR Compensation Workflow
Luckily, this is being done.

The issue: its being done manually. And tens of thousands of different
job descriptions are created each year.

Challenge: augment this work with machine learning

A sample short form job description:
Head of statistical programming co-ordinate and oversee activities of statistical
programming teams develop programming standards. develop utility sas macros for
use in project programs. develop programming technologies to increase efficiencies
in clinical study reporting. generate summary data tables and analyses as part of
clinical study reports and iss/ise documents for fda submission. key programmer for
the production of interim analysis tables needed by data monitoring boards.
collaborate with it department to enhance biostatistics technologies; serve as key
contact for sas issues. program and validate data transfers and data conversions
between sas and other software for clients and regulatory agencies. develop, test,
and validate sas interfaces to non-sas data sources. supervise, instruct, and mentor
junior staff. participate in business development activities. experience: likely to have
had a least 10 years' experience in statistical programming. qualifications: bsc in
computing, life sciences, mathematical or statistical subjects. line management
experience. level responsibility: line management - high - 6+. project - high - 3+
countries. financial - medium. technical - high - expert. alternative titles: director of
statistical programming. survey level 4.
I heard Tensorflow was good?
What is Tensorflow?
• Python-based neural network framework
• Google open source project on Github
• Released November 2015
• Runs highly optimized C++ code for actual calculations
• Higher level APIs on top of tensorflow available, like skflow to fit within the
Scikit Learn API

TensorFlow is an open source software library for numerical

computation using data flow graphs. Nodes in the graph represent
mathematical operations, while the graph edges represent the
multidimensional data arrays (tensors) that flow between them.
Why Tensorflow?

Because Google?
Text-based Convolutional Neural Nets

• Udacity Course:
https://www.udacity.com/course/deep-learning--ud730
• All examples are in main repository of tensorflow:
https://www.udacity.com/course/deep-learning--ud730
• ConvNet on Rotten Tomatoes data set guide:
http://www.wildml.com/2015/12/implementing-a-cnn-for-text-
classification-in-tensorflow/
Facebook whitepaper on convolutional neural
networks:

#TAGSPACE: Semantic Embeddings from Hashtags

• a convolutional neural network that learns feature representations for

short textual posts using hashtags as a supervised signal. The
proposed approach is trained on up to 5.5 billion words predicting
100,000 possible hashtags
• http://emnlp2014.org/papers/pdf/EMNLP2014194.pdf
Facebook’s neural network design:
How to get started?

• Step 1: pip install tensorflow

• Step 2: ??
• Step 3: Profit
Nvidia GPUs for NN

Neural networks are computationally

simple, but require massive amounts
of calculations, a problem well
conditioned for computation on a
GPU.
Nvidia GPUs for NN
• The Nvidia GTX 980 is (was until the GTX 1080) the most efficient hardware
for this, GTX 980 used by gamers (this is why the low price) — $520 prior to
introduction of GTX 1080, $400 now. The 1080 is $650.
• Neural networks do not need double sized floats, so the massively priced
K80, etc GPUS are not needed.
• 980 GTX uses Maxwell architecture 4 GB memory (224 GB/sec bandwidth)
and capable of 4.6 TFlops
• GTX 1080 introduced June 2016 using Pascal architecture which comes
with specialized support for neural networks (among other things support
for half-word sized floats), 8 GB of memory (320 GB/sec bandwidth) 8.9
TFlops.
Install notes
• Tensorflow can make use of of the GPU but requires Linux as the OS
and it’s best to set aside a video card for computation rather than
making it double as compute engine and video display.
• needs the latest CUDA was well as the latest CUDNN (CUDA for Neural
Networks) from Nvidia.
• these libraries must be obtained from the Nvidia developer site
• https://developer.nvidia.com/cuda-downloads
• https://developer.nvidia.com/cudnn
• follow install instructions given on nvidia site.
Install Cont’d
• Straightforward location is /usr/local

jeffw@chill:/usr/local$ cat /etc/ld.so.conf.d/cuda.conf

/usr/local/cuda/lib64
jeffw@chill:/usr/local$ cat /etc/ld.so.conf.d/cudnn.conf
/usr/local/cudnn/lib64
• install tensorflow with pip
• install the normal numerical packages numpy and scikit
Install Cont’d.
• use Ubuntu 16 to avoid the following problem:
https://github.com/tensorflow/tensorflow/issues/2190
The rig:
Post install
1. Do some ETL
2. Create a model
3. Run”python mymodel.py”
4. Tweak the model
5. Buy some more video cards to make it go faster*

* Has nothing to do with wanting to play games at 120 FPS on a

4k monitor
ETL

* Validation set should also be included, we just ran that in our final tool
ETL Cont’d.

• Map each document to a fixed length vector of integers (vocabulary

IDs) of length MAX_DOCUMENT_LENGTH
• each word is assigned a unique integer
• Save the vocabulary token off so IDs are consistent across multiple
training runs
We’ll come back to the model, but first an interlude:

Our experience getting the model up and running

First it crashed
And then it crashed again….

(Ubuntu 15 bug from install)

And again…
Then it burned

(our model didn’t perform so well)

What is a Convolutional Neural Network?
What is a Convolution?
• A sliding window function
applied to a matrix.
• The window is called kernel,
patch, or filter.

Source: http://deeplearning.stanford.edu/wiki/index.php/F
eature_extraction_using_convolution
What is a Convolution?

Source: http://www.wildml.com/2015/11/understanding-convolutional-neural-networks-for-nlp/
Strides
• Stride = number of pixels/words/characters to shift when looping
over input matrix

https://www.udacity.com/course/deep-learning--ud730
Embeddings
“The cat purrs”

“The cat hunts mice”

“The kitten purred”

Do kittens hunt mice?
Word2vec math
• Since embeddings are vectors of floats:

Puppy – Dog + Cat = Kitten

Embeddings
• Alternative to one-
hot encoding
• Scales better with
thousands of
categories, but
sparse values
• We learned
embeddings from
scratch instead of
word2vec
https://www.udacity.com/course/deep-learning--ud730
Embeddings

We tweaked the hyperparameter to:

EMBEDDING_SIZE = 32
Activation Function: non-linearity
• We’ll skim over these details, but we used a reLU
• It’s a rectified linear unit
• They are an alternative to sigmoid functions, such as tanh
• Linear if x > 0, 0 where x <= 0
• https://youtu.be/Opg63pan_YQ
Pooling: reduce complexity while maintaining accuracy

• aggregate windows of values

in the matrix to reduce the
output
• Average pooling: similar to
“blurring” an image
• Max Pooling: use maximum
of all values in neighborhood
of each value

https://www.udacity.com/course/deep-learning--ud730
Pooling
• Allows you to reduce stride, and increase accuracy
• Lower strides increase computation cost
• Another hyperparameter!

• Note: Output will lose edges, so either pad with zeros for ‘same’
padding, or the output size will be smaller with ‘valid’ padding.
Source: http://cs231n.github.io/convolutional-networks/#pool
Softmax: convert scores to probabilities
Returns a NumPy array with the same shape as the input:

def softmax(x):
e_x = np.exp(x)
return e_x / e_x.sum(axis=0)

scores = [1.0, 2.0, 3.0]

print softmax(scores)
[ 0.09003057 0.24472847 0.66524096]
What is a Convolutional Neural Network?
Demo!
Further Steps
Most inaccurate predictions had things like “responsible for supervising the collection of,
managing, and reports on <x>” where x might be “water samples” or “tax records”.

• Additional layers by training sentences instead of paragraphs, and then

training that output.
• Common structure in job descriptions
• Similar to a NN understanding edges, and then shapes

• Split classification into job function and career level as additional layers

• Pre-process with TF-IDF for job functionality

Questions?

Lab Report Format
No ratings yet
Lab Report Format
6 pages
Deep Learning TensorFlow and Keras
No ratings yet
Deep Learning TensorFlow and Keras
454 pages
Geometric Sequences - Sequence Puzzlers
No ratings yet
Geometric Sequences - Sequence Puzzlers
1 page
Introduction To TensorFlow For Artificial Intelligence
No ratings yet
Introduction To TensorFlow For Artificial Intelligence
41 pages
Learn Multithreading with Modern C++
From Everand
Learn Multithreading with Modern C++
James Raynard
No ratings yet
Tensorflow Tutorial PDF
100% (6)
Tensorflow Tutorial PDF
90 pages
Vijay Vittal, Raja Ayyanar Auth. Grid Integration and Dynamic Impact of Wind Energy
100% (1)
Vijay Vittal, Raja Ayyanar Auth. Grid Integration and Dynamic Impact of Wind Energy
152 pages
09 Tensorflow101 Slide
No ratings yet
09 Tensorflow101 Slide
78 pages
ML Unit-5
No ratings yet
ML Unit-5
19 pages
Bigdata Neural Networks
No ratings yet
Bigdata Neural Networks
144 pages
Lesson 05 TensorFlow
No ratings yet
Lesson 05 TensorFlow
113 pages
Deep Learning1
No ratings yet
Deep Learning1
23 pages
Week 13 GCP Lec Notes
No ratings yet
Week 13 GCP Lec Notes
28 pages
Dzone Rc251 Gettingstartedwithtensorflow
No ratings yet
Dzone Rc251 Gettingstartedwithtensorflow
5 pages
Large-Scale Deep Learning With Tensorflow: Jeff Dean Google Brain Team
No ratings yet
Large-Scale Deep Learning With Tensorflow: Jeff Dean Google Brain Team
119 pages
Tensorflow: Features
No ratings yet
Tensorflow: Features
10 pages
DeepTrading With TensorFlow 1 - TodoTrader
No ratings yet
DeepTrading With TensorFlow 1 - TodoTrader
6 pages
Python TensorFlow Tutorial - Build A Neural Network - Adventures in Machine Learning
No ratings yet
Python TensorFlow Tutorial - Build A Neural Network - Adventures in Machine Learning
18 pages
Appendix Tensorflow PDF
50% (8)
Appendix Tensorflow PDF
14 pages
What Is TensorFlow
No ratings yet
What Is TensorFlow
38 pages
15 ML
No ratings yet
15 ML
60 pages
Tensor Flow Guide
No ratings yet
Tensor Flow Guide
25 pages
Deep Learning With Tensorflow
No ratings yet
Deep Learning With Tensorflow
50 pages
Preet Hi
No ratings yet
Preet Hi
75 pages
Tensorflow 2.0 Cheat Sheet: Some Pre-Requisites TF Core Learning Algorithms Working With Keras Models
No ratings yet
Tensorflow 2.0 Cheat Sheet: Some Pre-Requisites TF Core Learning Algorithms Working With Keras Models
2 pages
Ansari H. Mastering TensorFlow. Unleashing The Power of Deep Learning... 2024
No ratings yet
Ansari H. Mastering TensorFlow. Unleashing The Power of Deep Learning... 2024
134 pages
Osn Tensorflow2 210327175734
No ratings yet
Osn Tensorflow2 210327175734
23 pages
DSE 3141 Deep Learning Lab Manual 2024 Week4
No ratings yet
DSE 3141 Deep Learning Lab Manual 2024 Week4
14 pages
Deep Learning
No ratings yet
Deep Learning
45 pages
Deep Learning With Tensorflow
100% (1)
Deep Learning With Tensorflow
70 pages
Ultimate Guide To Tensorflow 2.0 in Python
No ratings yet
Ultimate Guide To Tensorflow 2.0 in Python
23 pages
Week 2
No ratings yet
Week 2
4 pages
Large Scale Deep Learning With TensorFlow (PDFDrive)
No ratings yet
Large Scale Deep Learning With TensorFlow (PDFDrive)
240 pages
Krishna Rungta - TensorFlow in 1 Day Make Your Own Neural Network (2018) - Trang-3
No ratings yet
Krishna Rungta - TensorFlow in 1 Day Make Your Own Neural Network (2018) - Trang-3
18 pages
A Quick Introduction To Tensorflow: Machine Learning Spring 2019
100% (1)
A Quick Introduction To Tensorflow: Machine Learning Spring 2019
22 pages
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
From Everand
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
Marcus Richards
No ratings yet
521010J Toolbox Intro
No ratings yet
521010J Toolbox Intro
52 pages
TensorFlow For Machine Intelligence
100% (27)
TensorFlow For Machine Intelligence
305 pages
Bay Learn 2015 Deep Mind
No ratings yet
Bay Learn 2015 Deep Mind
69 pages
The First Artificial Neuron
No ratings yet
The First Artificial Neuron
2 pages
TensorFlow For Machine Intelligence
No ratings yet
TensorFlow For Machine Intelligence
306 pages
AI Slide 2
No ratings yet
AI Slide 2
82 pages
24 TensorFlow Clipper
No ratings yet
24 TensorFlow Clipper
35 pages
CV Lecture 4-Donnnn
No ratings yet
CV Lecture 4-Donnnn
65 pages
Tensor Flow
No ratings yet
Tensor Flow
130 pages
HW 5
No ratings yet
HW 5
10 pages
0jhKAy5cS6K4SgMuXHuiyg - TensorFlow On Google Cloud - Course Summary
No ratings yet
0jhKAy5cS6K4SgMuXHuiyg - TensorFlow On Google Cloud - Course Summary
7 pages
Google Aiml
No ratings yet
Google Aiml
50 pages
L6 Hardware and Software For DL en
No ratings yet
L6 Hardware and Software For DL en
66 pages
Unit Ii
No ratings yet
Unit Ii
83 pages
Deep Learning With Keras and Tensorflow
No ratings yet
Deep Learning With Keras and Tensorflow
557 pages
Introduction To Deep Neural Networks - DataCamp
No ratings yet
Introduction To Deep Neural Networks - DataCamp
10 pages
14 DL Frameworks
No ratings yet
14 DL Frameworks
30 pages
Deep Learning With Python Mini Course
No ratings yet
Deep Learning With Python Mini Course
26 pages
Tensorlayer Documentation: Release 1.11.1
No ratings yet
Tensorlayer Documentation: Release 1.11.1
258 pages
Intro To Deep Learning
100% (1)
Intro To Deep Learning
35 pages
Chapter DeepLearningwithTensorFlow
No ratings yet
Chapter DeepLearningwithTensorFlow
19 pages
1 TensorFlow
No ratings yet
1 TensorFlow
66 pages
Deep Learning r18 Jntuh Lab Manual
No ratings yet
Deep Learning r18 Jntuh Lab Manual
20 pages
CSCI 315: Artificial Intelligence Through Deep Learning: W&L Winter Term 2017 Prof. Levy
No ratings yet
CSCI 315: Artificial Intelligence Through Deep Learning: W&L Winter Term 2017 Prof. Levy
22 pages
106106213
No ratings yet
106106213
637 pages
Introduction To Deep Learning: TA: Drew Hudson May 8, 2020
No ratings yet
Introduction To Deep Learning: TA: Drew Hudson May 8, 2020
33 pages
SAS Interview Questions You'll Most Likely Be Asked
From Everand
SAS Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Lighting Design Analysis Report (415152)
No ratings yet
Lighting Design Analysis Report (415152)
5 pages
10th MCQs REFRACTION OF LIGHT AT PLANE SURFACES
No ratings yet
10th MCQs REFRACTION OF LIGHT AT PLANE SURFACES
3 pages
Intermediate Filaments PDF
No ratings yet
Intermediate Filaments PDF
26 pages
Specification of DC Connector - MC4 Type
No ratings yet
Specification of DC Connector - MC4 Type
4 pages
Lesson 7 Homework Practice Constant Rate of Change Answers
100% (1)
Lesson 7 Homework Practice Constant Rate of Change Answers
5 pages
EI FLS1000 Brochure 2023 STG07
No ratings yet
EI FLS1000 Brochure 2023 STG07
24 pages
California Dreamin Chords
No ratings yet
California Dreamin Chords
2 pages
Class 9th Oral
No ratings yet
Class 9th Oral
2 pages
Sim508 HD v2.06
No ratings yet
Sim508 HD v2.06
75 pages
Fritz 2016
No ratings yet
Fritz 2016
11 pages
Chapter 1: Introduction: Silberschatz, Galvin and Gagne ©2013 Operating System Concepts - 9 Edit9on
No ratings yet
Chapter 1: Introduction: Silberschatz, Galvin and Gagne ©2013 Operating System Concepts - 9 Edit9on
35 pages
Nas Syllabus
No ratings yet
Nas Syllabus
3 pages
Grade 9 2nd Quarter Module 4 Carbon A Special Element Finalized
100% (1)
Grade 9 2nd Quarter Module 4 Carbon A Special Element Finalized
26 pages
Engineering Problem Solving With C++, 3e Chapter 5 Test Bank
No ratings yet
Engineering Problem Solving With C++, 3e Chapter 5 Test Bank
4 pages
Dsa PDF
No ratings yet
Dsa PDF
109 pages
Physics Form One
No ratings yet
Physics Form One
249 pages
AspenEngineeringSuiteV14 2-Rel
No ratings yet
AspenEngineeringSuiteV14 2-Rel
66 pages
Financial Mathematics 2025
No ratings yet
Financial Mathematics 2025
17 pages
Transport System in Living Things
No ratings yet
Transport System in Living Things
1 page
Chapter 4 Questions 1
No ratings yet
Chapter 4 Questions 1
2 pages
SKILL
No ratings yet
SKILL
16 pages
Sub-Zero: Information Manual
No ratings yet
Sub-Zero: Information Manual
20 pages
Aarzu Sharma Resume
No ratings yet
Aarzu Sharma Resume
1 page
Set-3 AK
No ratings yet
Set-3 AK
14 pages
Rheinhütte T GBQ 55.209-25 Bomba H2SO4 Rev.00
No ratings yet
Rheinhütte T GBQ 55.209-25 Bomba H2SO4 Rev.00
32 pages
The MULTI TESTER A Detailed Lesson Plan
No ratings yet
The MULTI TESTER A Detailed Lesson Plan
5 pages
Bài Giải Secondary Checkpoint Science 2021 April Paper 1
100% (2)
Bài Giải Secondary Checkpoint Science 2021 April Paper 1
14 pages

Tensorflow Implementation For Job Market Classification: Taras Mitran Jeff Waller

Uploaded by

Tensorflow Implementation For Job Market Classification: Taras Mitran Jeff Waller

Uploaded by

TensorFlow Implementation for

Job Market Classification

Issue: Almost every company’s job title and description for

Challenge: augment this work with machine learning

TensorFlow is an open source software library for numerical

#TAGSPACE: Semantic Embeddings from Hashtags

• a convolutional neural network that learns feature representations for

• Step 1: pip install tensorflow

Neural networks are computationally

jeffw@chill:/usr/local$ cat /etc/ld.so.conf.d/cuda.conf

* Has nothing to do with wanting to play games at 120 FPS on a

• Map each document to a fixed length vector of integers (vocabulary

Our experience getting the model up and running

(Ubuntu 15 bug from install)

(our model didn’t perform so well)

“The cat hunts mice”

“The kitten purred”

Puppy – Dog + Cat = Kitten

We tweaked the hyperparameter to:

• aggregate windows of values

scores = [1.0, 2.0, 3.0]

• Additional layers by training sentences instead of paragraphs, and then

• Pre-process with TF-IDF for job functionality

You might also like