Tensorflow Implementation For Job Market Classification: Taras Mitran Jeff Waller
Tensorflow Implementation For Job Market Classification: Taras Mitran Jeff Waller
What the market rate for this job, at the 50th percentile?
60%ile?
Can similar or nearly identical jobs be clustered, so the user only has to
review a small subset of jobs?
HR Compensation Workflow
Luckily, this is being done.
The issue: its being done manually. And tens of thousands of different
job descriptions are created each year.
Because Google?
Text-based Convolutional Neural Nets
• Udacity Course:
https://www.udacity.com/course/deep-learning--ud730
• All examples are in main repository of tensorflow:
https://www.udacity.com/course/deep-learning--ud730
• ConvNet on Rotten Tomatoes data set guide:
http://www.wildml.com/2015/12/implementing-a-cnn-for-text-
classification-in-tensorflow/
Facebook whitepaper on convolutional neural
networks:
* Validation set should also be included, we just ran that in our final tool
ETL Cont’d.
Source: http://deeplearning.stanford.edu/wiki/index.php/F
eature_extraction_using_convolution
What is a Convolution?
Source: http://www.wildml.com/2015/11/understanding-convolutional-neural-networks-for-nlp/
Strides
• Stride = number of pixels/words/characters to shift when looping
over input matrix
https://www.udacity.com/course/deep-learning--ud730
Embeddings
“The cat purrs”
EMBEDDING_SIZE = 32
Activation Function: non-linearity
• We’ll skim over these details, but we used a reLU
• It’s a rectified linear unit
• They are an alternative to sigmoid functions, such as tanh
• Linear if x > 0, 0 where x <= 0
• https://youtu.be/Opg63pan_YQ
Pooling: reduce complexity while maintaining accuracy
https://www.udacity.com/course/deep-learning--ud730
Pooling
• Allows you to reduce stride, and increase accuracy
• Lower strides increase computation cost
• Another hyperparameter!
• Note: Output will lose edges, so either pad with zeros for ‘same’
padding, or the output size will be smaller with ‘valid’ padding.
Source: http://cs231n.github.io/convolutional-networks/#pool
Softmax: convert scores to probabilities
Returns a NumPy array with the same shape as the input:
def softmax(x):
e_x = np.exp(x)
return e_x / e_x.sum(axis=0)
• Split classification into job function and career level as additional layers