0% found this document useful (0 votes)

50 views

Natural Language Processing Using Python: (With NLTK, Scikit-Learn and Stanford NLP Apis)

This document provides an overview of natural language processing using Python with popular libraries like NLTK, scikit-learn, and Stanford NLP APIs. It outlines a roadmap for 3 sessions that will cover topics like text tokenization, part-of-speech tagging, chunking, named entity recognition, coreference resolution, and deep learning approaches. The first session introduces Python and NLTK for shallow parsing tasks. The second covers named entity recognition and coreference resolution. The third discusses word embeddings and deep learning models.

Uploaded by

PRINCE DEWANGAN

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

50 views

Natural Language Processing Using Python: (With NLTK, Scikit-Learn and Stanford NLP Apis)

Uploaded by

PRINCE DEWANGAN

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

Natural Language Processing using PYTHON

(with NLTK, scikit-learn and Stanford NLP APIs)

Instructor: Diptesh Kanojia, Abhijit Mishra

Supervisor: Prof. Pushpak Bhattacharyya
Center for Indian Language Technology
Department of Computer Science and Engineering
Indian Institute of Technology Bombay
email: {diptesh,abhijitmishra,pb}@cse.iitb.ac.in
URL: http://www.cse.iitb.ac.in/~{diptesh,abhijitmishra,pb}

VIVA Institute of Technology, 2016 Diptesh, Abhijit http://www.cfilt.iitb.ac.in

Roadmap
Session 1 (Introduction to NLP, Shallow Parsing and Deep Parsing)
Introduction to python and NLTK
Text Tokenization, POS tagging and chunking using NLTK.
Constituency and Dependency Parsing using NLTK and Stanford Parser
Session 2 (Named Entity Recognition, Coreference Resolution)
NER using NLTK
Coreference Resolution using NLTK and Stanford CoreNLP tool
Session 3 (Meaning Extraction, Deep Learning)
WordNets and WordNet-API
Other Lexical Knowledge Networks – Verbnet and Framenet

VIVA Institute of Technology, 2016 Diptesh, Abhijit http://www.cfilt.iitb.ac.in 2

SESSION-1 (INTRODUCTION TO NLP, SHALLOW PARSING
AND DEEP PARSING)

• Introduction to python and NLTK

• Text Tokenization, Morphological Analysis, POS tagging and
chunking using NLTK.
• Constituency and Dependency Parsing using NLTK

Expected duration: 15 mins

3
Why Python?
As far as I know, you cannot execute a Python program (compiled to bytecode) on every
machine, such as on windows, or on linux without modification.

You are incorrect. The python bytecode is cross platform. See Is python bytecode version-dependent?
Is it platform-dependent? on Stack Overflow.
However, it is not compatible across versions. Python 2.6 cannot execute Python 2.5 files. So while
cross-platform, its not generally useful as a distribution format.

But why Python needs both a compiler and an interpreter?

Speed. Strict interpretation is slow. Virtually every "interpreted" language actually compiles the source
code into some sort of internal representation so that it doesn't have to repeatedly parse the code.
In python's case it saves this internal representation to disk so that it can skip the parsing/compiling
process next time it needs the code.

VIVA Institute of Technology, 2016 Diptesh, Abhijit CFILT 4

Introduction to python
A programming language with strong similarities Perl and C with powerful typing and
object oriented features.
Commonly used for producing HTML content on websites. (e.g. Instagram,
Bitbucket, Mozilla and many more websites built on python-django
framework).
Great for text processing (e.g. Powerful RegEx tools).
Useful built-in types (lists, dictionaries, generators, iterators).
Parallel computing (Multi-processing, Multi-threading APIs)
Map-Reduce facilities, lambda functions
Clean/Readable syntax, Lots of open-source standard libraries
Code Reusability
DEMO (basics.py)
VIVA Institute of Technology, 2016 Diptesh, Abhijit Introduction to python 5
Installing Python
Download and installation instructions at:
https://www.python.org/download/
Windows/Mac systems require installation
Pre-installed latest linux distributions (Ubuntu, Fedora, Suse
etc.)
We will use python 2.7.X versions.

VIVA Institute of Technology, 2016 Diptesh, Abhijit Introduction to python 6

Python Tutorials
“Dive into Python”
http://diveintopython.org/
The Official Python Tutorial
https://docs.python.org/2/tutorial/
The Python Quick Reference
http://rgruet.free.fr/PQR2.3.html

VIVA Institute of Technology, 2016 Diptesh, Abhijit Introduction to python 7

Useful IDE/Text-Editors
IDLE (Windows)
Vi/Emacs (Linux)
Geany (Windows/Linux/Mac)
Pydev plugin for Eclipse IDE (Windows/Linux/Mac)
Notepad++ (Windows)

VIVA Institute of Technology, 2016 Diptesh, Abhijit Introduction to python 8

Useful Python Libraries
NumPy (Mathematical Computing, Advanced mathematical functionalities)
Matplotlib (Numerical plotting library, useful in data analysis)
Scipy (Library for scientific computation)
Scikit-learn (Machine Learning/Data-mining library,
PIL (Python library for Image Processing)
PySpeech (Library for speech processing and text-to speech conversion)
XML/LXML (XML Parsing and Processing)
NLTK (Natural Language Processing)
And many more…
https://wiki.python.org/moin/UsefulModules

VIVA Institute of Technology, 2016 Diptesh, Abhijit Introduction to python 9

The Natural Language Toolkit (NLTK)
Developed by Steven Bird and Co. at Stanford University
(2006).
Open source python modules, datasets and tutorials
Papers:
Bird, Steven. "NLTK: the natural language toolkit." Proceedings of the COLING/ACL on Interactive
presentation sessions. Association for Computational Linguistics, 2006.
Loper, Edward, and Steven Bird. "NLTK: The natural language toolkit." Proceedings of the ACL-02
Workshop on Effective tools and methodologies for teaching natural language processing and
computational linguistics-Volume 1. Association for Computational Linguistics, 2002.

VIVA Institute of Technology, 2016 Diptesh, Abhijit Introduction to NLTK 10

Components of NLTK (Bird et. al, 2006)
1. Code: corpus readers, tokenizers, stemmers, taggers, chunkers,
parsers, wordnet, ... (50k lines of code)
2. Corpora: >30 annotated data sets widely used in natural
language processing (>300Mb data)
3. Documentation: a 400-page book, articles, reviews, API
documentation

VIVA Institute of Technology, 2016 Diptesh, Abhijit Introduction to NLTK 11

1. Code
Corpus Readers
Tokenizers
Stemmers
Taggers
Parsers
WordNet
Semantic Interpretation
Clusterers
Evaluation Metrics

VIVA Institute of Technology, 2016 Diptesh, Abhijit Introduction to NLTK 12

2. Corpora
Brown Corpus
Carnegie Mellon Pronouncing Dictionary
CoNLL 2000 Chunking Corpus
Project Gutenberg Selections
NIST 1999 Information Extraction: Entity Recognition Corpus
US Presidential Inaugural Address Corpus
Indian Language POS-Tagged Corpus
Floresta Portuguese Treebank
Prepositional Phrase Attachment Corpus
SENSEVAL 2 Corpus
Sinica Treebank Corpus Sample
Universal Declaration of Human Rights Corpus
Stopwords Corpus
TIMIT Corpus Sample
Treebank Corpus Sample

VIVA Institute of Technology, 2016 Diptesh, Abhijit Introduction to NLTK 13

3. Documentation
Books:
Natural Language Processing with Python - Steven Bird, Edward Loper,
Ewan Klein
Python Text Processing with NLTK 2.0 Cookbook – Jacob Perkins
Included in NLTK:
Installation instructions
API Documentation: describes every module, interface, class, and
method

VIVA Institute of Technology, 2016 Diptesh, Abhijit Introduction to NLTK 14

NLTK- How to?
Install NLTK
Follow instructions at http://www.nltk.org/install.html
Installers for Windows, Linux and Mac OS available
Check installation
Execute python command through “shell” (Linux/Mac) or command prompt “cmd”
(Windows).
• $python
• >>import nltk
The interpreter should import “nltk” without showing any error.
Download NLTK data (corpora):
>>nltk.download()

VIVA Institute of Technology, 2016 Diptesh, Abhijit Introduction to NLTK 15

NLTK modules
NLP Tasks NLTK Modules Functionality
Accessing Corpora nltk.corpus Standardized interfaces to corpora and lexicons

String Processing nltk.tokenize, nltk.stem Tokenizers, sentence tokenizers, stemmers

Collection Discovery nltk.collections t-test, chi-squared, point-wise mutual information

POS Tagging nltk.tag n-gram, backoff, Brill, HMM, TnT

Chunking nltk.chunk Regular expression, n-gram, named entity

Parsing nltk.parse Chart, feature-based, unification, probabilistic,

dependency
Classification nltk.classify, nltk.cluster Decision tree, maximum entropy, naive Bayes, EM, k-
means
Semantic Interpretation nltk.sem, nltk.inference Lambda calculus, first-order logic, model checking

Evaluation Metrics nltk.metrics Precision, recall, agreement coefficients

Probability Estimation nltk.probability Frequency distributions, smoothed probability

distributions
Applications nltk.app Graphical concordancer, parsers, WordNet browser

Linguistics fieldwork nltk.toolbox Manipulate data in SIL Toolbox format

VIVA Institute of Technology, 2016 Diptesh, Abhijit Introduction to NLTK 16

Text Tokenization
Process of splitting a string into a list of tokens (words,
punctuation-marks etc.)
For most of the languages whitespace separates two adjacent
words.
Exceptions (source: Wikipedia):
For Chinese and Japanese, sentences are delimited but words are not.
For Thai, Phrases and Sentences are delimited but not words.

VIVA Institute of Technology, 2016 Diptesh, Abhijit CFILT 17

Text Tokenization – NLTK Tokenizers
Description: http://www.nltk.org/api/nltk.tokenize.html
Demo:
LineTokenizer – Tokenize string into lines
/*PunctWordTokenizer (Statistical) –
• Tokenizing based on Unupervised ML algorithms.
• Model parameters are learnt through training on a large corpus of abbreviation
words, collocations, and words that start sentences*/
RegexpTokenizer- Tokenization based on RegExp
SExprTokenizer – To find parenthesized expressions in a string
TreebankWordTokenizer – Tokenization as per Penn-Treebank standards

VIVA Institute of Technology, 2016 Diptesh, Abhijit CFILT 18

NLTK Morphological Analyzers
DEMO- Stemmers
Lancaster Stemmer
Porter Stemmer
Regexp Stemmer
Snowball Stemmer
DEMO - Lemmatizers
WordNet based Lemmatizers
Script: morphological_analyzer.py

VIVA Institute of Technology, 2016 Diptesh, Abhijit CFILT 19

NLTK- Part of Speech Tagging
The process of sequentially labeling words in a sentence with
their corresponding part of speech tags.
Demo – NLTK POS Taggers (pos_tagger.py)
Unigram Tagger (Based on prior probability)
Brill Tagger (Rule Based)
Regexp Tagger (Using regular expressions for tagging)
HMM based Tagger (HMM-Viterbi based)
Stanford tagger (Using Stanford module)
NLTK recommended tagger

VIVA Institute of Technology, 2016 Diptesh, Abhijit CFILT 20

Parsers
Writing grammar
Rule based constituency parsing
RecursiveDescent Parser
ShiftReduce Parser
DEMO- Statistical Parsers
Probabilistic Context Free Grammar (PCFG)
• Stanford parser
Probabilistic Dependency Parsing
• Malt Parser
• Stanford Parser
Script: parser_demo.py
VIVA Institute of Technology, 2016 Diptesh, Abhijit CFILT 21
SESSION 2 (NAMED ENTITY RECOGNITION,
COREFERENCE RESOLUTION)

• NER using NLTK

Expected duration: 15 mins

22
NER and Coreference Resolution using NLTK
DEMO – Named entity chunking using NLTK
Using Stanford CoreNLP tool for Coreference Resolution
Download and installation instruction at
http://stanfordnlp.github.io/CoreNLP/
Python Wrapper for CoreNLP
https://github.com/dasmith/stanford-corenlp-python
• DEMO – Coreference Resolution using Stanford CoreNLP
• Scripts: coreference_resolution.py
named_entity_chunking.py

VIVA Institute of Technology, 2016 Diptesh, Abhijit CFILT 23

SESSION-4 (MEANING EXTRACTION, DEEP LEARNING)

• WordNets and WordNet-API

• Other Lexical Knowledge Networks – VerbNet and FrameNet

Expected duration: 10 mins

24
WordNet
DEMO - NLTK WordNet (wordnet.py)
Finding all the synonym set (SynSet) of a word for all possible pos
tags.
Finding all the SynSets if POS tag is known.
Finding hypernym, hyponym of a synset.
Finding similarities between two words.

VIVA Institute of Technology, 2016 Diptesh, Abhijit CFILT 25

Other Lexical Networks – ConceptNet, Verbnet and FrameNet

DEMO- Conceptnet (conceptnet.py)

Using Divisi API (since it is not available in NLTK)
Using ConceptNet for finding attributes of a concept.
Using ConceptNet for word similarity computation
DEMO- VerbNet (framenet_verbnet.py)
Obtaining verb classes from VerbNet
DEMO – FrameNet
Listing all the frames in FrameNet
Obtaining properties of a particular frame.

VIVA Institute of Technology, 2016 Diptesh, Abhijit CFILT 26

THANK YOU

9th January, 2016 27

CS406 Midterm
No ratings yet
CS406 Midterm
12 pages
Railway Reservation System
50% (2)
Railway Reservation System
31 pages
Network As A Service (Naas) Playbook: 1. TX Network High-Level Design
100% (1)
Network As A Service (Naas) Playbook: 1. TX Network High-Level Design
503 pages
Natural Language Processing With Python - Steven Bird
0% (3)
Natural Language Processing With Python - Steven Bird
4 pages
Learn Python in 10 Minutes
From Everand
Learn Python in 10 Minutes
Victor Ebai
4/5 (30)
X Steel Book System
No ratings yet
X Steel Book System
291 pages
All-In-One Robot Education: Dobot Product Catalog
No ratings yet
All-In-One Robot Education: Dobot Product Catalog
32 pages
Python The Complete Reference: Comprehensive Guide to Mastering Python Programming from Fundamentals to Advanced Techniques
From Everand
Python The Complete Reference: Comprehensive Guide to Mastering Python Programming from Fundamentals to Advanced Techniques
Aarav Joshi
No ratings yet
Introduction To NLTK
No ratings yet
Introduction To NLTK
101 pages
Practical Guide to Python: From Basics to Advanced Programming
From Everand
Practical Guide to Python: From Basics to Advanced Programming
Arcadia J. Darell
No ratings yet
Mastering Python Programming: A Comprehensive Guide: The IT Collection
From Everand
Mastering Python Programming: A Comprehensive Guide: The IT Collection
Christopher Ford
5/5 (1)
Your First Python Program
From Everand
Your First Python Program
Alexander Paz
No ratings yet
NLTK Documentation: Release 3.2.5
No ratings yet
NLTK Documentation: Release 3.2.5
87 pages
Screenshot 2024-11-29 at 8.35.21 AM
No ratings yet
Screenshot 2024-11-29 at 8.35.21 AM
40 pages
Python Textbook
From Everand
Python Textbook
Manish Soni
No ratings yet
Python Mini Manual
From Everand
Python Mini Manual
CodeCraft Dynamics
No ratings yet
Natural Language Processing
No ratings yet
Natural Language Processing
116 pages
NLTK Tutorial: What Is NLTK Library in Python?
No ratings yet
NLTK Tutorial: What Is NLTK Library in Python?
3 pages
How To Install NLTK - 240116 - 161832
No ratings yet
How To Install NLTK - 240116 - 161832
11 pages
Essential Python 3
From Everand
Essential Python 3
Kevin Vans-Colina
No ratings yet
Natural Language Toolkit NLTK PDF
No ratings yet
Natural Language Toolkit NLTK PDF
23 pages
Effortless Python: Learn Python Quickly from Beginner to Pro
From Everand
Effortless Python: Learn Python Quickly from Beginner to Pro
Aarav Joshi
No ratings yet
NLTK Installation Guide
No ratings yet
NLTK Installation Guide
13 pages
TPLS, 09
No ratings yet
TPLS, 09
9 pages
Python Simplified
From Everand
Python Simplified
Alisa Turing
No ratings yet
Natural Language Toolkit - Getting Started
No ratings yet
Natural Language Toolkit - Getting Started
5 pages
NLTK
No ratings yet
NLTK
16 pages
Elegant Python: Simplifying Complex Solutions
From Everand
Elegant Python: Simplifying Complex Solutions
Michael Huang
No ratings yet
Python Programming Techniques: The Art of Coding and Programming Explained
From Everand
Python Programming Techniques: The Art of Coding and Programming Explained
Lance Gifford
No ratings yet
Natural Language Toolkit Tutorial
100% (1)
Natural Language Toolkit Tutorial
109 pages
Natural Language Processing
No ratings yet
Natural Language Processing
12 pages
Introduction To Natural Language Processing - GeeksforGeeks
No ratings yet
Introduction To Natural Language Processing - GeeksforGeeks
15 pages
Introduction to Python Programming: Learn Coding with Hands-On Projects for Beginners
From Everand
Introduction to Python Programming: Learn Coding with Hands-On Projects for Beginners
Kiet Huynh
No ratings yet
Python Mastery Unleashed: Advanced Programming Techniques
From Everand
Python Mastery Unleashed: Advanced Programming Techniques
Jarrel E.
No ratings yet
Python Programming: Learn, Code, Create
From Everand
Python Programming: Learn, Code, Create
Sachin Naha
No ratings yet
UBC Summer School in NLP - VSP 2019 Lecture 8
No ratings yet
UBC Summer School in NLP - VSP 2019 Lecture 8
27 pages
Mastering Python in 7 Days
From Everand
Mastering Python in 7 Days
Alex Wood
No ratings yet
PYTHON FOR BEGINNERS: Master the Basics of Python Programming and Start Writing Your Own Code in No Time (2023 Guide for Beginners)
From Everand
PYTHON FOR BEGINNERS: Master the Basics of Python Programming and Start Writing Your Own Code in No Time (2023 Guide for Beginners)
Glen Jennings
No ratings yet
Natural Language Processing With Python
100% (1)
Natural Language Processing With Python
504 pages
Python For Data Science
From Everand
Python For Data Science
Kevin Clark
No ratings yet
NLTK: The Natural Language Toolkit: Steven Bird Edward Loper
No ratings yet
NLTK: The Natural Language Toolkit: Steven Bird Edward Loper
4 pages
AI Zone: Log in Sign Up
No ratings yet
AI Zone: Log in Sign Up
24 pages
Introduction to Natural Language Processing
No ratings yet
Introduction to Natural Language Processing
31 pages
NLTK Tutorial
No ratings yet
NLTK Tutorial
17 pages
Installing NLTK in Windows
No ratings yet
Installing NLTK in Windows
13 pages
NLP Nanodegree Syllabus
No ratings yet
NLP Nanodegree Syllabus
11 pages
Lab8_NLTK_v2
No ratings yet
Lab8_NLTK_v2
53 pages
NLP Syllabus for Course Work
No ratings yet
NLP Syllabus for Course Work
4 pages
Mastering Python Programming for Beginners
From Everand
Mastering Python Programming for Beginners
gareth thomas
No ratings yet
A Guide to Python Mastery: Python
From Everand
A Guide to Python Mastery: Python
Ummed Singh
No ratings yet
NLP
No ratings yet
NLP
13 pages
Minorproject Ishant
No ratings yet
Minorproject Ishant
18 pages
Python Data Persistence
From Everand
Python Data Persistence
Malhar Lathkar
No ratings yet
Mastering Python: A Comprehensive Guide for Beginners and Experts
From Everand
Mastering Python: A Comprehensive Guide for Beginners and Experts
janya lo
No ratings yet
Natural Language Processing manual
No ratings yet
Natural Language Processing manual
39 pages
Natural Language Processing (NLP) With Python - Tutorial
No ratings yet
Natural Language Processing (NLP) With Python - Tutorial
72 pages
A082 - Shubham Kumar - Practical No. 2
No ratings yet
A082 - Shubham Kumar - Practical No. 2
6 pages
The 1 Page Python Book
From Everand
The 1 Page Python Book
Barani Kumar
2/5 (1)
Deep Learning in Practice Project Two: NLP of The Holy Quran in Python
No ratings yet
Deep Learning in Practice Project Two: NLP of The Holy Quran in Python
11 pages
Python Programming Mastery: A Comprehensive Guide for Beginners with Real-World Projects and Proven Techniques to Excel in 14 Days! Computer Programming
From Everand
Python Programming Mastery: A Comprehensive Guide for Beginners with Real-World Projects and Proven Techniques to Excel in 14 Days! Computer Programming
Ryan Campbell
No ratings yet
Mastering Python Networking - Third Edition: Your one-stop solution to using Python for network automation, programmability, and DevOps, 3rd Edition
From Everand
Mastering Python Networking - Third Edition: Your one-stop solution to using Python for network automation, programmability, and DevOps, 3rd Edition
Eric Chou
3/5 (2)
gbhrfthrdf
No ratings yet
gbhrfthrdf
3 pages
NLP Manual (1-12) 1
No ratings yet
NLP Manual (1-12) 1
56 pages
Python 3 Fundamentals: A Complete Guide for Modern Programmers
From Everand
Python 3 Fundamentals: A Complete Guide for Modern Programmers
Robert Johnson
No ratings yet
5 - Introduction To NLP
No ratings yet
5 - Introduction To NLP
34 pages
Prince Dewangan: Academic Details
No ratings yet
Prince Dewangan: Academic Details
2 pages
Secrets of Mental Math Secrets of Mental Math: Squaring Numbers
No ratings yet
Secrets of Mental Math Secrets of Mental Math: Squaring Numbers
21 pages
Aipython
No ratings yet
Aipython
221 pages
Cocube Sybulls: Cocubes Test Syllabus and Exam Pattern 2018
No ratings yet
Cocube Sybulls: Cocubes Test Syllabus and Exam Pattern 2018
1 page
Principles of Compiler Design PDF
No ratings yet
Principles of Compiler Design PDF
162 pages
Network Programming Lab
100% (1)
Network Programming Lab
79 pages
Vtu 4th Sem Cse Microprocessors Notes 10cs45 PDF
No ratings yet
Vtu 4th Sem Cse Microprocessors Notes 10cs45 PDF
112 pages
Riello Ups
No ratings yet
Riello Ups
4 pages
Test Guide Distance Nari Pcs-902
No ratings yet
Test Guide Distance Nari Pcs-902
8 pages
Anu Script Manager-Apple Keyboard
0% (1)
Anu Script Manager-Apple Keyboard
1 page
FR d710w 008 Na Mitsubishi VFD
No ratings yet
FR d710w 008 Na Mitsubishi VFD
3 pages
Single Card Kanban Systems
No ratings yet
Single Card Kanban Systems
2 pages
Practical 4 Notes
No ratings yet
Practical 4 Notes
21 pages
OPENova - OpenERP 7
No ratings yet
OPENova - OpenERP 7
120 pages
Modifications On v7:: F405 / 5VBEC / Camera Control / 6x UART / Flash
No ratings yet
Modifications On v7:: F405 / 5VBEC / Camera Control / 6x UART / Flash
2 pages
Index of QA Working
No ratings yet
Index of QA Working
2 pages
1 Cookiesflix
No ratings yet
1 Cookiesflix
3 pages
How To Attach Expert Advisor in MT5 - Guide - English
No ratings yet
How To Attach Expert Advisor in MT5 - Guide - English
9 pages
A Good Mystery Never Fails To Capture The Imagination. Money Is Stolen or Lost, Property Disappears, or
No ratings yet
A Good Mystery Never Fails To Capture The Imagination. Money Is Stolen or Lost, Property Disappears, or
2 pages
ST7MDT
No ratings yet
ST7MDT
5 pages
TAB 3 - ATMS Software Design Specifications v1.0.0
No ratings yet
TAB 3 - ATMS Software Design Specifications v1.0.0
38 pages
ALOHA Installation Manual
67% (3)
ALOHA Installation Manual
2 pages
Faac Depliant A952 04-2024 en Var SCREEN
No ratings yet
Faac Depliant A952 04-2024 en Var SCREEN
4 pages
sc14 HPCG
No ratings yet
sc14 HPCG
11 pages
CV of Asaduzzaman Saikot
No ratings yet
CV of Asaduzzaman Saikot
2 pages
computer complete notes
No ratings yet
computer complete notes
153 pages
3 - Models of Geographic Data Used in GIS - Eng
No ratings yet
3 - Models of Geographic Data Used in GIS - Eng
10 pages
MongoDB Replication Guide PDF
100% (1)
MongoDB Replication Guide PDF
106 pages
Ricoh Aficio 180 - D418 - 3218 - 2818 - 9918 - 4180 - 5218 Service Manual
No ratings yet
Ricoh Aficio 180 - D418 - 3218 - 2818 - 9918 - 4180 - 5218 Service Manual
247 pages
Alarm and Interlock Management
100% (1)
Alarm and Interlock Management
10 pages
DF Practicals 1 To 10
No ratings yet
DF Practicals 1 To 10
34 pages
S.No Sap T-Codes Activity Observe
No ratings yet
S.No Sap T-Codes Activity Observe
5 pages

Natural Language Processing Using Python: (With NLTK, Scikit-Learn and Stanford NLP Apis)

Uploaded by

Natural Language Processing Using Python: (With NLTK, Scikit-Learn and Stanford NLP Apis)

Uploaded by

Natural Language Processing using PYTHON

(with NLTK, scikit-learn and Stanford NLP APIs)

Instructor: Diptesh Kanojia, Abhijit Mishra

VIVA Institute of Technology, 2016 Diptesh, Abhijit http://www.cfilt.iitb.ac.in

VIVA Institute of Technology, 2016 Diptesh, Abhijit http://www.cfilt.iitb.ac.in 2

• Introduction to python and NLTK

Expected duration: 15 mins

But why Python needs both a compiler and an interpreter?

VIVA Institute of Technology, 2016 Diptesh, Abhijit CFILT 4

VIVA Institute of Technology, 2016 Diptesh, Abhijit Introduction to python 6

VIVA Institute of Technology, 2016 Diptesh, Abhijit Introduction to python 7

VIVA Institute of Technology, 2016 Diptesh, Abhijit Introduction to python 8

VIVA Institute of Technology, 2016 Diptesh, Abhijit Introduction to python 9

VIVA Institute of Technology, 2016 Diptesh, Abhijit Introduction to NLTK 10

VIVA Institute of Technology, 2016 Diptesh, Abhijit Introduction to NLTK 11

VIVA Institute of Technology, 2016 Diptesh, Abhijit Introduction to NLTK 12

VIVA Institute of Technology, 2016 Diptesh, Abhijit Introduction to NLTK 13

VIVA Institute of Technology, 2016 Diptesh, Abhijit Introduction to NLTK 14

VIVA Institute of Technology, 2016 Diptesh, Abhijit Introduction to NLTK 15

String Processing nltk.tokenize, nltk.stem Tokenizers, sentence tokenizers, stemmers

Collection Discovery nltk.collections t-test, chi-squared, point-wise mutual information

POS Tagging nltk.tag n-gram, backoff, Brill, HMM, TnT

Chunking nltk.chunk Regular expression, n-gram, named entity

Parsing nltk.parse Chart, feature-based, unification, probabilistic,

Evaluation Metrics nltk.metrics Precision, recall, agreement coefficients

Probability Estimation nltk.probability Frequency distributions, smoothed probability

Linguistics fieldwork nltk.toolbox Manipulate data in SIL Toolbox format

VIVA Institute of Technology, 2016 Diptesh, Abhijit Introduction to NLTK 16

VIVA Institute of Technology, 2016 Diptesh, Abhijit CFILT 17

VIVA Institute of Technology, 2016 Diptesh, Abhijit CFILT 18

VIVA Institute of Technology, 2016 Diptesh, Abhijit CFILT 19

VIVA Institute of Technology, 2016 Diptesh, Abhijit CFILT 20

• NER using NLTK

Expected duration: 15 mins

VIVA Institute of Technology, 2016 Diptesh, Abhijit CFILT 23

• WordNets and WordNet-API

Expected duration: 10 mins

VIVA Institute of Technology, 2016 Diptesh, Abhijit CFILT 25

DEMO- Conceptnet (conceptnet.py)

VIVA Institute of Technology, 2016 Diptesh, Abhijit CFILT 26

9th January, 2016 27

You might also like