0% found this document useful (0 votes)

9 views6 pages

Syllabus 2

Uploaded by

mhaab

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views6 pages

Syllabus 2

Uploaded by

mhaab

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Natural Language Processing for CogSci Research

Instructor: Marlene Staib

IMC/CogSci workshop, September 2018

E-mail: [email protected] Course Materials: github

Morning session: 9-12am Afternoon session: 1-4pm

Office hour: 4-5pm Room: 1485, CogSci Computer room

Overview
This workshop gives an introduction to some of the Machine Learning techniques used for Natu-
ral Language Processing. By understanding and implementing some simple, exemplary models,
participants will learn the basics of feature engineering, model selection and hyperparameter
tuning. This will provide a foundation for studying and applying other approaches used in
modern NLP. Discussions and labs will focus on the role of NLP for Cognitive Science, and the
application of analyses to participants’ own research.

Course Objectives
After the workshop, participants should be able to:

1. Understand the basic modelling approach(es) used in NLP

2. Implement a simple model for solving an NLP task

3. Critically evaluate choice of feature representation, type of model, hyperparameter selection

4. Understand the challenges and limitations for modelling language, in particular

5. Apply learned techniques to relevant questions in Cognitive Science research, e.g. to com-
paratively evaluate texts from different speaker groups

Organization
The main part of the workshop is split into two parts: seminars (lecture and discussion) and labs.
The lab materials will be provided in the form of Jupyter/iPython Notebooks. See “Software”
section below on how to get started with python and Jupyter.

1
NLP workshop

To get the most out of the workshop, I will suggest materials (readings, blogposts, videos,
labs/tutorials, quizzes) for you to engage with before (or in some cases after) the course. Mate-
rials are marked
[F] = foundational,
[C] = core, and
[A] = additional.
Please try to complete all of the core materials – there will only be a little bit of preparation,
to familiarize yourselves with the topics and to introduce yourselves and your ideas to me.
Foundational materials should help you gain a good background for what’s going on in the
lectures and labs. Feel free to skip them if they seem easy to you – they are meant as support,
not as an additional burden. Additional materials are for those of you who have smelled blood
and now want more. They are also meant to keep you busy, in case you happen to be one of the
people who finishes each lab within half the allocated time.
I do not expect anyone to read anything, or watch any of the linked videos; but I have noticed
that it often helps me solidify my knowledge if I hear/read the same content from at least 2-3
different sources. Also feel free to find your own content – e.g. by googling the keywords for
each day. I found some really cool videos and blog posts that way, that were way more fun than
the textbook.

Software
If you do not already have a working installation of Anaconda or Miniconda, please install Mini-
conda from the provided link. It doesn’t really matter which version you have (if you do not
have an installation yet, I recommend version 3.6), as long as you set up your environment as
described below. I can give very limited support on problems relating to versions, operating sys-
tems and conflicting package installations, therefore I would like you to follow these instructions
to set up a new conda environment for the course. If you are an expert and know you can make
it work no matter what, you may do whatever you want ;)
On Mac/Linux, open your terminal. On Windows, you can use the Anaconda prompt (hit the
start button and type "anaconda prompt"). Run:
conda create -n nlp_workshop python=3
source activate nlp_workshop (Mac/Linux); or: activate nlp_workshop (Windows)
conda install numpy pandas matplotlib seaborn ipykernel nltk scikit-learn
python -m ipykernel install –-user –-name nlp_workshop –-display-name "Python (nlp)"
This creates a separate python environment for the course, which should not conflict with any
other versions of python you have on your computer. You can activate and deactivate the envi-
ronment with:
source activate nlp_workshop
source deactivate
Whenever the environment is activated, you can use that installation of python via the termi-
nal/command prompt. Now, it’s time to download the labs to your computer. If you have git
installed, just run:
git clone https://github.com/MarleneStaib/NLPworkshop.git
Otherwise, open the workshop repository in your browser and download it as a zip file. Save it
in your home directory. In your terminal, navigate to the folder/directory where you saved it,

2/6
NLP workshop

like this:
cd NLPworkshop/labs
Then type (with the environment still activated):
jupyter notebook
This should automatically open a browser window, which looks like this:

You can click on the labs to open them, but that’s for later. You should be all set now. (To close
the notebook, you can close the browser windows and then hit Ctrl+C in your terminal.) If you
cannot manage to get there on your own, come see me in my first office hour on Monday, 17th
of September at 4pm.

Timetable
Day 0: Preparation
Materials:

• Pre-course Survey: Please fill in the pre-course survey [C]

• Short intro video: NLP tasks and applications; overview [C]

• Set up your environment, see Software; if you run into trouble come see me during the first
office hour before the course: Monday, 17th of September at 4pm [C]

• Optional: NLTK intro and tutorial [A]

• If you want a deeper understanding of the modelling approaches used in most of modern
NLP, I recommend you to freshen up your linear algebra, probability theory and a tiny bit
of calculus. [F]

• If you want an overview of NLP tasks and methods, I recommend the standard introduc-
tion “Speech and Language Processing” by Jurafsky and Martin (henceforth: J&M). Their
3rd edition draft is for free available online, and the most up-to-date, but incomplete. For
specific topics, you may have to check out edition 2. There is also a whole course (corre-
sponding roughly to the J&M book(s)) available for free on YouTube by Dan Jurafsky and
Chris Manning from Stanford University. [A]

3/6
NLP workshop

Day 1: Modeling in NLP; Naive Bayes

Morning: Lecture and discussion:
• Intro to NLP, reply to your comments from the online survey

• Modelling in NLP:

– CogSci/Stats versus AI/Machine Learning

– Inputs and outputs, feature representations for language
– Modelling approaches: supervised/unsupervised, discriminative/generative, classifi-
cation/regression
– Hyperparameters
– NLP - just another machine learning problem?

• First example: Naive Bayes for text classification

– Some background in probability theory

– Bayes’ rule
– The Naive Bayes Classifier: A generative model

Afternoon: Data Science in Python Lab

• numpy, pandas, matplotlib, seaborn

• Note: If you are very familiar with these libraries, get bored during the lab or just finish everything
early, you may get started on the Naive Bayes Lab linked under Day 2! :)
Materials:
• Chapter 4 in J&M, ed. 3 [F]

• Lessons 6.1 to 6.9 from Jurafsky’s lecture series on YouTube [F]

Day 2: Feature representations for language; Vector Semantics

Morning: Lecture and discussion:
• Possible representations for language; the problem with discrete representations

• Distributional hypothesis and vector semantics

• Measuring collocations and similarity: PPMI, TF-IDF, (PCA/LSA), cosine similarity

Afternoon: Naive Bayes Lab
• Train-validation-test split

• Building and evaluating a basic model

• Bonus: Improving the basic model: Feature engineering

Materials:
• Chapter 6 in J&M, ed.3 [F]

4/6
NLP workshop

Day 3: Unsupervised Learning; k-means/GMM clustering

Morning: Lecture and discussion

• k-means clustering algorithm

• Choosing the number of clusters

• Gaussian Mixture Models

Afternoon: Vector Semantics Lab

• Turn some text into word vectors

• Experiment with different ways of creating vectors: TF-IDF, PPMI

• Measure cosine similarity between different word vectors

• Additive meaning of word vectors?

• Using word vectors for sentiment analysis

Materials:

• Lecture on k-means by Victor Lavrenko on YouTube [F]

• Lecture on GMMs by Victor Lavrenko on YouTube [A]

Day 4: Neural Networks

Morning: Lecture and Discussion

• High-level introduction to Neural Nets and their application in NLP

• Almost all of modern NLP is “deep” learning (using Deep Neural Networks, DNNs). This
has achieved some amazing results, but there are still some challenges specific to language
that are unlikely to be solved with DNNs.

• We can discuss issues such as: What types of networks reflect what features of language?

Afternoon: Unsupervised Learning Lab

• Clustering word vectors with k-means and GMMs

• Defining the “optimal” number of clusters empirically

• Optional: Hierarchical Clustering

• Note: If you are done with all the labs early, you could have a look at this PyTorch tutorial and start
working on Neural Nets :)

Materials:

• Intro to Neural Networks by 3blue1brown. [A]

5/6
NLP workshop

• Skip gram: Learning vector semantic representations with a neural network (so-called
“word embeddings”). [A]

• This is way beyond what is covered here, but if you are convinced that NLP is awesome, feel
free to check out Stanford’s course on Natural Language Processing with Deep Learning,
which is fully available on youtube. [A]

• Once you have a bit of an understanding of Deep Learning and NLP, you may want to
check out this amazing PyTorch tutorial, to implement your own model. [A]

6/6

Raymond S. T. Lee - Natural Language Processing. A Textbook With Python Implementation-Springer (2024)
No ratings yet
Raymond S. T. Lee - Natural Language Processing. A Textbook With Python Implementation-Springer (2024)
454 pages
Seminar Title: Natural Language Processing: Understanding and Generating Human Language
No ratings yet
Seminar Title: Natural Language Processing: Understanding and Generating Human Language
20 pages
Natural Language Processing A Machine Learning Perspective by Yue Zhang, Westlake University Zhiyang Teng, Westlake University
No ratings yet
Natural Language Processing A Machine Learning Perspective by Yue Zhang, Westlake University Zhiyang Teng, Westlake University
768 pages
ANLP Syllabus 2021
No ratings yet
ANLP Syllabus 2021
7 pages
NLP Semester 7
No ratings yet
NLP Semester 7
1,072 pages
Unit 1
No ratings yet
Unit 1
99 pages
Roadmap For Mastering Natural Language Processing
No ratings yet
Roadmap For Mastering Natural Language Processing
3 pages
Lec1 Intro
No ratings yet
Lec1 Intro
130 pages
Lecture01 Introduction
No ratings yet
Lecture01 Introduction
37 pages
Natural Language Processing
No ratings yet
Natural Language Processing
77 pages
Meetup 14dec2019
No ratings yet
Meetup 14dec2019
10 pages
NLP 160709201345
No ratings yet
NLP 160709201345
61 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
13 pages
Complete NLP Mastery Study Plan
No ratings yet
Complete NLP Mastery Study Plan
18 pages
Lect36 Tasks
No ratings yet
Lect36 Tasks
95 pages
Natural Language Processing (NLP) With Python - Tutorial
No ratings yet
Natural Language Processing (NLP) With Python - Tutorial
72 pages
Introduction - Hugging Face NLP Course
No ratings yet
Introduction - Hugging Face NLP Course
8 pages
NLB Lab Manuel 2
No ratings yet
NLB Lab Manuel 2
71 pages
Introduction AdvNLP
No ratings yet
Introduction AdvNLP
12 pages
Course1 Recap
No ratings yet
Course1 Recap
46 pages
ML C-57 Program Calendar - Sept 23
No ratings yet
ML C-57 Program Calendar - Sept 23
3 pages
NLP Front Matter
No ratings yet
NLP Front Matter
28 pages
Large Language Models Ryan Cotterell, Mrinmaya Sachan, Florian Tramèr, Ce ZhangLecture 1
No ratings yet
Large Language Models Ryan Cotterell, Mrinmaya Sachan, Florian Tramèr, Ce ZhangLecture 1
37 pages
Lect36 Tasks
No ratings yet
Lect36 Tasks
95 pages
Session 1
No ratings yet
Session 1
22 pages
Introduction
No ratings yet
Introduction
29 pages
Lecture01 Introduction
No ratings yet
Lecture01 Introduction
35 pages
Chapter 1
No ratings yet
Chapter 1
66 pages
ChatGPT - MyLearning On Coding For NLP
No ratings yet
ChatGPT - MyLearning On Coding For NLP
10 pages
CSR 322 Syllabus
No ratings yet
CSR 322 Syllabus
2 pages
tdt4310 2024 Lect1 Full
No ratings yet
tdt4310 2024 Lect1 Full
42 pages
NLP
No ratings yet
NLP
2 pages
Natural Language Processing
No ratings yet
Natural Language Processing
5 pages
Introduction
No ratings yet
Introduction
29 pages
AnandKumar Course Intro IT356
No ratings yet
AnandKumar Course Intro IT356
42 pages
AI&NLP
No ratings yet
AI&NLP
1 page
Lecture1 PDF
No ratings yet
Lecture1 PDF
39 pages
Natural Language Processing Manual
No ratings yet
Natural Language Processing Manual
39 pages
Lect36 Tasks
No ratings yet
Lect36 Tasks
115 pages
E061341 - NLP
No ratings yet
E061341 - NLP
3 pages
Micro Macro Teaching
100% (2)
Micro Macro Teaching
18 pages
Absorptive Capacity Valuing A Reconceptualization
100% (1)
Absorptive Capacity Valuing A Reconceptualization
14 pages
Natural Language Processing With Deep Learning 1 PDF
No ratings yet
Natural Language Processing With Deep Learning 1 PDF
37 pages
01 Introduction
No ratings yet
01 Introduction
13 pages
U1 NLP App Solved
No ratings yet
U1 NLP App Solved
26 pages
NLP Intro Logistics MIHE
No ratings yet
NLP Intro Logistics MIHE
21 pages
MScIT Sem4
No ratings yet
MScIT Sem4
8 pages
Cs-3-Lesson Plan
No ratings yet
Cs-3-Lesson Plan
3 pages
Inquiry-Based Learning Approach
No ratings yet
Inquiry-Based Learning Approach
3 pages
Unit I NLP
No ratings yet
Unit I NLP
5 pages
Concepts of Leadership
No ratings yet
Concepts of Leadership
16 pages
Deep Learning DSDL32, NLP
No ratings yet
Deep Learning DSDL32, NLP
4 pages
Hocken Maier 25
No ratings yet
Hocken Maier 25
46 pages
NLP Nanodegree Syllabus
No ratings yet
NLP Nanodegree Syllabus
11 pages
NLP Syllabus For Course Work
No ratings yet
NLP Syllabus For Course Work
4 pages
Applied Natural Language Processing
No ratings yet
Applied Natural Language Processing
3 pages
NLP Subject Orientation SH23
No ratings yet
NLP Subject Orientation SH23
35 pages
Lec-1 Introduction
No ratings yet
Lec-1 Introduction
68 pages
GBHRFTHRDF
No ratings yet
GBHRFTHRDF
3 pages
Brochure CMU NLP 24-08-2022 V13
No ratings yet
Brochure CMU NLP 24-08-2022 V13
13 pages
NLP A
No ratings yet
NLP A
6 pages
Syllabus NLP (UE19CS334)
No ratings yet
Syllabus NLP (UE19CS334)
2 pages
Lesson Plan - Germination of Seeds
67% (3)
Lesson Plan - Germination of Seeds
4 pages
Subtraction 3-Digit Number Without Regrouping I
100% (2)
Subtraction 3-Digit Number Without Regrouping I
5 pages
Siike prt2
No ratings yet
Siike prt2
21 pages
Traditional Korean Clothing Origami Lesson Plan
No ratings yet
Traditional Korean Clothing Origami Lesson Plan
4 pages
Alternative Assessment
No ratings yet
Alternative Assessment
24 pages
Academic English Grammar in Use
No ratings yet
Academic English Grammar in Use
34 pages
Reality Monitoring PDF
No ratings yet
Reality Monitoring PDF
19 pages
Semi-Detailed Lesson Plan in Science For Kindergarten
No ratings yet
Semi-Detailed Lesson Plan in Science For Kindergarten
3 pages
Presentation Classroom Action Research
No ratings yet
Presentation Classroom Action Research
33 pages
Career Prediction System
No ratings yet
Career Prediction System
31 pages
Running Head: Psychology in My Life 1
No ratings yet
Running Head: Psychology in My Life 1
6 pages
Observation, Grounded Theory and Content Analysis
No ratings yet
Observation, Grounded Theory and Content Analysis
28 pages
Pp2 Schemes of Work Music
No ratings yet
Pp2 Schemes of Work Music
6 pages
4conchita 1.279150631 PDF
No ratings yet
4conchita 1.279150631 PDF
18 pages
Educational Researcher: Are Cognitive Skills Context-Bound?
No ratings yet
Educational Researcher: Are Cognitive Skills Context-Bound?
11 pages
Sentence Patterns: (Eng 1-Communication Arts)
No ratings yet
Sentence Patterns: (Eng 1-Communication Arts)
17 pages
A Social Psychological Perspective: Prejudice and Stereotypes in Innaritu'S Babel Movie (2006)
No ratings yet
A Social Psychological Perspective: Prejudice and Stereotypes in Innaritu'S Babel Movie (2006)
15 pages
CB Decision Making Model
No ratings yet
CB Decision Making Model
11 pages
Pinto Evaluating N-Gram Models For A Bilingual Word Sense Disambiguation Task
No ratings yet
Pinto Evaluating N-Gram Models For A Bilingual Word Sense Disambiguation Task
12 pages
DLL AOM (Week 0 June 20-21)
No ratings yet
DLL AOM (Week 0 June 20-21)
3 pages
Questionnare Unit 1
No ratings yet
Questionnare Unit 1
5 pages
Daily Lesson Plan: Teacher'S Name: MR Mohd Redza B Abdul Razak
No ratings yet
Daily Lesson Plan: Teacher'S Name: MR Mohd Redza B Abdul Razak
2 pages
Brochure: Comparative Adjectives
No ratings yet
Brochure: Comparative Adjectives
3 pages
Anisa Rahmadani: Statement of Participation
No ratings yet
Anisa Rahmadani: Statement of Participation
2 pages
Learners Module Grade 7
No ratings yet
Learners Module Grade 7
4 pages
Beckett: A Guide For The Perplexed
91% (11)
Beckett: A Guide For The Perplexed
27 pages
Programming Problems: A Primer for The Technical Interview
From Everand
Programming Problems: A Primer for The Technical Interview
Bradley Green
4.5/5 (3)
Learn Python in One Hour: Programming by Example
From Everand
Learn Python in One Hour: Programming by Example
Victor R. Volkman
3/5 (2)

Syllabus 2

Uploaded by

Syllabus 2

Uploaded by

Natural Language Processing for CogSci Research

Instructor: Marlene Staib

IMC/CogSci workshop, September 2018

E-mail: [email protected] Course Materials: github

Morning session: 9-12am Afternoon session: 1-4pm

1. Understand the basic modelling approach(es) used in NLP

2. Implement a simple model for solving an NLP task

3. Critically evaluate choice of feature representation, type of model, hyperparameter selection

4. Understand the challenges and limitations for modelling language, in particular

• Pre-course Survey: Please fill in the pre-course survey [C]

• Short intro video: NLP tasks and applications; overview [C]

• Optional: NLTK intro and tutorial [A]

Day 1: Modeling in NLP; Naive Bayes

– CogSci/Stats versus AI/Machine Learning

• First example: Naive Bayes for text classification

– Some background in probability theory

Afternoon: Data Science in Python Lab

• Lessons 6.1 to 6.9 from Jurafsky’s lecture series on YouTube [F]

Day 2: Feature representations for language; Vector Semantics

• Distributional hypothesis and vector semantics

• Measuring collocations and similarity: PPMI, TF-IDF, (PCA/LSA), cosine similarity

• Building and evaluating a basic model

• Bonus: Improving the basic model: Feature engineering

Day 3: Unsupervised Learning; k-means/GMM clustering

• k-means clustering algorithm

• Choosing the number of clusters

• Gaussian Mixture Models

Afternoon: Vector Semantics Lab

• Turn some text into word vectors

• Experiment with different ways of creating vectors: TF-IDF, PPMI

• Measure cosine similarity between different word vectors

• Additive meaning of word vectors?

• Using word vectors for sentiment analysis

• Lecture on k-means by Victor Lavrenko on YouTube [F]

• Lecture on GMMs by Victor Lavrenko on YouTube [A]

Day 4: Neural Networks

• High-level introduction to Neural Nets and their application in NLP

Afternoon: Unsupervised Learning Lab

• Clustering word vectors with k-means and GMMs

• Defining the “optimal” number of clusters empirically

• Optional: Hierarchical Clustering

• Intro to Neural Networks by 3blue1brown. [A]

You might also like