0% found this document useful (0 votes)
163 views2 pages

CCS369

This document outlines the objectives and units of study for a course on text and speech analysis. The course covers natural language processing basics, text classification algorithms, question answering systems, speech recognition and speech synthesis. It includes suggested activities, evaluation methods and expected learning outcomes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
163 views2 pages

CCS369

This document outlines the objectives and units of study for a course on text and speech analysis. The course covers natural language processing basics, text classification algorithms, question answering systems, speech recognition and speech synthesis. It includes suggested activities, evaluation methods and expected learning outcomes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

CSE/IT/CCE/AIDS/CSBS- DATA SCIENCE

CCS369 TEXT AND SPEECH ANALYSIS L T PC


2 0 2 3
COURSE OBJECTIVES:
• Understand natural language processing basics
• Apply classification algorithms to text documents
• Build question-answering and dialogue systems
• Develop a speech recognition system
• Develop a speech synthesizer
UNIT I NATURAL LANGUAGE BASICS 6
Foundations of natural language processing – Language Syntax and Structure- Text Preprocessing and
Wrangling – Text tokenization – Stemming – Lemmatization – Removing stop-words – Feature Engineering
for Text representation – Bag of Words model- Bag of N-Grams model – TF-IDF model
Suggested Activities
● Flipped classroom on NLP
● Implementation of Text Preprocessing using NLTK
● Implementation of TF-IDF models
Suggested Evaluation Methods
• Quiz on NLP Basics
• Demonstration of Programs
UNIT II TEXT CLASSIFICATION 6
Vector Semantics and Embeddings -Word Embeddings - Word2Vec model – Glove model – FastText model
– Overview of Deep Learning models – RNN – Transformers – Overview of Text summarization and Topic
Models
Suggested Activities
• Flipped classroom on Feature extraction of documents
• Implementation of SVM models for text classification
• External learning: Text summarization and Topic models
Suggested Evaluation Methods
• Assignment on above topics
• Quiz on RNN, Transformers
• Implementing NLP with RNN and Transformers
UNIT III QUESTION ANSWERING AND DIALOGUE SYSTEMS 9
Information retrieval – IR-based question answering – knowledge-based question answering – language
models for QA – classic QA models – chatbots – Design of dialogue systems -– evaluating dialogue
systems
Suggested Activities:
• Flipped classroom on language models for QA
• Developing a knowledge-based question-answering system
• Classic QA model development
Suggested Evaluation Methods
• Assignment on the above topics
• Quiz on knowledge-based question answering system
• Development of simple chatbots
UNIT IV TEXT-TO-SPEECH SYNTHESIS 6
Overview. Text normalization. Letter-to-sound. Prosody, Evaluation. Signal processing - Concatenative
and parametric approaches, WaveNet and other deep learning-based TTS systems
Suggested Activities:
• Flipped classroom on Speech signal processing
• Exploring Text normalization
• Data collection
• Implementation of TTS systems
Suggested Evaluation Methods
• Assignment on the above topics
• Quiz on wavenet, deep learning-based TTS systems
• Finding accuracy with different TTS systems
UNIT V AUTOMATIC SPEECH RECOGNITION 6
Speech recognition: Acoustic modelling – Feature Extraction - HMM, HMM-DNN systems
Suggested Activities:
• Flipped classroom on Speech recognition.
• Exploring Feature extraction
Suggested Evaluation Methods
• Assignment on the above topics
• Quiz on acoustic modelling
30 PERIODS
PRACTICAL EXERCISES 30 PERIODS
1. Create Regular expressions in Python for detecting word patterns and tokenizing text
2. Getting started with Python and NLTK - Searching Text, Counting Vocabulary, Frequency
Distribution, Collocations, Bigrams
3. Accessing Text Corpora using NLTK in Python
4. Write a function that finds the 50 most frequently occurring words of a text that are not stop words.
5. Implement the Word2Vec model
6. Use a transformer for implementing classification
7. Design a chatbot with a simple dialog system
8. Convert text to speech and find accuracy
9. Design a speech recognition system and find the error rate
TOTAL: 60 PERIODS
COURSE OUTCOMES:
On completion of the course, the students will be able to
CO1:Explain existing and emerging deep learning architectures for text and speech processing
CO2:Apply deep learning techniques for NLP tasks, language modelling and machine translation
CO3:Explain coreference and coherence for text processing
CO4:Build question-answering systems, chatbots and dialogue systems
CO5:Apply deep learning models for building speech recognition and text-to-speech systems

TEXTBOOK
1. Daniel Jurafsky and James H. Martin, “Speech and Language Processing: An Introduction to Natural
Language Processing, Computational Linguistics, and Speech Recognition”, Third Edition, 2022.
REFERENCES:
1. Dipanjan Sarkar, “Text Analytics with Python: A Practical Real-World approach to Gaining
Actionable insights from your data”, APress,2018.
2. Tanveer Siddiqui, Tiwary U S, “Natural Language Processing and Information Retrieval”, Oxford
University Press, 2008.
3. Lawrence Rabiner, Biing-Hwang Juang, B. Yegnanarayana, “Fundamentals of Speech Recognition”
1st Edition, Pearson, 2009.
4. Steven Bird, Ewan Klein, and Edward Loper, “Natural language processing with Python”, O’REILLY.
CO’s-PO’s & PSO’s MAPPING
CO’s PO’s PSO’s
1 2 3 4 5 6 7 8 9 10 11 12 1 2 3
1 3 2 3 1 3 - - - 1 2 1 2 1 1 1
2 3 1 2 1 3 - - - 2 2 1 3 3 2 1
3 2 2 1 3 1 - - - 3 3 1 2 3 3 1
4 2 1 1 1 2 - - - 2 1 2 2 3 1 1
5 1 3 2 2 1 - - - 3 2 1 1 2 3 1
AVg. 2.2 1.8 1.8 1.6 2 - - - 2.2 2 1.2 2 2.4 2 1
1 - low, 2 - medium, 3 - high, ‘-' - no correlation

You might also like