0% found this document useful (0 votes)

18 views34 pages

ChatGPT-NLP Course Summary

The document outlines the syllabus for a Natural Language Processing (NLP) course at Savitribai Phule Pune University, covering six units over 42 hours. Topics include the introduction to NLP, language syntax and semantics, language modeling, information retrieval, NLP tools and techniques, and applications of NLP. Each unit features key concepts, case studies, and mapped course outcomes to ensure comprehensive learning.

Uploaded by

dikshaahire256

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views34 pages

ChatGPT-NLP Course Summary

Uploaded by

dikshaahire256

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

NLP Course Summary

Course Contents

Unit I Introduction to Natural Language Processing 07

Hours
Introduction: Natural Language Processing, Why NLP is
hard? Programming languages Vs
Natural Languages, Are natural languages regular?
Finite automata for NLP, Stages of NLP,
Challenges and Issues(Open Problems) in NLP
Basics of text processing: Tokenization, Stemming,
Lemmatization, Part of Speech Tagging
#Exemplar/Case Studies Why English is not a regular
language:

http://cs.haifa.ac.il/~shuly/teaching/08/nlp/complexity.p
df#page=20

*Mapping of Course
Outcomes for Unit I

CO1

Unit II Language Syntax and Semantics 07 Hours

Home

Faculty of Engineering Savitribai Phule Pune University

Syllabus for Fourth Year of Computer Engineering

#79/128
Morphological Analysis: What is Morphology? Types of
Morphemes, Inflectional morphology
&Derivational morphology, Morphological parsing with
Finite State Transducers (FST)
Syntactic Analysis: Syntactic Representations of
Natural Language, Parsing Algorithms,
Probabilistic context-free grammars, and Statistical

🚀 Powered by ChatGPT Exporter 1 / 34

parsing
Semantic Analysis: Lexical Semantic, Relations among
lexemes & their senses –
Homonymy,Polysemy, Synonymy, Hyponymy, WordNet,
Word Sense Disambiguation (WSD),
Dictionary
based approach, Latent Semantic Analysis
#Exemplar/CaseStudies Study of Stanford Parser and
POS Tagger
https://nlp.stanford.edu/software/lex-parser.html
https://nlp.stanford.edu/software/tagger.html

*Mapping of Course
Outcomes for Unit II

CO2

Unit III Language Modelling 07 Hours

Probabilistic language modeling, Markov models,
Generative models of language, Log-Liner
Models, Graph-based Models
N-gram models: Simple n-gram models, Estimation
parameters and smoothing, Evaluating
language models, Word Embeddings/ Vector Semantics:
Bag-of-words, TFIDF, word2vec,
doc2vec, Contextualized representations (BERT)
Topic Modelling: Latent Dirichlet Allocation (LDA),
Latent Semantic Analysis, Non
Negative
Matrix Factorization
#Exemplar/Case Studies Study of language modelling
for Indian languages.
*Mapping of Course
Outcomes for Unit III

CO3

Unit IV Information Retrieval using NLP 07 Hours

Information Retrieval: Introduction, Vector Space Model
Named Entity Recognition: NER System Building

🚀 Powered by ChatGPT Exporter 2 / 34

Process, Evaluating NER System
Entity Extraction, Relation Extraction, Reference
Resolution, Coreference resolution, Cross
Lingual Information Retrieval
#Exemplar/Case Studies Natural Language Processing
based Information Extraction &

Retrieval:
https://www.cdac.in/index.aspx?
id=mc_cli_cross_lingual_info

*Mapping of Course
Outcomes for Unit IV

CO4

Unit V NLP Tools and Techniques 07 Hours

Prominent NLP Libraries: Natural Language Tool Kit
(NLTK), spaCy, TextBlob, Gensim etc.
Linguistic Resources: Lexical Knowledge Networks,
WordNets, Indian Language WordNet
(IndoWordnet), VerbNets, PropBank, Treebanks,
Universal Dependency Treebanks
Word Sense Disambiguation: Lesk Algorithm Walker’s
algorithm, WordNets for Word
Sense Disambiguation
#Exemplar/Case Studies Hindi Wordnet:
https://www.cfilt.iitb.ac.in/wordnet/webhwn/
Sanskrit WordNet:
https://www.cfilt.iitb.ac.in/wordnet/webswn/
Indic Library:
http://anoopkunchukuttan.github.io/indic_nlp_library/

Faculty of Engineering Savitribai Phule Pune University

Syllabus for Fourth Year of Computer Engineering

#80/128
*Mapping of Course
Outcomes for Unit V

🚀 Powered by ChatGPT Exporter 3 / 34

CO5

Unit VI Applications of NLP 07 Hours

Machine Translation: Rule based techniques, Statistical
Machine Translation (SMT), Cross
Lingual Translation
Sentiment Analysis, Question Answering, Text
Entailment, Discourse Processing, Dialog and
Conversational Agents, Natural Language Generation
#Exemplar/Case Studies Study working of Google
Translate

Study working of IBM Watson Natural Language

Processing oral qa

Here’s a cleaned and organized version of the course content for your Natural
Language Processing (NLP) syllabus from Savitribai Phule Pune University –
Final Year Computer Engineering. Each unit includes the main topics, suggested
exemplar/case studies, and mapped Course Outcomes (COs):

Unit I: Introduction to Natural Language Processing (07

Hours)
Topics Covered:

Definition and need for NLP

Why NLP is hard?

Programming Languages vs Natural Languages

Are Natural Languages Regular?

Finite Automata for NLP

🚀 Powered by ChatGPT Exporter 4 / 34

Stages of NLP

Challenges and Open Issues in NLP

Basics of Text Processing:

Tokenization

Stemming

Lemmatization

Part-of-Speech (POS) Tagging

Exemplar/Case Study:

Why English is not a regular language:

Complexity of English Language (PDF)

Mapped Course Outcome:

CO1

Unit II: Language Syntax and Semantics (07 Hours)

Topics Covered:

Morphological Analysis:

Morphology, Morphemes

Inflectional vs Derivational Morphology

Finite State Transducers for Morphological Parsing

Syntactic Analysis:

Syntactic Representations

Parsing Algorithms

🚀 Powered by ChatGPT Exporter 5 / 34

Probabilistic Context-Free Grammars

Statistical Parsing

Semantic Analysis:

Lexical Semantics

Word Relations: Homonymy, Polysemy, Synonymy, Hyponymy

WordNet

Word Sense Disambiguation (WSD)

Dictionary-based and Latent Semantic Analysis

Exemplar/Case Study:

Stanford Parser and POS Tagger:

Lex Parser
Tagger

Mapped Course Outcome:

CO2

Unit III: Language Modelling (07 Hours)

Topics Covered:

Probabilistic Language Modeling:

Markov Models

Generative Models

Log-Linear Models

Graph-based Models

🚀 Powered by ChatGPT Exporter 6 / 34

N-gram Models:

Simple N-grams, Parameter Estimation

Smoothing Techniques

Evaluation of Language Models

Vector Semantics:

Bag-of-Words, TF-IDF

word2vec, doc2vec

Contextualized Embeddings (e.g., BERT)

Topic Modelling:

LDA, Latent Semantic Analysis

Non-Negative Matrix Factorization

Exemplar/Case Study:

Language Modeling for Indian Languages

Mapped Course Outcome:

CO3

Unit IV: Information Retrieval using NLP (07 Hours)

Topics Covered:

Information Retrieval Basics:

Introduction, Vector Space Model

Entity Processing:

🚀 Powered by ChatGPT Exporter 7 / 34

Named Entity Recognition (NER)

Entity Extraction

Relation and Reference Resolution

Coreference Resolution

Cross-Lingual Information Retrieval

Exemplar/Case Study:

NLP-based Information Extraction and Retrieval:

CDAC Cross-Lingual IR

Mapped Course Outcome:

CO4

Unit V: NLP Tools and Techniques (07 Hours)

Topics Covered:

NLP Libraries:

NLTK, spaCy, TextBlob, Gensim

Linguistic Resources:

WordNets (English & Indian Languages)

IndoWordNet, VerbNets, PropBank

Treebanks, Universal Dependency Treebanks

Word Sense Disambiguation Techniques:

Lesk Algorithm

Walker’s Algorithm

🚀 Powered by ChatGPT Exporter 8 / 34

WordNets for WSD

Exemplar/Case Study:

Hindi WordNet: WebHwn

Sanskrit WordNet: WebSwn

Indic NLP Library: Indic Library

Mapped Course Outcome:

CO5

Unit VI: Applications of NLP (07 Hours)

Topics Covered:

Machine Translation:

Rule-Based, Statistical Machine Translation

Cross-Lingual Translation

Sentiment Analysis

Question Answering Systems

Text Entailment

Discourse Processing

Dialog and Conversational Agents

Natural Language Generation

Exemplar/Case Study:

🚀 Powered by ChatGPT Exporter 9 / 34

Working of Google Translate

Working of IBM Watson NLP for Oral QA

Let me know if you'd like a summarized PowerPoint presentation, PDF, or mind

map version of this content.

give me oral QA

Here's an Oral Question-Answer (QA) set based on the Natural Language

Processing (NLP) syllabus you've provided. These questions are suitable for viva,
oral exams, or interview preparation. They are organized unit-wise, with a focus on
conceptual clarity.

✅ Unit I – Introduction to NLP

Q1. What is Natural Language Processing?
A: NLP is a field of AI that enables computers to understand, interpret, and generate
human language.

Q2. Why is NLP considered difficult?

A: Because natural languages are ambiguous, unstructured, and have complex
grammar rules which vary by context and culture.

Q3. What are the stages of NLP?

A: Lexical analysis, syntactic analysis, semantic analysis, discourse integration, and
pragmatic analysis.

🚀 Powered by ChatGPT Exporter 10 / 34

Q4. What is tokenization?
A: It is the process of splitting text into individual words or tokens.

Q5. How is stemming different from lemmatization?

A: Stemming chops off word ends; lemmatization returns the base form using
vocabulary and morphology.

✅ Unit II – Syntax and Semantics

Q6. What is morphology in NLP?
A: Study of word formation and structure using morphemes.

Q7. What are inflectional and derivational morphemes?

A: Inflectional morphemes change tense or number; derivational morphemes create
new words.

Q8. What is a context-free grammar (CFG)?

A: A formal grammar where every production rule is of the form A → β, with A as a
non-terminal.

Q9. What is Word Sense Disambiguation (WSD)?

A: The process of identifying the correct meaning of a word in context.

Q10. What are homonymy and polysemy?

A: Homonyms have different meanings but the same form; polysemy has related
meanings.

✅ Unit III – Language Modelling

Q11. What is an N-gram model?
A: A probabilistic model that predicts the next word based on the previous N-1 words.

🚀 Powered by ChatGPT Exporter 11 / 34

Q12. What is smoothing in N-gram models?
A: Techniques used to handle zero probabilities for unseen word combinations.

Q13. What is TF-IDF?

A: Term Frequency-Inverse Document Frequency; it reflects the importance of a word
in a document.

Q14. What is word2vec?

A: A model that represents words in a dense vector space capturing semantic
meaning.

Q15. What is topic modeling?

A: A technique to discover abstract topics within a collection of texts, e.g., using LDA.

✅ Unit IV – Information Retrieval

Q16. What is Information Retrieval (IR)?
A: Finding relevant documents or data from a large corpus based on a query.

Q17. What is Named Entity Recognition (NER)?

A: Identifying entities like person names, locations, dates, etc., in text.

Q18. What is coreference resolution?

A: Identifying when different words refer to the same entity in a text.

Q19. What is cross-lingual IR?

A: Retrieving information in one language based on a query in another.

Q20. How does relation extraction work?

A: It identifies semantic relationships between entities in a sentence.

✅ Unit V – NLP Tools and Techniques

🚀 Powered by ChatGPT Exporter 12 / 34
Q21. What is NLTK?
A: A Python library for symbolic and statistical NLP.

Q22. What is spaCy used for?

A: Industrial-strength NLP tasks like tokenization, POS tagging, NER.

Q23. What is WordNet?

A: A lexical database of English words grouped into synonyms and linked by semantic
relations.

Q24. What is the Lesk algorithm?

A: A WSD algorithm that uses dictionary definitions (overlaps) to determine sense.

Q25. What is IndoWordNet?

A: A multilingual WordNet for Indian languages developed by IIT Bombay.

✅ Unit VI – Applications of NLP

Q26. What is Machine Translation?
A: Automatically translating text from one language to another.

Q27. What is Statistical Machine Translation (SMT)?

A: A translation technique based on probabilities derived from bilingual corpora.

Q28. What is Sentiment Analysis?

A: Detecting the sentiment (positive, negative, neutral) from text.

Q29. What are dialog systems?

A: AI systems that can converse with humans in natural language (e.g., chatbots).

Q30. What is text entailment?

A: Determining if one text logically follows from another.

🚀 Powered by ChatGPT Exporter 13 / 34

Would you like a printable version of this QA sheet or a presentation format with
speaker notes for teaching/revision?

give me more unit wise

Here’s an extended set of Unit-wise Oral Q&A for Natural Language

Processing (NLP) based on your syllabus — ideal for oral exams, interviews, and
concept revision.

✅ Unit I – Introduction to NLP (Extended)

Q1. What makes natural languages ambiguous?
A: Words can have multiple meanings, sentence structures can vary, and context
affects interpretation.

Q2. What are the main components of an NLP pipeline?

A: Lexical analysis, syntactic analysis, semantic analysis, discourse integration,
pragmatic analysis.

Q3. How is a natural language different from a programming language?

A: Natural languages are informal, ambiguous, and evolve over time; programming
languages are formal and have strict syntax.

Q4. What is POS tagging?

A: It assigns parts of speech (noun, verb, adjective, etc.) to each word in a sentence.

Q5. What are finite automata used for in NLP?

A: Modeling regular grammars and recognizing regular languages, though natural
languages are not fully regular.

🚀 Powered by ChatGPT Exporter 14 / 34

✅ Unit II – Language Syntax and Semantics (Extended)
Q6. What is derivational morphology?
A: It forms a new word by adding prefixes/suffixes (e.g., happy → happiness).

Q7. What is morphological parsing?

A: Breaking down words into morphemes to understand their structure and meaning.

Q8. What is parsing in NLP?

A: The process of analyzing sentence structure using a grammar.

Q9. What is the difference between lexical and compositional semantics?

A: Lexical semantics deals with word meanings; compositional semantics explains
how meanings combine in phrases/sentences.

Q10. What is WordNet and how is it useful in semantics?

A: A lexical database that links words with their meanings and relationships, used for
WSD and synonym detection.

✅ Unit III – Language Modelling (Extended)

Q11. What is a generative language model?
A: A model that can generate text by learning the joint probability of sequences of
words.

Q12. What is the role of smoothing in language modeling?

A: To handle zero-probability problems by redistributing some probability mass to
unseen events.

Q13. How does word2vec work?

A: It learns word embeddings using neural networks based on surrounding context
(CBOW or Skip-gram).

🚀 Powered by ChatGPT Exporter 15 / 34

Q14. What is the difference between BERT and word2vec?
A: BERT provides contextualized embeddings; word2vec gives static embeddings.

Q15. What are the applications of topic modeling?

A: Document classification, content summarization, recommendation systems.

✅ Unit IV – Information Retrieval using NLP (Extended)

Q16. What is the Vector Space Model in IR?
A: A model that represents documents and queries as vectors to calculate similarity
(e.g., cosine similarity).

Q17. What is Entity Extraction?

A: Identifying key entities (people, organizations, locations) in text.

Q18. What is the difference between NER and coreference resolution?

A: NER identifies entities; coreference resolution links different mentions of the same
entity.

Q19. What are evaluation metrics for NER?

A: Precision, Recall, F1-score.

Q20. What is the challenge in Cross-Lingual Information Retrieval?

A: Matching queries in one language with documents in another language.

✅ Unit V – NLP Tools and Techniques (Extended)

Q21. What is Gensim used for?
A: Topic modeling and document similarity analysis using models like LDA and
word2vec.

🚀 Powered by ChatGPT Exporter 16 / 34

Q22. What are Treebanks?
A: Annotated text corpora that show the syntactic structure of sentences.

Q23. What is Universal Dependency Treebank?

A: A multilingual treebank standard for syntactic annotation.

Q24. What is the purpose of VerbNet and PropBank?

A: To provide structured information about verb argument structures and semantics.

Q25. How does the Lesk algorithm perform WSD?

A: By finding the sense of a word whose dictionary definition overlaps most with its
context.

✅ Unit VI – Applications of NLP (Extended)

Q26. What are the types of Machine Translation systems?
A: Rule-based, Statistical (SMT), Neural Machine Translation (NMT).

Q27. What is text entailment in NLP?

A: Determining whether one sentence logically follows from another.

Q28. How do question answering systems work?

A: They use information retrieval, NLP, and sometimes reasoning to return accurate
answers to natural questions.

Q29. What is the goal of discourse processing?

A: To understand how sentences relate in context, coherence, and flow.

Q30. What are conversational agents?

A: AI systems like chatbots that interact with users through natural language
dialogue.

🚀 Powered by ChatGPT Exporter 17 / 34

Would you like me to generate flashcards, a quiz, or a PDF handout of this oral QA
for easy review?

oral QA unit wise deeply

Here is a comprehensive, deep oral Q&A set for each unit of your NLP
syllabus, designed for viva voce, oral exams, or detailed concept interviews.
Each question dives deeper into the topic, not just definitions, to reflect understanding
and reasoning.

✅ Unit I: Introduction to NLP (Deep Oral QA)

Q1. What makes natural language processing fundamentally different from traditional
programming problems?
A: Natural languages are inherently ambiguous, unstructured, and full of exceptions.
Unlike programming languages that follow strict grammar rules, natural languages
vary by context, region, and user intent, making interpretation by machines extremely
challenging.

Q2. Explain why natural languages are not regular languages.

A: Regular languages can be represented by finite automata and are limited in
expressiveness. Natural languages, however, require memory (e.g., nested structures
in grammar like "The cat that the dog chased ran away"), which can't be captured by
regular expressions or finite automata — they often require context-free or context-
sensitive grammars.

Q3. What are the primary stages of an NLP pipeline?

🚀 Powered by ChatGPT Exporter 18 / 34

1. Text preprocessing (tokenization, normalization)

2. Morphological analysis

3. Syntactic analysis (parsing)

4. Semantic analysis

5. Discourse integration

6. Pragmatic analysis

Q4. What challenges are still open problems in NLP?

Word Sense Disambiguation

Handling sarcasm and humor

Cross-lingual NLP

Context understanding and reasoning

Bias and fairness in language models

Q5. Differentiate between stemming and lemmatization with examples.

A: Stemming cuts word suffixes without understanding grammar (e.g., playing → play,
flies → fli). Lemmatization uses vocabulary and morphological analysis to return the
dictionary form (flies → fly).

✅ Unit II: Language Syntax and Semantics (Deep

Oral QA)
Q6. How does morphological analysis help in NLP?
A: It breaks down words into morphemes, aiding in understanding meaning and

🚀 Powered by ChatGPT Exporter 19 / 34

grammatical function, which is crucial for POS tagging, parsing, and translation.

Q7. Explain the role of finite-state transducers in morphological parsing.

A: FSTs model the relationship between surface forms and lexical forms by applying
rules that map input (inflected word) to output (root + features), making them
efficient for real-time parsing.

Q8. What is the significance of probabilistic parsing?

A: It uses probability to choose the most likely syntactic structure among many
possibilities, especially useful in ambiguous sentences (e.g., I saw the man with the
telescope).

Q9. Compare Homonymy and Polysemy with examples.

A: Homonymy: different meanings, same word (e.g., bank of river vs bank account).
Polysemy: related meanings (e.g., paper as material and academic publication).

Q10. What is Word Sense Disambiguation, and why is it hard?

A: It’s the process of identifying the correct sense of a word in context. It’s difficult
due to multiple valid meanings, subtle context shifts, and insufficient training data for
rare senses.

✅ Unit III: Language Modelling (Deep Oral QA)

Q11. What is the purpose of a language model in NLP?
A: To estimate the probability of sequences of words, allowing machines to predict the
next word, correct spelling errors, and understand natural language context.

Q12. Explain the problem of data sparsity in N-gram models and how it’s solved.
A: Data sparsity arises when many word sequences are never seen in training data.
It’s handled using smoothing techniques like Laplace, Good-Turing, and backoff
models.

Q13. How do word embeddings improve NLP tasks over traditional methods like BoW?
A: Word embeddings (e.g., word2vec) capture semantic similarity in vector space,

🚀 Powered by ChatGPT Exporter 20 / 34

unlike BoW which ignores context and word order. For instance, king - man + woman
≈ queen in word2vec.

Q14. How does BERT differ from traditional embeddings?

A: BERT provides contextual embeddings where the same word can have different
vectors based on sentence context (e.g., bank in river vs finance), while traditional
models like word2vec provide static vectors.

Q15. Describe Latent Dirichlet Allocation (LDA) in topic modeling.

A: LDA is a generative probabilistic model that assumes documents are mixtures of
topics, and topics are distributions over words. It uses Bayesian inference to uncover
hidden thematic structures.

✅ Unit IV: Information Retrieval Using NLP (Deep

Oral QA)
Q16. What are the limitations of Boolean retrieval in IR and how does the Vector
Space Model overcome them?
A: Boolean retrieval is rigid (AND/OR/NOT), returning exact matches. Vector Space
Model allows ranking documents based on similarity scores (like cosine similarity),
enabling partial and relevance-based matching.

Q17. How is NER useful in real-world applications?

A: NER identifies key entities (names, dates, locations), aiding in summarization,
information extraction, knowledge graph construction, and question answering.

Q18. What is relation extraction and how is it performed?

A: It identifies semantic relationships between entities (e.g., Barack Obama – born in –
Hawaii). It can be rule-based, supervised ML-based, or use pre-trained models.

Q19. How does coreference resolution enhance text understanding?

A: It links pronouns and noun phrases to the entities they refer to, helping maintain

🚀 Powered by ChatGPT Exporter 21 / 34

context across sentences (e.g., Mary loves coffee. She drinks it daily. – She = Mary, it
= coffee).

Q20. Explain the concept and need for Cross-Lingual Information Retrieval.
A: CLIR enables users to query in one language and retrieve documents in another. It
is essential for multilingual societies and global access to information, often using
machine translation or bilingual dictionaries.

✅ Unit V: NLP Tools and Techniques (Deep Oral QA)

Q21. What are the advantages of using spaCy over NLTK?
A: spaCy is faster, designed for production use, and supports deep learning. NLTK is
better for teaching and prototyping due to its comprehensive documentation and
datasets.

Q22. What are Treebanks and how are they used in NLP?
A: Treebanks are corpora annotated with syntactic or semantic sentence structure.
They’re used to train parsers and evaluate syntactic analysis algorithms.

Q23. How does WordNet assist in Word Sense Disambiguation?

A: WordNet provides senses, definitions, synonyms, and relations. Algorithms like Lesk
use this information to compare context with sense definitions.

Q24. What is the Lesk algorithm and what is its limitation?

A: It disambiguates word senses by overlapping dictionary definitions with context.
Limitation: highly dependent on the quality and completeness of definitions.

Q25. What role do linguistic resources like IndoWordNet play for Indian languages?
A: IndoWordNet provides structured semantic relations for Indian languages, enabling
translation, WSD, and cross-lingual NLP applications.

🚀 Powered by ChatGPT Exporter 22 / 34

✅ Unit VI: Applications of NLP (Deep Oral QA)
Q26. How does rule-based machine translation differ from statistical MT?
A: Rule-based uses grammatical rules and dictionaries; statistical MT learns from
aligned bilingual corpora using probability. SMT handles ambiguity better but requires
large data.

Q27. What are the steps involved in Sentiment Analysis?

A: Preprocessing → Tokenization → POS tagging → Feature extraction → Classification
(positive, negative, neutral).

Q28. How do QA systems differ from traditional search engines?

A: QA systems aim to provide direct answers to questions using NLP and IR
techniques, while search engines return a list of relevant documents.

Q29. What challenges exist in building dialog systems?

A: Handling context, understanding multi-turn conversations, managing user intent,
and generating natural responses.

Q30. How does natural language generation work in applications like chatbots or
summaries?
A: NLG converts structured data into coherent text using rule-based, template-based,
or neural network approaches like GPT or T5.

Would you like these in PDF, presentation, or printable flashcard format?

Course Contents

Unit I Introduction to Decision support systems and

Business

intelligence

🚀 Powered by ChatGPT Exporter 23 / 34

07 Hours
Decision support systems: Definition of system,
representation of the decision-making process,
evolution of information systems, Decision Support
System, Development of a decision support system,
the four stages of Simon’s decision-making process,
and common strategies and approaches of decision
makers
Business Intelligence: BI, its components &
architecture, previewing the future of BI, crafting a
better
experience for all business users, End user
assumptions, setting up data for BI, data, information
and
knowledge, The role of mathematical models, Business
intelligence architectures, Ethics and business
intelligence
#Exemplar/Case
Studies

Decision support system in business intelligence:

https://www.riverlogic.com/blog/five-decision-support-
system-examples

*Mapping of Course
Outcomes for Unit I

CO1

Unit II The Architecture of DW and BI 07 Hours

Home

Faculty of Engineering Savitribai Phule Pune University

Syllabus for Fourth Year of Computer Engineering

#99/128
BI and DW architectures and its types - Relation
between BI and DW - OLAP (Online analytical

🚀 Powered by ChatGPT Exporter 24 / 34

processing) definitions - Different OLAP Architectures-
Data Models-Tools in Business Intelligence-Role
of DSS, EIS, MIS and digital Dash boards – Need for
Business Intelligence
Difference between OLAP and OLTP - Dimensional
analysis - What are cubes? Drill-down and roll-up -
slice and dice or rotation - OLAP models - ROLAP versus
MOLAP - defining schemas: Stars,
snowflakes and fact constellations.
#Exemplar/Case
Studies

A case study on Retail Industry :

https://www.diva-
portal.org/smash/get/diva2:831050/FULLTEXT01.pdf

*Mapping of Course
Outcomes for Unit II
CO2

Unit III Reporting Authoring 07 Hours

Building reports with relational vs Multidimensional
data models; Types of Reports – List, crosstabs,
Statistics, Chart, map, financial etc; Data Grouping &
Sorting, Filtering Reports, Adding Calculations to
Reports, Conditional formatting, Adding Summary Lines
to Reports. Drill up, drill- down, drill-through
capabilities. Run or schedule report, different output
forms – PDF, excel, csv, xml etc.
#Exemplar/Case
Studies

Power BI Case Study – How the tool reduced hassles of

Heathrow & Edsby:
https://data-flair.training/blogs/power-bi-case-study/

*Mapping of Course
Outcomes for Unit III
CO3

🚀 Powered by ChatGPT Exporter 25 / 34

Unit IV Data preparation 07 Hours
Data validation: Incomplete data , Data affected by
noise .Data transformation: Standardization ,
Feature extraction. Data reduction : Sampling, Feature
selection, Principal component analysis, Data
discretization .Data exploration : 1.Univarate analysis
:Graphical analysis of categorical attributes
,Graphical analysis of numerical attributes , Measures
of central tendency for numerical attributes ,
Measures of dispersion for numerical attributes,
Identification of outliers for numerical attributes
2.Bivariate analysis: Graphical analysis , Measures of
correlation for numerical attributes , Contingency
tables for categorical attributes, 3.Multivariate analysis:
Graphical analysis , Measures of correlation for
numerical attributes
#Exemplar/Case
Studies

Case study on Data preparation phase of BI system

https://blog.panoply.io/load-and-transform-how-to-
prepare-your-data-for-
business-intelligence

*Mapping of Course
Outcomes for Unit IV
CO4

Unit V Impact of Machine learning in Business

Intelligence Process 07 Hours
Classification: Classification problems, Evaluation of
classification models, Bayesian methods, Logistic
regression. Clustering: Clustering methods, Partition
methods, Hierarchical methods, Evaluation of
clustering models. Association Rule: Structure of
Association Rule, Apriori Algorithm
#Exemplar/Case
Studies

🚀 Powered by ChatGPT Exporter 26 / 34

Business applications for comparing the performance of
a stock over a period
of time https://cleartax.in/s/stock-market-analysis

*Mapping of Course
Outcomes for Unit V
CO5

Unit VI BI Applications 07 Hours

Faculty of Engineering Savitribai Phule Pune University

Syllabus for Fourth Year of Computer Engineering

#100/128
Tools for Business Intelligence, Role of analytical tools
in BI, Case study of Analytical Tools: WEKA,
KNIME, Rapid Miner, R;
Data analytics, Business analytics, ERP and Business
Intelligence, BI and operation management, BI in
inventory management system, BI and human resource
management, BI Applications in CRM, BI
Applications in Marketing, BI Applications in Logistics
and Production, Role of BI in Finance, BI
Applications in Banking, BI Applications in
Telecommunications, BI in salesforce management
#Exemplar/Case
Studies

Logistics planning in the food industry

https://www.foodlogistics.com/case-studies
https://www.barrettdistribution.com/food-distribution-
case-study deep oral QA unit wise

Here is a deep, unit-wise oral Q&A guide for your subject "Decision Support
Systems and Business Intelligence" based on the syllabus of Savitribai Phule
Pune University. Each unit includes in-depth, application-focused, and conceptual
questions ideal for oral exams, vivas, or interviews.

🚀 Powered by ChatGPT Exporter 27 / 34

✅ Unit I: Introduction to DSS and BI – Deep Oral
QA
Q1. What is a Decision Support System (DSS)? How does it differ from MIS?
A: A DSS supports complex decision-making and problem-solving. Unlike MIS, which
focuses on routine operations and reports, DSS is analytical and interactive, helping
managers make strategic decisions.

Q2. Explain Simon’s decision-making model with real-life business examples.

A: Simon proposed four stages:

1. Intelligence (identify problem),

2. Design (generate alternatives),

3. Choice (select best),

4. Implementation.
Example: A retail chain deciding on expansion uses DSS to analyze sales data
(intelligence), compare locations (design), choose the best city (choice), and start
operations (implementation).

Q3. What are the key components of Business Intelligence architecture?

Data sources

ETL (Extract, Transform, Load)

Data warehouse

OLAP engines

Reporting tools

Dashboards/Visualization

🚀 Powered by ChatGPT Exporter 28 / 34

Users

Q4. How are data, information, and knowledge different?

Data: Raw facts (e.g., sales numbers).

Information: Processed data with meaning (e.g., total monthly sales).

Knowledge: Insights derived from information (e.g., customers prefer Product A).

Q5. What ethical issues arise in BI?

Data privacy and surveillance

Biased algorithms

Misuse of personal data

Transparency in decision-making

✅ Unit II: Architecture of DW and BI – Deep Oral

QA
Q6. Differentiate between OLTP and OLAP with business scenarios.
A:

OLTP: Used in day-to-day operations (e.g., ATM transactions).

OLAP: Used for analysis and decision-making (e.g., analyzing monthly banking
trends).

Q7. Explain star and snowflake schemas with diagrams.

🚀 Powered by ChatGPT Exporter 29 / 34

Star schema: Central fact table linked to dimension tables. Simple but
redundant.

Snowflake schema: Normalized dimension tables. More complex but less

redundant.

Q8. What is the importance of drill-down and slice-and-dice in OLAP?

Drill-down: View data at a more detailed level.

Slice-and-dice: Analyze data across different dimensions (e.g., by region or

product).

Q9. What is a cube in data warehousing?

A: A multidimensional dataset allowing analysis from multiple perspectives like time,
geography, and product.

Q10. Explain ROLAP vs MOLAP. Which is better for large datasets?

ROLAP: Works with relational databases; scalable for large data.

MOLAP: Pre-computed cubes; faster but limited in volume.

✅ Unit III: Reporting & Authoring – Deep Oral QA

Q11. What is the difference between list reports and crosstab reports?
A:

List report: Flat, linear data presentation.

Crosstab: Pivot-style, showing aggregation across two dimensions.

🚀 Powered by ChatGPT Exporter 30 / 34

Q12. Why is conditional formatting important in reports?
A: It highlights key values (e.g., low stock, high revenue) for quick insights.

Q13. Explain drill-through capability with an example.

A: Clicking on summarized data (like total sales) to view detailed records (e.g.,
customer-wise sales breakdown).

Q14. How do different output formats (CSV, XML, Excel) serve various stakeholders?
A:

CSV: Lightweight, for developers.

Excel: Interactive, used by analysts.

XML: Structured, used in system integrations.

Q15. What factors must be considered when scheduling reports?

Data refresh frequency

Audience need

Resource availability

Security/access controls

✅ Unit IV: Data Preparation – Deep Oral QA

Q16. Why is data validation crucial before BI analysis?
A: Dirty or incomplete data can lead to false insights and poor decision-making.

Q17. Differentiate between feature extraction and feature selection.

🚀 Powered by ChatGPT Exporter 31 / 34

Feature extraction: Derives new features (e.g., PCA).

Feature selection: Chooses the most relevant existing features.

Q18. How does principal component analysis (PCA) reduce dimensionality?

A: PCA converts correlated features into fewer uncorrelated components while
retaining maximum variance.

Q19. What is the role of univariate and bivariate analysis in data exploration?
A:

Univariate: Analyzing single attributes (e.g., age distribution).

Bivariate: Understanding relationships (e.g., sales vs discount).

Q20. How do you detect outliers in numerical data?

Z-score

Boxplot (IQR method)

Visualizations (scatter plots)

✅ Unit V: Machine Learning in BI – Deep Oral QA

Q21. What is the use of logistic regression in business intelligence?
A: Predict binary outcomes (e.g., will a customer buy or not). Useful in churn
prediction, lead scoring.

Q22. Compare hierarchical and partition clustering methods.

Partition (e.g., K-means): Divides into k clusters based on centroids.

🚀 Powered by ChatGPT Exporter 32 / 34

Hierarchical: Builds tree of clusters (dendrogram) – top-down or bottom-up.

Q23. How is the performance of classification models evaluated?

Accuracy

Precision, Recall

F1 Score

ROC Curve

Q24. Explain the Apriori algorithm with a business use-case.

A: Apriori finds frequent itemsets (e.g., {milk, bread} often bought together) and
generates association rules for market basket analysis.

Q25. What is overfitting in ML, and how can it affect BI decisions?

A: Overfitting models perform well on training data but poorly on real-world data,
leading to unreliable decisions.

✅ Unit VI: BI Applications – Deep Oral QA

Q26. How do ERP systems integrate with BI platforms?
A: ERPs manage transactions (HR, Finance), and BI tools analyze ERP data for trends,
performance metrics, and forecasts.

Q27. Discuss the role of BI in supply chain and logistics.

A: BI helps in demand forecasting, route optimization, inventory control, reducing
delivery times, and cost savings.

Q28. What is the role of tools like RapidMiner and KNIME in BI?
A: They are drag-and-drop platforms for data mining, predictive analytics, and model
deployment without much coding.

🚀 Powered by ChatGPT Exporter 33 / 34

Q29. How does BI support CRM systems?
A: By analyzing customer behavior, preferences, and feedback to improve sales,
service, and retention.

Q30. Give real-life examples of BI in financial services.

A: Fraud detection, credit scoring, risk modeling, portfolio analysis, and regulatory
compliance.

Would you like this content as a PDF for printing, a PowerPoint presentation, or
quiz flashcards for revision?

🚀 Powered by ChatGPT Exporter 34 / 34

CH 1 Erpm
No ratings yet
CH 1 Erpm
15 pages
DWDM Notes 5 Units
No ratings yet
DWDM Notes 5 Units
110 pages
Nptel: Natural Language Processing - Video Course
No ratings yet
Nptel: Natural Language Processing - Video Course
3 pages
Natural Language Processing (Peiii)
No ratings yet
Natural Language Processing (Peiii)
2 pages
Data Science
100% (2)
Data Science
33 pages
Quick Reference Guide-OrACLE SUCRITY
No ratings yet
Quick Reference Guide-OrACLE SUCRITY
2 pages
GATE DA Data Warehousing
No ratings yet
GATE DA Data Warehousing
30 pages
AI M3 Merged PDF
No ratings yet
AI M3 Merged PDF
98 pages
Unit 1
No ratings yet
Unit 1
99 pages
Natural Language Processing
No ratings yet
Natural Language Processing
87 pages
NLP Semester 7
No ratings yet
NLP Semester 7
1,072 pages
Enterprise Resource Planning Systems
No ratings yet
Enterprise Resource Planning Systems
33 pages
Regression Models Overview
No ratings yet
Regression Models Overview
170 pages
Bring Data Lakes and Data Warehouses Together
100% (1)
Bring Data Lakes and Data Warehouses Together
19 pages
A Call Detail Records Data Mart Data Modelling and
100% (1)
A Call Detail Records Data Mart Data Modelling and
25 pages
(Tit 701) Cryptography and Network Security
No ratings yet
(Tit 701) Cryptography and Network Security
3 pages
COGNOS Versão Final 3
No ratings yet
COGNOS Versão Final 3
73 pages
NLP Lect Unit I
100% (1)
NLP Lect Unit I
140 pages
Chaitali PM
No ratings yet
Chaitali PM
8 pages
Natural Language Processing (NLP) : April 2024
No ratings yet
Natural Language Processing (NLP) : April 2024
88 pages
NLP (1) (1) - Merged
No ratings yet
NLP (1) (1) - Merged
239 pages
Question Bank With 2 Marks
100% (1)
Question Bank With 2 Marks
21 pages
SYLLABUS
No ratings yet
SYLLABUS
2 pages
Nlp-Unit-I Final
No ratings yet
Nlp-Unit-I Final
31 pages
NLP PPT1
No ratings yet
NLP PPT1
29 pages
24 25 Final Report Stage I
No ratings yet
24 25 Final Report Stage I
50 pages
CCS 341 Lab Manual
No ratings yet
CCS 341 Lab Manual
32 pages
Lec-1 Introduction
No ratings yet
Lec-1 Introduction
68 pages
NoSQL
No ratings yet
NoSQL
32 pages
NLP Session I-Unit I and II
No ratings yet
NLP Session I-Unit I and II
50 pages
13 Batch Mid Final Q Solve Database
No ratings yet
13 Batch Mid Final Q Solve Database
9 pages
Endsem NLP IMPORTANT QUESTIONS
No ratings yet
Endsem NLP IMPORTANT QUESTIONS
2 pages
Data Lake and Data Warehouse
100% (2)
Data Lake and Data Warehouse
24 pages
NLP 01
No ratings yet
NLP 01
16 pages
Waste Plastic Road Construction Presentation
No ratings yet
Waste Plastic Road Construction Presentation
6 pages
Natural Language Processing-Course Handout September 2022
No ratings yet
Natural Language Processing-Course Handout September 2022
8 pages
A Close Look at OLTP, OLAP, and RDBMS Dynamics
No ratings yet
A Close Look at OLTP, OLAP, and RDBMS Dynamics
26 pages
Hoffer Mdm12e PP Ch09
No ratings yet
Hoffer Mdm12e PP Ch09
20 pages
Course 2 Module 02 WPTAssignment
No ratings yet
Course 2 Module 02 WPTAssignment
3 pages
De Normalization
No ratings yet
De Normalization
7 pages
Natural Language Processing Tools and Approaches
No ratings yet
Natural Language Processing Tools and Approaches
106 pages
Presentation 1
No ratings yet
Presentation 1
10 pages
Natural Language Processing: Zhao Hai 赵海 Department of Computer Science and Engineering Shanghai Jiao Tong University
No ratings yet
Natural Language Processing: Zhao Hai 赵海 Department of Computer Science and Engineering Shanghai Jiao Tong University
61 pages
Lect1 Intro 3jan08
No ratings yet
Lect1 Intro 3jan08
94 pages
Natural Language Processing Notes by Prof. Suresh R. Mestry: L I L L L I
No ratings yet
Natural Language Processing Notes by Prof. Suresh R. Mestry: L I L L L I
41 pages
NLP Subject Orientation SH23
No ratings yet
NLP Subject Orientation SH23
35 pages
CS702B
No ratings yet
CS702B
114 pages
AnandKumar Course Intro IT356
No ratings yet
AnandKumar Course Intro IT356
42 pages
Course Code: Course Title Credit CSDO7011 Atural Language Processing 3
No ratings yet
Course Code: Course Title Credit CSDO7011 Atural Language Processing 3
4 pages
NLP Defaulter Assignment
No ratings yet
NLP Defaulter Assignment
2 pages
NLP Unit1
No ratings yet
NLP Unit1
51 pages
Natural Language Processin1
No ratings yet
Natural Language Processin1
86 pages
NLP Notes
No ratings yet
NLP Notes
16 pages
Module 1 Lecture 1
No ratings yet
Module 1 Lecture 1
29 pages
Blockchain Technology in Smart Agriculture Environment A PLS-SEM
No ratings yet
Blockchain Technology in Smart Agriculture Environment A PLS-SEM
6 pages
NLP 2
No ratings yet
NLP 2
45 pages
Lucas Paquetta Raw NLP
No ratings yet
Lucas Paquetta Raw NLP
12 pages
NLP Assignment Notes
No ratings yet
NLP Assignment Notes
28 pages
BI Chapter 01 - Unlocked
No ratings yet
BI Chapter 01 - Unlocked
41 pages
Real Time Data Warehousing
No ratings yet
Real Time Data Warehousing
6 pages
Nlp-Unit-I Final
No ratings yet
Nlp-Unit-I Final
31 pages
Introduction To NLP: Natural Language Processing
No ratings yet
Introduction To NLP: Natural Language Processing
21 pages
Annex 1 Technology Architecture
No ratings yet
Annex 1 Technology Architecture
20 pages
MScIT Sem4
No ratings yet
MScIT Sem4
8 pages
CD AAT (Techtalk)
No ratings yet
CD AAT (Techtalk)
22 pages
Natural Language Processing Course Content
No ratings yet
Natural Language Processing Course Content
2 pages
CM321 NLP Syllabus
No ratings yet
CM321 NLP Syllabus
2 pages
Natural Language Processing Notes
No ratings yet
Natural Language Processing Notes
26 pages
2 - 6N302 Natural Language Processing
No ratings yet
2 - 6N302 Natural Language Processing
6 pages
Cs-3-Lesson Plan
No ratings yet
Cs-3-Lesson Plan
3 pages
Data Warehouse 1
No ratings yet
Data Warehouse 1
6 pages
Understanding BW Non Cumulative Concept As Applicable in Inventory Management Data Model
No ratings yet
Understanding BW Non Cumulative Concept As Applicable in Inventory Management Data Model
14 pages
21UAD704 - Natural Language Processing
No ratings yet
21UAD704 - Natural Language Processing
19 pages
Unit I NLP
No ratings yet
Unit I NLP
5 pages
15CS421E - Natural Language Processing
No ratings yet
15CS421E - Natural Language Processing
2 pages
Nlpa (PG Aids)
No ratings yet
Nlpa (PG Aids)
4 pages
L3
No ratings yet
L3
54 pages
Ai in Natural Language Processing
No ratings yet
Ai in Natural Language Processing
4 pages
ME02023011
No ratings yet
ME02023011
3 pages
BW Quiz From SearchSAP
No ratings yet
BW Quiz From SearchSAP
17 pages
NLP A
No ratings yet
NLP A
6 pages
Brochure CMU NLP 24-08-2022 V13
No ratings yet
Brochure CMU NLP 24-08-2022 V13
13 pages
1-Udacity Enterprise Syllabus Data Architect nd038
No ratings yet
1-Udacity Enterprise Syllabus Data Architect nd038
15 pages
Natural Language Processing
No ratings yet
Natural Language Processing
2 pages
Cse4022 Natural-Language-Processing Eth 1.0 37 Cse4022
No ratings yet
Cse4022 Natural-Language-Processing Eth 1.0 37 Cse4022
2 pages
DWM Unit-IV
No ratings yet
DWM Unit-IV
27 pages
CSE4022 Natural-Language-Processing ETH 1 AC41
No ratings yet
CSE4022 Natural-Language-Processing ETH 1 AC41
6 pages
Syllabus NLP
No ratings yet
Syllabus NLP
2 pages
INT344
50% (2)
INT344
2 pages
NLP
No ratings yet
NLP
2 pages
Syllabus NLP (UE19CS334)
No ratings yet
Syllabus NLP (UE19CS334)
2 pages
BEx Tools 2nd Edition SAP PRESS Sample Chapter
No ratings yet
BEx Tools 2nd Edition SAP PRESS Sample Chapter
31 pages
Final Report
No ratings yet
Final Report
2 pages
The spaCy Handbook: Simplifying Natural Language Processing
From Everand
The spaCy Handbook: Simplifying Natural Language Processing
Robert Johnson
No ratings yet
Language Identification: Fundamentals and Applications
From Everand
Language Identification: Fundamentals and Applications
Fouad Sabry
No ratings yet