0% found this document useful (0 votes)

32 views38 pages

Module5 DS PPT

data science module 5

Uploaded by

rajaa.david

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views38 pages

Module5 DS PPT

data science module 5

Uploaded by

rajaa.david

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 38

Module 5:Natural Language Processing

Word Clouds, n-Gram Language Models, Grammars, An Aside: Gibbs Sampling, Topic Modeling,
Word Vectors, Recurrent Neural Networks, Example: Using a Character-Level RNN, Network
Analysis, Betweenness Centrality, Eigenvector Centrality, Directed Graphs and PageRank,
Recommender Systems, Manual Curation, Recommending What’s Popular, User-Based
Collaborative Filtering, Item-Based Collaborative Filtering, Matrix Factorization.
Text Book : Chapters 21, 22 and 23
Natural Language Processing
• Natural language processing (NLP) refers to computational techniques involving
language.
• Natural Language Processing (NLP) is a field of artificial intelligence (AI) focused
on enabling computers to understand, interpret, and generate human language in a
way that is meaningful and useful.
• NLP combines linguistics with computer science and machine learning to process
and analyze large amounts of natural language data, such as text or speech.
Word Clouds
• One approach to visualizing words and counts is word clouds, which artistically depict
the words at sizes proportional to their counts.
• Generally, though, data scientists don’t think much of word clouds, in large part
• because the placement of the words doesn’t mean anything other than “here’s some
• space where I was able to fit a word.”
• If you ever are forced to create a word cloud, think about whether you can make the
• axes convey something.
• For example, imagine that, for each of some collection of data science–related
buzzwords, you have two numbers between 0 and 100—the first representing how
frequently it appears in job postings, and the second how frequently it appears on
résumés:
data = [ ("big data", 100, 15), ("Hadoop", 95, 25), ("Python", 75, 50),
("R", 50, 40), ("machine learning", 80, 20), ("statistics", 20, 60),
("data science", 60, 70), ("analytics", 90, 3),
("team player", 85, 85), ("dynamic", 2, 90), ("synergies", 70, 0),
("actionable insights", 40, 30), ("think out of the box", 45, 10),
("self-starter", 30, 50), ("customer focus", 65, 15),
("thought leadership", 35, 35)]
The word cloud approach is just to arrange the words on a page in a cool-looking
font (Figure 1)
from matplotlib import pyplot as plt
def text_size(total: int) -> float:
"""equals 8 if total is 0, 28 if total is 200"""
return 8 + total / 200 * 20
for word, job_popularity,
resume_popularity in data:
plt.text(job_popularity, resume_popularity,
word,
ha='center', va='center',
size=text_size(job_popularity +
resume_popularity))
plt.xlabel("Popularity on Job Postings")
plt.ylabel("Popularity on Resumes")
plt.axis([0, 100, 0, 100])
plt.xticks([])
plt.yticks([])
plt.show()
n-Gram Language Models
• An n-gram language models are statistical model used in natural language processing to
predict the probability word based on the previous (n-1) words.
• These models are crucial in various NLP tasks such as text generation, speech
recognition, machine translation and more.
• It operates by estimating the probability of a word given the preceding words, typically
considering a fixed number (n) of previous words to make this prediction.
• n-gram a continuous sequence of n items from a given sample of text or speech.
• n-gram ban be classified as
 Unigram- A single word, eg:”I”
 Bigram- A sequence of two words, eg: “ I am”
 Trigram- A sequence of three words, eg: “ I am happy”
 N-gram- A sequence of n word
Structure of n-gram models
 Probability Estimation:
• primary goal of an n-gram model is to estimate the probability of word given its
preceding words.
• For the n-gram model the probability of the next word wi given a previous (n-1)
words wi-n+1, …wi-1 is
P(wi|wi-n+1, …, wi-1)
 Markov Assumption
• The model is based on the Markov assumption means which simplifies the problem
of assuming that probability of a word depends only on the fixed number of
preceeding words, not on the entire sentence,
• This is often expressed as: 𝑃(𝑤𝑖∣𝑤1,𝑤2,…,𝑤𝑖−1)≈𝑃(𝑤𝑖∣𝑤𝑖−𝑛+1,…,𝑤𝑖−1)
 Chain rule:
• Using chain rule the joint probability of a sequence of words can be decomposed
p(w1,w 2,…,wN)=p(w1). P(w2|w1).p(w3|w1,w2), …p(wN|w1,w2,..wN-1)
• For n-gram model
p(w1,w 2,…,wN) ≈ p(w1). P(w2|w1)…p(wN|wN-n+1,.. wN-1)
 Training n-gram model
• Training involves counting the occurrences of n-gram in a text corpus and using these
counts to estimate probabilities.
• For example, in a bigram model, the probability of a word 𝑤𝑖 following a word 𝑤𝑖−1
is calculated as:
𝑃(𝑤𝑖∣𝑤𝑖−1)=Count(𝑤𝑖−1,𝑤𝑖)/Count(𝑤𝑖−1)
Grammars
• A different approach to modeling language is with grammars, rules for generating
acceptable sentences.
• For example, a sentence necessarily consists of a noun followed by a verb. If you then
have a list of nouns and verbs, you can generate sentences according to the rule.

from typing import List, Dict

# Type alias to refer to grammars later
Grammar = Dict[str, List[str]]
grammar = {
"_S" : ["_NP _VP"],
"_NP" : ["_N",
"_A _NP _P _A _N"],
"_VP" : ["_V",
"_V _NP"],
"_N" : ["data science", "Python", "regression"],
"_A" : ["big", "linear", "logistic"],
"_P" : ["about", "near"],
"_V" : ["learns", "trains", "tests", "is"]
}
• The common convention that names starting with underscores refer to rules that need
further expanding, and that other names are terminals that don’t need further processing.
• So, for example, "_S" is the “sentence” rule, which produces an "_NP" (“noun phrase”)
rule followed by a "_VP" (“verb phrase”) rule.
• Notice that the "_NP" rule contains itself in one of its productions. Grammars can be
recursive, which allows even finite grammars like this to generate infinitely many
different sentences.
• To generate sentences from this grammar, we start with a list containing the sentence
rule ["_S"]. And then we’ll repeatedly expand each rule by replacing it with a randomly
chosen one of its productions.
• We stop when we have a list consisting solely of terminals.
For example, one such progression might look like:
['_S']
['_NP','_VP']
['_N','_VP']
['Python','_VP']
['Python','_V','_NP']
['Python','trains','_NP']
['Python','trains','_A','_NP','_P','_A','_N']
['Python','trains','logistic','_NP','_P','_A','_N']
['Python','trains','logistic','_N','_P','_A','_N']
['Python','trains','logistic','data science','_P','_A','_N']
['Python','trains','logistic','data science','about','_A', '_N']
['Python','trains','logistic','data science','about','logistic','_N']
['Python','trains','logistic','data science','about','logistic','Python']
An Aside: Gibbs Sampling
• Generating samples from some distributions is easy.
• We can get uniform random variables with: random.random() and normal random variables
with: inverse_normal_cdf(random.random())
• But some distributions are harder to sample from.
• Gibbs sampling is a technique for generating samples from multidimensional distributions
when we only know some of the conditional distributions.
• For example, imagine rolling two dice. Let x be the value of the first die and y be the sum of
the dice, and imagine you wanted to generate lots of (x, y) pairs. In this case it’s easy to
generate the samples directly:
from typing import Tuple
import random
def roll_a_die() -> int:
return random.choice([1, 2, 3, 4, 5, 6])
def direct_sample() -> Tuple[int, int]:
d1 = roll_a_die()
d2 = roll_a_die()
return d1, d1 + d2
result = direct_sample()
print(result) # This might output something like (3, 8) where 3 is d1 and 8 is the sum of d1 and d2.
• The distribution of y conditional on x is easy—if you know the value of x, y is equally
likely to be x + 1, x + 2, x + 3, x + 4, x + 5, or x +6

def roll_a_die() -> int:

# Returns a random integer between 1 and 6
return random.choice([1, 2, 3, 4, 5, 6])

def random_y_given_x(x: int) -> int:

"""Returns an integer equally likely to be x + 1, x + 2, ... , x + 6"""
return x + roll_a_die()
result = random_y_given_x(10)
print(result) # This might output a value between 11 and 16, each equally likely.
• The way Gibbs sampling works is that we start with any (valid) values for x and y and then
repeatedly alternate replacing x with a random value picked conditional on y and replacing
y with a random value picked conditional on x.
• After a number of iterations, the resulting values of x and y will represent a sample from
the unconditional joint distribution:
Topic Modeling
• Topic modeling is an unsupervised machine learning technique used to automatically
identify topics within a large collection of text documents.
• By grouping words into clusters or "topics," topic modeling helps uncover the hidden
thematic structure in the data, allowing for a high-level view of what the documents are
about.
• Topic modeling provides insights into large text corpora by extracting structured,
interpretable themes that reveal latent information, making it widely valuable in areas
like research, marketing, content recommendation, and social media analysis.
• Some of the techniques used in topic modelling are Latent Dirichlet Allocation (LDA),
Non-Negative Matrix Factorization (NMF), Latent Semantic Analysis (LSA) and
BERTopic.
• Latent Dirichlet Allocation (LDA) is commonly used to identify common topics in a set
of documents.
• LDA has some similarities to the Naive Bayes classifier, in that it assumes a
probabilistic model for documents.
The model assumes the following
• There is some fixed number K of topics.
• There is a random variable that assigns each topic an associated probability distribution
over words. You should think of this distribution as the probability of seeing word w
given topic k.
• There is another random variable that assigns each document a probability distribution
over topics. You should think of this distribution as the mixture of topics in document d.
• Each word in a document was generated by first randomly picking a topic (from the
document’s distribution of topics) and then randomly picking a word (from the topic’s
distribution of words).
Word vectors
• One important innovation involves representing words as low-dimensional vectors.
These vectors can be compared, added together, fed into machine learning models,
or anything else you want to do with them.
• Word vectors, or word embeddings, are a way to represent words in a continuous
vector space, typically used in natural language processing (NLP) to capture
semantic and syntactic relationships between words.
• Instead of treating words as discrete entities, word vectors encode them in a form
that allows similarity in meaning to be reflected by closeness in vector space.
1. Get a bunch of text.
2. Create a dataset where the goal is to predict a word given nearby words (or
alternatively,
to predict nearby words given a word).
3. Train a neural net to do well on this task.
4. Take the internal states of the trained neural net as the word vectors.
Recurrent Neural Network
• A Recurrent Neural Network (RNN) is a type of artificial neural network designed for
sequential data processing.
• Unlike feedforward neural networks, RNNs have connections that form directed cycles,
enabling them to maintain an internal "memory" of previous inputs, making them well-
suited for tasks involving time-series data or sequential dependencies.
• Variants of RNNs:
 Long Short-Term Memory (LSTM):Introduces gates (input, forget, and output
gates) to control the flow of information, enabling the network to retain or discard
information over long time periods.
 Gated Recurrent Unit (GRU):Similar to LSTM but with fewer parameters,
combining the forget and input gates into a single update gate.
Challenges with Basic RNNs:
• Vanishing and Exploding Gradients: During backpropagation through time (BPTT),
gradients can become very small (vanish) or very large (explode), making it hard to train
RNNs effectively over long sequences.
• Long-Term Dependencies: Standard RNNs struggle to capture long-range dependencies
in sequential data.

Applications of RNNs:
• Natural Language Processing (NLP):Sentiment analysis, machine translation, text
generation, and speech recognition.
• Time-Series Analysis: Stock price prediction, weather forecasting, and signal processing.
• Sequential Data Modeling:Video frame analysis and music generation.
Beetweeness Centrality
• Betweenness Centrality is a metric used in network analysis to measure the importance
of a node (or edge) in a graph based on its role in connecting other nodes.
• It quantifies how often a node appears on the shortest paths between pairs of nodes in a
network.
users = [User(0, "Hero"), User(1, "Dunn"), User(2, "Sue"), User(3, "Chi"),
User(4, "Thor"), User(5, "Clive"), User(6, "Hicks"),
User(7, "Devin"), User(8, "Kate"), User(9, "Klein")]
and friendships:
friend_pairs = [(0, 1), (0, 2), (1, 2), (1, 3), (2, 3), (3, 4),
(4, 5), (5, 6), (5, 7), (6, 8), (7, 8), (8, 9)]
• The betweenness centrality of node i is computed by adding up, for every other pair of
nodes j and k, the proportion of shortest paths between node j and node k that pass
through i.

As shown in above Figure , users 0 and 9 have centrality 0 (as neither is on any shortest
path between other users), whereas 3, 4, and 5 all have high centralities (as all three lie
on many shortest paths).
Closeness Centrality is a metric in network analysis that quantifies how "close" a node is
to all other nodes in a network.
It measures the average shortest distance from a given node to every other node,
emphasizing how quickly information or resources can spread from that node throughout
the network.
Eigenvector Centrality
• Eigenvector Centrality is a network analysis metric that measures the influence of a
node in a graph based on the importance of its neighbors.
• Unlike simpler measures like degree centrality, which counts the number of direct
connections, eigenvector centrality assigns higher scores to nodes that are connected
to other highly central nodes.
Characteristics
• Eigenvector centrality is a recursive metric, meaning a node's importance depends on
the importance of its neighbors.
• It can handle directed and undirected graphs.
• The computation requires finding the eigenvector of the adjacency matrix, which is
computationally intensive for large graphs.
Comparison of Centrality Measures

Centrality Measure What It Measures Strengths Limitations

Number of direct Ignores indirect
Degree Centrality Simple and intuitive
connections influence
Proximity to all Captures overall Sensitive to
Closeness Centrality
other nodes accessibility disconnected graphs
Computationally
Betweenness Role in shortest Highlights critical
expensive for large
Centrality paths intermediaries
networks
Influence based on
Eigenvector Captures global Favors nodes in
neighbors'
Centrality importance dense areas
importance
Combines influence
Influence with Requires fine-tuning
Katz Centrality and inherent
baseline adjustment parameters
importance
Sensitive to graph
Influence with Effective for
PageRank topology and
random walks directed networks
parameters
Directed Graphs and PageRank
• PageRank is an algorithm that assigns importance scores to nodes in a directed graph, such
as web pages, based on their connectivity. It was originally developed by Larry Page and
Sergey Brin, the founders of Google, to rank websites in search results.
• To rank websites based on which other websites link to them, which other websites link to
those, and so on.

endorsements = [(0, 1), (1, 0), (0, 2), (2, 0), (1, 2),
(2, 1), (1, 3), (2, 3), (3, 4), (5, 4),
(5, 6), (7, 5), (6, 8), (8, 7), (8, 9)]
There is a total of 1.0 (or 100%) PageRank in the network.
2. Initially this PageRank is equally distributed among nodes.
3. At each step, a large fraction of each node’s PageRank is distributed evenly among
its outgoing links.
4. At each step, the remainder of each node’s PageRank is distributed evenly among
all nodes
Recommender Systems
• Another common data problem is producing recommendations of some sort.
• Netflix recommends movies you might want to watch.
• Amazon recommends products you might want to buy.
• Twitter recommends users you might want to follow.
users_interests = [
["Hadoop", "Big Data", "HBase", "Java", "Spark", "Storm",
"Cassandra"],
["NoSQL", "MongoDB", "Cassandra", "HBase", "Postgres"],
["Python", "scikit-learn", "scipy", "numpy", "statsmodels",
"pandas"],
["R", "Python", "statistics", "regression", "probability"],
["machine learning", "regression", "decision trees", "libsvm"],
["Python", "R", "Java", "C++", "Haskell", "programming
languages"],
["statistics", "probability", "mathematics", "theory"],
["machine learning", "scikit-learn", "Mahout", "neural networks"],
["neural networks", "deep learning", "Big Data", "artificial
intelligence"],
["Hadoop", "Java", "MapReduce", "Big Data"],
["statistics", "R", "statsmodels"],
["C++", "deep learning", "artificial intelligence", "probability"],
["pandas", "R", "Python"],
["databases", "HBase", "Postgres", "MySQL", "MongoDB"],
• A Recommendation System is a machine learning model designed to predict user
preferences and suggest relevant items, such as movies, books, or products.
• These systems are widely used in e-commerce, streaming services, and social
platforms. There are several approaches to building recommendation systems:
• Types of Recommendation Systems
1. Collaborative Filtering
2. Content-Based Filtering
3. Hybrid Systems
4. Matrix Factorization (Latent Factor Models)
5. Deep Learning Models
6. Knowledge-Based Systems
Collaborative FilteringUser-Based: Recommends items based on similarities between
users.Item-Based: Recommends items based on similarities between items.Advantages:
Learns from real user behavior without requiring explicit content data.Limitations:
Struggles with cold-start problems (new users or items).

Content-Based FilteringRecommends items based on item features and user

preferences.Advantages: Works well with specific content features (e.g., genre,
price).Limitations: Limited diversity and struggles to recommend novel items.

Hybrid SystemsCombines collaborative and content-based filtering.Advantages: Balances

the strengths of both approaches and reduces limitations.

Matrix Factorization (Latent Factor Models)Uses techniques like Singular Value

Decomposition (SVD) to identify latent features in user-item interaction data.Common in
collaborative filtering.
Deep Learning Models: Uses neural networks for capturing complex patterns in user-item
relationships.Popular architectures: Autoencoders, Deep Learning embeddings, and
Transformer-based models.

Knowledge-Based Systems: Relies on domain-specific knowledge to recommend

items.Useful in specialized fields like healthcare and education.

Steps to Build a Recommendation System

Data Collection Explicit feedback (e.g., ratings) or implicit feedback (e.g., clicks, views).
Data Preprocessing Clean and preprocess data for missing values, scaling, or encoding.
Feature Engineering Extract relevant features like user demographics or item attributes.
Model Selection Choose collaborative, content-based, or hybrid models.
Evaluation Metrics Precision, recall, F1-score, RMSE, MAE, or diversity metrics.
Item based collaborative filtering
• Item-based collaborative filtering is a recommendation system technique that uses the
similarity between items to recommend products to users.
• It is particularly useful in systems where the number of users is very high, and the items
are fewer or well-curated.
Working
1. Build the Item-Item Similarity Matrix: Compute the similarity between items based on
user ratings or interactions. For example, in a movie recommendation system, similarity
could be calculated based on how users rated two movies (e.g., cosine similarity,
Pearson correlation).
2. Find Similar Items: For each item in the system, identify other items with high
similarity.
3. Generate Recommendations: For a user, identify items they have interacted with. Look
up similar items to those the user liked. Aggregate the scores of these similar items to
recommend the top-N items the user has not yet interacted with.
Advantages:
Scalability: Easier to scale when there are many users.
Accuracy: Captures item similarities well, especially when user behavior is stable
over time.

Disadvantages:
Cold Start Problem: It struggles to recommend items with no user interactions (new
items).
Data Sparsity: If users rate only a few items, similarity computations can be less
effective.
User-based collaborative filtering
• User-based collaborative filtering is a recommendation system technique that focuses
on identifying users with similar preferences and using their preferences to
recommend items.
• It is one of the earliest and simplest collaborative filtering methods.
Working
1. Input Data: A user-item interaction matrix 𝑅, where each entry 𝑅𝑖𝑗 represents the
interaction (e.g., rating or purchase) of user 𝑖 with item 𝑗.
2. Compute User Similarity: Find the similarity between users based on their
interactions with items. Common similarity measures include: Cosine Similarity,
Pearson Correlation
3. Find Similar Users: For a target user, identify the top-K most similar users.
4. Generate Recommendations: For the target user, recommend items that their similar
users have interacted with but they have not.
Advantages
Intuitive: Leverages the idea that "users with similar tastes will like similar items.“
Personalized: Recommendations are tailored to each user based on their preferences.

Disadvantages
Scalability: For a large number of users, finding similar users can be computationally
expensive.
Sparsity Problem: If user interactions are sparse, it may be difficult to find similar users.
Cold Start Problem: Struggles with new users who haven't interacted with any items.

Applications
E-commerce: Recommending products based on similar shoppers.
Streaming Services: Suggesting movies or music based on similar viewers or listeners.
Social Media: Recommending friends or content based on similar users' preferences.
Matrix factorization
• Matrix Factorization is a powerful technique used in recommendation systems to predict
missing values in a user-item interaction matrix.
• It is particularly effective in collaborative filtering, where the goal is to recommend
items to users based on their past behavior.
• Matrix factorization decomposes a large, sparse user-item interaction matrix 𝑅 (e.g.,
user ratings for movies) into two smaller matrices:
1. User Matrix (𝑈): Represents user preferences in a latent feature space.
2. Item Matrix (𝑉V): Represents item attributes in the same latent feature space.
Mathematically:𝑅≈𝑈×VT
Where:
𝑅: Original user-item matrix (e.g., ratings)
𝑈: User matrix (of size 𝑚×𝑘, where 𝑚 is the number of users, and 𝑘 is the number of
latent features)
𝑉: Item matrix (of size 𝑛×𝑘, where 𝑛 is the number of items)
• Each row of 𝑈 represents a user's preferences, and each row of 𝑉 represents an
item's characteristics.

• The optimization goal is to minimize the error between the actual values 𝑅𝑖𝑗 and
the predicted values 𝑅^𝑖𝑗 from 𝑈×VT

Algorithms for Matrix Factorization

1. Stochastic Gradient Descent (SGD):
• Iteratively updates 𝑈 and 𝑉 by computing gradients of the loss function.
• Easy to implement and works well for sparse data.
2. Alternating Least Squares (ALS):
• Alternates between optimizing 𝑈 while keeping 𝑉fixed and optimizing 𝑉
while keeping 𝑈 fixed.
• Often faster than SGD for large datasets.
Some Important Questions

1.Describe the n-gram language model in detail

2. Explain how grammars are used in modeling languages
3. Explain Gibbs Sampling and Topic Modeling
4. Explain the Recurrent Neural Network in detail.
5. Explain Eigen vector centrality in detail
6. Explain item based and user based collaborative filtering
7. What is word cloud?
8. Explain Matrix factorization in detail
9. Explain Betweeness centrality and Closeness centrality
10. What is recommendation system? Give its types

Intro To Statistical NLP
No ratings yet
Intro To Statistical NLP
57 pages
Applications of AI
No ratings yet
Applications of AI
11 pages
NLP - Module 2
No ratings yet
NLP - Module 2
77 pages
Lecture 5: Language Modeling (N-Gram, BOW)
No ratings yet
Lecture 5: Language Modeling (N-Gram, BOW)
25 pages
3-Lecture Three - (Chapter Two-N-gram Language Models)
No ratings yet
3-Lecture Three - (Chapter Two-N-gram Language Models)
28 pages
Notes of NLP - Unit-2
No ratings yet
Notes of NLP - Unit-2
23 pages
LM 24 Aug
No ratings yet
LM 24 Aug
84 pages
3 LM 2024
No ratings yet
3 LM 2024
78 pages
Lecture 4
No ratings yet
Lecture 4
87 pages
Unit-3 (NLP)
No ratings yet
Unit-3 (NLP)
28 pages
Introduction To Computational Linguistics: Eugene Charniak and Mark Johnson
No ratings yet
Introduction To Computational Linguistics: Eugene Charniak and Mark Johnson
148 pages
Lecture 3 - Language Modelling and RNNs Part 1
No ratings yet
Lecture 3 - Language Modelling and RNNs Part 1
44 pages
Language Modeling
No ratings yet
Language Modeling
88 pages
Function Key
100% (1)
Function Key
3 pages
CH 6. Applications of AI-NLP
No ratings yet
CH 6. Applications of AI-NLP
65 pages
NLP-Ch-2 Introduction To Language Models
No ratings yet
NLP-Ch-2 Introduction To Language Models
82 pages
Lecture - 3 - Statistical Language Models
No ratings yet
Lecture - 3 - Statistical Language Models
56 pages
04 Language Modeling
No ratings yet
04 Language Modeling
70 pages
Multimedia Application L6
No ratings yet
Multimedia Application L6
63 pages
Unit 3-Notes AI
No ratings yet
Unit 3-Notes AI
36 pages
NLP Unit-4
No ratings yet
NLP Unit-4
48 pages
Module 5
No ratings yet
Module 5
69 pages
Bcse306l Ai Module-7 Smsatapathy
No ratings yet
Bcse306l Ai Module-7 Smsatapathy
51 pages
NLP Unit-4
No ratings yet
NLP Unit-4
62 pages
Ngrams
100% (1)
Ngrams
22 pages
NLP DeepNLP
No ratings yet
NLP DeepNLP
61 pages
Ngram
No ratings yet
Ngram
41 pages
13 Ngramlm
No ratings yet
13 Ngramlm
27 pages
Multimedia Application L5
No ratings yet
Multimedia Application L5
35 pages
6.chapter6 LanguageModel
No ratings yet
6.chapter6 LanguageModel
33 pages
Lecture 4 N Grams
No ratings yet
Lecture 4 N Grams
29 pages
Introduction To Language Modeling Final
No ratings yet
Introduction To Language Modeling Final
69 pages
UBC Summer School in NLP - VSP 2019 Lecture 9
No ratings yet
UBC Summer School in NLP - VSP 2019 Lecture 9
17 pages
N Grams
No ratings yet
N Grams
51 pages
NLP Unit2
No ratings yet
NLP Unit2
65 pages
PT 2
No ratings yet
PT 2
59 pages
Lecture 6 N Gram Language Models Contd Annotations
No ratings yet
Lecture 6 N Gram Language Models Contd Annotations
36 pages
NLP L IA2
No ratings yet
NLP L IA2
23 pages
13 Ai Cse551 NLP 1 PDF
No ratings yet
13 Ai Cse551 NLP 1 PDF
50 pages
DSA Module 5 Notes
No ratings yet
DSA Module 5 Notes
23 pages
Natural Language Processing
No ratings yet
Natural Language Processing
28 pages
NLP 5th Unit
No ratings yet
NLP 5th Unit
19 pages
NLP 1.2
No ratings yet
NLP 1.2
22 pages
OOSD All Units Notes by MultiAtoms
No ratings yet
OOSD All Units Notes by MultiAtoms
93 pages
Lec-3 Language Modeling N-Grams
No ratings yet
Lec-3 Language Modeling N-Grams
41 pages
Design Engineer Interview Questions
0% (1)
Design Engineer Interview Questions
2 pages
Machine Learning and Statistical Natural Language Processing
No ratings yet
Machine Learning and Statistical Natural Language Processing
27 pages
Eisenstein
No ratings yet
Eisenstein
305 pages
Chapter 5
No ratings yet
Chapter 5
22 pages
Language Modeling: Prabhleen Juneja Thapar Institute of Engineering & Technology
No ratings yet
Language Modeling: Prabhleen Juneja Thapar Institute of Engineering & Technology
36 pages
NLP - N-Gram Language Model
No ratings yet
NLP - N-Gram Language Model
22 pages
UNIT 3 Language Modelling
No ratings yet
UNIT 3 Language Modelling
15 pages
Natural Language Processing 5
No ratings yet
Natural Language Processing 5
24 pages
NLP Viva
No ratings yet
NLP Viva
14 pages
Module 5-Natural Language Processing
No ratings yet
Module 5-Natural Language Processing
13 pages
NLP Sem Unit 5
No ratings yet
NLP Sem Unit 5
9 pages
Motivation Video: Mitsuku Vs Cleverbot - AI (Artificial Intelligence)
No ratings yet
Motivation Video: Mitsuku Vs Cleverbot - AI (Artificial Intelligence)
45 pages
Artificial Intelligence: N-Gram Models: Russell & Norvig: Section 22.1
No ratings yet
Artificial Intelligence: N-Gram Models: Russell & Norvig: Section 22.1
32 pages
CS 388: Natural Language Processing:: N-Gram Language Models
No ratings yet
CS 388: Natural Language Processing:: N-Gram Language Models
22 pages
Module-5:: Network Analysis
No ratings yet
Module-5:: Network Analysis
22 pages
2a29477 Clapper Check Valve Ops Manual
No ratings yet
2a29477 Clapper Check Valve Ops Manual
28 pages
Romance D'Amour Sheet Music For Piano (Solo) Easy
No ratings yet
Romance D'Amour Sheet Music For Piano (Solo) Easy
1 page
1034 Chap 2
100% (1)
1034 Chap 2
48 pages
PVC-Insulated Cables: 450/750V Single-Core PVC Insulated, Non-Sheathed Cable
No ratings yet
PVC-Insulated Cables: 450/750V Single-Core PVC Insulated, Non-Sheathed Cable
1 page
Https Raw - Githubusercontent.com Joelgrus Data-Science-From-Scratch Master Code Natural Language Processing
No ratings yet
Https Raw - Githubusercontent.com Joelgrus Data-Science-From-Scratch Master Code Natural Language Processing
5 pages
Particulars of Factories Paying Revenue of Rs. One Crore and Above During The Year 2006-2007 As Compared To 2005 - 06 Commissionerate: Chennai-Iv
No ratings yet
Particulars of Factories Paying Revenue of Rs. One Crore and Above During The Year 2006-2007 As Compared To 2005 - 06 Commissionerate: Chennai-Iv
13 pages
High Availability and DR Test Report: T24 Architecture With JMS Connectivity Oracle Stack
No ratings yet
High Availability and DR Test Report: T24 Architecture With JMS Connectivity Oracle Stack
59 pages
Iphone Thesis Statement
100% (2)
Iphone Thesis Statement
6 pages
Cold Storage Design Thesis
100% (2)
Cold Storage Design Thesis
6 pages
Colorbond Brochure 140220
No ratings yet
Colorbond Brochure 140220
40 pages
21st Century Learning For 21st Century Skills 7th European Conference Of Technology Enhanced Learning Ectel 2012 Saarbrcken Germany September 1821 2012 Proceedings 1st Edition Richard Noss Auth instant download
No ratings yet
21st Century Learning For 21st Century Skills 7th European Conference Of Technology Enhanced Learning Ectel 2012 Saarbrcken Germany September 1821 2012 Proceedings 1st Edition Richard Noss Auth instant download
77 pages
English 5 Blended Words
No ratings yet
English 5 Blended Words
26 pages
3 Way Valve Size 1.5 Inch VMBT6 + MVT503
No ratings yet
3 Way Valve Size 1.5 Inch VMBT6 + MVT503
6 pages
ISTQB CTFL40 Sample-Exam-Answers SET-E v1.2 GTB-edition Engl en
No ratings yet
ISTQB CTFL40 Sample-Exam-Answers SET-E v1.2 GTB-edition Engl en
59 pages
Busbar Protection
No ratings yet
Busbar Protection
40 pages
Module 5 Turing Machines
No ratings yet
Module 5 Turing Machines
6 pages
Whatsapp Document PDF
No ratings yet
Whatsapp Document PDF
5 pages
The Smart Thermostat
No ratings yet
The Smart Thermostat
15 pages
RTI GHY April 22
No ratings yet
RTI GHY April 22
42 pages
RV 10
No ratings yet
RV 10
8 pages
Valvula de Alivio de Presion 0.5in Modelo A Reliable
No ratings yet
Valvula de Alivio de Presion 0.5in Modelo A Reliable
1 page
Prateek Resume
No ratings yet
Prateek Resume
1 page
Tutorial Bootstrap Part 3 - Cara Menginstall Bootstrap 5
No ratings yet
Tutorial Bootstrap Part 3 - Cara Menginstall Bootstrap 5
6 pages
OOPS
No ratings yet
OOPS
3 pages
Tooling For Euomac Multi Tools
No ratings yet
Tooling For Euomac Multi Tools
4 pages
Power Electronics
No ratings yet
Power Electronics
3 pages
Oi Nod 2425 0142
No ratings yet
Oi Nod 2425 0142
1 page
Project: Location: Date: Owner: Subject:: Task Progress Start END Days
No ratings yet
Project: Location: Date: Owner: Subject:: Task Progress Start END Days
1 page
50 Python Concepts Every Developer Should Know
From Everand
50 Python Concepts Every Developer Should Know
Hernando Abella
No ratings yet
Pattern Recognition: Fundamentals and Applications
From Everand
Pattern Recognition: Fundamentals and Applications
Fouad Sabry
No ratings yet

Module5 DS PPT

Uploaded by

Module5 DS PPT

Uploaded by

Module 5:Natural Language Processing

from typing import List, Dict

def roll_a_die() -> int:

def random_y_given_x(x: int) -> int:

Centrality Measure What It Measures Strengths Limitations

Content-Based FilteringRecommends items based on item features and user

Hybrid SystemsCombines collaborative and content-based filtering.Advantages: Balances

Matrix Factorization (Latent Factor Models)Uses techniques like Singular Value

Knowledge-Based Systems: Relies on domain-specific knowledge to recommend

Steps to Build a Recommendation System

Algorithms for Matrix Factorization

1.Describe the n-gram language model in detail

You might also like