0% found this document useful (0 votes)

28 views40 pages

Content Based Filtering

Information security

Uploaded by

hermoinegranger187

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views40 pages

Content Based Filtering

Information security

Uploaded by

hermoinegranger187

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 40

Content-Based

Recommendation
Content-based?

• Item descriptions to identify items that

are of particular interest to the user
Example
Example
Comparing with Non-
content based
Items
• User-based CF
Searches for similar users
in user-item rating
matrix

• No rating

• Item-feature matrix
Ratings
Item Representation

• Structured
• Unstructured
• Semi-structured
Structured
• Attribute - value
Unstructured
• Full-text
• No attributes formally defined
• Other complicated problems - such as
synonymy, polysymy
Semi-structured

• Structured + unstructured
• Well defined attributes/values + free text
Conversion of
Unstructured Data
• Need to convert to structured form
• IR techniques
• VSM, TF-IDF, stemming, etc.
User Profile
• A model of user s preference
• A history of the user s interactions
• Recently viewed items
• Already viewed items
• Training data — machine learning
User Profile
Tear Rate
Reduced Normal

No
Young
Age Pre-presb

Yes Presbyoptic
Prescription
Hypermetrope
Myope
Yes
Astigmatic
Yes No
Yes No
User Profiling
Collects information
about individual user User Modeling side

Adaptive User Model

System
Adaptation side
Provides
adaptation effect

User profiling == user modeling

User Profile
• Customization
• Manual
• Limitations — user efforts (interest
change), no ordering
• Rule-based recommendation — history
based rules
User Profile
User Model Learning
• Classification problem
• Classifying to Like or Dislike
• Training data — feedbacks
• Probability of classification
• Unstructured data conversion
• Feature selection — high/low dimensions
User Model Learning
Training Data Target Data

Train/Learn
Classify/Recommen
UM Learning
Feedbacks
• Implicit feedback
• Indirect interaction
• Opened document, Reading time, etc.
• Large data, high uncertainty
UM Learning
Feedbacks
• Explicit

• Directly from users

• No noise, hard to
obtain
User Model Learning
Feature Selection
• Problem of high dimensional input vectors
• Overfit
(especially when a dataset is small)
• Document frequency thresholding,
Information gain, Mutual information, Chi
square statistic, Term strength
Overfitting

Overfit Underfit
User Model Learning
Feature Selection
• Mutual Information
• A = number of times t and c co-occur
B = number of times t occurs without c
C = number of times c occurs without t
N = number of total documents

A! N
I(t, c) = log
(A + C) ! (A + B)
User Model Learning
Feature Selection
• Austrian train fire accident
• After learning 5 documents
Fire
Train Alps Austria People

A 5 5 2 5 5
B 5873 8092 93 974 34501
C 0 0 3 0 0
Decision Tree

• Partitioning dataset into trees

• Ideal for structured, small data
• Performance, simplicity, understandability
• ID3 (Iterative Dichotomiser 3)
Decision Tree
Example
• Using WEKA
• http://www.cs.waikato.ac.nz/ml/weka/
• Machine learning algorithm package
• JAVA API, interactive UI
Decision Tree
Example - dataset
Training Testing

attribute as inputs attribute to be es

Decision Tree
Example - tree
Decision Tree
Example - tree
Tear Rate
Reduced Normal

Young
Age Pre-presbyoptic
Presbyoptic Yes
Yes
Prescription
Hypermetrope
Myope
es
Astigmatic
Yes No
Decision Tree
Example - evaluation
k Nearest Neighbor

• Prepare training data (classification labels)

• Extract k most similar items
• Decide the label of the test data by looking
at kNN s
k Nearest Neighbor
Example

k=3
k=5
Linear Classifier

raining data

• Tries to find out a hyperplane that best

separates classes
Linear Classifier
SVN
• Support Vector Machine
• Maximizes the distance between decision
boundary & support vector (closest
training instance)
• Avoids overfitting
Linear Classifier
SVM
Naive Bayes

• VSM — lack of theoretical justification

• Probabilistic text classification method
• Naive = term independence
• Probability document d is classified to
category c
Naive Bayes
• Multivariate Bernoulli
• Document probability = of term
probability
term independence assumption
naive
Naive Bayes
• Multinomial
• Non-binary
Naive Bayes
Example
Conclusions
• User model learns from content
(description, fulltext, etc) itself
• Implicit, explicit method
• Classifying — like, dislike
• Limitations — not enough content
• Hybrid — content + collaborative

Introduction To Big Data and Data Mining
No ratings yet
Introduction To Big Data and Data Mining
130 pages
OTHM L7 DEML Assignment Briefs - April 2019 - V01 (1142)
83% (6)
OTHM L7 DEML Assignment Briefs - April 2019 - V01 (1142)
17 pages
Machine Learning
100% (2)
Machine Learning
104 pages
Unit-5 Mahout
0% (1)
Unit-5 Mahout
26 pages
TNCT - Q1 - M5 - Ver2 - Contribution of The Parts of The Whole
No ratings yet
TNCT - Q1 - M5 - Ver2 - Contribution of The Parts of The Whole
24 pages
Unit One 8602
100% (3)
Unit One 8602
25 pages
13 PracticalMachineLearning
100% (1)
13 PracticalMachineLearning
84 pages
DLP Week 2 Ucsp
No ratings yet
DLP Week 2 Ucsp
3 pages
Supervised Learning
No ratings yet
Supervised Learning
30 pages
Topic No.1: You Should Say
100% (2)
Topic No.1: You Should Say
4 pages
08 Class Basic
No ratings yet
08 Class Basic
141 pages
Emotional Intelligence, Classroom Management, Competencies and Performance of Kindergarten Teachers
No ratings yet
Emotional Intelligence, Classroom Management, Competencies and Performance of Kindergarten Teachers
12 pages
08 Class Basic
No ratings yet
08 Class Basic
103 pages
"Classifiers": R & D Project by Under The Guidance of
No ratings yet
"Classifiers": R & D Project by Under The Guidance of
59 pages
PDSeasonableSchool ML4PD
No ratings yet
PDSeasonableSchool ML4PD
135 pages
Mastery of The Competencies of Edukasyon Sa Pagpapakatao (ESP)
No ratings yet
Mastery of The Competencies of Edukasyon Sa Pagpapakatao (ESP)
9 pages
2021 Lecture10 BasicML
No ratings yet
2021 Lecture10 BasicML
76 pages
In5490 Classification
No ratings yet
In5490 Classification
85 pages
Module 1 ML
No ratings yet
Module 1 ML
78 pages
An Introduction To Data Mining IIT Bombay
No ratings yet
An Introduction To Data Mining IIT Bombay
48 pages
Artificial Intelligence Fundamentals and Applications
No ratings yet
Artificial Intelligence Fundamentals and Applications
77 pages
Machine Learning: Mona Leeza Email: Monaleeza - Bukc@bahria - Edu.pk
No ratings yet
Machine Learning: Mona Leeza Email: Monaleeza - Bukc@bahria - Edu.pk
60 pages
Lecture 3 Supervised
No ratings yet
Lecture 3 Supervised
58 pages
TED Talk - Instructions
No ratings yet
TED Talk - Instructions
3 pages
03 Classification
No ratings yet
03 Classification
66 pages
A Review of Information Filtering-CF
No ratings yet
A Review of Information Filtering-CF
47 pages
An Introduction To Data Mining: Prof. S. Sudarshan CSE Dept, IIT Bombay
No ratings yet
An Introduction To Data Mining: Prof. S. Sudarshan CSE Dept, IIT Bombay
47 pages
Quiz 1 On Wednesday
No ratings yet
Quiz 1 On Wednesday
46 pages
Module 2
No ratings yet
Module 2
53 pages
ML.4-Classification Techniques (Week 5,6,7)
No ratings yet
ML.4-Classification Techniques (Week 5,6,7)
56 pages
11 W11NSE6220 - Fall 2023 - Zeng
No ratings yet
11 W11NSE6220 - Fall 2023 - Zeng
43 pages
TTNT 09 Learning From Examples
No ratings yet
TTNT 09 Learning From Examples
58 pages
Data Mining Classification Algorithms: Credits: Padhraic Smyth
No ratings yet
Data Mining Classification Algorithms: Credits: Padhraic Smyth
54 pages
CCST9017 (2023-24lecture11printed Version) MachineLearning
No ratings yet
CCST9017 (2023-24lecture11printed Version) MachineLearning
55 pages
ML DecisionTrees
No ratings yet
ML DecisionTrees
46 pages
Welcome To:: Introduction To Classification & Classification Algorithms
No ratings yet
Welcome To:: Introduction To Classification & Classification Algorithms
42 pages
Learning
No ratings yet
Learning
51 pages
Data Mining All Summary
No ratings yet
Data Mining All Summary
47 pages
Learning AI
No ratings yet
Learning AI
34 pages
AI
No ratings yet
AI
28 pages
INT524 Unit3
No ratings yet
INT524 Unit3
35 pages
Jalali@mshdiua - Ac.ir Jalali - Mshdiau.ac - Ir: Data Mining
No ratings yet
Jalali@mshdiua - Ac.ir Jalali - Mshdiau.ac - Ir: Data Mining
50 pages
Pattern Recognition 14
No ratings yet
Pattern Recognition 14
46 pages
Big Data Notes
No ratings yet
Big Data Notes
33 pages
2019 NEDRP Connection Toolkit 5.0 PDF
No ratings yet
2019 NEDRP Connection Toolkit 5.0 PDF
1 page
Machine Learning Crash Course: Computer Vision James Hays
No ratings yet
Machine Learning Crash Course: Computer Vision James Hays
38 pages
Bia Unit-3 Part-2
No ratings yet
Bia Unit-3 Part-2
43 pages
Markets vs. Monopolies in Education: A Global Review of The Evidence, Cato Policy Analysis No. 620
No ratings yet
Markets vs. Monopolies in Education: A Global Review of The Evidence, Cato Policy Analysis No. 620
16 pages
An Introduction To Data Mining: Prof. S. Sudarshan CSE Dept, IIT Bombay
No ratings yet
An Introduction To Data Mining: Prof. S. Sudarshan CSE Dept, IIT Bombay
47 pages
Lec13 ML Intro
No ratings yet
Lec13 ML Intro
27 pages
Data Mining Intro IEP
No ratings yet
Data Mining Intro IEP
47 pages
Filtering and Recommender Systems: Content-Based and Collaborative
No ratings yet
Filtering and Recommender Systems: Content-Based and Collaborative
30 pages
Class 2a-Decision Trees
No ratings yet
Class 2a-Decision Trees
28 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
3rd Quarter Science 7 - DLP VELOCITY
No ratings yet
3rd Quarter Science 7 - DLP VELOCITY
4 pages
An Introduction To Data Mining: Prof. S. Sudarshan CSE Dept, IIT Bombay
No ratings yet
An Introduction To Data Mining: Prof. S. Sudarshan CSE Dept, IIT Bombay
48 pages
English DLL Q2 Week 7
No ratings yet
English DLL Q2 Week 7
3 pages
(LS 1 English, From The Division of Zamboanga Del Sur
No ratings yet
(LS 1 English, From The Division of Zamboanga Del Sur
17 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
24 pages
DWDM Unit 4
No ratings yet
DWDM Unit 4
22 pages
ML Important
No ratings yet
ML Important
11 pages
Machine Learning Model ENG
No ratings yet
Machine Learning Model ENG
16 pages
Jadavpur University: Assignment Submission
No ratings yet
Jadavpur University: Assignment Submission
9 pages
Course Description:: World Geography Course Syllabus Gateway College Preparatory School Grade 9
100% (1)
Course Description:: World Geography Course Syllabus Gateway College Preparatory School Grade 9
2 pages
Alibai, Bisalao
No ratings yet
Alibai, Bisalao
11 pages
Grade 9 Circle Lesson Plan
No ratings yet
Grade 9 Circle Lesson Plan
3 pages
Midterm Topics - V Advanced Data Mining Algorithms
No ratings yet
Midterm Topics - V Advanced Data Mining Algorithms
7 pages
Naïve Bayes vs. Decision Trees vs. Neural Networks in The Classification of Training Web Pages
No ratings yet
Naïve Bayes vs. Decision Trees vs. Neural Networks in The Classification of Training Web Pages
8 pages
INT 354 CA1 Mokshagna
No ratings yet
INT 354 CA1 Mokshagna
8 pages
Comparison of Text Classifiers On News Articles
No ratings yet
Comparison of Text Classifiers On News Articles
5 pages
Influence of Remedial Program On Academic Performance
No ratings yet
Influence of Remedial Program On Academic Performance
6 pages
Classification
No ratings yet
Classification
4 pages
Grade 6 Parent Home Learning Plan
No ratings yet
Grade 6 Parent Home Learning Plan
1 page
Annotated Bibliography
No ratings yet
Annotated Bibliography
5 pages
Individual Work Plan by D Phiri
No ratings yet
Individual Work Plan by D Phiri
2 pages
Abegail Sese Curriculum Vitae
No ratings yet
Abegail Sese Curriculum Vitae
3 pages
DLP-Q2 - Simple Annuity
No ratings yet
DLP-Q2 - Simple Annuity
3 pages
SM0025 - Exercise Course Assessment Plan
No ratings yet
SM0025 - Exercise Course Assessment Plan
2 pages
4 2 Yr9 Maths Cartesian Lesson Plan
No ratings yet
4 2 Yr9 Maths Cartesian Lesson Plan
4 pages
Department of Education Maquilao Integrated School Maquilao, Tangub City
No ratings yet
Department of Education Maquilao Integrated School Maquilao, Tangub City
2 pages
英語主要檢測參考對照表 English Proficiency Test Comparison Chart (1120616更新)
No ratings yet
英語主要檢測參考對照表 English Proficiency Test Comparison Chart (1120616更新)
1 page
Makdum Khan Resume - Process Trainer
No ratings yet
Makdum Khan Resume - Process Trainer
1 page
365 ML Infographic
No ratings yet
365 ML Infographic
1 page
Lesson Plan
No ratings yet
Lesson Plan
6 pages
Managing Impulsivity Explicit Teachlp
No ratings yet
Managing Impulsivity Explicit Teachlp
2 pages
Python Machine Learning: Learn how to build powerful Python machine learning algorithms to generate useful data insights with this data analysis tutorial
From Everand
Python Machine Learning: Learn how to build powerful Python machine learning algorithms to generate useful data insights with this data analysis tutorial
Sebastian Raschka
4/5 (20)
Machine Learning with R
From Everand
Machine Learning with R
Brett Lantz
4/5 (9)
Machine Learning with R - Third Edition: Expert techniques for predictive modeling, 3rd Edition
From Everand
Machine Learning with R - Third Edition: Expert techniques for predictive modeling, 3rd Edition
Brett Lantz
No ratings yet
data science course training in india hyderabad: innomatics research labs
From Everand
data science course training in india hyderabad: innomatics research labs
innomatics research labs
No ratings yet
Data Collection: Six Sigma Thinking, #1
From Everand
Data Collection: Six Sigma Thinking, #1
Sumeet Savant
No ratings yet

Content Based Filtering

Uploaded by

Content Based Filtering

Uploaded by

Content-Based

• Item descriptions to identify items that

Adaptive User Model

User profiling == user modeling

• Directly from users

• Partitioning dataset into trees

attribute as inputs attribute to be es

• Prepare training data (classification labels)

• Tries to find out a hyperplane that best

• VSM — lack of theoretical justification

You might also like