0% found this document useful (0 votes)
91 views150 pages

A Study of Different Similarity Metrics and Prediction Approaches in Collaborative Filtering For Recommendation

Here are the key steps: 1. The task is to predict ratings P(U,T) for users U and items T in the testing set based on a model trained on the training set. 2. Common evaluation metrics are mean absolute error (MAE) and root mean squared error (RMSE) between predicted ratings P(U,T) and actual ratings R(U,T). Lower errors indicate better performance. 3. Models like average item ratings or collaborative filtering can be used to build the prediction model and make predictions for the testing set. The goal is to accurately predict unknown ratings for evaluation based on patterns learned from the training set. This evaluates how well the model generalizes to new data.

Uploaded by

achyut301093
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
91 views150 pages

A Study of Different Similarity Metrics and Prediction Approaches in Collaborative Filtering For Recommendation

Here are the key steps: 1. The task is to predict ratings P(U,T) for users U and items T in the testing set based on a model trained on the training set. 2. Common evaluation metrics are mean absolute error (MAE) and root mean squared error (RMSE) between predicted ratings P(U,T) and actual ratings R(U,T). Lower errors indicate better performance. 3. Models like average item ratings or collaborative filtering can be used to build the prediction model and make predictions for the testing set. The goal is to accurately predict unknown ratings for evaluation based on patterns learned from the training set. This evaluates how well the model generalizes to new data.

Uploaded by

achyut301093
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 150

A Study of Different Similarity

Metrics and Prediction Approaches


in Collaborative Filtering for
Recommendation

National Institute of Technology, Durgapur


West Bengal - 713209

January 30, 2018


Outline
• Recommender System
• Motivation
• Introduction of Collaborative Filtering
• Types of similarity algorithm in Collaborative Filtering
• Introduction of Similarity Metrics and Prediction Approach
• Accuracy comparison of different similarity algorithm in
Collaborative Filtering
• Conclusion
• Future Work

January 30, 2018


Recommender System
What is Recommender System ?

January 30, 2018


Motivation

Why Recommender System is important ?

Products Recommendation Jobs Recommendation

Movies Recommendation Songs Recommendation

News Recommendation Friends Recommendation


 Collaborative Filtering Based Recommender System

January 30, 2018


Example
Input

User preferences for books on a scale of 1-5 and titles of the books
Similarity Algorithm in Collaborative
Filtering

• Item Based Collaborative Filtering algorithm


• User Based Collaborative Filtering algorithm

January 30, 2018


Similarity Metrics
• Cosine Similarity (CS)
• Adjusted Cosine Similarity (ACS)
• Pearson Correlation (PC)
• Spearman Correlation (SC)
• Jaccard Similarity (JS)
• Euclidean Distance (ED)
• Manhattan Distance (MD)
• Mean Squared Distance (MSD)

January 30, 2018


Item Based Similarity Metrics
• Cosine Similarity (CS)

𝑟𝑖𝑢 and 𝑟𝑗𝑢 are denote the rating of user u on items i and j.

4 3
4 ∗ 3 + 5∗0 + 4∗0 + 0∗3 + 0∗4 +(0∗0)
s(1,2) = 5 =
42 + 52 + 4 2 ∗ ( 32 + 32 + 42 )
4
3 = 0.272
4

January 30, 2018


 Item Based Cosine Similarity Matrix

Cosine Similarity matrix on Books

January 30, 2018


• Adjusted Cosine Similarity (ACS)

U1 4
U2 4.33
U3 4
Average rating of all user =
U4 4
U5 4
U6 3.67

Sim(1,2)=
4−4 3−4 + 5−4.33 0−4.33 + 4−4 0−4 + 0−4 3−4 + 0−4 4−4 + 0−3.67 (0−3.67)
{ 4−4 2 + 5−4.33 2 + 4−4 2 + 0−4 2 + 0−4 2 + 0−3.67 2 } { 3−4 2 + 0−4.33 2 + 0−4 2 + 3−4 2 + 4−4 2 + 0−3.67 2 }

= 0.3032

January 30, 2018


 Item Based Adjusted Cosine Similarity Matrix

1 0.3032 0.7765 0.4997 0.9784 -0.2376

0.3032 1 0.2264 0.5062 0.3700 0.6342

0.7765 0.2264 1 0.8069 0.6963 0.1370

0.4997 0.5062 0.8069 1 0.5020 0.5841

0.9784 0.3700 0.6963 0.5020 1 -0.2292

-0.2376 0.6342 0.1370 0.5841 -0.2292 1

Adjusted Cosine Similarity matrix on Books

January 30, 2018


• Pearson Correlation (PC)

𝑅𝑢,𝑖 and 𝑅𝑢,𝑗 are denote the rating of user u on item i and item j.

Average rating of all items =


B1 B2 B3 B4 B5 B6
4.333 3.333 3.667 3.50 4.333 4.667

Sim(1,2)=

4−4.33 3−3.33 + 5−4.33 0−3.33 + 4−4.33 0−3.33 + 0−4.33 3−3.33 + 0−4.33 4−3.33 + 0−4.33 (0−3.33)
{ 4−4.33 2+ 5−4.33 2 + 4−4.33 2 + 0−4.33 2 + 0−4.33 2 + 0−4.33 2 } { 3−3.33 2 + 0−3.33 2 + 0−3.33 2 + 3−3.33 2 + 4−3.33 2 + 0−3.33 2 }

= 0.2725

January 30, 2018


 Item Based Pearson Correlation Matrix

Pearson Correlation Similarity matrix on Books

January 30, 2018


• Spearman Correlation (SC)

Rank matrix =

January 30, 2018


 Item Based Spearman Correlation Matrix

Spearman Correlation Similarity matrix on Books

January 30, 2018


• Jaccard Similarity (JS)

1
Sim(1,2) = = 0.20
5

Jaccard Similarity matrix on Books

January 30, 2018


• Euclidean Distance (ED)

January 30, 2018


• Manhattan Distance (MD)

January 30, 2018


• Mean Squared Distance (MSD)

January 30, 2018


Rating Prediction Approaches
• Mean Centering (MC)
𝑗∈𝑁𝑢 (𝑖) 𝑠𝑖𝑚 𝑖,𝑗 (𝑟𝑗𝑢 − 𝑟𝑗 )
𝑟𝑢𝑖 = 𝑟𝑖 +
𝑗∈𝑁𝑢 (𝑖) |𝑠𝑖𝑚 𝑖,𝑗 |

• Weighted Average (WA)


𝑗∈𝑁𝑢 (𝑖) 𝑠𝑖𝑚 𝑖,𝑗 ∗𝑟𝑗𝑢
𝑟𝑢𝑖 =
𝑗∈𝑁𝑢 (𝑖) |𝑠𝑖𝑚 𝑖,𝑗 |

• Z-Score (ZS)
𝑗∈𝑁𝑢 (𝑖) 𝑠𝑖𝑚 𝑖,𝑗 (𝑟𝑗𝑢 − 𝑟𝑗 )/ 𝜎𝑖
𝑟𝑢𝑖 = 𝑟𝑖 + 𝜎𝑖
𝑗∈𝑁𝑢 (𝑖) |𝑠𝑖𝑚 𝑖,𝑗 |

January 30, 2018


Rating Prediction
• Mean Centering (MC)
𝑗∈𝑁𝑢 (𝑖) 𝑠𝑖𝑚 𝑖,𝑗 (𝑟𝑗𝑢 − 𝑟𝑗 )
𝑟𝑢𝑖 = 𝑟𝑖 +
𝑗∈𝑁𝑢 (𝑖) |𝑠𝑖𝑚 𝑖,𝑗 |

B1 B2 B3 B4 B5 B6
Average rating of the book =
4.333 3.333 3.667 3.50 4.333 4.667

𝑟13 = 3.667 +
0.79 4−4.333 +0+1(0−3.667)+ 0.69 0−3.50 + 0.71 5−4.33 +(0.18)(0−4.667)
0.79 +0.69+0+1 +0.71+0.18

𝒓𝟏𝟑 = 1.67

January 30, 2018


• Weighted Average (WA)
𝑗∈𝑁𝑢 (𝑖) 𝑠𝑖𝑚 𝑖,𝑗 ∗𝑟𝑗𝑢
𝑟𝑢𝑖 =
𝑗∈𝑁𝑢 (𝑖) |𝑠𝑖𝑚 𝑖,𝑗 |

0.79 4 +0+1(0)+ 0.69 0 + 0.71 5 +(0.18)(0)


𝑟13 =
0.79 +0.69+0+1 +0.71+0.18
= 1.99

January 30, 2018


• Z-Score (ZS)
𝑗∈𝑁𝑢 (𝑖) 𝑠𝑖𝑚 𝑖,𝑗 (𝑟𝑗𝑢 − 𝑟𝑗 )/ 𝜎𝑖
𝑟𝑢𝑖 = 𝑟𝑖 + 𝜎𝑖
𝑗∈𝑁𝑢 (𝑖) |𝑠𝑖𝑚 𝑖,𝑗 |

B1 B2 B3 B4 B5 B6
Standard deviation of items =
0.5774 0.5774 1.5275 0.7071 0.5774 0.5774

𝑟𝑢𝑖 = 1.67

January 30, 2018


Performance Metric
• Mean Absolute Error (MAE)

• Root Mean Squared Error (RMSE)

January 30, 2018


Accuracy Comparison of Item Similarity Based Collaborative Filtering
Mean Centering

Weighted Average

Z-Score

January 30, 2018


 User Based Collaborative filtering approach

Cosine Similarity matrix on Users

January 30, 2018


 User Based Collaborative filtering approach (Contd...)

January 30, 2018


Accuracy Comparison of User Similarity Based Collaborative Filtering

Mean Centering

Weighted Average

Z Score

January 30, 2018


Conclusion & Future Work
Prediction Approaches Item based Similarity User based Similarity
Metrics Metrics

Mean Centering (MC) ACS ACS & SC

Weighted Average (WA) JS ED & JS

Z Score (ZS) PC ED & JS

January 30, 2018


Task and Eval (1): Rating Prediction
User Item Rating
Task: P(U, T) in testing set
U1 T1 4
U1 T2 3 Prediction error: e = R(U, T) – P(U, T)
U1 T3 3
U2 T2 4 Train Mean Absolute Error (MAE) =
U2 T3 5
U2 T4 5
Other evaluation metrics:
U3 T4 4 • Root Mean Square Error (RMSE)
U1 T4 3 • Coverage
U2 T1 2 • and more …
U3 T1 3 Test
U3 T2 3
U3 T3 4

42
Task and Eval (1): Rating Prediction
User Item Rating
Task: P(U, T) in testing set
U1 T1 4
U1 T2 3 1. Build a model, e.g., P(U, T) = Avg (T)
U1 T3 3 2. Process of Rating Prediction
U2 T2 4 Train P(U1, T4) = Avg(T4) = (5+4)/2 = 4.5
P(U2, T1) = Avg(T1) = 4/1 = 4
U2 T3 5
P(U3, T1) = Avg(T1) = 4/1 = 4
U2 T4 5 P(U3, T2) = Avg(T2) = (3+4)/2 = 3.5
U3 T4 4 P(U3, T3) = Avg(T3) = (3+5)/2 = 4
U1 T4 3 3. Evaluation by Metrics
U2 T1 2
Mean Absolute Error (MAE) =
ei = R(U, T) – P(U, T)
U3 T1 3 Test
U3 T2 3 MAE = (|3 – 4.5| + |2 - 4| + |3 - 4| +
U3 T3 4 |3 – 3.5| + |4 - 4|) / 5 = 1
43
Task and Eval (2): Top-N Recommendation
User Item Rating Task: Top-N Items to a user U3
U1 T1 4
Predicted Rank: T3, T1, T4, T2
U1 T2 3
Real Rank: T3, T2, T1
U1 T3 3
U2 T2 4 Train Then compare the two lists:
U2 T3 5 Precision@N = # of hits/N
U2 T4 5
Other evaluation metrics:
U3 T4 4 • Recall
U1 T4 3 • Mean Average Precision (MAP)
• Normalized Discounted Cumulative Gain (NDCG)
U2 T1 2 • Mean Reciprocal Rank (MRR)
• and more …
U3 T1 3 Test
U3 T2 3
U3 T3 4

44
Task and Eval (2): Top-N Recommendation
User Item Rating Task: Top-N Items to user U3
U1 T1 4
1. Build a model, e.g., P(U, T) = Avg (T)
U1 T2 3
2. Process of Rating Prediction
U1 T3 3 P(U3, T1) = Avg(T1) = 4/1 = 4
U2 T2 4 Train P(U3, T2) = Avg(T2) = (3+4)/2 = 3.5
U2 T3 5 P(U3, T3) = Avg(T3) = (3+5)/2 = 4
P(U3, T4) = Avg(T4) = (4+5)/2 = 3.5
U2 T4 5
U3 T4 4 Predicted Rank: T3, T1, T4, T2
U1 T4 3 Real Rank: T3, T2, T1
U2 T1 2 3. Evaluation Based on the two lists
U3 T1 3 Test Precision@N = # of hits/N
Precision@1 = 1/1
U3 T2 3
Precision@2 = 2/2
U3 T3 4 Precision@3 = 2/3
45
Context and
Context-awareness
Outline

• Context and Context-awareness


What is context and examples
What is context-awareness and examples
Context collections

49
Outline

• Context and Context-awareness


What is context and examples
What is context-awareness and examples
Context collections

50
Factors Influencing Holiday Decisions

51
Example of Contexts

• Search in Google (by time)

52
Example of Contexts

• Search in Google Map (by location)

53
Example of Contexts Beyond Time & Locations

• Search in Google Map (by location)

54
Example of Contexts Beyond Time & Locations

• Search in Google Map (by location)

55
What is Context?

• “Context is any information that can be used to


characterize the situation of an entity” by Anind
K. Dey, 2001

56
What is Context?

There are tons of


ways to split
contexts into
different
categories

57
What is Context?

The most common contextual variables:


Time and Location
User intent or purpose
User emotional states
Devices
Topics of interests, e.g., apple vs. Apple
Others: companion, weather, budget, etc
Usually, the selection/definition of contexts is a domain-specific problem

58
Outline

• Context and Context-awareness


What is context and examples
What is context-awareness and examples
Context collections

59
Context-Awareness

• Context-Awareness = Adapt to the changes of


the contextual situations, to build smart
applications
• It has been successfully applied to:
– Ubiquitous computing
– Mobile computing
– Information Retrieval
– Recommender Systems
– And so forth…
60
Example: Smart Home with Remote Controls
https://www.youtube.com/watch?v=jB7iuBKcfZw

61
Example: Smart Home with Context-awareness
https://www.youtube.com/watch?v=UQWYRsXkbAM

62
Content vs Context

• Content-Based Approaches

• Context-Driven Applications and Approaches

63
Content-Based Approaches

64
Context-Driven Applications and Approaches

At Cinema At Home At Swimming Pool


with Friends with Family with Partner

65
When Contexts Take Effect?

• Contexts could be useful in different time points

Past Current Future


Context Context Context
Timeline

Historical
Most Ubiquitous
Data or
Applications Computing
Knowledge

Context Modeling Context Matching Context Prediction


Context Mining Context Adaptation Context Adaptation

66
Outline

• Context and Context-awareness


What is context and examples
What is context-awareness and examples
Context collections

67
How to Collect Contexts

• Sensors
e.g., the application of smart homes
• User Inputs
e.g., survey or user interactions
• Inference
e.g., from user reviews

68
How to Collect Contexts

• Sensors, e.g., the application of smart homes

69
How to Collect Contexts

• User Inputs, e.g., survey or user interactions

70
How to Collect Contexts

• Inference, e.g., from user reviews

Family Trip

Early Arrival

Season and Family Trip

71
Short Summary

• Information Overload
• Solution: Information Retrieval (IR)
e.g., Google Search Engine
• Solution: Recommender Systems (RecSys)
e.g., Movie recommender by Netflix
• Context and Context-awareness
e.g., Mobile computing and smart home devices

72
Next

• Coffee Break
– Time: 3:00 PM to 3:30 PM
– Location: Outside Merchants
• Context-awareness in IR and RecSys
• Extended Topics: Trends, Challenges and Future

73
Context-awareness in IR
Context-awareness in IR: Examples

• Search in Google (by time)

75
Context-awareness in IR: Examples

• Search in Google Map (by location)

76
Context in IR

• Searches should be processed in the context of the


information surrounding them, allowing more
accurate search results that better reflect the
user’s actual intentions. (Finkelstein, 2001)
• Context, in IR, refers to the whole data, metadata,
applications and cognitive structures embedded in
situations of retrieval or information seeking.
(Tamine, et al., 2010)
• These information usually have an impact on the
user’s behavior and perception of relevance.
77
Context in IR

78
Context-awareness in IR

Gareth J.F. Jones, 2004

79
Context-awareness in IR

The development of Context-awareness in IR


Interactive and Proactive by Gareth J.F. Jones, 2001
Other Frameworks or Models
Temporal Models
Semantic Models
Topic Models
Multimedia as inputs: IR based on voice or audios
And so forth…

80
Terminologies in IR

The Author: The author of documents, info provider


The End User: Who releases the queries or Whose
context information is captured
Information Recipient: Who finally receive the
retrieved information

We assume the end user and information recipient


are the same person.

81
Context-awareness in IR: Interactive Applications

Interactive app
• User-driven approach
• Users explicitly issue a request (along with context
information) to retrieve relevant documents
• Examples: What are the comfortable hotels near
the Omaha Zoo (assume there are no automatic
location detectors or sensors)
• Contexts are included in the query; or finer-grained
query can be derived from related key words
82
Context-awareness in IR: Proactive Applications

Proactive App
• Author-driven approach
• Each document is associated with a trigger context. The
documents are retrieved to the user if the trigger context
matches user’s current context.
• Example-1: (Location, Time) = trigger contexts for each
restaurant; open Yelp, input Chinese dish, Yelp will return a
list of Chinese restaurants nearby and valid opening hours at
the current moment. [search with queries]
• Example-2: (Location, Time) = trigger contexts for each
restaurant; open Yelp, Yelp will deliver a list of Chinese
restaurants nearby and valid opening hours at the current
moment. [retrieval or recommendation without queries]
83
Context-awareness in IR

Other Frameworks or Models


Temporal Models
Image retrieval, NYC pictures: old, new? Summer, winter?
Semantic Models
Text retrieval, apple vs Apple?
Topic Models
Academic papers retrieval, AI, ML, DM, RecSys?
Multimedia as inputs: IR based on voice or audios
Bird singing, which birds? real birds? emotional reactions?
And so forth…

84
Context-awareness in IR

Interactive
User must give a query
The context information are involved in the user inputs
It is a process from user contexts to relevant documents
Proactive
User may or may not give a query
Context are captured automatically
It is a process of matching trigger contexts with user contexts
It is a process from documents to users

85
Context-awareness in RecSys
Outline

• Context-aware Recommendation
Intro: Does context matter?
 Definition: What is Context in RecSys?
 Collection: Context Acquisition
 Selection: How to identify the relevant context?
Context Incorporation
Context Filtering
Context Modeling
Other Challenges and CARSKit
87
Non-context vs Context

• Decision Making = Rational + Contextual

Companion

• Examples:
 Travel destination: in winter vs in summer
 Movie watching: with children vs with partner
 Restaurant: quick lunch vs business dinner
 Music: workout vs study
88
What is Context?

• “Context is any information that can be used to characterize the


situation of an entity” by Anind K. Dey, 2001

• Representative Context: Fully Observable and Static


• Interactive Context: Non-Fully observable and Dynamic
89
Interactive Context Adaptation
• Interactive Context: Non-fully observable and Dynamic
List of References:
 M Hosseinzadeh Aghdam, N Hariri, B Mobasher, R Burke. "Adapting
Recommendations to Contextual Changes Using Hierarchical Hidden
Markov Models", ACM RecSys 2015
 N Hariri, B Mobasher, R Burke. "Adapting to user preference changes
in interactive recommendation", IJCAI 2015
 N Hariri, B Mobasher, R Burke. "Context adaptation in interactive
recommender systems", ACM RecSys 2014
 N Hariri, B Mobasher, R Burke. "Context-aware music
recommendation based on latent topic sequential patterns", ACM
RecSys 2012

90
CARS With Representative Context

• Observed Context:
Contexts are those variables which may change when a same
activity is performed again and again.
• Examples:
Watching a movie: time, location, companion, etc
Listening to a music: time, location, emotions, occasions, etc
Party or Restaurant: time, location, occasion, etc
Travels: time, location, weather, transportation condition, etc

91
What is Representative Context?

Activity Structure:
1). Subjects: group of users
2). Objects: group of items/users
3). Actions: the interactions within the activities

Which variables could be context?


1). Attributes of the actions
Watching a movie: time, location, companion Yong Zheng. "A Revisit to The
Identification of Contexts in
Listening to a music: time, occasions, etc Recommender Systems", IUI 2015

2). Dynamic attributes or status from the subjects


User emotions

92
Context-aware RecSys (CARS)

• Traditional RS: Users × Items Ratings


• Contextual RS: Users × Items × Contexts Ratings

Example of Multi-dimensional Context-aware Data set


User Item Rating Time Location Companion
U1 T1 3 Weekend Home Kids
U1 T2 5 Weekday Home Partner
U2 T2 2 Weekend Cinema Partner
U2 T3 3 Weekday Cinema Family
U1 T3 ? Weekend Cinema Kids

93
Terminology in CARS

• Example of Multi-dimensional Context-aware Data set


User Item Rating Time Location Companion
U1 T1 3 Weekend Home Kids
U1 T2 5 Weekday Home Partner
U2 T2 2 Weekend Cinema Partner
U2 T3 3 Weekday Cinema Family
U1 T3 ? Weekend Cinema Kids

Context Dimension: time, location, companion


Context Condition: Weekend/Weekday, Home/Cinema
Context Situation: {Weekend, Home, Kids}

94
Context Acquisition
How to Collect the context and user preferences in contexts?
• By User Surveys or Explicitly Asking for User Inputs
Predefine context & ask users to rate items in these situations;
Or directly ask users about their contexts in user interface;
• By Usage data
The log data usually contains time and location (at least);
User behaviors can also infer context signals;
• By User reviews
Text mining or opinion mining could be helpful to infer context
information from user reviews

95
Examples: Context Acquisition (RealTime)

96
Examples: Context Acquisition (Explicit)

97
Examples: Context Acquisition (Explicit)

98
Examples: Context Acquisition (Explicit)
Mobile App: South Tyrol Suggests

Personality
Collection

Context
Collection

99
Examples: Context Acquisition (Implicit)

• Inference, e.g., from user reviews

Family Trip

Early Arrival

Season and Family Trip

100
Examples: Context Acquisition (PreDefined)

101
Examples: Context Acquisition (PreDefined)

Google Music: Listen Now

102
Examples: Context Acquisition (User Behavior)

103
Context Relevance and Context Selection

Apparently, not all of the context are relevant or influential


• By User Surveys
e.g., which ones are important for you in this domain
• By Feature Selection
e.g., Principal Component Analysis (PCA)
e.g., Linear Discriminant Analysis (LDA)
• By Statistical Analysis or Detection on Contextual Ratings
Statistical test, e.g., Freeman-Halton Test
Other methods: information gain, mutual information, etc
Reference: Odic, Ante, et al. "Relevant context in a movie
recommender system: Users’ opinion vs. statistical detection."
CARS Workshop@ACM RecSys 2012
104
Context-aware Data Sets

Public Data Set for Research Purpose


• Food: AIST Japan Food, Mexico Tijuana Restaurant Data
• Movies: AdomMovie, DePaulMovie, LDOS-CoMoDa Data
• Music: InCarMusic
• Travel: TripAdvisor, South Tyrol Suggests (STS)
• Mobile: Frappe
Frappe is a large data set, others are either small or sparse
Downloads and References:
https://github.com/irecsys/CARSKit/tree/master/context-
aware_data_sets

105
Context Incorporation

• Once we collect context information, and also identify


the most influential or relevant contexts, the next step
is to incorporate contexts into the recommender
systems.
• Traditional RS: Users × Items Ratings
• Contextual RS: Users × Items × Contexts Ratings

106
Context-aware RecSys (CARS)

• There are three ways to build algorithms for CARS

107
Context-aware RecSys (CARS)

• Next, we focus on the following CARS algorithms:


Contextual Filtering: Use Context as Filter
Contextual Modeling: Independent vs Dependent

108
Contextual Filtering

109
Contextual Filtering

Reduction-based Approach, 2005


Exact and Generalized PreFiltering, 2009
Item Splitting, 2009
User Splitting, 2011
Dimension as Virtual Items, 2011
Differential Context Relaxation, 2012
Differential Context Weighting, 2013
Semantic Contextual Pre-Filtering, 2013
UI Splitting, 2014
110
Differential Context Modeling

• Data Sparsity Problem in Contextual Rating


User Movie Time Location Companion Rating

U1 Titanic Weekend Home Girlfriend 4

U2 Titanic Weekday Home Girlfriend 5

U3 Titanic Weekday Cinema Sister 4

U1 Titanic Weekday Home Sister ?

Context Matching  Only profiles given in <Weekday, Home, Sister>


Context Relaxation  Use a subset of context dimensions to match
Context Weighting  Use all profiles, but weighted by context similarity

111
Differential Context Modeling

• Context Relaxation
User Movie Time Location Companion Rating

U1 Titanic Weekend Home Girlfriend 4

U2 Titanic Weekday Home Girlfriend 5

U3 Titanic Weekday Cinema Sister 4

U1 Titanic Weekday Home Sister ?

Use {Time, Location, Companion}  0 record matched!


Use {Time, Location}  1 record matched!
Use {Time}  2 records matched!

Note: a balance is required for relaxation and accuracy

112
Differential Context Modeling

• Context Weighting
User Movie Time Location Companion Rating

U1 Titanic Weekend Home Girlfriend 4

U2 Titanic Weekday Home Girlfriend 5

U3 Titanic Weekday Cinema Sister 4

U1 Titanic Weekday Home Sister ?

Similarity of contexts is measured by


Weighted Jaccard similarity
c and d are two contexts. (Two red regions in the Table above.)
σ is the weighting vector <w1, w2, w3> for three dimensions.
Assume they are equal weights, w1 = w2 = w3 = 1.
J(c, d, σ) = # of matched dimensions / # of all dimensions = 2/3

113
Differential Context Modeling

• Notion of “differential”
1.Neighbor Selection 2.Neighbor contribution

3.User baseline 4.User Similarity

In short, we apply different context relaxation and context weighting to


each component

114
Differential Context Modeling
• Workflow
Step-1: We decompose an algorithm to different components;
Step-2: We try to find optimal context relaxation/weighting:
 In context relaxation, we select optimal context dimensions
 In context weighting, we find optimal weights for each dimension
• Optimization Problem
Assume there are 4 components and 3 context dimensions
1 2 3 4 5 6 7 8 9 10 11 12
DCR 1 0 0 0 1 1 1 1 0 1 1 1
DCW 0.2 0.3 0 0.1 0.2 0.3 0.5 0.1 0.2 0.1 0.5 0.2

1st 2nd 3rd 4th


115
Differential Context Modeling

• Optimization Approach
 Particle Swarm Optimization (PSO)
 Genetic Algorithms
 Other non-linear approaches

Fish Birds Bees

116
Differential Context Modeling

• How PSO works?


Swarm = a group of birds
Particle = each bird ≈ search entity in algorithm
Vector = bird’s position in the space ≈ Vectors we need in DCR/DCW
Goal = the distance to location of pizza ≈ prediction error

So, how to find goal by swam intelligence?


1.Looking for the pizza
Assume a machine can tell the distance
2.Each iteration is an attempt or move
3.Cognitive learning from particle itself
Am I closer to the pizza comparing with
my “best ”locations in previous history?
4.Social Learning from the swarm
Hey, my distance is 1 mile. It is the closest!
. Follow me!! Then other birds move towards here.

DCR – Feature selection – Modeled by binary vectors – Binary PSO


DCW – Feature weighting – Modeled by real-number vectors – PSO

117
Differential Context Modeling

• Summary
Pros: Alleviate data sparsity problem in CARS
Cons: Computational complexity in optimization
Cons: Local optimum by non-linear optimizer

Our Suggestion:
 We may just run these optimizations offline to find optimal
context relaxation or context weighting solutions; And those
optimal solutions can be obtained periodically;

118
Contextual Modeling

119
Contextual Modeling

Tensor Factorization, 2010


Context-aware Matrix Factorization, 2011
Factorization Machines, 2011
Deviation-Based Contextual Modeling, 2014
Similarity-Based Contextual Modeling, 2015

120
Independent Contextual Modeling
(Tensor Factorization)

121
Independent Contextual Modeling

• Tensor Factorization
Multi-dimensional space: Users × Items × Contexts  Ratings

Each context variable is


modeled as an
individual and
independent dimension
in addition to user &
item dims.

Thus we can create a


multidimensional
space, where rating is
the value in the space.
122
Independent Contextual Modeling

• Tensor Factorization (Optimization)


Multi-dimensional space: Users × Items × Contexts  Ratings

1).By CANDECOMP/PARAFAC (CP) Decomposition

123
Independent Contextual Modeling

• Tensor Factorization (Optimization)


Multi-dimensional space: Users × Items × Contexts  Ratings

2).By Tucker Decomposition

124
Independent Contextual Modeling

• Tensor Factorization

Pros: Straightforward, easily to incorporate contexts into the model

Cons: 1). Ignore the dependence between contexts and user/item dims
2). Increased computational cost if more context dimensions

There are some research working on efficiency improvement on TF,


such as reusing GPU computations, and so forth…

125
Dependent Contextual Modeling
(Deviation-Based v.s. Similarity-Based)

126
Dependent Contextual Modeling

• Dependence between Every two Contexts


 Deviation-Based: rating deviation between two contexts
 Similarity-Based: similarity of rating behaviors in two contexts

127
Deviation-Based Contextual Modeling

• Notion: Contextual Rating Deviation (CRD)


CRD how user’s rating is deviated from context c1 to c2?
Context D1: Time D2: Location
c1 Weekend Home
c2 Weekday Cinema
CRD(Di) 0.5 -0.1
CRD(D1) = 0.5  Users’ rating in Weekday is generally higher than
users’ rating at Weekend by 0.5
CRD(D2) = -0.1  Users’ rating in Cinema is generally lower than
users’ rating at Home by 0.1

128
Deviation-Based Contextual Modeling

• Notion: Contextual Rating Deviation (CRD)


CRD how user’s rating is deviated from context c1 to c2?
Context D1: Time D2: Location
c1 Weekend Home
c2 Weekday Cinema
CRD(Di) 0.5 -0.1

Assume Rating (U, T, c1) = 4


Predicted Rating (U, T, c2) = Rating (U, T, c1) + CRDs
= 4 + 0.5 -0.1 = 4.4
129
Deviation-Based Contextual Modeling

• Build a deviation-based contextual modeling approach


Assume Ø is a special situation: without considering context
Context D1: Time D2: Location
Ø UnKnown UnKnown
c2 Weekday Cinema
CRD(Di) 0.5 -0.1
Assume Rating (U, T, Ø) = Rating (U, T) = 4
Predicted Rating (U, T, c2) = 4 + 0.5 -0.1 = 4.4
𝑁
In other words, F(U, T, C) = P(U, T) + 𝑖=0 𝐶𝑅𝐷(𝑖)

130
Deviation-Based Contextual Modeling

• Build a deviation-based contextual modeling approach


𝑁
Simplest model: F(U, T, C) = P(U, T) + 𝑖=0 𝐶𝑅𝐷(𝑖)

𝑁
User-personalized model: F(U, T, C) = P(U, T) + 𝑖=0 𝐶𝑅𝐷(𝑖, 𝑈)

𝑁
Item-personalized model: F(U, T, C) = P(U, T) + 𝑖=0 𝐶𝑅𝐷(𝑖, 𝑇)

Note: P(U, T) could be a rating prediction by any traditional


recommender systems, such as matrix factorization

131
Similarity-Based Contextual Modeling

• Build a similarity-based contextual modeling approach


Assume Ø is a special situation: without considering context
Context D1: Time D2: Location
Ø UnKnown UnKnown
c2 Weekday Cinema
Sim(Di) 0.5 0.1
Assume Rating (U, T, Ø) = Rating (U, T) = 4
Predicted Rating (U, T, c2) = 4 × Sim(Ø, c2)
In other words, F(U, T, C) = P(U, T) × Sim(Ø, C)

132
Similarity-Based Contextual Modeling

• Challenge: how to model context similarity, Sim(c1,c2)

We propose three representations:


• Independent Context Similarity (ICS)
• Latent Context Similarity (LCS)
• Multidimensional Context Similarity (MCS)

133
Similarity-Based Contextual Modeling

• Sim(c1, c2): Independent Context Similarity (ICS)


Context D1: Time D2: Location

c1 Weekend Home
c2 Weekday Cinema
Sim(Di) 0.5 0.1
𝑆𝑖𝑚 c1, 𝑐2 = 𝑁 𝑖=1 𝑠𝑖𝑚(𝐷𝑖) = 0.5 × 0.1 = 0.05
𝐺𝑒𝑛𝑒𝑟𝑎𝑙𝑙𝑦, 𝐼𝑛 𝐼𝐶𝑆: 𝑆𝑖𝑚 c1, 𝑐2 = 𝑁 𝑖=1 𝑠𝑖𝑚(𝐷𝑖)
Weeend Weekday Home Cinema
Weekend 1 b — —
Weekday a 1 — —
Home — — 1 c
Cinema — — d 1
134
Similarity-Based Contextual Modeling

• Sim(c1, c2): Latent Context Similarity (LCS)


In training, we learnt (home, cinema), (work, cinema)
In testing, we need (home, work)

f1 f2 … … … … fN
home 0.1 -0.01 … … … … 0.5 Vector
work 0.01 0.2 … … … … 0.01 Representation
cinema 0.3 0.25 … … … … 0.05

𝑁
𝐺𝑒𝑛𝑒𝑟𝑎𝑙𝑙𝑦, 𝐼𝑛 𝐿𝐶𝑆: 𝑆𝑖𝑚 c1, 𝑐2 = 𝑖=1 𝑠𝑖𝑚(𝐷𝑖)
𝑆𝑖𝑚 𝐷𝑖 = 𝑑𝑜𝑡𝑃𝑟𝑜𝑑𝑢𝑐𝑡 (𝑉𝑖1, 𝑉𝑖2)

135
Similarity-Based Contextual Modeling

• Sim(c1, c2): Multidimensional Context Similarity (MCS)


 Each context condition is an individual axis in the space.
 For each axis, there are only two values: 0 and 1.
 1 means this condition is selected; otherwise, not selected.
 When value is 1, each condition is associated with a weight

c1 = <Weekday, Cinema, with Kids>


c2 = <Weekend, Home, with Family>
They can be mapped as two points in the space

𝐼𝑛 𝑀𝐶𝑆: 𝐷𝑖𝑠𝑆𝑖𝑚 c1, 𝑐2 = distance between two point

136
Similarity-Based Contextual Modeling

• Build algorithms based on traditional recommender


Similarity-Based CAMF:

Similarity-Based CSLIM:

𝐼𝑛 𝐼𝐶𝑆: 𝑆𝑖𝑚 c1, 𝑐2 = 𝑁 𝑖=1 𝑠𝑖𝑚(𝐷𝑖)


𝐼𝑛 𝐿𝐶𝑆: 𝑆𝑖𝑚 c1, 𝑐2 = 𝑁 𝑖=1 𝑠𝑖𝑚 𝐷𝑖 , 𝑠𝑖𝑚 𝐷𝑖 𝑖𝑠 𝑑𝑜𝑡𝑃𝑟𝑜𝑑𝑢𝑐𝑡
𝐼𝑛 𝑀𝐶𝑆: 𝐷𝑖𝑠𝑠𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦 𝑖𝑠 𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒, 𝑠𝑢𝑐ℎ 𝑎𝑠 𝐸𝑢𝑐𝑙𝑖𝑑𝑒𝑎𝑛 𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒

137
CARSKit: Recommendation Library

138
Recommendation Library

• Motivations to Build a Recommendation Library


1). Standard Implementations for popular algorithms
2). Standard platform for benchmark or evaluations
3). Helpful for both research purpose and industry practice
4). Helpful as tools in teaching and learning

139
Recommendation Library
There are many recommendation library for traditional recommendation.
Users × Items Ratings

140
CARSKit: A Java-based Open-source
Context-aware Recommendation Library
CARSKit: https://github.com/irecsys/CARSKit
Users × Items × Contexts Ratings

User Guide: http://arxiv.org/abs/1511.03780

141
CARSKit: A Short User Guide

1. Download the JAR library, i.e., CARSKit.jar


2. Prepare your data

3. Setting: setting.conf

4. Run: java –jar CARSKit.jar –c setting.conf

142
CARSKit: A Short User Guide

Sample of Outputs: Data Statistics

143
CARSKit: A Short User Guide

Sample of Outputs:
1). Results by Rating Prediction Task
Final Results by CAMF_C, MAE: 0.714544, RMSE: 0.960389, NAME: 0.178636,
rMAE: 0.683435, rRMSE: 1.002392, MPE: 0.000000, numFactors: 10, numIter: 100,
lrate: 2.0E-4, maxlrate: -1.0, regB: 0.001, regU: 0.001, regI: 0.001, regC: 0.001,
isBoldDriver: true, Time: '00:01','00:00‘
2). Results by Top-N Recommendation Task
Final Results by CAMF_C, Pre5: 0.048756,Pre10: 0.050576, Rec5: 0.094997, Rec10:
0.190364, AUC: 0.653558, MAP: 0.054762, NDCG: 0.105859, MRR: 0.107495,
numFactors: 10, numIter: 100, lrate: 2.0E-4, maxlrate: -1.0, regB: 0.001, regU:
0.001, regI: 0.001, regC: 0.001, isBoldDriver: true, Time: '00:01','00:00'

144
Example of Experimental Results

145
Example of Experimental Results

146
Extended Topics:
Trends, Challenges & Future
Challenges

• There could be many other challenges in context-


awareness in IR and RecSys:
 Numeric v.s. Categorical Context Information
 Explanations by Context
 New User Interfaces and Interactions
 User Intent Predictions or References in IR and RecSys
 Cold-start and Data Sparisty Problems in CARS

148
Challenges: Numeric Context

• List of Categorical Context


Time: morning, evening, weekend, weekday, etc
Location: home, cinema, work, party, etc
Companion: family, kid, partner, etc
• How about numeric context
Time: 2016, 6:30 PM, 2 PM to 6 PM (time-aware recsys)
Temperature: 12°C, 38°C
Principle component by PCA: numeric values

149
Challenges: Explanation

• Recommendation Using social networks (By Netflix)


The improvement is not significant;
Unless we explicitly explain it to the end users;
• IR and RecSys Using context (Open Research)
Similar thing could happen to context-aware IR & recsys;
How to use contexts to explain information filtering;
How to design new user interface to explain;
How to introduce user-centric evaluations;

150
Challenges: User Interface

• Potential Research Problems in User Interface

 New UI to collect context;


 New UI to interact with users friendly and smoothly;
 New UI to explain context-aware IR and RecSys;
 New UI to avoid debates on user privacy;
 User privacy problems in context collection & usage

151
Challenges: Cold-Start and Data Sparsity

• Cold-start Problems
Cold-start user: no rating history by this user
Cold-start item: no rating history on this item
Cold-start context: no rating history within this context
• Solution: Hybrid Method by Matthias Braunhofer, et al.

152
Challenges: User Intent

• User Intent could be the most influential contexts

 How to better predict that


 How to better design UI to capture that
 How to balance user intent and limitations in resources

153
Trends and Future

• Context-awareness enable new applications: context


suggestion, or context-driven UI/Applications

Context Suggestion

154
Context Suggestion

• Task: Suggest a list of contexts to users (on items)

Context Rec

Traditional Rec Contextual Rec

155
Context Suggestion: Motivations

• Motivation-1: Maximize user experience


User Experience (UX) refers to a person's emotions and
attitudes about using a particular product, system or
service.

156
Context Suggestion: Motivations

• Motivation-1: Maximize user experience


It is not enough to recommend items only

157
Zoo Parks in San Diego, USA

• San Diego Zoo • San Diego Zoo Safari Park

158
Zoo Parks in San Diego, USA

159
Context Suggestion: Motivations

• Motivation-2: Contribute to Context Collection


Predefine contexts and suggest them to users

160
Context Suggestion: Motivations

• Motivation-3: Connect with Context-aware RecSys


User’s actions on context is a context-query to system

161
References
 L Baltrunas, M Kaminskas, F Ricci, et al. Best usage context prediction
for music tracks. CARS@ACM RecSys, 2010
 Y Zheng, B Mobasher, R Burke. Context Recommendation Using Multi-
label Classification. IEEE/WIC/ACM WI, 2014
 Y Zheng. Context Suggestion: Solutions and Challenges. ICDM
Workshop, 2015
 Y Zheng. Context-Driven Mobile Apps Management and
Recommendation. ACM SAC, 2016
 Yong Zheng, Bamshad Mobasher, Robin Burke. “User-Oriented
Context Suggestion“, ACM UMAP, 2016

162
Tutorial: Context-Awareness In Information
Retrieval and Recommender Systems
Yong Zheng
School of Applied Technology
Illinois Institute of Technology, Chicago

The 16th IEEE/WIC/ACM Conference on Web Intelligence, Omaha, USA

You might also like