0% found this document useful (0 votes)

10 views10 pages

13 Recsys 2

The document outlines announcements regarding homework deadlines and class schedules for a course on Recommender Systems and Big Data Analytics. It covers key concepts such as UV decomposition, latent factor models, and the importance of regularization to prevent overfitting in model training. Additionally, it discusses optimization techniques like gradient descent and the use of biases in modeling user-item interactions.

Uploaded by

asansyzbai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views10 pages

13 Recsys 2

Uploaded by

asansyzbai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Announcements

• Homeworks:
• HW2 (due extended: 11/08)
• Due to maintenance at Haedong Lounge (11/01 18:00 – 11/04 09:00).
Recommender Systems 2 • Enjoy the Netflix challenge!
• HW3 (will be posted on 11/06)
EE412: Foundation of Big Data Analytics
• Midterm:
Fall 2024
• Claim finished; thank you for your hard work!
• Classes:
• No in-person class on 11/04; video will be uploaded.

Jaemin Yoo 1 Jaemin Yoo 2

Recommend
Recap Outline
Popularity
“Touching the void” “Into Thin Air”
• Recommender Systems 1. UV Decomposition
• Content-based Recommendation 2. UV Decomposition: Computation
• Collaborative Filtering Items 3. UV Decomposition: Variants
• The Netflix Challenge

like
like like

similar

recommend
like recommend

Jaemin Yoo 3 Jaemin Yoo 4

Goal of Recommender Systems Latent Factor Models
• Recommendation is to fill in the blank in the utility matrix 𝑅. • We’ll learn latent factor models, which assume:
• The core operation is how to get user and item representations. • There are latent factors that can represent users and items well.
• Content-based approach creates user/item profiles. • Such latent factors can be extracted from the utility matrix.
• Collaborative filtering uses the rows and columns of 𝑅. • Many people consider latent factor models as a part of CF.
• Since they share the same philosophy.
HP1 HP2 HP3 TW SW1 SW2 SW3 • CF uses the rows and columns of 𝑅 without modification.
A 4 5 1 • Latent factor models extract (better) latent factors from 𝑅.
B 5 5 4
C 2 4 5
D 3 3

Jaemin Yoo 5 Jaemin Yoo 6

Latent Factor Models UV Decomposition

• Idea: Consider a utility matrix as the product of factor matrices. • Given an 𝑚 × 𝑛 utility matrix 𝑅 (i.e., 𝑚 users and 𝑛 items).
• Latent factors are underlying concepts/topics; same as in SVD. • Find an 𝑚 × 𝑘 matrix 𝑈 and 𝑛 × 𝑘 matrix 𝑉 such that:
• E.g., users react to certain genres, famous actors, or directors. • 𝑈𝑉 # closely approximates 𝑀 for the non-blank entries.
• UV decomposition decomposes a utility matrix into 𝑈 and 𝑉. • Use the elements of 𝑈𝑉 ! to estimate the blank entries in 𝑅.
• Each user and movie is summarized as a low-dimensional vector. ̂ = 𝑢$#𝑣% to predict 𝑟$% .
• Compute 𝑟$%
n k n

✖ 𝑉! k
m R ≈ U

R U 𝑉!

Jaemin Yoo 7 Jaemin Yoo 8

Error Function Recap: Singular Value Decomposition
• Root-mean-square error (RMSE) measures the difference. • Decomposition of any matrix into a product of three matrices.
• Let 𝐸 be the set of non-blank entries. • Choose any number 𝑟 of intermediate concepts (= latent factors).
• RMSE 𝑅, / 𝑅 = sqrt & ∑ $,% ∈' 𝑟$% ̂ − 𝑟$% * . • In a way that minimizes the reconstruction error.
'
• The error is zero when 𝑟 ≥ rank 𝑀 .
• RMSE is equivalent to the sum of squared errors (SSE).
/ 𝑅 = 𝐸 ⋅ RMSE* 𝑅,
• SSE 𝑅, / 𝑅 . n r r n

VT r
Σ
RMSE( )= m =
M U

R 𝑈𝑉 !

Jaemin Yoo 9 Jaemin Yoo 10

UV Decomposition vs. SVD Outline

• Note: 𝑈 and ΣV ! from SVD are also factor matrices. 1. UV Decomposition
• Differences of UV decomposition from SVD: 2. UV Decomposition: Computation
• UV ignores the missing entries, not treating them as zero. 3. UV Decomposition: Variants
• UV minimizes RMSE on only the training portion.
• Larger 𝑘 is not necessarily better.
• Larger 𝑘 does not guarantee decreasing RMSE on the test data.
• 𝑈 and 𝑉 need not to be orthonormal matrices.

Jaemin Yoo 11 Jaemin Yoo 12

Objective Function Mismatch in the Objective Function
• Goal: Find matrices 𝑈 and 𝑉 such that: • Our (true) goal is to minimize SSE for unseen test data.
• However, our objective function only considers training data.
𝑈 ∗ , 𝑉 ∗ = argmin$,& 𝐽 𝑅, 𝑈, 𝑉
• Increasing 𝑘 always decreases our objective function.
# * • Since the error decreases with more factors.
• 𝐽 𝑅, 𝑈, 𝑉 = ∑$,%∈' 𝑟$% − 𝑢$ 𝑣% measures SSE on training data 𝐸.
• However, SSE on test data can begin to rise with large 𝑘.
n k n

✖ 𝑉! k
• We call 𝑈, 𝑉 parameters.
m R ≈ U • Other choices are hyperparameters.

Jaemin Yoo 13 Jaemin Yoo 14

Overfitting Regularization
• This is a classical example of overfitting: • Regularization is a possible way to prevent overfitting:
• Model starts fitting noise with too many free parameters. • Allows a rich model when there is sufficient data.
• Model is not generalizing well to unseen test data. • Shrinks the model aggressively where data is scarce.
• We should carefully control the model complexity. • The new objective function with regularization is
• E.g., the number of clusters in 𝑘-means.
𝐽 ⋅ = 5 𝑟'( − 𝑢' 𝑣( +
+ 𝜆, 5 𝑢' +
+ + 𝜆+ 5 𝑣( +
+
',(∈* ' (

• 𝜆& and 𝜆* are hyperparameters. Limit the lengths of 𝑢! and 𝑣"

Source: Medium

Jaemin Yoo 15 Jaemin Yoo 16

The Color
serious
Amadeus
Braveheart Validation Data
Purple

• Introducing validation data is also important.

Lethal • Split the training data into (new) training and validation data.
Sense and Weapon
Sensibility Ocean’s 11 Geared
• Check the performance on validation data during training.
Geared
towards towards • Use the validation performance as a proxy of the test performance.
Factor 1
females males

Validation dataset
Effect of regularization: The Lion King
Dumb and Training dataset
• Goes to the center unless Dumber

Factor 2
the signal is really strong. Independence
Day Test dataset

funny

Jaemin Yoo 17 Jaemin Yoo 18

Gradient Descent Gradient Descent for UV Decomposition

• Gradient descent (GD) aims to find an input 𝑥 to minimize 𝑓 𝑥 . • How to use GD to find 𝑈 ∗ and 𝑉 ∗ for UV decomposition:
• Compute the derivative ∇𝑓. • Step 1: Initialize 𝑈 and 𝑉 using SVD, treating missing entries as 0.
• Step 2: Update 𝑈 and 𝑉 to minimize the objective function 𝐽 ⋅ .
• Start at some point 𝑦 and evaluate ∇𝑓 𝑦 .
• Make a step in the reverse direction of the gradient: 𝑦 ← 𝑦 − ∇𝑓 𝑦 .
• 𝑈 ← 𝑈 − 𝜂 ⋅ ∇2 𝐽 ⋅ .
• Repeat until 𝑓 is sufficiently small.
𝑓 • 𝑉 ← 𝑉 − 𝜂 ⋅ ∇3 𝐽 ⋅ .
• 𝜂 is a hyperparameter (called a step size or a learning rate).
𝑓 𝑦 + 𝛻𝑓(𝑦)

Jaemin Yoo 19 Jaemin Yoo 20

Gradient Descent for UV Decomposition Stochastic Gradient Descent
• Perform the update step on every entry independently. • Observation: The gradient can be decomposed as
• Since 𝑈 and 𝑉 are matrices. • ∇𝐽 𝑅, 𝑈, 𝑉 = ∑%: $,% ∈' −2𝑣%6 𝑟$% − 𝑢$#𝑣% + 2𝜆&𝑢$6
• For each entry at row 𝑥, column 𝑐 of matrix 𝑈, we update: = ∑$,%∈' ∇𝐽 𝑟$% , 𝑈, 𝑉
• Stochastic gradient descent (SGD):
𝑢'/ = 𝑢'/ − 𝜂 ⋅ ∇0!" 𝐽 𝑈, 𝑉 • Evaluate the gradient on each (not all) rating and make a step.
• Needs more steps until convergence but each step is much faster.
where ∇4!" 𝐽 𝑈, 𝑉 = ∑%: $,% ∈' −2𝑣%6 𝑟$% − 𝑢$#𝑣% + 2𝜆&𝑢$6 . • GD: 𝑈 ← 𝑈 − 𝜂 ⋅ ∑% ∇𝐽 𝑟$% .
• SGD: 𝑈 ← 𝑈 − 𝜂 ⋅ ∇𝐽 𝑟$% .

Jaemin Yoo 21 Jaemin Yoo 22

Convergence of SGD vs. GD SGD for UV Decomposition

• GD improves the value of the objective function at every step. • How to use SGD to find 𝑈 ∗ and 𝑉 ∗ for UV decomposition:
• SGD improves the value but in a “noisy” way. • Step 1: Initialize 𝑈 and 𝑉 using SVD, treating missing ratings as 0.
• GD takes fewer steps to converge but each step takes much longer. • Step 2: Iterate over the ratings and update factors until convergence:

for each 𝑟$% ∈ 𝐸:

Objective function

𝜇& and 𝜇*: Learning rates.

𝜖$% ← 2 𝑟$% − 𝑢$#𝑣%
𝑢$ ← 𝑢$ + 𝜇& 𝜖$% 𝑣% − 𝜆&𝑢$
𝑣% ← 𝑣% + 𝜇* 𝜖$% 𝑢$ − 𝜆*𝑣%

Iteration/step
Jaemin Yoo 23 Jaemin Yoo 24
23
SGD with Mini-batches Outline
• In practice, people do not apply SGD for individual samples. 1. UV Decomposition
• Instead, they create (mini-)batches of several samples. 2. UV Decomposition: Computation
• GD: 1 step using 𝑁 samples. 3. UV Decomposition: Variants
• (True) SGD: 𝑁 steps using 1 sample.
• Batch SGD: 𝑁/𝐵 steps using a batch of 𝐵 samples.
• Makes a better balance between speed and stability in training.
• 𝐵 is a hyperparameter to choose.

Jaemin Yoo 25 Jaemin Yoo 26

Modeling Biases and Interactions Model with Biases

• There are global effects (biases) of users and movies. • Let’s predict a rating as 𝑟'(
̂ = 𝜇 + 𝑏' + 𝑏( + 𝑢'! 𝑣( .
• Rating 𝑟$% is not only about the interaction between 𝑥 and 𝑖. • 𝜇 is the overall mean rating.
• E.g., a critical reviewer 𝑥& vs. a generous person 𝑥*. • 𝑏$ and 𝑏% are biases for user 𝑥 and movie 𝑖, respectively.
• E.g., The Godfather 𝑖& vs. some bad movie 𝑖* on Netflix. • 𝑢$#𝑣% is their interaction modeled by factor matrices.
• Example:
user bias movie bias user-movie interaction
• Mean rating of training data is 𝜇 = 3.7.
• You are a critical reviewer: 𝑏$ = −1.
• Star Wars is favored by many people: 𝑏% = +0.5.
• Final score is 3.7 − 1 + 0.5 + 𝑢$#𝑣% .

Jaemin Yoo 27 Jaemin Yoo 28

Fitting the New Model Further Improvements
• Update the parameters {𝑈, 𝑉, 𝐵1234 , 𝐵5637 } minimizing 𝐽′: • Any idea for modifying the three components is okay to try:
• Set 𝜃 of learnable parameters.
𝐽 ⋅ = 5 𝑟'( − 𝜇 + 𝑏' + 𝑏( + 𝑢'! 𝑣( + regularizer
8 + • Objective function 𝐽 to minimize.
• (Optional) regularizer on 𝜃 with 𝜆.
',(∈*
• SGD (or GD) will take care of finding the optimal parameters.
• There are 4 regularization hyperparameters: 𝜆&, 𝜆*, 𝜆7, 𝜆8. • We believe in the power of gradient-based optimization!
• 𝜇 is the simple average of ratings; it needs not to be learned.

Jaemin Yoo 29 Jaemin Yoo 30

Hyperparameter Search Dealing with Implicit Feedback

• How can we search for optimal hyperparameters 𝜆, , 𝜆+ , 𝜆9 , 𝜆: ? • What if the utility matrix 𝑅 contains only implicit feedback?
• We pick the one with the best validation performance. • Each entry is either 0 (not watched) or 1 (watched).
• Grid search: Create sets of values and try for each combination. • Using the same model with RMSE 𝐽 leads to some limitations.
• E.g., 𝜆& in 0.001, 0.01, 0.1, 1, 10 . • Limitation 1: Our prediction can be any number, i.e., 𝑢$#𝑣% > 1.
• Limitation 2: We may have a better gradient curve of 𝐽.
• Random search: Try for each randomly selected combination. • Limitation 3: Our loss 𝐽 assumes 0 as a dislike, not “not watched.”
• E.g., 𝜆& from 0.001, 10 (but in a log scale).

Jaemin Yoo 31 Jaemin Yoo 32

Idea 1: Sigmoid Function Implication of the Sigmoid
• We want to limit the predictions of our model to be 0, 1 . • Without sigmoid: Signals are mixed regardless of 𝑟'( .
• Simple solution is to use 𝜎 𝑢'! 𝑣( instead of 𝑢'! 𝑣( as output. • If 𝑟$% = 1 and 𝑢$#𝑣% > 1, the model is updated to decrease 𝑢$#𝑣% .
= 1 and 𝑢$#𝑣% < 1, the model is updated to increase 𝑢$#𝑣% .
• Sigmoid function 𝜎 is defined as follows:
• If 𝑟$%
• If 𝑟$% = 0 and 𝑢$#𝑣% > 0, the model is updated to decrease 𝑢$#𝑣% .
1 • If 𝑟$% = 0 and 𝑢$#𝑣% < 0, the model is updated to increase 𝑢$#𝑣% .
𝜎 𝑥 = • With sigmoid: Signals are consistent with 𝑟'( .
1 + 𝑒 ;'
• If 𝑟$% = 1, the model is always updated to increase 𝑢$#𝑣% .
• Maps −∞, ∞ to 0, 1 with 𝜎 0 = 0.5. • If 𝑟$% = 0, the model is always updated to decrease 𝑢$#𝑣% .
• Monotonically increasing for all ranges of 𝑥.

Jaemin Yoo 33 Jaemin Yoo 34

Idea 2: Binary Cross Entropy Gradient Curves

• RMSE is mainly designed for continuous, unbounded targets. • BCE has a gradient curve different from that of RMSE.
• We may use binary cross entropy (BCE) instead of RMSE: • Suppose that 𝑟'( = 0 and 𝑎 = 𝜎 𝑢'! 𝑣( for simplicity.
• (RMSE) ∇9 𝑎 * = 2𝑎: Gradient increases linearly with 𝑎.
𝐽<=> 𝑢' , 𝑣( , 𝑟'( = − 𝑟'( log 𝜎 𝑢'! 𝑣( + 1 − 𝑟'( log(1 − 𝜎 𝑢'! 𝑣( ) • (BCE) ∇9 − log 1 − 𝑎 = 1/ 1 − 𝑎 : Gradient increases rapidly.

• If 𝑟$% = 1, we minimize − log 𝜎 𝑢$#𝑣% by making 𝜎 𝑢$#𝑣% to 1. • BCE makes the model focus more on wrong samples.
• If 𝑟$% = 0, we minimize − log(1 − 𝜎 𝑢$#𝑣% ) by making 𝜎 𝑢$#𝑣% to 0. • Not already accurate ones.

Jaemin Yoo 35 Jaemin Yoo 36

Idea 3: Ranking Losses Bayesian Personalized Ranking
• Both RMSE and BCE assume 0 entries as “dislike,” not “unknown.” • We may use the Bayesian personalized ranking (BPR) loss:
• Since they are computed for individual elements 𝑟$% .
• Idea: Let’s consider the task as ranking, not elementwise prediction. 𝐽<BC 𝑈, 𝑉 = 5 − log 𝜎 𝑢'! 𝑣( − 𝑢'! 𝑣D
• Given a user 𝑥, suppose that 𝑟$% = 1 while 𝑟$: = 0. ',(,D
• Then, we can safely assume that user 𝑥 likes 𝑖 more than 𝑗.
• If 𝑥 really likes 𝑗, they would have watched it before 𝑖. • Item 𝑗 is randomly selected from the negative samples { 𝑗 | 𝑟$: = 0 }.
• Let’s update the model by comparing items, so that 𝑢$% > 𝑢$: . • In BPR, we don’t have to compare 𝜎(𝑢'! 𝑣( ) and 𝜎(𝑢'! 𝑣D ).
• Since 𝜎 is a monotonically increasing function.

Jaemin Yoo 37 Jaemin Yoo 38

Summary
1. UV Decomposition
• Latent factor models
2. UV Decomposition: Computation
• Overfitting and regularization
• Stochastic gradient descent
3. UV Decomposition: Variants
• Modeling biases
• Dealing with implicit feedback
• BCE and BPR losses

Jaemin Yoo 39

MITx 6.86x Notes - MD
No ratings yet
MITx 6.86x Notes - MD
91 pages
2104 RZIM Academy Notes 5.1
No ratings yet
2104 RZIM Academy Notes 5.1
5 pages
Normal Pulse Voltammetry
100% (2)
Normal Pulse Voltammetry
10 pages
Perspectives On The History of Mathematical Logic PDF
100% (1)
Perspectives On The History of Mathematical Logic PDF
218 pages
2 Modeltraining 191029173510
No ratings yet
2 Modeltraining 191029173510
61 pages
A Recommender System: John Urbanic
No ratings yet
A Recommender System: John Urbanic
36 pages
Big Data Topic8b (Recommendation System) (Thanh Binh Nguyen) .TextMark
No ratings yet
Big Data Topic8b (Recommendation System) (Thanh Binh Nguyen) .TextMark
33 pages
Clustering With Gradient Descent: 1 Performance
No ratings yet
Clustering With Gradient Descent: 1 Performance
4 pages
HW 7
No ratings yet
HW 7
7 pages
16 DL 1
No ratings yet
16 DL 1
9 pages
285 Project Paper
No ratings yet
285 Project Paper
7 pages
Using Singular Value Decomposition Approximation For Collaborati
No ratings yet
Using Singular Value Decomposition Approximation For Collaborati
8 pages
DLbook
No ratings yet
DLbook
165 pages
Linear Algebra For Machine Learning
No ratings yet
Linear Algebra For Machine Learning
65 pages
Lecture 2
No ratings yet
Lecture 2
6 pages
Lecture04. Training Models (Regression in Chapter 4)
No ratings yet
Lecture04. Training Models (Regression in Chapter 4)
44 pages
CS229 Lecture 2 PDF
100% (1)
CS229 Lecture 2 PDF
48 pages
Deeplearning - Ai Deeplearning - Ai
No ratings yet
Deeplearning - Ai Deeplearning - Ai
48 pages
DSV Mod 3
No ratings yet
DSV Mod 3
47 pages
Python Basics Nympy
No ratings yet
Python Basics Nympy
5 pages
Lecture 1
No ratings yet
Lecture 1
6 pages
I M F M F: Ntroduction TO Atrix Actorization Ethods Collaborative Iltering
No ratings yet
I M F M F: Ntroduction TO Atrix Actorization Ethods Collaborative Iltering
20 pages
Lec4 Oct12 2022 PracticalNotes LinearRegression
No ratings yet
Lec4 Oct12 2022 PracticalNotes LinearRegression
34 pages
Lecture-03 - Vectors and Matrices
No ratings yet
Lecture-03 - Vectors and Matrices
27 pages
Department of Computer Science and Engineering (Data Science) Subject: Recommender System Laboratory (DJS22DSL6012) Experiment 7
No ratings yet
Department of Computer Science and Engineering (Data Science) Subject: Recommender System Laboratory (DJS22DSL6012) Experiment 7
7 pages
CS550 Lec2
No ratings yet
CS550 Lec2
24 pages
Logistic
No ratings yet
Logistic
14 pages
Deep Neural Networks
No ratings yet
Deep Neural Networks
48 pages
Lec 05 Regularization
No ratings yet
Lec 05 Regularization
77 pages
ML Module 2,3,4
No ratings yet
ML Module 2,3,4
13 pages
Recommendation Systems
No ratings yet
Recommendation Systems
62 pages
Recommender Systems
No ratings yet
Recommender Systems
8 pages
Theory DL
No ratings yet
Theory DL
227 pages
ML Quiz 2
No ratings yet
ML Quiz 2
1 page
WINSEM2024-25 CSE4006 ETH AP2024254000689 2025-01-03 Reference-Material-I
No ratings yet
WINSEM2024-25 CSE4006 ETH AP2024254000689 2025-01-03 Reference-Material-I
39 pages
DSCI 303: Machine Learning For Data Science Fall 2020
No ratings yet
DSCI 303: Machine Learning For Data Science Fall 2020
5 pages
U5 - SVD - 5th Sem - DS
No ratings yet
U5 - SVD - 5th Sem - DS
17 pages
26 Matrix Factorization
No ratings yet
26 Matrix Factorization
20 pages
Tut04 - One Algorithm To Optimize Them All
No ratings yet
Tut04 - One Algorithm To Optimize Them All
19 pages
Midpaper
No ratings yet
Midpaper
16 pages
Lecture 6
No ratings yet
Lecture 6
41 pages
Artifical Intelligence Coursework Report
No ratings yet
Artifical Intelligence Coursework Report
28 pages
Recommender Systems-Chapter 5
No ratings yet
Recommender Systems-Chapter 5
23 pages
Predict
No ratings yet
Predict
196 pages
Lecture 1 - Overview of Supervised Learning
No ratings yet
Lecture 1 - Overview of Supervised Learning
133 pages
2024 Exam2 Solution
No ratings yet
2024 Exam2 Solution
11 pages
机器学习
No ratings yet
机器学习
41 pages
A1388404476 - 64039 - 23 - 2023 - Machine Learning II
No ratings yet
A1388404476 - 64039 - 23 - 2023 - Machine Learning II
10 pages
Al3451 ML - Questionbank - 3,4,5
No ratings yet
Al3451 ML - Questionbank - 3,4,5
11 pages
HCIP-AI-EI Developer V2.0 Training Material
No ratings yet
HCIP-AI-EI Developer V2.0 Training Material
508 pages
Introduction To: Support Vector Machines
No ratings yet
Introduction To: Support Vector Machines
53 pages
CS712 G9 Project Report
No ratings yet
CS712 G9 Project Report
3 pages
Restricted Boltzmann Machines For Collaborative Filtering
No ratings yet
Restricted Boltzmann Machines For Collaborative Filtering
8 pages
Unit V NNHDL
No ratings yet
Unit V NNHDL
33 pages
Lecture 1.2. Basics and Prerequisite
No ratings yet
Lecture 1.2. Basics and Prerequisite
34 pages
AWS Machine Learning Specialty Master Cheat Sheet
No ratings yet
AWS Machine Learning Specialty Master Cheat Sheet
24 pages
Mathematics of Deep Learning: Lecture 1-Introduction and The Universality of Depth 1 Nets
No ratings yet
Mathematics of Deep Learning: Lecture 1-Introduction and The Universality of Depth 1 Nets
12 pages
Regularization For Deep Learning: Tsz-Chiu Au Chiu@unist - Ac.kr
No ratings yet
Regularization For Deep Learning: Tsz-Chiu Au Chiu@unist - Ac.kr
100 pages
B22CS014 Report
No ratings yet
B22CS014 Report
11 pages
Midterm Report
No ratings yet
Midterm Report
4 pages
Matrix Factorization
No ratings yet
Matrix Factorization
18 pages
The Visual Elements—Photography: A Handbook for Communicating Science and Engineering
From Everand
The Visual Elements—Photography: A Handbook for Communicating Science and Engineering
Felice C. Frankel
No ratings yet
PLC_Programmable logic controller, (Verilog).: Electronic control based on Fulcrum-B3 architecture.
From Everand
PLC_Programmable logic controller, (Verilog).: Electronic control based on Fulcrum-B3 architecture.
Mario Franco
4/5 (1)
Action Research Proposal
No ratings yet
Action Research Proposal
10 pages
Lesson 1: Pre-Analytical Factors and Gross Description: Histopathologic and Cytologic Techniques - Lecture
No ratings yet
Lesson 1: Pre-Analytical Factors and Gross Description: Histopathologic and Cytologic Techniques - Lecture
28 pages
KALAnnualReport2016 17
No ratings yet
KALAnnualReport2016 17
92 pages
Fundamentals of Multimedia
No ratings yet
Fundamentals of Multimedia
3 pages
21 - Olorunfemi - Assessment of The Effect
No ratings yet
21 - Olorunfemi - Assessment of The Effect
7 pages
Chapter 8-Performance Management
No ratings yet
Chapter 8-Performance Management
14 pages
Unit Ii 2 Marks S. No Questions CO BTL
No ratings yet
Unit Ii 2 Marks S. No Questions CO BTL
4 pages
Wafers: Basic Wafer Types
No ratings yet
Wafers: Basic Wafer Types
7 pages
Crystal Structures I
No ratings yet
Crystal Structures I
28 pages
Term 2 Basic 3 Week 3 Lesson Plan
No ratings yet
Term 2 Basic 3 Week 3 Lesson Plan
20 pages
Pruebas y Ajustes r1300G
No ratings yet
Pruebas y Ajustes r1300G
21 pages
Environmental Law and Jurisprudence
No ratings yet
Environmental Law and Jurisprudence
76 pages
Planning and Design of Radiology & Imaging Sciences
100% (1)
Planning and Design of Radiology & Imaging Sciences
39 pages
Chapter 1 Thesis Noise
100% (1)
Chapter 1 Thesis Noise
11 pages
TD Sba0 en
No ratings yet
TD Sba0 en
3 pages
Asia-Pacific Trade Agreement
No ratings yet
Asia-Pacific Trade Agreement
2 pages
The Impact of Digital Marketing Management On Customers Buying Behavior
No ratings yet
The Impact of Digital Marketing Management On Customers Buying Behavior
22 pages
Statistical Reasoning For Everyday Life 5th Edition Bennett Test Bank Download
100% (3)
Statistical Reasoning For Everyday Life 5th Edition Bennett Test Bank Download
40 pages
Repetitve Nerve Stimulation (RNS) : By: Syed Irshad Murtaza Neurophysiology Dept AKUH Karachi Date:12-06-2013
No ratings yet
Repetitve Nerve Stimulation (RNS) : By: Syed Irshad Murtaza Neurophysiology Dept AKUH Karachi Date:12-06-2013
33 pages
Teacher Leader Qualities Self Assessment
No ratings yet
Teacher Leader Qualities Self Assessment
7 pages
O Level Forces
No ratings yet
O Level Forces
16 pages
NCP J2
No ratings yet
NCP J2
1 page
Biotechnology and It's Application by Hare Krishna Deepak
No ratings yet
Biotechnology and It's Application by Hare Krishna Deepak
42 pages
UNEP MC COP 3 INF 28 List Participants - English
No ratings yet
UNEP MC COP 3 INF 28 List Participants - English
70 pages
Vaginal Exam Learning Guide
No ratings yet
Vaginal Exam Learning Guide
2 pages
Glass Ampoules & Glass Vials Import Sample
No ratings yet
Glass Ampoules & Glass Vials Import Sample
15 pages
Hipotesis Uji T Kontrol Dan Intervensi
No ratings yet
Hipotesis Uji T Kontrol Dan Intervensi
3 pages

13 Recsys 2

Uploaded by

13 Recsys 2

Uploaded by

Announcements

Jaemin Yoo 1 Jaemin Yoo 2

Jaemin Yoo 3 Jaemin Yoo 4

Jaemin Yoo 5 Jaemin Yoo 6

Latent Factor Models UV Decomposition

Jaemin Yoo 7 Jaemin Yoo 8

Jaemin Yoo 9 Jaemin Yoo 10

UV Decomposition vs. SVD Outline

Jaemin Yoo 11 Jaemin Yoo 12

Jaemin Yoo 13 Jaemin Yoo 14

• 𝜆& and 𝜆* are hyperparameters. Limit the lengths of 𝑢! and 𝑣"

Jaemin Yoo 15 Jaemin Yoo 16

• Introducing validation data is also important.

Jaemin Yoo 17 Jaemin Yoo 18

Gradient Descent Gradient Descent for UV Decomposition

Jaemin Yoo 19 Jaemin Yoo 20

Jaemin Yoo 21 Jaemin Yoo 22

Convergence of SGD vs. GD SGD for UV Decomposition

for each 𝑟$% ∈ 𝐸:

𝜇& and 𝜇*: Learning rates.

Jaemin Yoo 25 Jaemin Yoo 26

Modeling Biases and Interactions Model with Biases

Jaemin Yoo 27 Jaemin Yoo 28

Jaemin Yoo 29 Jaemin Yoo 30

Hyperparameter Search Dealing with Implicit Feedback

Jaemin Yoo 31 Jaemin Yoo 32

Jaemin Yoo 33 Jaemin Yoo 34

Idea 2: Binary Cross Entropy Gradient Curves

Jaemin Yoo 35 Jaemin Yoo 36

Jaemin Yoo 37 Jaemin Yoo 38

You might also like