Recommendation Chapter2
Recommendation Chapter2
based
recommendations
B U I L D I N G R E C O M M E N D AT I O N E N G I N E S I N P Y T H O N
Rob O'Callaghan
Director of Data
What are content-based recommendations?
Rob O'Callaghan
Director of Data
Introducing the Jaccard similarity
Jaccard similarity:
A∩B
J(A, B) =
A∪B
print(jaccard_score(hobbit_row, GOT_row))
0.5
square_jaccard_distances = squareform(jaccard_distances)
print(square_jaccard_distances)
[[0. 1. 0.5 1. ]
[1. 0. 1. 0.5]
[0.5 1. 0. 1. ]
[1. 0.5 1. 0. ]]
[[0. 1. 0.5 1. ]
[1. 0. 1. 0.5]
[0.5 1. 0. 1. ]
[1. 0.5 1. 0. ]]
jaccard_similarity_array = 1 - square_jaccard_distances
print(jaccard_similarity_array)
[[1. 0. 0.5 0. ]
[0. 1. 0. 0.5]
[0.5 0. 1. 0. ]
[0. 0.5 0. 1. ]]
0.75
0.15
title
The Hobbit 1.00
The Two Towers 0.91
A Game of Thrones 0.50
...
Rob O'Callaghan
Director of Data
Working without clear attributes
Book Description
The Hobbit "Bilbo Baggins lives a simple life with his fellow hobbits in the shire..."
The Great Gatsby "Set in Jazz Age New York, the novel tells the tragic story of Jay ..."
A Game of Thrones "15 years have passed since Robert's rebellion, with a nine-year-long ..."
Macbeth "A brave Sco ish general receives a prophecy from a trio of witches ..."
... ...
tfidfvec = TfidfVectorizer( , )
tfidfvec = TfidfVectorizer(min_df=2, )
print(vectorized_data.to_array())
Rob O'Callaghan
Director of Data
Item to item recommendations
age 0.376667
ancient 0.480000
angry 0.426667
brave 0.256667
...
print(user_prof.values.reshape(1,-1))
similarity_score
Title
The Two Towers 0.422488
Dune 0.363540
The Magicians Nephew 0.316075
... ...