Joel Project
Joel Project
PROJECT REPORT
ON
“Book Recommendation system”
Prepared By
Joel Louis Heplette
Mohammad Hasnain Wakeel Sheikh
system submitted by Joel Louis Heplette & Mohammad Hasnain Wakeel Sheikh
software project carried out under the supervision and guidance of Prof.
project work.
Place: Nagpur
Date:
We, Joel Louis Heplette& Mohammad Hasnain Wakeel Sheikh hereby declare that
project work titled “Book Recommendation system” has been carried out by us as a
We solemnly declare that, this project report has not been submitted by Us for any other
examination of this university or any other university for any purpose.
Your Name
Place: Nagpur [ B.Com(CA) Sem-VI]
Date:
.
ACKNOWLEDGEMENT
We feel immense pleasure in expressing Our deepest sense of gratitude to Our guide, Prof.
Pramod Karambhe for his encouragement & enlightened comments throughout this project
work.
Her appreciative suggestion always motivated Us for putting most willing efforts on our
study during project report.
Our special thanks to our Director, Dr. Lalit Khullar for providing co-operation.
Last but not least, we express my thanks to all staff members of Computer Department
for their enthusiasm & constant reminders to complete the work as soon as possible. We are also
thankful to everyone in this college who directly or indirectly helped me in completing my
project.
However, We accept the sole responsibilities for any possible errors of omission and
commission.
Place: Nagpur
Joel Louis Heplette
&
Mohammad Hasnain Wakeel Sheikh
B.Com(CA) Sem-VI
SR .N PAGE
O. PARTICULAR NO.
1. INTRODUCTION 02
4. DATASET 12
5. DETAIL SYSTEM ANALYSIS 14
6. SYSTEM ANALYSIS 17
7. CONCULSION 26
8. BIBLOGRAPHY 28
INTRODUCTION
In today’s world, we all use many platforms for entertainment, like Youtube, in
the initial stages. First, we will see a video or books based on our interest by
searching for the desired books on the search engine. The recommendation
system works here. The system will analyse the video or the books which we
have watched. Based on this analysis made by the recommendation system, we
will be getting some recommendations for the next books.
1|Page
AIM AND OBJECTIVE
Have you ever wondered how YouTube recommends content, or how
Facebook recommends you, new friends? Perhaps you’ve noticed similar
recommendations with LinkedIn connections, or how Amazon will
recommend similar products while you’re browsing. All of these
recommendations are made possible by the implementation of
recommender systems.
factorization.
In this article, I’ll look at why we need recommender systems and the
different types of users online. Then, I’ll show you how to build your own
2|Page
PROJECT CONTEXT
Why Do We Need Recommender Systems?
We now live in what some call the “era of abundance”. For any given product,
there are sometimes thousands of options to choose from. Think of the examples
above: streaming videos, social networking, online shopping; the list goes on.
Recommender systems help to personalize a platform and help the user find
something they like.
The easiest and simplest way to do this is to recommend the most popular items.
However, to really enhance the user experience through personalized
recommendations, we need dedicated recommender systems.
From a business standpoint, the more relevant products a user finds on the
platform, the higher their engagement. This often results in increased revenue
for the platform itself. Various sources say that as much as 35–40% of tech
giants’ revenue comes from recommendations alone.
3|Page
What are the different filtration strategies?
Content-based Filtering
This filtration strategy is based on the data provided about the items. The
algorithm recommends products that are similar to the ones that a user has liked
in the past. This similarity (generally cosine similarity) is computed from the
data we have about the items as well as the user’s past preferences.
Disadvantages
4|Page
Collaborative Filtering
This filtration strategy is based on the combination of the user’s behaviour and
comparing and contrasting that with other users’ behaviour in the database. The
history of all users plays an important role in this algorithm. The main
difference between content-based filtering and collaborative filtering that in the
latter, the interaction of all users with the items influences the
recommendation algorithm while for content-based filtering only the concerned
user’s data is taken into account.
There are multiple ways to implement collaborative filtering but the main
concept to be grasped is that in collaborative filtering multiple user’s data
influences the outcome of the recommendation. and doesn’t depend on only one
user’s data for modelling.
The basic idea here is to find users that have similar past preference patterns
as the user ‘A’ has had and then recommending him or her items liked by those
similar users which ‘A’ has not encountered yet. This is achieved by making a
matrix of items each user has rated/viewed/liked/clicked depending upon the
task at hand, and then computing the similarity score between the users and
finally recommending items that the concerned user isn’t aware of but users
similar to him/her are and liked it.
For example, a user who likes a set of items is considered to have similar tastes
to all those items. This method is not based on the contents of the items, but
rather on the behaviour of the users. Similarity scores between items are
calculated using measures like Euclidean distance, Pearson correlation, or
cosine similarity.
5|Page
Disadvantages
1. People are fickle-minded i.e. their taste change from time to time and as
this algorithm is based on user similarity it may pick up initial similarity
patterns between 2 users who after a while may have completely different
preferences.
2. There are many more users than items therefore it becomes very
difficult to maintain such large matrices and therefore needs to be
recomputed very regularly.
3. This algorithm is very susceptible to shilling attacks where fake users
profiles consisting of biased preference patterns are used to manipulate
key decisions.
The concept in this case is to find similar books instead of similar users and then
recommending similar books to that ‘A’ has had in his/her past preferences. This
is executed by finding every pair of items that were rated/viewed/liked/clicked
by the same user, then measuring the similarity of those
rated/viewed/liked/clicked across all user who rated/viewed/liked/clicked both,
and finally recommending them based on similarity scores.
Here, for example, we take 2 books ‘A’ and ‘B’ and check their ratings by all
users who have rated both the books and based on the similarity of these ratings,
and based on this rating similarity by users who have rated both we find similar
books. So if most common users have rated ‘A’ and ‘B’ both similarly and it is
highly probable that ‘A’ and ‘B’ are similar, therefore if someone has read and
liked ‘A’ they should be recommended ‘B’ and vice versa.
6|Page
Advantages over User-based Collaborative Filtering
2.There are usually a lot fewer items than people, therefore easier to maintain
and compute the matrices.
DATASET
7|Page
For our own system, we’ll use the open-source Books 5000 dataset from Kaggle
.This dataset contains 10K data points of various books and users.
This dataset contains two CSV files. One is credits, and the other is a books file.
We will explore these files later. This file contains columns like budget for the
ISBN, Book-Title, Book-Author, Year-Of-Publisher, Image-URL-S ,
Image-URL-M , Image-URL-L
The file contains columns like title, cast, and crew in credits.
• Python – 3. x
• Pandas – 1.2.4
• Scikit-learn – 0.24.1
8|Page
Let’s view the credits data frame. Books.head(5)
Now merge both the datasets into the books data frame and view it.
num_rating.rename(columns={"rating":"num_of_rating"},inplace=True)
9|Page
But with the help of the books data set and credits data set, the
recommendations can be more personalized. For Example, if a books is
searched, the recommendation system should suggest some other books with the
same writer. And it should also show some books with the same cast and others
with the same genre.
Implementation
We need a function to implement the recommendation system that takes the
books we are searching for as input and similar books names as output.
Index of the books and similarity function is considered, like how many times it
is being searched and similar books with the help of the similarity function.
Books coming as output to the similarity function are taken and arranged in
descending order based on the index of that books.
In that order, first four to five are taken and are recommended to the users.
SYSTEM DESIGN
SOURCE CODE (BACKEND)
import numpy as np
import pandas as pd
10 | P a g e
import matplotlib.pyplot as plt
import seaborn as sns
books = pd.read_csv('books.csv')
users = pd.read_csv('users.csv')
ratings = pd.read_csv('ratings.csv')
books.head(5)
books.shape
books.columns
books.head(2)
books.rename(columns={
"Book-Title":"title",
"Book-Author":'author',
"Year-Of-Publication":"year",
"Publisher":"publisher",
"Image-URL-L":"Img_Url",},inplace=True)
books.head(2)
users.head()
ratings.shape
print(books.shape)
print(users.shape)
print(ratings.shape)
ratings.rename(columns={
"User-ID":"user_id",
"Book-Rating":"rating"},inplace=True)
ratings.head()
ratings['user_id'].value_counts()
ratings['user_id'].unique().shape
x=ratings['user_id'].value_counts() > 200
x[x].shape
y=x[x].index
y
ratings=ratings[ratings['user_id'].isin(y)]
11 | P a g e
ratings.head()
ratings.shape
ratings_with_books = ratings.merge(books,on="ISBN")
ratings_with_books.head(2)
num_rating = ratings_with_books.groupby('title')['rating'].count().reset_index()
num_rating.head()
num_rating.rename(columns={"rating":"num_of_rating"},inplace=True)
num_rating.head()
ratings_with_books.head(2)
final_rating = ratings_with_books.merge(num_rating,on ='title')
final_rating.head(2)
final_rating.shape
final_rating = final_rating[final_rating['num_of_rating']>=50]
final_rating.sample(2)
final_rating.shape
final_rating.drop_duplicates(['user_id','title'],inplace=True)
final_rating.shape
final_rating
book_pivot=final_rating.pivot_table(columns='user_id',index='title',values='rating')
book_pivot
book_pivot.shape
book_pivot.fillna(0,inplace=True)
book_pivot
from scipy.sparse import csr_matrix
book_sparse = csr_matrix(book_pivot)
book_sparse
from sklearn.neighbors import NearestNeighbors
model = NearestNeighbors(algorithm='brute')
model.fit(book_sparse)
distance, suggestion = model.kneighbors(book_pivot.iloc[237,:].values.reshape(1,-
1),n_neighbors=6)
distance
suggestion
12 | P a g e
for i in range(len(suggestion)):
print(book_pivot.index[suggestion[i]])
book_pivot.index[237]
book_pivot.index
books_name = book_pivot.index
import pickle
pickle.dump(model,open('model.pkl','wb'))
pickle.dump(books_name,open('books_name.pkl','wb'))
pickle.dump(final_rating,open('final_rating.pkl','wb'))
pickle.dump(book_pivot,open('book_pivot.pkl','wb'))
def recommed_book(book_name):
book_id = np.where(book_pivot.index == book_name)[0][0]
distance, suggestion =
model.kneighbors(book_pivot.iloc[book_id,:].values.reshape(1,-1),n_neighbors=6)
for i in range(len(suggestion)):
books = book_pivot.index[suggestion[i]]
for j in books:
print(j)
book_name ='Harry Potter and the Chamber of Secrets (Book 2)'
recommed_book(book_name)
OUTPUT:
13 | P a g e
14 | P a g e
15 | P a g e
16 | P a g e
17 | P a g e
18 | P a g e
SOURCE CODE (FRONTEND)
import pickle
import streamlit as st
import numpy as np
books_name = pickle.load(open('books_name.pkl','rb'))
final_rating = pickle.load(open('final_rating.pkl','rb'))
book_pivot = pickle.load(open('book_pivot.pkl','rb'))
def fecth_poster(suggestion):
book_name = []
ids_index = []
poster_url = []
return poster_url
def recommend_books(book_name):
books_list = []
book_id = np.where(book_pivot.index == book_name)[0][0]
distance, suggestion =
model.kneighbors(book_pivot.iloc[book_id,:].values.reshape(1,-1),n_neighbors=6)
poster_url = fecth_poster(suggestion)
19 | P a g e
for i in range(len(suggestion)):
books = book_pivot.index[suggestion[i]]
for j in books:
books_list.append(j)
return books_list,poster_url
selected_books = st.selectbox(
"Type or select a book",
books_name
)
if st.button('Show Recommendation'):
recommendation_books, poster_url = recommend_books(selected_books)
col1, col2, col3, col4, col5 = st.columns(5)
with col1:
st.text(recommendation_books[1])
st.image(poster_url[1])
with col2:
st.text(recommendation_books[2])
st.image(poster_url[2])
with col3:
st.text(recommendation_books[3])
st.image(poster_url[3])
with col4:
st.text(recommendation_books[4])
st.image(poster_url[4])
with col5:
st.text(recommendation_books[5])
st.image(poster_url[5])
20 | P a g e
OUTPUT:
21 | P a g e
CONCLUSION
In today’s world, we can see that most people are using amazon kindle, and
many other platforms. In that, we can see that books are being recommended to
us based on our watch history. Not only these, but you can also observe while
watching Instagram reels and youtube shorts. These videos are also
recommended based on our watch history. This is where the recommendation
system works.
And that we had built in this article.
22 | P a g e
BIBLIOGRAPHY
1. Streamlit documentation - The official documentation for the Streamlit
library provides detailed information on the various functions and features
of the library, including those relevant to building a Books
Recommendation System .
2. https://www.analyticsvidhya.com
Analyticsvidhya is optimized for learning, testing, and training. Examples
might be simplified to improve reading and basic understanding.
23 | P a g e