Personalized Book Recommendation System Using TF-IDF and KNN Hybrid

A recommendation system helps an organization to create loyal customers and build trust by offering their desired products and services. These systems today are so powerful that they can handle the new customer too who has visited the site for the first time. With the increasing number of books, people prefer to use e-books. Today, online businesses have emerged that are dedicated only for e-books. They allow their users to purchase any books of their interest or even read them online.

Uploaded by

IJRASETPublications

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views5 pages

Personalized Book Recommendation System Using TF-IDF and KNN Hybrid

Uploaded by

IJRASETPublications

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

10 VII July 2022

https://doi.org/10.22214/ijraset.2022.45736
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue VII July 2022- Available at www.ijraset.com

Personalized Book Recommendation System using

TF-IDF and KNN Hybrid
Rashika S1, Namit S Gouranna2, Nishanth Nayak T3, Prajwal C R4, Mr. Prashanth J5
1, 2, 3, 4
BNM Institute of Technology, Bangalore, Karnataka
5
Assistant Professor, Department of Computer Science and Engineering, BNM Institute of Technology, Bangalore, Karnataka

Abstract: A recommendation system helps an organization to create loyal customers and build trust by offering their desired
products and services. These systems today are so powerful that they can handle the new customer too who has visited the site for
the first time. With the increasing number of books, people prefer to use e-books. Today, online businesses have emerged that
are dedicated only for e-books. They allow their users to purchase any books of their interest or even read them online. This
improves their business targets. To make their users engaged, they use machine learning models that recommend users the
books based on their preferences. Such a system is called Book Recommendation System. Over the past, a large number of book
recommendation systems have been built, most of them are found to be useful for both the organization and the users, and are
being put into use in the real world. In this proposed system, we build a Book Recommendation System which recommends a set
of books to users based on their previous ratings and readings using content-based filtering, collaborative filtering, and hybrid
filtering model. This will save users time in searching the books of their interest.
Keywords: Hybrid recommendation, content-based filtering, collaborative filtering, personalized recommendation.

I. INTRODUCTION
Every recommender system comprises of two entities, one is user and other is item. A user can be any customer or consumer of any
product or items, who get the suggestions. Input for recommendation algorithms can be a data in database of user and items and
output obliviously will be the recommendations. As in our case, inputs consist of database of customer and database of books and
output denotes the book recommendations.
There are many approaches for recommender systems used in the development of the machines through Content-based approach
and Collaborative approach or Hybrid approach.
1) Collaborative Filtering Approach: Collaborative filtering approach uses collecting the data and analysing data based on the
user’ behaviours, preferences or activities and predicting what users will like based on their similarities with other users.
2) Content-based Approach: This approach is based on the item description and user profile. In a content-based recommender
system, keywords are used to describe the items and a user profile is built to indicate the type of item this user likes.
3) Hybrid Recommender System: This system is combination of content based and collaborative filtering approach on the data.

Personalized recommendation system are seek to predict the preference based on the user’s interest, behaviour and other
information. Personalized recommendation is not only can provide the user needs, but also to help users explore and discover new
hobbies. Now-a-days many book selling websites are available on the internet. Many of them are having their own recommendation
system to recommend books to the buyers.
In this project, we build a Personalized Book Recommendation System which recommends a set of books to users based on their
previous ratings and readings using content-based filtering, collaborative filtering, and hybrid filtering model. This will save users
time in searching the books of their interest.

II. DATASET DESCRIPTION

This dataset contains ratings for ten thousand popular books. As to the source, let's say that these ratings were found on the internet.
Generally, there are hundreds of reviews for each book, although some have less - fewer - ratings. Ratings go from one to five. Both
book IDs of books and user IDs of user are contiguous. For books, they are one to ten thousand for user, one to fifty three thousand.
All users have made at least two ratings. Median number of ratings per user is eight. There are also books marked to read by users,
book metadata and tags.

©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 3872
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue VII July 2022- Available at www.ijraset.com

III. RELATED WORKS

As a ground work, we examined and studied about different kinds of existing recommendation engines and their uses and
limitations. We have presented a concise version of our study here. [1] YougChangWang et al. gives the methodology used, that is
PCA (Principle Component Analysis) and SVD (Singular Value Decomposition) for Dimensionality reduction. With the evolution
of large-scale, complex and high dimension data, it is very much required to reduce this dimensionality. PCA is used to make the
variance of the data distribution maximum, for this it uses Eigen values and Eigen vectors concept, We can also find the concept of
SVD in this paper. [2] Bin Li et al. gives the methodology used, that is K-Nearest Neighbours (K-NN), where the Root Mean
Squared Error (RMSE) value of the same is calculated. In experiments, the methods of Top-10 recommended mainly refer to the
score on the basis of prediction. We recommend the items whose score is highest. The model extracts keyword information of items
at first and then calculates their weights by using different methods like Term Frequency (TF), Document Frequency (DF) and Term
Frequency - Inverse Document Frequency (TF-IDF). [3] Mohammed Fadhel Aljunid et al. uses the Alternating Least Square
algorithm. The ALS algorithm uses least squares computation to minimize the estimation errors, and alternates between solving for
product factors and solving for user factors. This paper also mentioned an improved ALS approach based on apache spark. This
improved ALS model is proposed to be more efficient when compared with the existing ALS algorithm. Both the ALS and the
improved ALS algorithm work in a similar fashion but the major difference between these two methods is that while ALS algorithm
splits the entire dataset into test and train data only once, the Improved ALS model splits the dataset k times to reduce the Root
Mean Squared Error (RMSE). [4] Muhammad Zuhdi Fikri Johari et al. uses the Indonesian Online Marketplace dataset to provide
recommendations of relevant items to the users by using IMDB’s weighted rating formula to generate the top-n products. This also
proves that by using the user ratings and IMDB weighted rating formula and the demographic filtering method, top-n items of any
category could be found. [5] N. Muthurasu et al. uses the Term-Frequency Inverse Document Frequency to provide
recommendations according to the user’s preferences. Each data record is converted into a vector by using the TF-IDF vectorization
algorithm. For each vector, a similarity measure is computed using the cosine similarity method.

IV. PROPSED MODEL

Our proposed system is a hybrid book recommendation that uses IMDB weighted rating formula, cosine similarity algorithm and for
recommending books. The main advantage of this system is that, the algorithm is designed to work efficiently even for a small set of
data. The recommendations are based on the genre, authors, likes and dislikes of the user. It allows the users to save time in
searching books. Better and more efficient recommendation systems also increase market reach and create a flux of recurring
customers for the site.

A. IMDB weighted rating formula

The IMDb Weighted rating, which was formerly used for the calculation of the top-n ranking of films by IMDb, is used to calculate
the top-n products from the book dataset. Mathematically IMDb's weighted rating formula is represented as follows:
WR = × + ×
Where,
 WR is Weighted Rating.
 v is the number of ratings for the book.
 m is the minimum ratings required to be listed in the chart.
 R is the average rating of the book.
 C is the mean rating across the whole report.
The next step is to determine an appropriate value for m, the minimum ratings required to be listed in the chart. We will use 95th
percentile as our cutoff. In other words, for a book to feature in the charts, it must have more ratings than at least 95% of the books
in the list.

B. Cosine Similarity
This algorithm converts a text document as a vector of terms. By this model, the similarity between two dataset can be found by
determining cosine value between two vectors. Application of this algorithm can be performed on any two texts such as documents,
sentence or paragraph. In case of search engines, the similarity value between user query and documents are determined and then it
is categorized from highest to lowest one. Higher the similarity score between the user query vector and document vector means
more relevancy between query and document.

©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 3873
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 10 Issue VII July 2022- Available at www.ijraset.com

Similarity measurement between the user query and document should analyze the meaning of the term. Cosine similarity on the
other hand still can’t deal with the semantic meaning of the query very well. Semantic meaning problem does not meet the
difference of syntax matching.
Mathematically cosine similarity formula is represented as follows:
( , ) = ∑( . ) / ( ∑ . ∑ b)
Where,
 is a term as a key.
 is the term comparison.

C. K-Nearest Neighbour
KNN implements an item based collaborative filtering, KNN is a perfect go-to model and also a very good baseline for
recommender system development. KNN is a nonparametric, lazy learning method. It uses a dataset in which the data points are
separated into several clusters to make inference for new samples. KNN does not come up with any assumptions on the underlying
data distribution but it relies on item feature similarity. When KNN makes inference about a book, KNN will calculate the
“distance” between the target book and every other book in its dataset, then it ranks its distances and returns the top K nearest
neighbour books as the most similar book recommendations.

V. CONCLUSION
Recommender systems are an extremely potent tool utilized to assist the selection process easier for users. This paper has covered
the personalized book recommendation using hybrid recommendation model. On the bases of this study, Hybrid approach of IMDB
weighted rating formula, Cosine similarity algorithm and Alternating Least Square algorithm has been proposed in order to improve
the efficiency of basic algorithm.

REFERENCES
[1] YongChangWang, Ligu Zhu, “Research and Implementation of SVD In Machine Learning”, 16th International Conference on Computer and Information
Science (ICIS), Wuhan, China, 2017.
[2] Bin Li, Hua Xia, Sailuo Wan, Fengshou Qian, “The Research for Recommendation System Based on Improved KNN Algorithm”, 2020 IEEE International
Conference on Advances in Electrical Engineering and Computer Applications (AEECA), Dalian, China, 2020.
[3] Y. N. Bhagirathi, P. Kiran, “Book Recommendation System using KNN Algorithm”, International Journal of Research in Engineering, Science and
Management, 2019.
[4] Mohammed Fadhel Aljunid, D. H. Manjaiah, “An Improved ALS Recommendation Model Based on Apache Spark”, ResearchGate, Kollam, India, 2018.
[5] Muhammed Johari, Arif Laksito, “The Hybrid Recommender System of the Indonesian Online Market Products using IMDb weight rating and TF-IDF”,
JURNAL RESTI, 2021.
[6] N Muthurasu, Nandhini Rengaraj, Kavitha Conjeevaram Mohan, “Movie Recommendation System Using Term Frequency-Inverse Document Frequency and
Cosine Similarity Method”, International Journal of Recent Technology and Engineering (IJRTE), 2019.
[7] Suad A. Alasadi and Wesam S. Bhaya, “Review of Data Preprocessing Techniques in Data Mining”, Journal of Engineering and Applied Sciences, Babil, Iraq,
2017.
[8] Sandeep Matharia and C.N.S Murthy, “NOVA: Hybrid Book Recommendation Engine”, Institute of Electrical and Electronics Engineers (IEEE), Indore,
India 2012.
[9] Sunny Sharma, Vijay Rana, Manisha Malhotra, “Automatic recommendation system based on hybrid filtering algorithm", Springer, 2021.
[10] Salil Kanetkar, Akshay Nayak, Sridhar Swamy, Gresha Bhatia, "Web-based Personalized Hybrid Book Recommendation System", IEEE, 2014.
[11] Yassine Afoudi, Mohamed Lazaar, Mohammed Al Achhab, "Hybrid recommendation system combined content-based filtering and collaborative prediction
using artificial neural network", ScienceDirect, 2021.
[12] Yonghong Tian, Bing Zheng, Yanfang Wang, Yue Zhang, Qi Wu, "College Library Personalized Recommendation System Based on Hybrid Recommendation
Algorithm", ScienceDirect, 2019.

Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (649)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brene Brown
4/5 (1175)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1857)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4104)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (629)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (298)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1139)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2886)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (919)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (836)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
Air Conditioning Heat Load Analysis of A Cabin
No ratings yet
Air Conditioning Heat Load Analysis of A Cabin
9 pages
IoT-Based Smart Medicine Dispenser
100% (1)
IoT-Based Smart Medicine Dispenser
8 pages
Se of Optimism Software To Observe Effect of Different Sources in Optical Fiber
No ratings yet
Se of Optimism Software To Observe Effect of Different Sources in Optical Fiber
7 pages
Design and Analysis of Fixed-Segment Carrier at Carbon Thrust Bearing
No ratings yet
Design and Analysis of Fixed-Segment Carrier at Carbon Thrust Bearing
10 pages
Adsorption Study On Waste Water Characteristics by Using Natural Bio-Adsorbents
No ratings yet
Adsorption Study On Waste Water Characteristics by Using Natural Bio-Adsorbents
6 pages
A Review On Speech Emotion Classification Using Linear Predictive Coding and Neural Networks
No ratings yet
A Review On Speech Emotion Classification Using Linear Predictive Coding and Neural Networks
5 pages
Study and Analysis of Non-Newtonian Fluid Speed Bump
No ratings yet
Study and Analysis of Non-Newtonian Fluid Speed Bump
8 pages
Study and Analysis of Non-Newtonian Fluid Speed Bump
No ratings yet
Study and Analysis of Non-Newtonian Fluid Speed Bump
8 pages
11 V May 2023
No ratings yet
11 V May 2023
34 pages
Design and Analysis of Components in Off-Road Vehicle
No ratings yet
Design and Analysis of Components in Off-Road Vehicle
23 pages
Topology Optimisation of Piston
No ratings yet
Topology Optimisation of Piston
8 pages
Role of Artificial Intelligence in Emotion Recognition
No ratings yet
Role of Artificial Intelligence in Emotion Recognition
5 pages
Advanced Wireless Multipurpose Mine Detection Robot
No ratings yet
Advanced Wireless Multipurpose Mine Detection Robot
7 pages
Design and Analysis of Fixed Brake Caliper Using Additive Manufacturing
No ratings yet
Design and Analysis of Fixed Brake Caliper Using Additive Manufacturing
9 pages
Low Cost Scada System For Micro Industry
No ratings yet
Low Cost Scada System For Micro Industry
5 pages
TNP Portal Using Web Development and Machine Learning
No ratings yet
TNP Portal Using Web Development and Machine Learning
9 pages
Skill Verification System Using Blockchain SkillVio
No ratings yet
Skill Verification System Using Blockchain SkillVio
6 pages
Controlled Hand Gestures Using Python and OpenCV
No ratings yet
Controlled Hand Gestures Using Python and OpenCV
7 pages
Structural Analysis of The Performance of The Diagrid System With and Without Shear Wall
No ratings yet
Structural Analysis of The Performance of The Diagrid System With and Without Shear Wall
13 pages
Image Detection and Real Time Object Detection
100% (1)
Image Detection and Real Time Object Detection
8 pages
Smart Parking System Using MERN Stack
No ratings yet
Smart Parking System Using MERN Stack
6 pages
Pneumonia Detection Using X-Rays by Deep Learning
No ratings yet
Pneumonia Detection Using X-Rays by Deep Learning
6 pages
Real Time Human Body Posture Analysis Using Deep Learning
100% (1)
Real Time Human Body Posture Analysis Using Deep Learning
7 pages
CryptoDrive A Decentralized Car Sharing System
100% (1)
CryptoDrive A Decentralized Car Sharing System
9 pages
Credit Card Fraud Detection Using Machine Learning and Blockchain
100% (1)
Credit Card Fraud Detection Using Machine Learning and Blockchain
9 pages
Business Support System For Local Stores
No ratings yet
Business Support System For Local Stores
8 pages
BIM Data Analysis and Visualization Workflow
No ratings yet
BIM Data Analysis and Visualization Workflow
7 pages
Comparative in Vivo Study On Quality Analysis On Bisacodyl of Different Brands
No ratings yet
Comparative in Vivo Study On Quality Analysis On Bisacodyl of Different Brands
17 pages
Fund Future Empowering The Crowdfunding
No ratings yet
Fund Future Empowering The Crowdfunding
6 pages
Dark Store E-Commerce Website Using Sentiment Analysis Prediction
No ratings yet
Dark Store E-Commerce Website Using Sentiment Analysis Prediction
6 pages
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)
Instance Based Learning
100% (1)
Instance Based Learning
27 pages
AI and ML For Business Management
No ratings yet
AI and ML For Business Management
110 pages
Landscape and Fragmentation Analysis: Patch Analyst Patch Analyst (Grid)
No ratings yet
Landscape and Fragmentation Analysis: Patch Analyst Patch Analyst (Grid)
34 pages
Data Science and Machine Learning
No ratings yet
Data Science and Machine Learning
30 pages
A Guide To Singular Value Decomposition For Collaborative Filtering
No ratings yet
A Guide To Singular Value Decomposition For Collaborative Filtering
14 pages
Lecture Material 12
No ratings yet
Lecture Material 12
9 pages
Scaler DSML GitHub Search
No ratings yet
Scaler DSML GitHub Search
7 pages
CO - CSE 4102 - AI Lab Course Outline
100% (1)
CO - CSE 4102 - AI Lab Course Outline
4 pages
Unit III
No ratings yet
Unit III
19 pages
Report
100% (3)
Report
101 pages
Assignment 1 To 4
No ratings yet
Assignment 1 To 4
4 pages
Data Mining - Sem 3 - Assignment - 2
No ratings yet
Data Mining - Sem 3 - Assignment - 2
5 pages
New Microsoft Word Document
No ratings yet
New Microsoft Word Document
9 pages
Unit - 5: Anuj Khanna Assistant Profesor (Kiot, Kanpur)
No ratings yet
Unit - 5: Anuj Khanna Assistant Profesor (Kiot, Kanpur)
23 pages
Pratical Training Report 21052940 Bipin Ghimire
No ratings yet
Pratical Training Report 21052940 Bipin Ghimire
10 pages
Mini Project
No ratings yet
Mini Project
16 pages
All Machine Learning Algorithms Explained in One Line
No ratings yet
All Machine Learning Algorithms Explained in One Line
12 pages
Final Report Nagu Phase 1
No ratings yet
Final Report Nagu Phase 1
34 pages
CHEMOMETRICS and STATISTICS Multivariate Classification Techniques-21-27
No ratings yet
CHEMOMETRICS and STATISTICS Multivariate Classification Techniques-21-27
7 pages
Major Project Final TABLE DIAGRAM
No ratings yet
Major Project Final TABLE DIAGRAM
28 pages
Application of Machine Learning To Predict Transient Sand Production in The Karazhanbas Oil Field, Ustyurt-Buzachi Basin (West Kazakhstan)
No ratings yet
Application of Machine Learning To Predict Transient Sand Production in The Karazhanbas Oil Field, Ustyurt-Buzachi Basin (West Kazakhstan)
12 pages
Final Year Project
No ratings yet
Final Year Project
25 pages
Ebook 17CCC
No ratings yet
Ebook 17CCC
440 pages
M Tech Artificial Intelligence R15 Syllabus
No ratings yet
M Tech Artificial Intelligence R15 Syllabus
48 pages
M.TEch CSE
No ratings yet
M.TEch CSE
55 pages
Data Modification and Predictive Analytics - MCQ - 1 - 2
No ratings yet
Data Modification and Predictive Analytics - MCQ - 1 - 2
24 pages
GGNN (Ieee)
No ratings yet
GGNN (Ieee)
16 pages
Semester Suggestion Solution
No ratings yet
Semester Suggestion Solution
26 pages
Takeoff Edu Group CSE Title List
No ratings yet
Takeoff Edu Group CSE Title List
211 pages
20 - A Performance of K-Nearest Neighbor
No ratings yet
20 - A Performance of K-Nearest Neighbor
4 pages

Personalized Book Recommendation System Using TF-IDF and KNN Hybrid

Uploaded by

Personalized Book Recommendation System Using TF-IDF and KNN Hybrid

Uploaded by

10 VII July 2022

Personalized Book Recommendation System using

II. DATASET DESCRIPTION

III. RELATED WORKS

IV. PROPSED MODEL

A. IMDB weighted rating formula

You might also like