python-based Personalized Recommendation System Development
python-based Personalized Recommendation System Development
INTRODUCTION
A recommendation system or recommendation engine is a model used for information filtering
where it tries to predict the preferences of a user and provide suggests based on these
preferences. These systems have become increasingly popular nowadays and are widely used
today in areas such as movies, music, books, videos, clothing, restaurants, food, places and other
utilities. These systems collect information about a user's preferences and behavior, and then use
this information to improve their suggestions in the future. Movies are a part and parcel of life.
There are different types of movies like some for entertainment, some for educational purposes,
some are animated movies for children, and some are horror movies or action films. Movies can
be easily differentiated through their genres like comedy, thriller, animation, action etc. Other
way to distinguish among movies can be either by releasing year, language, director etc.
Watching movies online, there are a number of movies to search in our most liked movies.
Movie Recommendation Systems helps us to search our preferred movies among all of these
different types of movies and hence reduce the trouble of spending a lot of time searching our
favorable movies. So, it requires that the movie recommendation system should be very reliable
and should provide us with the recommendation of movies which are exactly same or most
matched with our preferences. A large number of companies are making use of recommendation
systemsto increase user interaction and enrich a user's shopping experience. Recommendation
systems have several benefits, the most important being customer satisfaction and revenue.
Movie Recommendation system is very powerful and important system. But, due to the
problems associated with pure collaborative approach, movie recommendation systems also
suffers with poor recommendation quality and scalability issues.
1
The main aim would be to develop a hybrid recommender system which incorporates and
enhances properties of existing recommendation systems along with a new approach in order to
decrease system runtime and to reveal latent user and item relations with great accuracy.
Developing a popularity score which will help users judge the movie in a better way and success
prediction for movies before release will provide better feedback to movie makers. To find a
general way to make recommendation methods more effective in a broader range of applications.
Although our experiments merely focus on one specific dataset, we desire to develop a
universal model that canbe applied to any other problem domain.
Over the years, many recommendation systems have been developed using either collaborative,
content based or hybrid filtering methods. These systems have been implemented using various
big data and machine learning algorithms.
A recommendation system collects data about the user‟s preferences either implicitly or
explicitly on different items like movies. An implicit acquisition in the development of movie
recommendation system uses the user‟s behavior while watching the movies. On the other hand,
a explicit acquisition in the development of movie recommendation system uses the user‟s
previous ratings or history. The other supporting technique that are used in the development of
recommendation system is clustering. Clustering is a process to group a set of objects in such a
way that objects in the same clusters are more similar to each other than to those in other
clusters. KMeans Clustering along with K-Nearest Neighbor is implemented on the movie lens
dataset in order to obtain the best-optimized result. In existing technique, the data is scattered
which results in a high number of clusters while in the proposed technique data is gathered and
results in a low number of clusters. The process of recommendation of a movie is optimized in
the proposed scheme. The proposed recommender system predicts the user‟s preference of a
movie on the basis of different parameters. The recommender system works on the
2
concept that people are having common preference or choice. These users will influence on each
other‟s opinions. This process optimizes the process and havinglower RMSE.
System which gives recommendations usually filters the given data using various methodologies
and suggests the relevant one to the customer benefit. In day-to-day life people usually use a
powered recommendation system in many areas like movies, books, music, news, items etc. In
this paper a wide range of work is reviewed in the field of a recommender system for movies
where dataset source, methods used and accuracy are compared to deduce best one and future
scope for improvement in this area are analyzed.
Collaborative filtering systems analyses the user's behavior and preferences and predict what they
would like based on similarity with other users. There are two kinds of collaborative filtering
systems; user-based recommender and item-based recommender.
1. Use-based filtering: User-based preferences are very common in the field of designing
personalized systems. This approach is based on the user's likings. The process starts with users
giving ratings (1-5) to some movies. These ratings can be implicit or explicit. Explicit ratings are
when the user explicitly rates the item on some scale or indicates a thumbs-up/thumbs-down to
the item. Often explicit ratings are hard to gather as not every user is much interested in
providing feedbacks. In these scenarios, we gather implicit ratings based on their behavior.
For instance, if a user buys a product more than once, it indicates a positive preference. In
context to movie systems, we can imply that if a user watches the entire movie, he/she has some
likeability to it. Note that there are no clear rules in determining implicit ratings. Next, for each
user, we first find some defined number of nearest neighbors. We calculate
3
correlation between users' ratings using Pearson Correlation algorithm. The assumption that if
two users' ratings are highly correlated, then these two users must enjoy similar items and
products are used to recommend items to users.
2. Item-based filtering: Unlike the user-based filtering method, item based focuses on the
similarity between the item‟s users like instead of the users themselves. The most similar items
are computed ahead of time. Then for recommendation, the items that are most similar to the
target item are recommended to the user.
2.4 Products and Movie Recommendation System for Social Networking Sites
Recommendation systems are an integral part of information filtering system in data science, that
are widely used in order to identify the pattern a user would likely choose on the basis of the
previous choices of the user as well as from studying the pattern in which others have chosen.
For a fact, the recommendation can never be a cent percent correct at providing
recommendations to the user but can be close enough to please them to a certain extent. Thus, the
same is widely used in the industries these days to get higher profit and have a good hold in the
market. The data scientists of every company design some algorithm that studies the information
from the social network and clusters the data. There can be a single algorithm for classification
like k-Means clustering or Hidden Markov model or can be done by bagging and boosting
techniques. With this technique of displaying the movies or products into the profile of a
particular customer, they not only increase their business but also enhance the customer
experiences but there are several issues related to the standard techniques like the cold start
problem, shrill attack, etc. thereby increasing the scope of research in this field. This work
deals with both Collaborative Filtering and Content-Based Filtering to form a product and movie
recommendation system for the social networking sites that shows the effectiveness of
collaborative filtering and portrays the challenges faced by content- based filtering.
4
2.5 An Intelligent Movie Recommendation System based on user priorities Today‟s web
and app users request modified experiences. They anticipate the apps,news sites, social
networks they engage with to evoke who they are and what they‟refascinated in and make
related, adjusted, and accurate commendations for new contentand new goods based on their
earlier deeds. This can be done using RecommendedSystems in Machine Learning. In this
paper we use Recommender System torecommend movies based on his previous ratings on
movie he came across.
CHAPTER 3
The main aim would be to develop a hybrid recommender system which incorporates and
enhances properties of existing recommendation systems. Developing a popularity score which
will help users judge the movie in a better way .To find a general way to make recommendation
methods more effective in a broader range of applications . A new approach in order to decrease
system runtime and to reveal latent user and item relations with great accuracy.
Users do not always leave behind enough personalized information along their customer
journey. For instance, new customers can be acquired or existing customers might browse an e-
commerce website without being logged in. Non-personalized recommendation systems, such as
those based on proposals for products frequently purchased together, still offer recommendation
opportunities for companies in this case. However, the more individually these are tailored to the
customer, the better. Therefore, in the following, personalized approaches are presented which
learn the preferences of customers. To understand these methods, it is helpful to consider the
recommendation problem at hand as a sparsely populated matrix. The rows represent the users,
the columns represent the items. Whenever a user performs an action for an item, a
5
respective entry is recorded in the matrix. Otherwise, values remain absent and do not need to be
explicitly stored.
3.2 PURPOSE
3.3 SCOPE
The scope of this SRSdocument persists for the entire life cycle of the project. This
document defines the final state of the software requirements agreed upon by the customers and
designers. Finally at the end of the project execution all the functionalities may be traceable from
the SRSto the product. The document describes the functionality, performance, constraints,
interface and reliability for the entire life cycle of the project.
6
3.4 PROPOSED SYSTEM
They proposed a movie recommendation system using collaborative filtering that focuses on the
ratings given by the users to provide recommendations. The proposed system is built using
machine learning algorithm to sort the movies according to the ratings. In one paper the authors
propose a fully content-based movie recommendation system to recommend movies. The
proposed system makes use of a neural network with the content information of the movies to
obtain features and learn the similarities between movies. The authors implement a
recommendation system that combines both user-based and item-based collaborative filtering
approach. The system is built using machine learning technique and develop a new algorithm
that unifies used based and item-based recommendations. Based on the research we conducted,
collaborative filtering was found to be one of the popularly used approaches to build
recommendation systems.
7
CHAPTER 4
Recommender systems are information filtering systems that help deal with the problem of
information overload by filtering and segregating information and creating fragments out of large
amounts of dynamically generated information according to user‟s preferences, interests, or
observed behavior about a particular item or items. A Recommender system has the ability to
predict whether a particular user would prefer an item or not based on the user‟s profile and its
historical information. Recommendation systems have also proved to improve the decision
making processes and quality. In large e-commerce settings, recommender systems enhance the
revenues for marketing, for the fact that they are effective means of selling more products. In
scientific libraries, recommender systems support and allow users to move beyond the generic
catalogue searches. Therefore, the need to use efficient and accurate recommendation techniques
within a system that provides relevant and dependable recommendations for users cannot be
neglected.
Conglomerates like Netflix use a recommendation engine to present their viewers with
movie and show suggestions. Amazon, on the other hand, uses its recommendation engine to
present customers with product recommendations. While each uses the one for slightly different
purposes, both in general have the same goal: to drive sales, boost engagement and customer
retention, and deliver more personalized customer experiences. Recommendations typically
speed up the searches and make it easier for users to access the content they have always been
interested in, and surprise them with several offers they would have never searched for. Doing
companies are able to gain new customers by sending out customized emails with links to new
offers that meet the recipient interests or suggestions of films and TV shows that suit their
particular profiles.
8
4.1 COLLABORATIVE FILTERING
The Collaborative filtering method for recommender systems is a method that is solely based on
the past interactions that have been recorded between users and items, in order to produce new
recommendations. Collaborative Filtering tends to find what similar users would like and the
recommendations to be provided and in order to classify the users into clusters of similar types
and recommend each user according to the preference of its cluster. The main idea that governs
the collaborative methods is that through past user-item interactions when processed through
the system, it becomes sufficient to detect similar users or similar items to make predictions
based onthese estimated facts and insights.
Such memory-based approaches directly work with the values of recorded interactions or data
and are essentially core based on nearest neighbors search, finding the closest users from a user
of interest and suggest the most popular items among these neighbors. The created model
approaches assuming there is an underlying “generative” insight that explains the user-item
interactions and tries to discover it in order to make new predictions. It recommends an item to
user A based on the interests of a similar user B. Furthermore, the embeddings can be learned
automatically, without relying on hand-engineering of features. The collaborative filtering
method does not need the features of the items to be given. Every user and item is described by
a feature vector or embedding.
9
Fig 4.1 :- SYSTEM ARCHITECTURE
10
4.2 SYSTEM REQUIREMENTS
Ram: 4 GB
4.3.1 Python:
Python is Interactive − You can actually sit at a Python prompt and interact with the
interpreter directly to write your programs.
11
Python is Object-Oriented − Python supports Object-Oriented style or technique of
programming that encapsulates code within objects.
Python is a Beginner's Language − Python is a great language for the beginner- level
programmers and supports the development of a wide range of applications from simple
text processing to WWW browsers to games.
Python was developed by Guido van Rossum in the late eighties and early nineties at the
National Research Institute for Mathematics and Computer Science in the Netherlands.
Python is derived from many other languages, including ABC, Modula-3, C, C++, Algol- 68,
Smalltalk, and Unix shell and other scripting languages.
Python is copyrighted. Like Perl, Python source code is now available under the GNU General
Public License (GPL).
Python is now maintained by a core development team at the institute, although Guido van
Rossum still holds a vital role in directing its progress.
Easy-to-learn − Python has few keywords, simple structure, and a clearly defined
syntax. This allows the student to pick up the language quickly.
Easy-to-read − Python code is more clearly defined and visible to the eyes.
A broad standard library − Python's bulk of the library is very portable and cross-
platform compatible on UNIX, Windows, and Macintosh.
Interactive Mode − Python has support for an interactive mode which allows interactive
testing and debugging of snippets of code.
12
Portable − Python can run on a wide variety of hardware platforms and has the same
interface on all platforms.
Extendable − You can add low-level modules to the Python interpreter. These modules
enable programmers to add to or customize their tools to be more efficient.
GUI Programming − Python supports GUI applications that can be created and ported to
many system calls, libraries and windows systems, such as Windows MFC, Macintosh,
and the X Window system of Unix.
Scalable − Python provides a better structure and support for large programs than
shell scripting.
Apart from the above-mentioned features, Python has a big list of good features, few are
listed below −
It provides very high-level dynamic data types and supports dynamic typechecking.
It can be easily integrated with C, C++, COM, ActiveX, CORBA, and Java.
Python is available on a wide variety of platforms including Linux and Mac OS X.Let's
understand how to set up our Python environment.
Getting Python
The most up-to-date and current source code, binaries, documentation, news, etc., is available on
the official website of Python https://www.python.org.
13
Windows Installation
Follow the link for the Windows installer python-XYZ.msifile where XYZ is the version
you need to install.
To use this installer python-XYZ.msi, the Windows system must support Microsoft
Installer 2.0. Save the installer file to your local machine and then run it to find out if
your machine supports MSI.
Run the downloaded file. This brings up the Python install wizard, which is really easy to
use. Just accept the default settings, wait until the install is finished, and you are done.
The Python language has many similarities to Perl, C, and Java. However, there are some
definite differences between the languages.
$ python
Python2.4.3(#1,Nov112010,13:34:43)
>>>
Type the following text at the Python prompt and press the Enter −
14
>>>print"Hello, Python!"
If you are running new version of Python, then you would need to use print statement with
parenthesis as in print ("Hello, Python!");. However in Python version 2.4.3, this produces the
following result −
Hello, Python!
Invoking the interpreter with a script parameter begins execution of the script andcontinues until
the script is finished. When the script is finished, the interpreter is no longer active.
Let us write a simple Python program in a script. Python files have extension .py. Type the
following source code in a test.py file −
print"Hello, Python!"
We assume that you have Python interpreter set in PATH variable. Now, try to run thisprogram as
follows −
$ python test.py
Hello, Python!
4.4 MODULES
15
Evaluation Module
Collecting all the required data set from Kaggle web site. in this project we require
movie.csv,ratings.csv,users.csv.
we preprocess the datasets into a proper format and transform these data frame of ratings to a
suitable format for our model. We want the data to be in an mXn matrix, where m and n gives
number of movies and users.
In this Module, we have modeled different collaborative filters to predict ratings for users and the
distance between the targeted items and other items is obtained by similarity measure, which
gives us the top machine learning and finally predicting the required recommended list of movies
with decreasing order of distance.
Evaluation is done based on the actual ratings and predicted ratings comparison of the user
which were already seen and present in remaining dataset and similar process is repeated with
other similarity measures.
Data Flow Diagram (DFD) is a two-dimensional diagram that describes how data is
processed and transmitted in a system. The graphical depiction recognizes each source of data
and how it interacts with other data sources to reach a mutual output. In order to draft a data flow
diagram one must
Identify external inputs and outputs
Determine how the inputs and outputs relate to each other
Explain with graphics how these connections relate and what they result in.
16
4.5.1 Role of DFD:
It is a documentation support which is understood by both programmers and
nonprogrammers. As DFD postulates only what processes are accomplished not how they are
performed.
A physical DFD postulates where the data flows and who processes the data.
It permits analyst to isolate areas of interest in the organization and study them by
examining the data that enter the process and viewing how they are altered when they leave.
17
Figure 4.1
UML stands for Unified Modeling Language. UML is a standardized general-purpose modeling
language in the field of object-oriented software engineering. The standard is managed, and was
created by, the Object Management Group.
The goal is for UML to become a common language for creating models of object
oriented computer software. In its current form UML is comprised of two major
18
components: a Meta-model and a notation. In the future, some form of method or process may
also be added to; or associated with, UML.
The Unified Modeling Language is a standard language for specifying, Visualization,
Constructing and documenting the artifacts of software system, as well as for business modeling
and other non-software systems.
The UML represents a collection of best engineering practices that have proven
successful in the modeling of large and complex systems.
The UML is a very important part of developing objects oriented software and the
software development process. The UML uses mostly graphical notations to express the design
of software projects.
GOALS:
The Primary goals in the design of the UML are as follows:
1. Provide users a ready-to-use, expressive visual modeling Language so that theycan
develop and exchange meaningful models.
2. Provide extendibility and specialization mechanisms to extend the core concepts.
3. Be independent of particular programming languages and development process.
4. Provide a formal basis for understanding the modeling language.
5. Encourage the growth of OO tools market.
6. Support higher level development concepts such as collaborations, frameworks,patterns
and components.
7. Integrate best practices.
A use case diagram in the Unified Modeling Language (UML) is a type of behavioral
diagram defined by and created from a Use-case analysis. Its purpose is to present a graphical
overview of the functionality provided by a system in terms of actors, their goals (represented
as use cases), and any dependencies between those
19
use cases. The main purpose of a use case diagram is to show what system functionsare
performed for which actor. Roles of the actors in the system can be depicted.
Fig 4.2
20
4.8 ACTIVITY DIAGRAM FOR ISSUE BOOK:
Activity diagrams are graphical representations of workflows of stepwise activities and actions
with support for choice, iteration and concurrency. In the Unified Modeling Language, activity
diagrams can be used to describe the business and operational step- by-step workflows of
components in a system. An activity diagram shows the overall flow of control.
Fig 4.
21
CHAPTER 5
It is a technical specification requirement for the software products. It is the first step in
the requirement analysis process which lists the requirements of particular software systems
including functional, performance and security requirements. The function of the system depends
mainly on the quality hardware used to run the software with given functionality.
Usability
It specifies how easy the system must be use. It is easy to ask queries in any format which
is short or long, porter stemming algorithm stimulates the desired response for user.
Robustness
It refers to a program that performs well not only under ordinary conditions but also
under unusual conditions. It is the ability of the user to cope with errors for irrelevant queries
during execution.
Security
The state of providing protected access to resource is security. The systemprovides good
security and unauthorized users cannot access the system there by providing high security.
22
Reliability
It is the probability of how often the software fails. The measurement is often expressed
in MTBF (Mean Time between Failures). The requirement is needed in order to ensure that the
processes work correctly and completely without being aborted. It can handle any load and
survive and survive and even capable of working around any failure.
Compatibility
It is supported by version above all web browsers. Using any web servers like localhost
makes the system real-time experience.
Flexibility
The flexibility of the project is provided in such a way that is has the ability to runon
different environments being executed by different users.
Safety
Portability
It is the usability of the same software in different environments. The project canbe run
in any operating system.
Performance
These requirements determine the resources required, time interval, throughputand
everything that deals with the performance of the system.
Accuracy
The result of the requesting query is very accurate and high speed of retrieving
23
information. The degree of security provided by the system is high and effective.
Maintainability
Project is simple as further updates can be easily done without affecting its stability.
Maintainability basically defines that how easy it is to maintain the system. It means that how
easy it is to maintain the system, analyse, change and test the application. Maintainability of
this project is simple as further updates can be easilydone without affecting its stability.
The input design is the link between the information system and the user. It comprises the
developing specification and procedures for data preparation and those steps are necessary to put
transaction data in to a usable form for processing can be achieved by inspecting the computer to
read data from a written or printed document or it can occur by having people keying the data
directly into the system. The design of input focuses on controlling the amount of input
required, controlling the errors, avoiding delay, avoiding extra steps and keeping the process
simple. The input is designed in such a way so that it provides security and ease of use with
retaining the privacy. Input Design considered the following things:
OBJECTIVES
1. Input Design is the process of converting a user-oriented description of the input into a
computer-based system. This design is important to avoid errors in the data input process and
show the correct direction to the management for getting correct information from the
computerized system.
24
2. It is achieved by creating user-friendly screens for the data entry to handle large volume of data.
The goal of designing input is to make data entry easier and to be free from errors. The data
entry screen is designed in such a way that all the data manipulates can be performed. It also
provides record viewing facilities.
3. When the data is entered it will check for its validity. Data can be entered with thehelp of
screens. Appropriate messages are provided as when needed so that the user will not be in maize
of instant. Thus the objective of input design is to create an input layout that is easy to follow
A quality output is one, which meets the requirements of the end user and presents the
information clearly. In any system results of processing are communicated to the users and to
other system through outputs. In output design it is determined how the information is to be
displaced for immediate need and also the hard copy output. It isthe most important and direct
source information to the user. Efficient and intelligent output design improves the system‟s
relationship to help user decision-making.
1. Designing computer output should proceed in an organized, well thought out manner; the right
output must be developed while ensuring that each output element is designed so that people will
find the system can use easily and effectively. When analysis design computer output, they
should Identify the specific output that is needed to meet the requirements.
1. Create document, report, or other formats that contain information produced by thesystem.
The output form of an information system should accomplish one or more of thefollowing
objectives.
The feasibility of the project is analyzed in this phase and business proposal is put
forth with a very general plan for the project and some cost estimates. During system analysis
the feasibility study of the proposed system is to be carried out. This is to ensure that the
proposed system is not a burden to the company. For feasibilityanalysis, some understanding
of the major requirements for the system is essential.
ECONOMICAL FEASIBILITY
TECHNICAL FEASIBILITY
SOCIAL FEASIBILITY
This study is carried out to check the economic impact that the system will have on the
organization. The amount of fund that the company can pour into the research and development
of the system is limited. The expenditures must be justified. Thus the developed system as well
within the budget and this was achieved because most of the technologies used are freely
available. Only the customized products had to be purchased.
This study is carried out to check the technical feasibility, that is, the technical
requirements of the system. Any system developed must not have a high demand on the
26
available technical resources. This will lead to high demands on the available technical
resources. This will lead to high demands being placed on the client. The developed system must
have a modest requirement, as only minimal or null changes are required for implementing this
system.
The aspect of study is to check the level of acceptance of the system by the user. This
includes the process of training the user to use the system efficiently. The user must not feel
threatened by the system, instead must accept it as a necessity. The level of acceptance by the
users solely depends on the methods that are employed to educate the user about the system and
to make him familiar with it. His level of confidence must be raised so that he is also able to
make some constructive criticism, which is welcomed, as he is the final user of the system.
The purpose of testing is to discover errors. Testing is the process of trying to discover
every conceivable fault or weakness in a work product. It provides a way to check the
functionality of components, sub assemblies, assemblies and/or a finished product It is the
process of exercising software with the intent of ensuring that the
Software system meets its requirements and user expectations and does not fail in an
unacceptable manner. There are various types of test. Each test type addresses a specific testing
requirement.
Unit testing
Unit testing involves the design of test cases that validate that the internal program logic
is functioning properly, and that program inputs produce valid outputs. All decision branches
and internal code flow should be validated. It is the testing of
27
individual software units of the application .it is done after the completion of an individual unit
before integration. This is a structural testing, that relies on knowledge of its construction and is
invasive. Unit tests perform basic tests at component level and test a specific business process,
application, and/or system configuration. Unit tests ensure that each unique path of a business
process performs accurately to the documented specifications and contains clearly defined inputs
and expected results.
Functional tests provide systematic demonstrations that functions tested are available as
specified by the business and technical requirements, system documentation, and user manuals.
28
Functions : identified functions must be exercised.
System testing ensures that the entire integrated software system meets requirements. It tests
a configuration to ensure known and predictable results. An example of system testing is the
configuration oriented system integration test. System testing is based on process descriptions
and flows, emphasizing pre-driven processlinks and integration points.
Unit Testing:
Unit testing is usually conducted as part of a combined code and unit test phase of the
software lifecycle, although it is not uncommon for coding and unit testing to be conducted as
two distinct phases.
Test objectives
Features to be tested
Integration Testing
Software integration testing is the incremental integration testing of two or more integrated software
components on a single platform to produce failures caused by interface
defects.
30
The task of the integration test is to check that components or software applications,
e.g. components in a software system or – one step up – software applications at the
company level – interact without error.
Test Results: All the test cases mentioned above passed successfully. No defects encountered.
Acceptance Testing
User Acceptance Testing is a critical phase of any project and requires significant
participation by the end user. It also ensures that the system meets the functional requirements.
Test Results: All the test cases mentioned above passed successfully. No defects encountered.
CHAPTER 6
In this project, we have discussed that how the movie tweets have been collected from micro
blogging websites to understand the current trends and user response of the movie and
experiments conducted on public database produce promising results. The proposed system is
also scalable for handling enormous amount of information available online. The system is not
having complex process to recommend the movies than the existing system. Proposed system
gives genuine and fast result than existing system.
31
APPENDIX
SOURCE CODE:
import numpy as np
import pandas as pd
import difflib
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
# loading the data from the csv file to apandas dataframe
movies_data = pd.read_csv('/content/movies.csv')
# printing the first 5 rows of the dataframe
movies_data.head()
# number of rows and columns in the data frame
movies_data.shape
# selecting the relevant features for recommendation
selected_features = ['genres','keywords','tagline','cast','director']
print(selected_features)
# replacing the null valuess with null string
vectorizer = TfidfVectorizer()
feature_vectors = vectorizer.fit_transform(combined_features)
print(feature_vectors)
# getting the similarity scores using cosine similarity
similarity = cosine_similarity(feature_vectors)
print(similarity)
print(similarity.shape)
# getting the movie name from the user
list_of_all_titles = movies_data['title'].tolist()
print(list_of_all_titles)
# finding the close match for the movie name given by the user
33
index_of_the_movie = movies_data[movies_data.title ==
close_match]['index'].values[0]
print(index_of_the_movie)
# getting a list of similar movies
similarity_score = list(enumerate(similarity[index_of_the_movie]))
print(similarity_score)
len(similarity_score)
# sorting the movies based on their similarity score
i=1
34
list_of_all_titles = movies_data['title'].tolist()
close_match = find_close_match[0]
index_of_the_movie = movies_data[movies_data.title ==
close_match]['index'].values[0]
similarity_score = list(enumerate(similarity[index_of_the_movie]))
i=1
REFERENCES
35
System for Mobile Devices” Springer Lecture notes on networks and systems, 130, 477-486.
[3] Lavanya, R., RithikaLahari, PalakGupta. (2019). “An Optimal Enhancement of the Dynamic
Features of Recommender Systems” International Journal of Recent Technology and
Engineering, 8(2S4), 51-55.
[5] Yi Ren, JingkeXu, Jie Huang and Cuirong Chi “ Research on Collaborative Filtering
Recommendation Algorithm for Personalized Recommendation System”, 2019 9th International
Conference on Education and Social Science (ICESS 2019).
[6] Vishwa, Bhavesh, Aman Gupta, PranalSoni “Movie Recommendation System”, International
Research Journal of Engg and Technology Volume: 05 Issue: 02, Feb-2018.
[7] S. G Walunj, K Sadafale, "An online recommendation system for e-commerce based on
Apache Mahout framework", 2017 ACM SIGMIS International Conference on Computers and
People Research, pp. 153- 158,2013.
[8] H. W. Chen, Y. L. Wu, M. K. Hor and C. Y. Tang, "Fully content-based movie recommender
system with feature extraction using neural network," 2017 International Conference on Machine
Learning and Cybernetics (ICMLC), Ningbo, China, 2017, pp. 504-509. doi:
10.1109/ICMLC.2017.8108968
[9]Jain, A., &Vishwakarma, S. K., "Collaborative Filtering for Movie Recommendation using
Rapid Miner" International Journal of Computer Applications (0975 - 8887) Volume 169 - No.6,
July 2017.
Wang, B., Xiong, S., Huang, Y., and Li, X. (2018). “Review rating prediction based on user context
and product context.” Applied Sciences, 8(10), 1
36
37