0% found this document useful (0 votes)
85 views15 pages

Minor Project

This document describes a movie recommendation system called PICK-A-MOVIE that uses machine learning. It uses a supervised machine learning approach and content-based filtering to calculate similarity scores between movie contents and recommend movies to users based on their preferences. The system analyzes similarity between vectors representing movie texts to find similar movies. It aims to improve the customer experience on applications by recommending content that matches their interests.

Uploaded by

harmeetpics1607
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
85 views15 pages

Minor Project

This document describes a movie recommendation system called PICK-A-MOVIE that uses machine learning. It uses a supervised machine learning approach and content-based filtering to calculate similarity scores between movie contents and recommend movies to users based on their preferences. The system analyzes similarity between vectors representing movie texts to find similar movies. It aims to improve the customer experience on applications by recommending content that matches their interests.

Uploaded by

harmeetpics1607
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

PICK-A-MOVIE

(Using Machine Learning)

Name – Sarthak Agrawal


MCA 3rd Semester
Jan-2023 Batch
(Machine Learning and Artificial Intelligence)
Enrollment No. - A9929722000077 (el)
ABSTRACT

PICK-A-MOVIE which is a Movie Recommendation System, is an example of Supervised


Machine Learning approach. This analysis mainly focuses on finding the similarity scores
between the two contents in the content-based filtering. It helps us in finding the distance
between the two vectors and their angle by the help of the cosine similarity formula and
magnitude of their relative scores.
This model analyses the points to find the two text similarity scores by using Jupyter
Notebook and distance between the two-vector model approach more effectively and
precisely. In today’s computer world we have lots of stuff on our Internet sources to w atch
and see but every single stuff available does not match our liking. We sometimes get the feed
of the videos, news, clothing etc. which is not according to our liking and interest. It makes
the customer interest in the application lower, and he/she further doesn’t want to get through
the same application again.
The need of the hour is to develop some code which can tell at a beginner level the matching
pattern of the customer trend and recommend him with the best item of his interest level. This
will help us in making the customer experience satisfactory and able to achieve good ratings
and popularity as well. For example, if we tend to see a particular genre of movie more
frequently, then the system recommends that genre over the others when recommending what
to watch next. It gains popularity by application rating and at the same time enhances the
user experience.
This policy of recommendation system is really helpful in giving optimum results to an
application profitability and to make the organisation more connected. We can also see the
recommendation work in online food applications such as Zomato and Swiggy which offers
their customers the food restaurants which caters to their taste in food. They learn upon the
behaviour of the user from the previous orders and tries to impress them with the latest add -
ons of their favourite cuisines and stuffs. The recommendation engine implements ML
algorithms that compute the similarity scores between specified features.
INTRODUCTION

Supervised Machine Learning is used when the model is getting trained on a labelled dataset.
Labelled dataset is the one which has both input and output parameters. In this type of
learning both training and validation datasets are labelled. Here, we have three components
such as Training Data, Test Data and features. Training data is that where data is usually split
in the ratio of 80:20 i.e., 80% as training data and rest as testing data. Testing data is that
when data is good to be tested. At the time of testing, input is fed from remaining 20% data
which the model has never seen before, the model will predict some value and we will
compare it with actual output and calculate the accuracy.

The different steps of this ML approach are as follows:


1.) Data Extraction and Cleaning: This is used to extract and clean the data by using
scripting languages such as Python, Shell Scripting etc. Here we extract and filter the useful
data of movies dataset according to our need.
2.) Build the ML Model: Once we extract and clean the data, we start building up the model
with tools such as Scikit-Learn, Difflib, Pandas etc. We build our movie recommendation
engine here which would be in form of a Python script.
3.) Build Software Infrastructure: If we want to integrate this engine into an application or
website for the users, we use the ML algorithm in the form of a software by using JavaScript
and other tools.
Users- They are the one who uses these services or acts as the consumer.
Items- Here these are the different sets of movies which are been recommended in a sort of
zig-zag manner according to our previous searches.
There are three ways of performing the filtering. Thery are as follows-

• Trending based filtering- Here the movies are classified by the ratings and the stuff
that is been liked by majority of the population.
• Content based filtering- Here the similar articles are recommended to the user
according to his previous content search.
• Collaborative based filtering- Here the two similar user likings act as a
recommendation criterion to each other. Like, the two users watch comedy movies so
if a new comedy stuff appears and is watched by user A it will also be recommended
to user B.
A recommender system, or a recommendation system (sometimes replacing 'system' with a
synonym such as platform or engine), is a subclass of information filtering system that seeks
to predict the "rating" or "preference" a user would give to an item. They are primarily used
in commercial applications. Recommender systems are utilized in a variety of areas and are
most widely recognized as playlist generators for video and music services like Netflix,
YouTube and Spotify, product recommenders for services such as Amazon, or content
recommenders for social media platforms such as Facebook or Twitter. These systems can
operate using a single input, like music, or multiple inputs within and across platforms like
news, books, and search queries. Recommender systems have also been developed to explore
research articles and financial services.

These both are the recommendation engines that recommend us the movies and other related
stuff based on our previous searches and watched experience.
➢ Classification: It is a Supervised Learning task where output is has defined labels
(discrete values).
➢ Regression: It is a Supervised Learning method where the target feature must have
continuous values.
OBJECTIVE

A recommendation system or recommendation engine is a model used for information


filtering where it tries to predict the preferences of a user and provide suggests based on these
preferences. These systems have become increasingly popular nowadays and are widely used
today in areas such as movies, music, books, videos, clothing, restaurants, food, places and
other utilities. These systems collect information about a user's preferences and behaviour,
and then use this information to improve their suggestions in the future.
Movies are a part and parcel of life. There are different types of movies like some for
entertainment, some for educational purposes, some are animated movies for children, and
some are horror movies or action films. Movies can be easily d ifferentiated through their
genres like comedy, thriller, animation, action etc. Other way to distinguish among movies
can be either by releasing year, language, director etc. Watching movies online, there are
several movies to search in our most liked movies.
Movie Recommendation Systems helps us to search our preferred movies among all of these
different types of movies and hence reduce the trouble of spending a lot of time searching our
favourable movies. So, it requires that the movie recommendation sy stem should be very
reliable and should provide us with the recommendation of movies which are exactly same or
most matched with our preferences. A large number of companies are making use of
recommendation systems to increase user interaction and enrich a user's shopping experience.
Recommendation systems have several benefits, the most important being customer
satisfaction and revenue. Movie Recommendation system is very powerful and important
system. But, due to the problems associated with pure collaborative approach, movie
recommendation systems also suffers with poor recommendation quality and scalability
issues.
The goal of the project is to recommend a movie to the user. Providing related content out of
relevant and irrelevant collection of items to users of online service providers.
LITERATURE REVIEW
(Background Study)

♦ Movie Recommendation System by K-Means Clustering And K-Nearest


Neighbor
A recommendation system collect data about the user’s preferences either implicitly or
explicitly on different items like movies. An implicit acquisition in the development of movie
recommendation system uses the user’s behaviour while watching the movies. On the other
hand, a explicit acquisition in the development of movie recommendation system uses the
user’s previous ratings or history. The other supporting technique that are used in the
development of recommendation system is clustering. Clustering is a process to group a set
of objects in such a way that objects in the same clusters are more similar to each other than
to those in other clusters. K-Means Clustering along with K-Nearest Neighbour is
implemented on the movie lens dataset in order to obtain the best-optimized result. In
existing technique, the data is scattered which results in a high number of clusters while in
the proposed technique data is gathered and results in a low number of clusters. The process
of recommendation of a movie is optimized in the proposed scheme. The proposed
recommender system predicts the user’s preference of a movie on the basis of different
parameters. The recommender system works on the concept that people are having common
preference or choice. These users will influence on each other’s opinions. This process
optimizes the process and having lower RMSE.
♦ Movie Recommendation System Using Collaborative Filtering
Collaborative filtering systems analyse the user's behaviour and preferences and predict what
they would like based on similarity with other users. There are two kinds of collaborative
filtering systems; user-based recommender and item-based recommender. 1. Use-based
filtering: User-based preferences are very common in the field of designing personalized
systems. This approach is based on the user's likings. The process starts with users giving
ratings (1-5) to some movies. These ratings can be implicit or explicit. Explicit ratings are
when the user explicitly rates the item on some scale or indicates a thumbs-up/thumbs-down
to the item. Often explicit ratings are hard to gather as not every user is much interested in
providing feedbacks. In these scenarios, we gather implicit ratings based on their behaviour.
For instance, if a user buys a product more than once, it indicates a positive preference. In
context to movie systems, we can imply that if a user watches the entire movie, he/she has
some likeability to it. Note that there are no clear rules in determining implicit ratings. Next,
for each user, we first find some defined number of nearest neighbours. We calculate
correlation between users' ratings using Pearson Correlation algorithm. The assumption that
if two users' ratings are highly correlated, then these two users must enjoy similar items and
products is used to recommend items to users. 2. Item-based filtering: Unlike the user-based
filtering method, item-based focuses on the similarity between the item’s users like instead of
the users themselves. The most similar items are computed ahead of time.
RESEARCH METHODOLOGY

In order to achieve the goal of the project, the first process is to do enough background study,
so the literature study will be conducted. The whole project is based on a big amount of
movie data so that we choose quantitative research method. For philosophical assumption,
positivism is selected because the project is experimental and testing character. The research
approach is deductive approach as the improvement of our research will be tested by
deducing and testing a theory. Ex post facto research is our research strategy, the movie data
is already collected and we don’t change the independent variables. We use experiments to
collect movie data. Computational mathematics is used data analysis because the result is
based on improvement of algorithm. For the quality assurance, we have a detail explanation
of algorithm to ensure test validity. The similar results will be generated when we run the
same data multiple times, which is for reliability. We ensure the same data leading to sam e
result by different researchers.

Methodology Adopted - Agile Methodology


Research Design – The research design that was used in this study is both ‘Descriptive’ and
‘Exploratory’.
Sampling Size – 4803 movie records in the data file.
1. Collecting the data sets: Collecting all the required data set from Kaggle website. In this
project we require movies.csv file.
2. Data Analysis: Make sure that that the collected data sets are correct and analyse the data
in the csv files. i.e., checking whether all the column fields are present in the data sets or not.
3. Algorithms: In our project we have only two algorithms one is cosine similarity and other
is single valued decomposition are used to build the machine learning recommendation
model.
4. Training and Testing the model: Once the implementation of algorithm is completed.
we have to train the model to get the result. We have tested it several times the model is
recommend different set of movies to different users.
5. Improvements in the project: In the later stage we can implement different algorithms
and methods for better recommendation.
DATA INTERPRETATION

Content-based Filtering
Content based recommender works with data that the user provides, either explicitly (rating)
or implicitly (clicking on a link). Based on that data, a user profile is generated, which is then
used to make suggestions to the user. As the user provides more inputs or takes actions on the
recommendations, the engine becomes more and more accurate.
Algorithm : 1. : Collect the user reviews or movie name as the input from user.
2: Extract the name from the user input.
3. Convert the dataset feature values into feature vectors.
3: Compare the extracted name from the user’s input with dataset using cosine
similarity.
4: Learn user’s profile based on the rated items and make recommendation based
on top ranked items after comparing cosine values.
Cosine similarity is a measure of similarity between two non-zero vectors of an inner
product space that measures the cosine of the angle between them.
Formula:

Recommendation System is a system which is used for filtering the information in the system
that predicts rating for a given item. Recommended system identifies recommendations for
individual users based on past acquisition and searches, and on other users behaviour.
Recommended Systems are software tools and techniques providing suggestions for items to
be of use to a user. The system helps users to match with the items which they are interested
in. It will support and improve the quality of the decisions which the users searching the
items in the online.
Model Implementation :-
1. First we import all the required modules in the model.

2. After importing the required modules, we upload the data file.


3. After the data file upload, Users reviews are collected as data set. The data set is pre-
processed to remove the unwanted texts and missing values and we select relevant
feature to use to predict the movie name.

4. Now, we convert the ‘combined_features’ data to feature vectors.


5. Now, we apply cosine similarity function to get cosine values in shape of a matrix.

6. Now, we take the movie name input from the user and find a close match in the list of
all available movie titles.

7. Find the index of the movie that was the closest match and arrange the similarity
scores in descending order.
8. Finally display the name of top 30 movies that are similar to the name entered by the
user.
Model at a Glance
RECOMMENDATIONS AND CONCLUSION

• Conlcusion
In this project we have implemented and learn the following things such as- • Building a
Movie Recommendation System • To find the Similarity Scores and Indexes. • Compute
Distance Between Two Vectors • Cosine Similarity • To find Euclidian Distance and ma ny
more ML related concepts and techniques.
Since our project is movie recommendation system, one can develop a movie
recommendation system by using either content based or collaborative filtering or combining
both. In our project we have used the content-based approach. This approach is quite straight
forward and easy to implement. Content-based approach have its own advantages and dis-
advantages. In content-based filtering is based on the user ratings, only such kind of movie
will be recommended to the user.
Advantages: it is easy to design and it takes less time to compute.
Dis-advantages: the model can only make recommendations based on existing interests of
the user. In other words, the model has limited ability to expand on the users' existing
interests.

• Recommendations
There are plenty of way to expand on the work done in this project. Firstly, the content-based
method can be expanded to include more criteria to help categorize the movies. The most obvious
ideas are to add features to suggest movies with common actors, directors or writers. In addition,
movies released within the same time-period could also receive a boost in likelihood for
recommendation. Similarly, the movies total gross could be used to identify a user's taste in terms of
whether he/she prefers large release blockbusters, or smaller indie films. However, the above ideas
may lead to overfitting, given that a user's taste can be highly varied, and we only have a guarantee
that 20 movies (less than 0.2%) have been reviewed by the user. In addition, we could try to develop
hybrid methods that try to combine the advantages of both content-based methods and
collaborative filtering into one recommendation system. Mood-detection system can also be
combined with the model as sometimes user may tend to watch a particular movie if he/she is in a
particular movie. For example, if one is angry or upset an action may be a better fit but since our
model only recommends based on name and not considers other factors, the prediction may not
satisfy the customer.
BIBLIOGRAPHY AND REFERENCES
[1] Hirdesh Shivhare, Anshul Gupta and Shalki Sharma (2015), “Recommender system using
fuzzy c-means clustering and genetic algorithm based weighted similarity measure”, IEEE
International Conference on Computer, Communication and Control.
[2] Manoj Kumar, D.K. Yadav, Ankur Singh and Vijay Kr. Gupta (2015), “A Movie
Recommender System: MOVREC”, International Journal of Computer Applications (0975 –
8887) Volume 124 – No.3.
[3] RyuRi Kim, Ye Jeong Kwak, HyeonJeong Mo, Mucheol Kim, Seungmin Rho,Ka Lok
Man, Woon Kian Chong (2015),“Trustworthy Movie Recommender System with Correct
Assessment and Emotion Evaluation”, Proceedings of the International MultiConference of
Engineers and Computer Scientists Vol II.
[4] Zan Wang, Xue Yu*, Nan Feng, Zhenhua Wang (2014), “An Improved Collaborative
Movie Recommendation System using Computational Intelligence”,Journal of Visual
Languages & Computing,Volume 25, Issue 6.
[5] Debadrita Roy, Arnab Kundu, (2013), “Design of Movie Recommendation System by
Means of Collaborative Filtering”, International Journal of Emerging Technology and
Advanced Engineering, Volume 3, Issue 4.

You might also like