Minor Project
Minor Project
Supervised Machine Learning is used when the model is getting trained on a labelled dataset.
Labelled dataset is the one which has both input and output parameters. In this type of
learning both training and validation datasets are labelled. Here, we have three components
such as Training Data, Test Data and features. Training data is that where data is usually split
in the ratio of 80:20 i.e., 80% as training data and rest as testing data. Testing data is that
when data is good to be tested. At the time of testing, input is fed from remaining 20% data
which the model has never seen before, the model will predict some value and we will
compare it with actual output and calculate the accuracy.
• Trending based filtering- Here the movies are classified by the ratings and the stuff
that is been liked by majority of the population.
• Content based filtering- Here the similar articles are recommended to the user
according to his previous content search.
• Collaborative based filtering- Here the two similar user likings act as a
recommendation criterion to each other. Like, the two users watch comedy movies so
if a new comedy stuff appears and is watched by user A it will also be recommended
to user B.
A recommender system, or a recommendation system (sometimes replacing 'system' with a
synonym such as platform or engine), is a subclass of information filtering system that seeks
to predict the "rating" or "preference" a user would give to an item. They are primarily used
in commercial applications. Recommender systems are utilized in a variety of areas and are
most widely recognized as playlist generators for video and music services like Netflix,
YouTube and Spotify, product recommenders for services such as Amazon, or content
recommenders for social media platforms such as Facebook or Twitter. These systems can
operate using a single input, like music, or multiple inputs within and across platforms like
news, books, and search queries. Recommender systems have also been developed to explore
research articles and financial services.
These both are the recommendation engines that recommend us the movies and other related
stuff based on our previous searches and watched experience.
➢ Classification: It is a Supervised Learning task where output is has defined labels
(discrete values).
➢ Regression: It is a Supervised Learning method where the target feature must have
continuous values.
OBJECTIVE
In order to achieve the goal of the project, the first process is to do enough background study,
so the literature study will be conducted. The whole project is based on a big amount of
movie data so that we choose quantitative research method. For philosophical assumption,
positivism is selected because the project is experimental and testing character. The research
approach is deductive approach as the improvement of our research will be tested by
deducing and testing a theory. Ex post facto research is our research strategy, the movie data
is already collected and we don’t change the independent variables. We use experiments to
collect movie data. Computational mathematics is used data analysis because the result is
based on improvement of algorithm. For the quality assurance, we have a detail explanation
of algorithm to ensure test validity. The similar results will be generated when we run the
same data multiple times, which is for reliability. We ensure the same data leading to sam e
result by different researchers.
Content-based Filtering
Content based recommender works with data that the user provides, either explicitly (rating)
or implicitly (clicking on a link). Based on that data, a user profile is generated, which is then
used to make suggestions to the user. As the user provides more inputs or takes actions on the
recommendations, the engine becomes more and more accurate.
Algorithm : 1. : Collect the user reviews or movie name as the input from user.
2: Extract the name from the user input.
3. Convert the dataset feature values into feature vectors.
3: Compare the extracted name from the user’s input with dataset using cosine
similarity.
4: Learn user’s profile based on the rated items and make recommendation based
on top ranked items after comparing cosine values.
Cosine similarity is a measure of similarity between two non-zero vectors of an inner
product space that measures the cosine of the angle between them.
Formula:
Recommendation System is a system which is used for filtering the information in the system
that predicts rating for a given item. Recommended system identifies recommendations for
individual users based on past acquisition and searches, and on other users behaviour.
Recommended Systems are software tools and techniques providing suggestions for items to
be of use to a user. The system helps users to match with the items which they are interested
in. It will support and improve the quality of the decisions which the users searching the
items in the online.
Model Implementation :-
1. First we import all the required modules in the model.
6. Now, we take the movie name input from the user and find a close match in the list of
all available movie titles.
7. Find the index of the movie that was the closest match and arrange the similarity
scores in descending order.
8. Finally display the name of top 30 movies that are similar to the name entered by the
user.
Model at a Glance
RECOMMENDATIONS AND CONCLUSION
• Conlcusion
In this project we have implemented and learn the following things such as- • Building a
Movie Recommendation System • To find the Similarity Scores and Indexes. • Compute
Distance Between Two Vectors • Cosine Similarity • To find Euclidian Distance and ma ny
more ML related concepts and techniques.
Since our project is movie recommendation system, one can develop a movie
recommendation system by using either content based or collaborative filtering or combining
both. In our project we have used the content-based approach. This approach is quite straight
forward and easy to implement. Content-based approach have its own advantages and dis-
advantages. In content-based filtering is based on the user ratings, only such kind of movie
will be recommended to the user.
Advantages: it is easy to design and it takes less time to compute.
Dis-advantages: the model can only make recommendations based on existing interests of
the user. In other words, the model has limited ability to expand on the users' existing
interests.
• Recommendations
There are plenty of way to expand on the work done in this project. Firstly, the content-based
method can be expanded to include more criteria to help categorize the movies. The most obvious
ideas are to add features to suggest movies with common actors, directors or writers. In addition,
movies released within the same time-period could also receive a boost in likelihood for
recommendation. Similarly, the movies total gross could be used to identify a user's taste in terms of
whether he/she prefers large release blockbusters, or smaller indie films. However, the above ideas
may lead to overfitting, given that a user's taste can be highly varied, and we only have a guarantee
that 20 movies (less than 0.2%) have been reviewed by the user. In addition, we could try to develop
hybrid methods that try to combine the advantages of both content-based methods and
collaborative filtering into one recommendation system. Mood-detection system can also be
combined with the model as sometimes user may tend to watch a particular movie if he/she is in a
particular movie. For example, if one is angry or upset an action may be a better fit but since our
model only recommends based on name and not considers other factors, the prediction may not
satisfy the customer.
BIBLIOGRAPHY AND REFERENCES
[1] Hirdesh Shivhare, Anshul Gupta and Shalki Sharma (2015), “Recommender system using
fuzzy c-means clustering and genetic algorithm based weighted similarity measure”, IEEE
International Conference on Computer, Communication and Control.
[2] Manoj Kumar, D.K. Yadav, Ankur Singh and Vijay Kr. Gupta (2015), “A Movie
Recommender System: MOVREC”, International Journal of Computer Applications (0975 –
8887) Volume 124 – No.3.
[3] RyuRi Kim, Ye Jeong Kwak, HyeonJeong Mo, Mucheol Kim, Seungmin Rho,Ka Lok
Man, Woon Kian Chong (2015),“Trustworthy Movie Recommender System with Correct
Assessment and Emotion Evaluation”, Proceedings of the International MultiConference of
Engineers and Computer Scientists Vol II.
[4] Zan Wang, Xue Yu*, Nan Feng, Zhenhua Wang (2014), “An Improved Collaborative
Movie Recommendation System using Computational Intelligence”,Journal of Visual
Languages & Computing,Volume 25, Issue 6.
[5] Debadrita Roy, Arnab Kundu, (2013), “Design of Movie Recommendation System by
Means of Collaborative Filtering”, International Journal of Emerging Technology and
Advanced Engineering, Volume 3, Issue 4.