0% found this document useful (0 votes)
38 views51 pages

17BIT024

The document discusses a movie recommendation system using machine learning algorithms. It proposes using K-Means clustering, K Nearest Neighbors, and Affinity propagation clustering algorithms to recommend movies to users based on their ratings. The system aims to suggest the top 20 movies to users.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views51 pages

17BIT024

The document discusses a movie recommendation system using machine learning algorithms. It proposes using K-Means clustering, K Nearest Neighbors, and Affinity propagation clustering algorithms to recommend movies to users based on their ratings. The system aims to suggest the top 20 movies to users.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 51

MOVIE RECOMMENDATION SYSTEM USING

MACHINE LEARNING ALGORITHMS

A PROJECT REPORT

Submitted by

SANDHIYA G (17BIT013)
KAUSHIKA S (17BIT024)
SUJITHRA R(17BIT005)

in partial fulfilment for the award of the degree


BACHELOR OF TECHNOLOGY
in
INFORMATION TECHNOLOGY

KUMARAGURU COLLEGE OF
TECHNOLOGY COIMBATORE-641 049
(An Autonomous Institution Affiliated to Anna University, Chennai)

June 2021

i
BONAFIDE CERTIFICATE

Certified that this project report “MOVIE RECOMMENDATION


SYSTEM USING MACHINE LEARNING ALGORITHMS” is
the
bonafide work of the SANDHIYA G (17BIT013), KAUSHIKA S
(17BIT024) and SUJITHRA R (17BIT005) and who carried out the
project work under my supervision.

SIGNATURE SIGNATURE
Dr. M. Alamelu Ms.S.Sathyavathi
HEAD OF THE DEPARTMENT Supervisor
Associate Professor Assistant Professor
Information Technology Information Technology

Internal Examiner External Examiner

The candidates with University register number


(17BIT051,17BIT058,17BIT208) were examined in the Project Viva-Voce
examination held on 26.05.2021
ii

DECLARATION

We SANDHIYA G (17BIT013), KAUHSIKA S (17BIT024) and


SUJITHRA R (17BIT005), hereby declare that the project
“MOVIE RECOMMENDATION SYSTEM USING
MACHINE
LEARNING ALGORITHMS” is done by us and to the study of
our knowledge, a similar work has not been submitted to any other
institution, for the fulfilment of the required course of study. The
report is submitted on the partial fulfilment of the requirements for
all awards of the Degree of Bachelor of Information Technology at
Kumaraguru College of Technology, Coimbatore.

SANDHIYA G KAUSHIKA S SUJITHRA R


17BIT013 17BIT024 17BIT005

We certify that the declaration made above by the candidates is true.


ACKNOWLEDGEMENT

We express our profound gratitude to our Chairman


Dr.B.K.Krishnaraj Vanavarayar B.com., B.L for giving this great
opportunity to pursue this course.
We express our heartfelt and deep sense of gratitude to our Joint
Correspondent Mr. K. Shankar Vanavarayar, MBA., PGDIEM., for
giving this great opportunity to pursue this course.
We extend our gratefulness to Dr.D.Saravanan , Principal for
providing the necessary facilities to complete our project.
We are deeply obliged to Dr.M.Alamelu , Head of the Department,

Information Technology, for her constant support.

We express our heartfelt and deep sense of gratitude to our

Coordinators, Dr. P.C.Thirumal and Dr.G.S.Nandakumar ,

Department of Information Technology, for their continuing support,


patience, words of expertise during the project development work.
We express our heartiest thanks to our guide Ms. S. Sathyavathi ,
Assistant professor(SRG), Department of Information Technology,
for her incredible support, valuable ideas and suggestions for all toil
regarding the project.
TABLE OF CONTENTS

CHAPTER TITLE PAGE


NO NO
ABSTRACT 7

1. INTRODUCTION 8

2. LITERATURE SURVEY 9

3. DATASET 11
4. PROPOSED SYSTEM 11

5. SYSTEM DESIGN 12

6. ALGORITHMS 13
6.1 . K MEANS
CLUSTERING ALGORITHM
6.2 . K NEAREST
NEIGHBOUR ALGORITHM
6.3 . AFFINITY
PROPAGATION CLUSTERING
ALGORITHM
7. SYSTEM REQUIREMENTS 20
8. RESULT 21
9. CODING 22
10. SNAPSHOTS 40
11. CONCLUSION 47
12. FUTURE WORK 49

13. REFERENCE 51
ABSTRACT

Everyone loves movies no matter age, gender, race, colour, or


geographical location. We tend to all in the simplest way are
connected to every different via this wonderful medium, nonetheless
what most attention-grabbing is that the undeniable fact that however
distinctive our selections and combos are in terms of picture show
preference. Some individuals like genre-specific movies be it a
thriller, romance, or sci-fi. Whereas others specialize in lead actors
and administrators. After we take all the under consideration, it’s
astoundingly troublesome to generalize a movie and say that
everybody would love it. However, with all that said, it’s still seen
that similar movies are liked by a selected part of the society. So
here’s whether we tend to as information scientists get play and
extract the juice out of all the behavioural patterns of not solely the
audience however conjointly from the films themselves. Thus, while
not additional ruction let’s jump right into the fundamentals of a
recommendation system. This paper is planned a machine learning
approach to suggest movies to the users using K- Means clustering
algorithm, K Nearest neighbours algorithm and Affinity propagation
clustering algorithm to recommend movies to the users.
1. INTRODUCTION

Machine Learning is that field of study that offers computers the


aptitude to find out while not being expressly programmed. ML is one
of the foremost exciting technologies that one would have ever
stumbled upon because it is obvious from the name, it provides the pc
that creates it additional like humans: the power to learn. Machine
Learning is actively getting used these days, maybe in many places
than one would expect. Machine Learning is employed in net search
engines, email filters to delineated spam, websites to create
individualized recommendations, banking software systems to sight
uncommon transactions, and much of apps on our phones like voice
recognition.
Recommender systems are systems that are designed to suggest things
to the user that support many alternative factors. These systems
predict the foremost possible product that the users are possibly to
buy and are of interest to firms like Netflix, Amazon etc. use
recommender systems to assist their users to spot the right product or
movies for them. The recommender system deals with an outsized
volume data present by filtering the foremost necessary information
supported by the information provided by a user’s preference and
interest. It finds out the match between user and item and imputes the
similarities between users and ratings for recommendation. Both the
users and the services provided have benefited from these sorts of
systems, the standard and decision-making method has additionally
improved through these sorts of systems.
In our project, by exploring different Machine learning algorithm
such as K-Means clustering algorithm, K Nearest Neighbors
algorithm and
Affinity propagation clustering algorithm, we recommend top 20
movies to users based on the rating given by users to the movies.

2. LITERATURE SURVEY

S.NO TITLE YEAR AIM DATASET METHOD RESULT CONCLUSION

This paper
Movie proposed a System showed This proves that
recommendation machine 95% accuracy our system is a
2018 MovieLens
1. System using learning on average in valid one for
(publicly
clustering approach to predicting rating prediction in the
Available) K Means
Algorithm and recommend from new user field of movies.
Pattern movies to clustering
11 target data which can
recognition users using be used to 146 This ensures that,
class
network K-means analyze which our system can
clustering Neural
And movie should be deal with
algorithm to Network
recommended different types of
separate to new users. users with diverse
12000 users
similar users attitude towards
and creating movies.
a neural
network for
each
cluster.
2. TV series Predict Movie data 1st TV series The result is
recommendation what rating collected K Means recommendation promising as the
Using fuzzy 2017 a user might from Clustering system that Average rating is
reference give to MovieLens consider no. of significantly
system, KMean certain TV TV series TV series as an lower, but more
clustering and by analyzing from IMDB. Adaptive input. research can
Adaptive neuro Imformation Fuzzy neuro improve the
fuzzy inference about user inference result even
system. and TV system further.
series. (AFNIS).

Movie
To shows predictions for
3. Analysis of Movie
that low collaborative the user with This proves that,
Recommendation
rated filtering user-Id 254: it movies that have
Systems; with
2020 movies are is observed never got an
and without
not that there is no above average
considering the
significant in Movie-Lens- significant rating does not
low rated movies
finding the 100k. difference have significant
movie between the contribution in
predictions. Pearson predictions. movie
So it’s correlation recommendations
suggestable coefficient The negligible and it’s suggested
to ignore difference to ignore such
them while between the movies.
calculating predictions
movie shows that the
predictions. effect of
removing low-
rated movies is
negligible and
hence can be
removed.
To improve
4. Group quality of The dataset hierarchical While doing The main concept
Recommendation 2008 service for used in this clustering clustering behind the GRS
System for Facebook research was and accuracy can be used in
Facebook users, we collected decision tree. improved by 9% many different
developed using applications. One
GRS to find Facebook is information
the most Platform. distribution
suitable system based on
group to profile features of
join by users. As social
matching networking
users community
profiles with expands
groups exponentially, it
identity. will become a
Facebook challenge to
social distribute right
network information to a
groups can right person. So If
be we know identity
identified of the user’s
based on groups, we can
their ensure the user
members’ to receive
profiles. information
he/she prefers.

3. DATASET

The dataset (ml-latest-small) describes 5-star rating and free-text


tagging activity from [Movie Lens](http://movielens.org), a film
recommendation service. It contains 100004 ratings, 9125 movies
and 671 users. This dataset was generated on October
seventeen,2016. Users were chosen randomly for inclusion. All the
chosen users had rated a minimum of 20 movies. The Movie Lens
dataset primarily has two files. The primary file contains data
regarding movies it’s: movie id , movie name and list of its genres.
The Movie Lens dataset contains a movie list of nineteen genres. The
opposite file consists of: user id, movie id, ratings. These two files are
pre-processed and manipulated therefore to produce our system.
4. PROPOSED SYSTEM

Recommender system are the system that are designed to suggest


things to the user supported many alternative factors. These systems
predict the foremost possible product that the users are possibly to
buy and are of interest to firms like Netflix, Amazon etc. Both the
users and the services provided have benefited from these sort of
systems, the standard and decision-making method has additionally
improved through these sorts of systems.

In this project of Movie recommendation system using machine


learning algorithms we will be exploring different Machine learning
algorithm such as K-Means clustering algorithm, K Nearest
Neighbors algorithm and Affinity propagation clustering algorithm,
we recommend top 20 movies to users based on the rating given by
users to the movies.

5. SYSTEM DESIGN
In our system, the primary and foremost is knowledge gathering.
We’ll be grouping the dataset that’s prepared, it ought to be pre-
processed since it's real-world data. There could be a possible ton of
missing data, mismatching entries etc… exploitation effective
functionalities the dataset are pre-processed and provided to following
steps. Now, the data is ready for applying the desired algorithmic rule,
here we use K-Means clustering algorithm, K nearest Neighbors
algorithm and Affinity propagation clustering algorithm. Once after
applying the algorithm separately for the dataset we will predict and
recommend 20 movies for the users who are yet to watch the movie,
and finally we conclude by comparing the results of each algorithm.
6. ALGORITHMS
6.1 K MEANS CLUSTERING ALGORITHM

K-Means Clustering is an unsupervised learning algorithm that is used


to solve the clustering problems in machine learning or data science.
In this topic, we will learn what is K-means clustering algorithm, how
the algorithm works, along with the Python implementation of k-
means clustering. It is an iterative algorithm that divides the unlabeled
dataset into k different clusters in such a way that each dataset
belongs only one group that has similar properties. It allows us to
cluster the data into different groups and a convenient way to discover
the categories of groups in the unlabeled dataset on its own without
the need for any training.

It is a centroid-based algorithm, where each cluster is associated with


a centroid. The main aim of this algorithm is to minimize the sum of
distances between the data point and their corresponding clusters.The
algorithm takes the unlabeled dataset as input, divides the dataset into
k-number of clusters, and repeats the process until it does not find the
best clusters. The value of k should be predetermined in this
algorithm.

The algorithm works as follows:


1. First we initialize k points, called means, randomly.
2. We categorize each item to its closest mean and we update
the mean’s coordinates, which are the averages of the items
categorized in that mean so far.
3. We repeat the process for a given number of iterations and at
the end, we have our clusters.

The goal of this algorithm is to find out similarities within groups of


people in order to build a movie recommendation system for
users.We are going to analyze a dataset to explore the characteristics
that people share in movies’ taste, based on how they rate them.we
use dataset consisting of two files : Rate and Movie.After clustering ,
we predict whether the user will like the movie , based on the average
of all other users rating .We finally , recommend the most rated movie
to the user who are yet to watch the movie.

6.2 K NEAREST NEIGHBORS ALGORITHM

kNN is a machine learning algorithm to find clusters of similar


users based on co ratings, and make predictions using the average
rating of top-k nearest neighbors.
We then find the k item that have the most similar user engagement
vectors. In this case, Nearest Neighbors of item id 5= [7, 4, 8, …].
Now let’s implement kNN into our book recommender system.We
use unsupervised algorithms with sklearn.neighbors. The algorithm
we use to compute the nearest neighbors is “brute”, and we specify
“metric=cosine” so that the algorithm will calculate the cosine
similarity between rating vectors. Finally, we fit the model.
sklearn.neighbors provides functionality for unsupervised and
supervised neighbors-based learning methods. Unsupervised nearest
neighbors is the foundation of many other learning methods, notably
manifold learning and spectral clustering. Supervised neighbors-based
learning comes in two flavors: classification for data with discrete
labels, and regression for data with continuous labels.

The principle behind nearest neighbor methods is to find a predefined


number of training samples closest in distance to the new point, and
predict the label from these. The number of samples can be a user-
defined constant (k-nearest neighbor learning), or vary based on the
local density of points (radius-based neighbor learning). The distance
can, in general, be any metric measure: standard Euclidean distance is
the most common choice. Neighbors-based methods are known
as non-generalizing machine learning methods, since they simply
“remember” all of its training data (possibly transformed into a fast
indexing structure such as a Ball Tree or KD Tree).

Despite its simplicity, nearest neighbors has been successful in a large


number of classification and regression problems, including
handwritten digits and satellite image scenes. Being a non-parametric
method, it is often successful in classification situations where the
decision boundary is very irregular.

The classes in sklearn.neighbors can handle either NumPy arrays


or scipy.sparse matrices as input. For dense matrices, a large number
of possible distance metrics are supported. For sparse matrices,
arbitrary Minkowski metrics are supported for searches.

There are many learning routines which rely on nearest neighbors at


their core. One example is kernel density estimation, discussed in
the density estimation section.

NearestNeighbors implements unsupervised nearest neighbors


learning. It acts as a uniform interface to three different nearest
neighbors algorithms: BallTree, KDTree, and a brute-force algorithm
based on routines in sklearn.metrics.pairwise. The choice of neighbors
search algorithm is controlled through the keyword 'algorithm', which
must be one of ['auto', 'ball_tree', 'kd_tree', 'brute']. When the default
value 'auto' is passed, the algorithm attempts to determine the best
approach from the training data. For a discussion of the strengths and
weaknesses of each option, see Nearest Neighbor Algorithms.

For the simple task of finding the nearest neighbors between two sets
of data, the unsupervised algorithms within sklearn.neighbors can be
used. Because the query set matches the training set, the nearest
neighbor of each point is the point itself, at a distance of zero.

It is also possible to efficiently produce a sparse graph showing the


connections between neighboring points. The dataset is structured
such that points nearby in index order are nearby in parameter space,
leading to an approximately block-diagonal matrix of K-nearest
neighbors. Such a sparse graph is useful in a variety of circumstances
which make use of spatial relationships between points for
unsupervised learning: in particular,
see Isomap, LocallyLinearEmbedding, and SpectralClustering.
NEAREST NEIGHBORS ALGORITHMS

BRUTE FORCE

Fast computation of nearest neighbors is an active area of research in


machine learning. The most naive neighbor search implementation
involves the brute-force computation of distances between all pairs of
points in the dataset: for N samples in D dimensions, this approach
scales as O[DN2]. Efficient brute-force neighbors searches can be
very competitive for small data samples. However, as the number of
samples N grows, the brute-force approach quickly becomes
infeasible. In the classes within sklearn.neighbors, brute-force
neighbors searches are specified using the
keyword algorithm = 'brute', and are computed using the routines
available in sklearn.metrics.pairwise.
K-D TREE
To address the computational inefficiencies of the brute-force
approach, a variety of tree-based data structures have been invented.
In general, these structures attempt to reduce the required number of
distance calculations by efficiently encoding aggregate distance
information for the sample. The basic idea is that if point A is very
distant from point B, and point B is very close to point C, then we
know that points A and C are very distant, without having to
explicitly calculate their distance. In this way, the computational cost
of a nearest neighbors search can be reduced to O[DNlog⁡(N)] or
better. This is a significant improvement over brute-force for large N.

An early approach to taking advantage of this aggregate information


was the KD tree data structure (short for K-dimensional tree), which
generalizes two-dimensional Quad-trees and 3-dimensional Oct-
trees to an arbitrary number of dimensions. The KD tree is a binary
tree structure which recursively partitions the parameter space along
the data axes, dividing it into nested orthotropic regions into which
data points are filed. The construction of a KD tree is very fast:
because partitioning is performed only along the data axes, no D-
dimensional distances need to be computed. Once constructed, the
nearest neighbor of a query point can be determined with
only O[log⁡(N)] distance computations. Though the KD tree approach
is very fast for low-dimensional (D<20) neighbors searches, it
becomes inefficient as D grows very large: this is one manifestation
of the so-called “curse of dimensionality”. In scikit-learn, KD tree
neighbors searches are specified using the
keyword algorithm = 'kd_tree', and are computed using the
class KDTree.
BALL TREE

To address the inefficiencies of KD Trees in higher dimensions,


the ball tree data structure was developed. Where KD trees partition
data along Cartesian axes, ball trees partition data in a series of
nesting hyper-spheres. This makes tree construction more costly than
that of the KD tree, but results in a data structure which can be very
efficient on highly structured data, even in very high dimensions.
A ball tree recursively divides the data into nodes defined by a
centroid C and radius r, such that each point in the node lies within
the hyper-sphere defined by r and C. The number of candidate points
for a neighbor search is reduced through use of the triangle
inequality:

|x+y|≤|x|+|y|
With this setup, a single distance calculation between a test point and
the centroid is sufficient to determine a lower and upper bound on the
distance to all points within the node. Because of the spherical
geometry of the ball tree nodes, it can out-perform a KD-tree in high
dimensions, though the actual performance is highly dependent on the
structure of the training data. In scikit-learn, ball-tree-based neighbors
searches are specified using the keyword algorithm = 'ball_tree', and
are computed using the class BallTree. Alternatively, the user can
work with the BallTree class directly.

The goal of this algorithm is that, We will be using an unsupervised


learning algorithm known as NearestNeighbors. Since this algorithm
calculates distance between two points we will pivot our dataset into
an item user matrix , items will be clustered based on rating of item
given by users and item will be recommends based on similar
items.So, applying algorithm for computing similarity with cosine
distance metrics which is very fast and more preferable that pearson
coefficient. So, finally the working of recommendation system is :
movie name will be given as input. Our system will find the similar
movies and sort them based on their similar distance and give output
of 20 movies with their distance.

6.3AFFINITY PROPAGATION CLUSTERING ALGORITHM

Affinity Propagation creates clusters by sending messages between


data points until convergence. Unlike clustering algorithms such as
k-means or k-medoids, affinity propagation does not require the
number of clusters to be determined or estimated before running the
algorithm, for this purpose the two important parameters are
the preference, which controls how many exemplars (or prototypes)
are used, and the damping factor which damps the responsibility and
availability of messages to avoid numerical oscillations when
updating these messages.
A dataset is described using a small number of exemplars,
‘exemplars’ are members of the input set that are representative of
clusters. The messages sent between pairs represent the suitability
for one sample to be the exemplar of the other, which is updated in
response to the values from other pairs. This updating happens
iteratively until convergence, at that point the final exemplars are
chosen, and hence we obtain the final clustering.

Algorithm for Affinity Propagation:


Input: Given a dataset D = {d1, d2, d3, …..dn}
s is a NxN matrix such that s(i, j) represents the similarity between
di and dj. The negative squared distance of two data points was used
as s i.e. for points xi and xj, s(i, j)= -||xi-xj||2 .
The diagonal of s i.e. s(i, i) is particularly important, as it represents
the input preference, meaning how likely a particular input is to
become an exemplar. When it is set to the same value for all inputs,
it controls how many classes the algorithm produces. A value close
to the minimum possible similarity produces fewer classes, while a
value close to or larger than the maximum possible similarity,
produces many classes. It is typically initialized to the median
similarity of all pairs of inputs.
The algorithm proceeds by alternating two message passing steps, to
update two matrices:
The “responsibility” matrix R has values r(i, k) that quantify
how well-suited xk is to serve as the exemplar for xi, relative
to other candidate exemplars for xi.
 The “availability” matrix A contains values a(i, k) that
represent how “appropriate” it would be for xi to pick xk as
its exemplar, taking into account other points’ preference for
xk as an exemplar.
Both matrices are initialized to all zeroes. The algorithm then
performs the following updates iteratively:
 First, responsibility updates are sent around

 Then, availability is updated per

The iterations are performed until either the cluster boundaries


remain unchanged over a number of iterations, or after some
predetermined number of iterations. The exemplars are extracted
from the final matrices as those whose ‘responsibility + availability
for themselves is positive (i.e. (r(i, i) + a(i, i)) > 0).
The goal of this algorithm is to recommend the user top 20 movies,
similar to entered movie. So by computing the similarity matrix and
movie clusters using pearson similarity algorithm. We used
precomputed similarity matrix and cluster are used to recommend top
20 movies to the users.

7. SYSTEM REQUIREMENTS
HARDWARE REQUIREMENTS
 System : i7 Processor
 Ram : 16 GB
SOFTWARE REQUIREMENTS
 Operating system : Windows 10
 Tool: Anaconda Navigator – 64bit
 Scripting Tool: Jupyter Notebook

Jupyter Notebook : The Jupyter Notebook is an open-source web


application that allows you to create and share documents that contain
live code, equations, visualizations and narrative text. Uses include:
data cleaning and transformation, numerical simulation, statistical
modeling, data visualization, machine learning, and much more.

 Language: Python3.8

Python : Python is an interpreted, high-level, general-purpose


programming language. Created by Guido van Rossum and first
released in 1991, Python's design philosophy emphasizes code
readability with its notable use of significant whitespace. Its
language constructs and object-oriented approach aim to help
programmers write clear, logical code for small and large-scale
projects. Python is dynamically typed and garbage-collected. It
supports multiple programming paradigms, including procedural,
object-oriented, and functional programming. Python is often
described as a "batteries included" language due to its comprehensive
standard library.

 Necessary libraries : numpy for numerical data calculation ,


pandas for loading and preprocessing of data , matplotlib for
visualization of data and sklearns which is a machine learning
library.

8. RESULT
The dataset used is Rate.csv and Movie.csv which is taken from
MovieLens. This dataset consists of 100004 ratings given by 671
users for 9077 movies, each user rated nearly minimum of 20 movies.
The below image shows the movies recommended to the users. After
the result we check which system is more efficient for the
recommendation by comparing the result.
9. CODING

import matplotlib.pyplot as plt


import pandas as pd
import numpy as np

from mpl_toolkits.axes_grid1 import make_axes_locatable


from sklearn.cluster import KMeans
from sklearn.metrics import
mean_squared_error import itertools
from sklearn.metrics import silhouette_samples, silhouette_score
from scipy.sparse import csr_matrix

movies = pd.read_csv('Movie.csv')
ratings = pd.read_csv('Rate.csv')
dataset = pd.merge(movies, ratings, how ='inner', on ='movieId')
dataset.head()
print('The dataset contains: ', len(ratings), ' ratings of ', len(movies), ' movies.')
dataset.shape
dataset.nunique()
unique_user = ratings.userId.nunique(dropna = True)
unique_movie = ratings.movieId.nunique(dropna =
True) print("number of unique user:")
print(unique_user)
print("number of unique movies:")
print(unique_movie)
dataset = dataset.drop_duplicates()
print(dataset)
dataset.describe()
dataset.isnull()
dataset.isnull().sum()

x = dataset.genres
a = list()
for i in x:
abc = i
a.append(abc.split('|'))
a = pd.DataFrame(a)
b = a[0].unique()
for i in b:
dataset[i] = 0
dataset.head(2000)

for i in b:
dataset.loc[dataset['genres'].str.contains(i), i] = 1
dataset.head(2000)

dataset = dataset.drop(['genres','title'],axis =1)


dataset.head()
a=dataset a=a.groupby('movieId')
["rating"].mean() a
sorted_ratings_wise_movie=a.sort_values(ascending=False)
sorted_ratings_wise_movie

def get_genre_ratings(ratings, movies, genres,


column_names): genre_ratings = pd.DataFrame()
for genre in genres:
genre_movies = movies[movies['genres'].str.contains(genre) ]
avg_genre_votes_per_user = ratings[ratings['movieId'].isin(genre_movies['movieId'])].loc[:,
['userId', 'rating']].groupby(['userId'])['rating'].mean().round(2)

genre_ratings = pd.concat([genre_ratings, avg_genre_votes_per_user], axis=1)

print(genre_ratings)
genre_ratings.columns =
column_names return genre_ratings

genre_ratings = get_genre_ratings(ratings, movies, ['Romance', 'Sci-Fi', 'Comedy'],


['avg_romance_rating', 'avg_scifi_rating', 'avg_comedy_rating'])
genre_ratings.head()

def bias_genre_rating_dataset(genre_ratings, score_limit_1, score_limit_2):


biased_dataset = genre_ratings[((genre_ratings['avg_romance_rating'] < score_limit_1 - 0.2) &
(genre_ratings['avg_scifi_rating'] > score_limit_2)) | ((genre_ratings['avg_scifi_rating'] <
score_limit_1) & (genre_ratings['avg_romance_rating'] > score_limit_2))]
biased_dataset = pd.concat([biased_dataset[:300], genre_ratings[:2]])
biased_dataset = pd.DataFrame(biased_dataset.to_records())
return biased_dataset
biased_dataset = bias_genre_rating_dataset(genre_ratings, 3.2, 2.5)
print( "Number of records: ", len(biased_dataset))
biased_dataset.head()
X = biased_dataset[['avg_scifi_rating','avg_romance_rating','avg_comedy_rating']].values
df = biased_dataset[['avg_scifi_rating','avg_romance_rating','avg_comedy_rating']]
possible_k_values = range(2, len(X)+1, 5)
def clustering_errors(k, data):
kmeans = KMeans(n_clusters=k).fit(data)
predictions = kmeans.predict(data)
#cluster_centers =
kmeans.cluster_centers_
# errors = [mean_squared_error(row, cluster_centers[cluster]) for row, cluster in zip(data.values,
predictions)]
# return sum(errors)
silhouette_avg = silhouette_score(data, predictions)
return silhouette_avg
errors_per_k = [clustering_errors(k, X) for k in possible_k_values]

fig, ax = plt.subplots(figsize=(16, 6))


plt.plot(possible_k_values, errors_per_k)
xticks = np.arange(min(possible_k_values), max(possible_k_values)+1, 5.0)
ax.set_xticks(xticks, minor=False)
ax.set_xticks(xticks, minor=True)
ax.xaxis.grid(True, which='both')
yticks = np.arange(round(min(errors_per_k), 2), max(errors_per_k), .05)
ax.set_yticks(yticks, minor=False)
ax.set_yticks(yticks, minor=True)
ax.yaxis.grid(True, which='both')

ratings_title = pd.merge(ratings, movies[['movieId', 'title']], on='movieId' )


user_movie_ratings = pd.pivot_table(ratings_title, index='userId', columns= 'title', values='rating')

print('dataset dimensions: ', user_movie_ratings.shape, '\n\nSubset example:')


user_movie_ratings.iloc[:6, :10]

def sort_by_rating_density(user_movie_ratings, n_movies, n_users):


most_rated_movies = get_most_rated_movies(user_movie_ratings, n_movies)
most_rated_movies = get_users_who_rate_the_most(most_rated_movies, n_users)
return most_rated_movies
def get_most_rated_movies(user_movie_ratings, max_number_of_movies):
# 1- Count
user_movie_ratings = user_movie_ratings.append(user_movie_ratings.count(), ignore_index=True)
# 2- sort
user_movie_ratings_sorted = user_movie_ratings.sort_values(len(user_movie_ratings)-1, axis=1,
ascending=False)
user_movie_ratings_sorted =
user_movie_ratings_sorted.drop(user_movie_ratings_sorted.tail(1).index)
# 3- slice
most_rated_movies = user_movie_ratings_sorted.iloc[:, :max_number_of_movies]
return most_rated_movies
def get_users_who_rate_the_most(most_rated_movies, max_number_of_movies):
# Get most voting users
# 1- Count
most_rated_movies['counts'] = pd.Series(most_rated_movies.count(axis=1))
# 2- Sort
most_rated_movies_users = most_rated_movies.sort_values('counts', ascending=False)
# 3- Slice
most_rated_movies_users_selection = most_rated_movies_users.iloc[:max_number_of_movies, :]
most_rated_movies_users_selection = most_rated_movies_users_selection.drop(['counts'], axis=1)

return most_rated_movies_users_selection

n_movies = 30
n_users = 18
most_rated_movies_users_selection = sort_by_rating_density(user_movie_ratings, n_movies,
n_users)

print('dataset dimensions: ', most_rated_movies_users_selection.shape)


most_rated_movies_users_selection.head()

def draw_movies_heatmap(most_rated_movies_users_selection, axis_labels=True):

# Reverse to match the order of the printed dataframe


#most_rated_movies_users_selection = most_rated_movies_users_selection.iloc[::-1]
fig =
plt.figure(figsize=(15,4)) ax =
plt.gca()

# Draw heatmap
heatmap = ax.imshow(most_rated_movies_users_selection, interpolation='nearest', vmin=0,
vmax=5, aspect='auto')

if axis_labels:
ax.set_yticks(np.arange(most_rated_movies_users_selection.shape[0]) , minor=False)
ax.set_xticks(np.arange(most_rated_movies_users_selection.shape[1]) , minor=False)
ax.invert_yaxis()
ax.xaxis.tick_top()
labels = most_rated_movies_users_selection.columns.str[:40]
ax.set_xticklabels(labels, minor=False)
ax.set_yticklabels(most_rated_movies_users_selection.index, minor=False)
plt.setp(ax.get_xticklabels(), rotation=90)
else:
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)

ax.grid(False)
ax.set_ylabel('User id')

# Separate heatmap from color bar


divider = make_axes_locatable(ax)
cax = divider.append_axes("right", size="5%", pad=0.05)

# Color bar
cbar = fig.colorbar(heatmap, ticks=[5, 4, 3, 2, 1, 0], cax=cax)
cbar.ax.set_yticklabels(['5 stars', '4 stars','3 stars','2 stars','1 stars','0 stars'])
plt.show()

draw_movies_heatmap(most_rated_movies_users_selection)

user_movie_ratings = pd.pivot_table(ratings_title, index='userId', columns= 'title', values='rating')


most_rated_movies_1k = get_most_rated_movies(user_movie_ratings, 1000)

def sparse_clustering_errors(k, data):


kmeans = KMeans(n_clusters=k).fit(data)
predictions = kmeans.predict(data)
cluster_centers = kmeans.cluster_centers_
errors = [mean_squared_error(row, cluster_centers[cluster]) for row, cluster in zip(data,
predictions)]
return sum(errors)

sparse_ratings = csr_matrix(pd.SparseDataFrame(most_rated_movies_1k).to_coo())

def draw_movie_clusters(clustered, max_users, max_movies):


c=1
for cluster_id in clustered.group.unique():
# To improve visibility, we're showing at most max_users users and max_movies movies per
cluster.
# You can change these values to see more users & movies per cluster
d = clustered[clustered.group == cluster_id].drop(['index', 'group'], axis=1)
n_users_in_cluster = d.shape[0]

d = sort_by_rating_density(d, max_movies, max_users)

d = d.reindex_axis(d.mean().sort_values(ascending=False).index,
axis=1) d =
d.reindex_axis(d.count(axis=1).sort_values(ascending=False).index) d =
d.iloc[:max_users, :max_movies]
n_users_in_plot = d.shape[0]

# We're only selecting to show clusters that have more than 9 users, otherwise, they're less
interesting
if len(d) > 9:
print('cluster # {}'.format(cluster_id))
print('# of users in cluster: {}.'.format(n_users_in_cluster), '# of users in plot:
{}'.format(n_users_in_plot))
fig = plt.figure(figsize=(15,4))
ax = plt.gca()

ax.invert_yaxis()
ax.xaxis.tick_top()
labels = d.columns.str[:40]

ax.set_yticks(np.arange(d.shape[0]) ,
minor=False) ax.set_xticks(np.arange(d.shape[1])
, minor=False)

ax.set_xticklabels(labels, minor=False)

ax.get_yaxis().set_visible(False)

# Heatmap
heatmap = plt.imshow(d, vmin=0, vmax=5, aspect='auto')

ax.set_xlabel('movies')
ax.set_ylabel('User id')

divider = make_axes_locatable(ax)
cax = divider.append_axes("right", size="5%", pad=0.05)

# Color bar
cbar = fig.colorbar(heatmap, ticks=[5, 4, 3, 2, 1, 0], cax=cax)
cbar.ax.set_yticklabels(['5 stars', '4 stars','3 stars','2 stars','1 stars','0 stars'])

plt.setp(ax.get_xticklabels(), rotation=90, fontsize=9)


plt.tick_params(axis='both', which='both', bottom='off', top='off', left='off', labelbottom='off',
labelleft='off')
#print('cluster # {} \n(Showing at most {} users and {} movies)'.format(cluster_id, max_users,
max_movies))

plt.show()

def bias_genre_rating_dataset(genre_ratings, score_limit_1, score_limit_2):


biased_dataset = genre_ratings[((genre_ratings['avg_romance_rating'] < score_limit_1 - 0.2) &
(genre_ratings['avg_scifi_rating'] > score_limit_2)) | ((genre_ratings['avg_scifi_rating'] <
score_limit_1) & (genre_ratings['avg_romance_rating'] > score_limit_2))]
biased_dataset = pd.concat([biased_dataset[:300], genre_ratings[:2]])
biased_dataset = pd.DataFrame(biased_dataset.to_records())
return biased_dataset

def sort_by_rating_density(user_movie_ratings, n_movies, n_users):


most_rated_movies = get_most_rated_movies(user_movie_ratings, n_movies)
most_rated_movies = get_users_who_rate_the_most(most_rated_movies, n_users)
return most_rated_movies

import helper
import importlib
importlib.reload(helper)

predictions = KMeans(n_clusters=20, algorithm='full').fit_predict(sparse_ratings)


max_users = 70
max_movies = 50

clustered = pd.concat([most_rated_movies_1k.reset_index(), pd.DataFrame({'group':predictions})],


axis=1)
helper.draw_movie_clusters(clustered, max_users, max_movies)
cluster_number = 4
n_users = 75
n_movies = 300
cluster = clustered[clustered.group == cluster_number].drop(['index', 'group'], axis=1)
cluster = sort_by_rating_density(cluster, n_movies, n_users)
draw_movies_heatmap(cluster, axis_labels=False)
cluster.fillna('').head()

movie_name = ("Blues Brothers, The (1980)")


cluster[movie_name].mean()
cluster.mean().head(20)’

user_id = 2
user_2_ratings = cluster.loc[user_id, :]
user_2_unrated_movies = user_2_ratings[user_2_ratings.isnull()]
avg_ratings = pd.concat([user_2_unrated_movies, cluster.mean()], axis=1, join='inner').loc[:,0]
avg_ratings.sort_values(ascending=False)[:20]

from sklearn.neighbors import NearestNeighbors


import seaborn as sns
dataset = ratings.pivot(index='movieId',columns='userId',values='rating')
dataset.head()
dataset.fillna(0,inplace=True)
dataset.head()
no_user_voted = ratings.groupby('movieId')['rating'].agg('count')
no_movies_voted = ratings.groupby('userId')['rating'].agg('count')
f,ax = plt.subplots(1,1,figsize=(16,4))
# ratings['rating'].plot(kind='hist')
plt.scatter(no_user_voted.index,no_user_voted,color='mediumseagreen')
plt.axhline(y=10,color='r')
plt.xlabel('MovieId')
plt.ylabel('No. of users voted')
plt.show()
dataset = dataset.loc[no_user_voted[no_user_voted >
10].index,:] f,ax = plt.subplots(1,1,figsize=(16,4))
plt.scatter(no_movies_voted.index,no_movies_voted,color='mediumseagreen')
plt.axhline(y=50,color='r')
plt.xlabel('UserId')
plt.ylabel('No. of votes by
user') plt.show()
dataset=dataset.loc[:,no_movies_voted[no_movies_voted > 50].index]
dataset
sample = np.array([[0,0,3,0,0],[4,0,0,0,2],[0,0,0,0,1]])
sparsity = 1.0 - ( np.count_nonzero(sample) / float(sample.size) )
print(sparsity)
csr_sample = csr_matrix(sample)
print(csr_sample)
csr_data = csr_matrix(dataset.values)
dataset.reset_index(inplace=True)
knn = NearestNeighbors(metric='cosine', algorithm='brute', n_neighbors=20, n_jobs=-1)
knn.fit(csr_data)
def get_movie_recommendation(movie_name):
n_movies_to_reccomend = 20
movie_list = movies[movies['title'].str.contains(movie_name)]
if len(movie_list):
movie_idx= movie_list.iloc[0]['movieId']
movie_idx = dataset[dataset['movieId'] == movie_idx].index[0]
distances , indices =
knn.kneighbors(csr_data[movie_idx],n_neighbors=n_movies_to_reccomend+1)
rec_movie_indices =
sorted(list(zip(indices.squeeze().tolist(),distances.squeeze().tolist())),key=lambda x: x[1])[:0:-1]
recommend_frame = []
for val in rec_movie_indices:
movie_idx = dataset.iloc[val[0]]['movieId']
idx = movies[movies['movieId'] == movie_idx].index recommend_frame.append({'Title':movies.iloc[idx]
['title'].values[0],'Distance':val[1]})
df = pd.DataFrame(recommend_frame,index=range(1,n_movies_to_reccomend+1))
return df
else:

return "No movies found. Please check your input"


get_movie_recommendation('Blues Brothers, The')

import numpy as np
import pickle
from sklearn.cluster import AffinityPropagation
import math
import sys
import pickle
from time import sleep

train_data=[]
test_data = []
users ={}
movies={}

def get_data():
train_data = np.genfromtxt("dataset.csv",delimiter= ',',skip_header=(1))

user_id =
list(set(train_data[:,0]))
user_id.sort()
movie_id =
list(set(train_data[:,1]))
movie_id.sort()
users={}
movies={}
for i,j in
enumerate(user_id):
users[j]=i
for i,j in enumerate(movie_id):

# print(i)
movies[j]=i

user_item = np.empty((len(set(train_data[:,0])),len(set(train_data[:,1]))))

for row in train_data:


i = users[int(row[0])]
j=
movies[int(row[1])]
user_item[i][j]=row[2]
return(train_data,users,movies,user_item)

def find_similarity():
movie_sim = np.zeros([len(movies.keys()),len(movies.keys())])
for st,m1 in enumerate(movies.keys()):
if st%1000==0:
print('in movie',st)

# r1 - Average rating of movie m1


r1 = np.average(user_item[:,movies[m1]])
u_m1 = np.where(user_item[:,movies[m1]]!
=0)

for j in range(st,len(movies.keys())):
m2 = 1
myArr = list(movies.keys())
m2 =myArr[j]
r2 = np.average(user_item[:,movies[m2]])
u_m2 = np.where(user_item[:,movies[m2]]!
=0)
u = list(set(u_m1[0]).intersection(set(u_m2[0])))

if len(u)!=0:
co_ratings = user_item[np.ix_(u,[int(movies[m1]),int(movies[m2])])]
num = sum((co_ratings[:,0]-r1)*(co_ratings[:,1]-r2))
den = ((sum((co_ratings[:,0]-r1)**2))**0.5)*((sum((co_ratings[:,1]-r2)**2))**0.5)
corr = num*1.0/den
movie_sim[st][j] = corr
if j != st:
movie_sim[j][st] = corr

return(movie_sim)

def compute_reco(act_user,act_mov):

user = users[act_user]
movie = movies[act_mov]
clus = clus_labels[movie]
clus_movie = np.where(clus_labels==clus)
user_rated_movies = np.where(user_item[user]!
=0)
rated_movies = list(set(clus_movie[0]).intersection(set(user_rated_movies[0])))
if movie in rated_movies:
rated_movies.remove(movie)
clus_movie = np.delete(clus_movie,np.where(clus_movie[0]==movie))
ratings = user_item[user,rated_movies]
# dtype = [('movie_num', int), ('rating', float), ('W',
int)] reco_ratings = np.zeros([len(clus_movie),3])
for j,m in
enumerate(clus_movie): if m in
user_rated_movies[0]:
pred_rating = user_item[user,m]
reco_ratings[j]=[0,m,pred_rating]

else:
sim = sim_mat[m,rated_movies]
rated = np.column_stack((ratings,sim))
pred_rating = np.dot(rated[:,0],rated[:,1])*1.0/sum(rated[:,1])
pred_rating = round(pred_rating * 2) / 2
if(math.isnan(pred_rating)):
pred_rating = 0
reco_ratings[j]=[1,m,pred_rating]
not_watched_ind =
np.where(reco_ratings[:,0]==1)
if(len(not_watched_ind[0])>10):
not_watched = reco_ratings[not_watched_ind[0]]
reco_movies = not_watched[np.argsort(not_watched[:, 2])][::-1]
reco_movies= reco_movies[0:10]
elif (len(reco_ratings)>10):
reco_movies = reco_ratings[np.argsort(reco_ratings[:, 2])][::-1]
reco_movies= reco_movies[0:10]
else:
reco_movies = reco_ratings[np.argsort(reco_ratings[:, 2])][::-1]
final_list = [0]*len(reco_movies)
for k,i in enumerate(reco_movies):

final_list[k] = next(key for key, value in movies.iteritems() if value == i[1] )


return(final_list)

prompt = "Do you want to compute similarity matrix and cluster?\n Enter Y - To compute the
components \n Enter N - To use precomputed components\n Enter ex to stop execution \nInput - "
while True:
inp_choice = input(prompt)
if inp_choice.lower() ==
'y':

train_data,users,movies,user_item = get_data()
print("Data read complete\n Computing similarity...")
sleep(6)
print("computation completed")
sleep(3)
print("Results are
processing") break

sim_mat = find_similarity()
print("Similarity matrix computed\n Movies being clustered...")

np.savetxt("sim_mat_Pearson.csv",sim_mat,delimiter=',')

af = AffinityPropagation(verbose=True,affinity="precomputed").fit(sim_mat)
clus_labels = af.labels_
print("Movies clustered")

break
elif inp_choice.lower()=='n':
print("Loading precomputed
components...") data2 = []
with open("Pickle_file", "rb") as f:
for _ in range(pickle.load(f)):
data2.append(pickle.load(f))
sim_mat = np.genfromtxt("sim_mat_Pearson.csv",delimiter=",")
train_data = data2[0]
# test_data = data2[1]
users = data2[1]
movies = data2[2]
user_item = data2[3]
clus_labels =
data2[4]
print("\nPrecomputed components
loaded") break
elif inp_choice.lower()=="ex":
sys.exit("Program stopped as
requested")
else:
print("Invalid input")
continue

while True:
break
print("\n1st 100 User ids =", users.keys()
[0:100],) print("\n")
print("1st 100 Movie ids =",movies.keys()[0:100],)
break
print("hello1")
#
inp = input("\nEnter user id and movie id separated by comma- ")
if inp == "":
break
else:
act_user,act_mov = inp.split(',')
act_user = int(act_user)
act_mov = int(act_mov)
final_list = compute_reco(act_user,act_mov)
print("Top %d movies for user %d similar to movie %d \n"%(len(final_list),act_user,act_mov))
print(final_list)
inp1 = input("Do you want to continue? Y/N - ")
if inp1.lower()== 'y':
continue
else:

break

computeRecommedationMatrix()

10. SNAPSHOT
11. CONCLUSION

In this project, we worked on a K-Means clustering algorithm , knn


algorithm and Affinity Propagation Clustering Algorithm to make
movie recommendations as good as possible. User rating and
preference have been considered while building the system. Our
system will calculate the common in predicting rating from user
information which may be used to analyze that movie ought to
suggest to new users using three Machine Learning Algorithm. This
proves that our system may be a valid one for predicting within the
field of movies. Finally, we compare the result of the three algorithm
and found that we got nearly same result with different execution
time. So we conclude that Affinity Propagation Clustering Algorithm
is more efficient than the other two algorithm.

Chart Title

execution time

efficiency

accuracy

0%10%20%30%40%50%60%70%80%90%100%

Affinity Propagation Clustering Algorithm K Nearest Neighbors Algorithm(unsupervised)


K Means Clustering Algorithm
12. FUTURE WORK

Neural Networks and Deep Learning have been all the rage the
last couple of years in many different fields, and it appears that
they are also helpful for solving recommendation system
problems.

Ben Allison, a Principal Machine Learning Scientist at Amazon,


gave a great talk earlier this year at Amazon’s re:MARS conference
about building recommender systems using Recurrent Neural
Networks and Deep Learning.

One of the benefits of Deep Learning is similar to matrix


factorization, in that there is an ability to derive latent attributes. Deep
Learning, however, can make up for some of the weaknesses of
matrix factorization such as the inability to include time in the model
— which standard matrix factorization isn’t designed for. Deep
Learning, however, can utilize Recurrent Neural Networks which are
specifically designed for time and sequence data.

Incorporating time into a recommender system is important, because


there are often preference seasonal effects. For example, it is likely
that in December, more people are going to be watching holiday-
themed movies and buying home decorations.

Another point that Ben Allison brought up is the need to see what
would happen if a customer was shown a sub-optimal
recommendation. This is taking a reinforcement learning approach,
since the goal in this case would be to show customers a
recommendation, and then record what the customer does. At times,
customers can be recommended something that does not seem like the
best option, just to see how the customer reacts which will improve
the learning in the long-term.

Recommender systems can be a very powerful tool in a company’s


arsenal, and future developments are going to increase business value
even further. Some of the applications include being able to
anticipate seasonal purchases based on recommendations, determine
important purchases, and give better recommendations to customers
which can increase retention and brand loyalty.

Most businesses will have some use for recommender systems, and I
encourage everyone to learn more about this fascinating area.
13. REFERENCE

[1] Chen, Hung-Chen, and Arbee L. P. Chen. “A music


recommendation system based on music data grouping and
user interests.” Proceedings of the tenth international
conference on Information and knowledge management -
CIKM01, 2001, doi:10.1145/502585.502625.
[2] Ahmed, Muyeed, et al. “TV Series Recommendation Using
Fuzzy Inference System, K-Means Clustering and Adaptive Neuro
Fuzzy Inference System.” 2017, pp. 1512–1519.
[3] Park, Moon-Hee, et al. “Location-Based Recommendation
System Using Bayesian User‟s Preference Model in Mobile
Devices.” Ubiquitous Intelligence and Computing Lecture Notes in
Computer Science, pp. 1130–1139., doi:10.1007/978-3-540-73549-
6_110.
[4] Huang, Yao-Chang, and Shyh-Kang Jenor. “An audio
recommendation system based on audio signature description
scheme in MPEG-7 Audio.” 2004 IEEE International Conference on
Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763),
doi:10.1109/icme.2004.1394273.
[5] Baatarjav, Enkh-Amgalan, et al. “Group Recommendation
System for Facebook.” On the Move to Meaningful Internet Systems:
OTM 2008 Workshops Lecture Notes in Computer Science, 2008,
pp. 211– 219., doi:10.1007/978-3-540-88875-8_41.
[6] Kumar, S. Gupta, S. K. Singh and K. K. Shukla, "Comparison
of various metrics used in collaborative filtering for
recommendation system," 2015 Eighth International Conference on
Contemporary Computing (IC3), Noida, 2015, pp. 150-154.
Publisher: IEEE.
[7] M. K. Kharita, A. Kumar and P. Singh, "Item-Based Collaborative
Filtering in Movie Recommendation in Real time," 2018 First
International Conference on Secure Cyber Computing and
Communication (ICSCCC), Jalandhar, India, 2018, pp. 340-342.
Publisher: IEEE.
[8] Huang, Yao-Chang, and Shyh-Kang Jenor. “An audio
recommendation system based on audio signature description
scheme in MPEG-7 Audio.” 2004 IEEE International Conference on
Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763),
doi:10.1109/icme.2004.1394273.
[9] . Lund and Y. Ng, "Movie Recommendations Using the Deep
Learning Approach," 2018 IEEE International Conference on
Information Reuse and Integration (IRI), Salt Lake City, UT, 2018,
pp. 47-54. Publisher: IEEE.
[10] M. Ilhami and Suharjito, "Film recommendation systems using
matrix factorization and collaborative filtering," 2014 International
Conference on Information Technology Systems and Innovation
(ICITSI), Bandung, 2014, pp. 1-6. Publisher: IEEE.

You might also like