0% found this document useful (0 votes)
50 views

Assignment 3 AI

The document describes how to build a movie recommendation system using artificial intelligence techniques. It involves 5 steps: 1) creating a data file of user ratings, 2) computing the Euclidean distance score between users, 3) computing the Pearson correlation score, 4) finding similar users based on their Pearson scores, and 5) generating movie recommendations for a given user based on the ratings of similar users. Code examples are provided for each step to demonstrate how to calculate the scores and recommendations programmatically.

Uploaded by

Imraan Imraan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views

Assignment 3 AI

The document describes how to build a movie recommendation system using artificial intelligence techniques. It involves 5 steps: 1) creating a data file of user ratings, 2) computing the Euclidean distance score between users, 3) computing the Pearson correlation score, 4) finding similar users based on their Pearson scores, and 5) generating movie recommendations for a given user based on the ratings of similar users. Code examples are provided for each step to demonstrate how to calculate the scores and recommendations programmatically.

Uploaded by

Imraan Imraan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

ARTIFICIAL INTELLIGENCE

Assignment # 3

Submitted by:

M Waqas Arif

70067876

Section T

DECEMBER 15, 2020


Artificial Intelligence

Generating movie recommendations


Let's see how to build it.

1- Creating data file for users by the name of movie_ratings.json:


{
"John Carson":
{
"Inception": 2.5,
"Pulp Fiction": 3.5,
"Anger Management": 3.0,
"Fracture": 3.5,
"Serendipity": 2.5,
"Jerry Maguire": 3.0
},
"Michelle Peterson":
{
"Inception": 3.0,
"Pulp Fiction": 3.5,
"Anger Management": 1.5,
"Fracture": 5.0,
"Jerry Maguire": 3.0,
"Serendipity": 3.5
},
"William Reynolds":
{
"Inception": 2.5,
"Pulp Fiction": 3.0,
"Fracture": 3.5,
"Jerry Maguire": 4.0
},
"Jillian Hobart":
{
"Pulp Fiction": 3.5,
"Anger Management": 3.0,
"Jerry Maguire": 4.5,
"Fracture": 4.0,
"Serendipity": 2.5
},
"Melissa Jones":
{
"Inception": 3.0,
"Pulp Fiction": 4.0,
"Anger Management": 2.0,
"Fracture": 3.0,
"Jerry Maguire": 3.0,
"Serendipity": 2.0
},
"Alex Roberts":
{
"Inception": 3.0,
"Pulp Fiction": 4.0,
"Jerry Maguire": 3.0,
"Fracture": 5.0,
"Serendipity": 3.5
},
"Michael Henry":
{
"Pulp Fiction": 4.5,
"Serendipity": 1.0,
"Fracture": 4.0
}
}

2- Computing the Euclidean distance score


Code :
import json
import numpy as np
# Returns the Euclidean distance score between user1 and user2
def euclidean_score(dataset, user1, user2):
if user1 not in dataset:
raise TypeError('User ' + user1 + ' not present in the dataset')

if user2 not in dataset:


raise TypeError('User ' + user2 + ' not present in the dataset')

# Movies rated by both user1 and user2


rated_by_both = {}

for item in dataset[user1]:


if item in dataset[user2]:
rated_by_both[item] = 1
# If there are no common movies, the score is 0
if len(rated_by_both) == 0:
return 0
squared_differences = []

for item in dataset[user1]:


if item in dataset[user2]:
squared_differences.append(np.square(dataset[user1][item] - dataset[user2][item]))

return 1 / (1 + np.sqrt(np.sum(squared_differences)))

if __name__=='__main__':
data_file = 'movie_ratings.json'
with open(data_file, 'r') as f:
data = json.loads(f.read())
user1 = 'John Carson'
user2 = 'Michelle Peterson'

print ("\nEuclidean score:")


print (euclidean_score(data, user1, user2) )

Output:

3- Computing the Pearson correlation score

Code :
import json
import numpy as np
# Returns the Pearson correlation score between user1 and user2
def pearson_score(dataset, user1, user2):
if user1 not in dataset:
raise TypeError('User ' + user1 + ' not present in the dataset')

if user2 not in dataset:


raise TypeError('User ' + user2 + ' not present in the dataset')
# Movies rated by both user1 and user2
rated_by_both = {}

for item in dataset[user1]:


if item in dataset[user2]:
rated_by_both[item] = 1

num_ratings = len(rated_by_both)
# If there are no common movies, the score is 0
if num_ratings == 0:
return 0
# Compute the sum of ratings of all the common preferences
user1_sum = np.sum([dataset[user1][item] for item in rated_by_both])
user2_sum = np.sum([dataset[user2][item] for item in rated_by_both])
# Compute the sum of squared ratings of all the common preferences
user1_squared_sum = np.sum([np.square(dataset[user1][item]) for item in rated_by_both])
user2_squared_sum = np.sum([np.square(dataset[user2][item]) for item in rated_by_both])
# Compute the sum of products of the common ratings
product_sum = np.sum([dataset[user1][item] * dataset[user2][item] for item in
rated_by_both])
# Compute the Pearson correlation
Sxy = product_sum - (user1_sum * user2_sum / num_ratings)
Sxx = user1_squared_sum - np.square(user1_sum) / num_ratings
Syy = user2_squared_sum - np.square(user2_sum) / num_ratings
if Sxx * Syy == 0:
return 0
return Sxy / np.sqrt(Sxx * Syy)
if __name__=='__main__':
data_file = 'movie_ratings.json'

with open(data_file, 'r') as f:


data = json.loads(f.read())

user1 = 'John Carson'


user2 = 'Michelle Peterson'

print ("\nPearson score:")


print (pearson_score(data, user1, user2))

Output:

4- Finding similar users in the dataset

Code:
import json
import numpy as np

from pearson_score import pearson_score

# Finds a specified number of users who are similar to the input user
def find_similar_users(dataset, user, num_users):
if user not in dataset:
raise TypeError('User ' + user + ' not present in the dataset')

# Compute Pearson scores for all the users


scores = np.array([[x, pearson_score(dataset, user, x)] for x in dataset if user != x])
# Sort the scores based on second column
scores_sorted = np.argsort(scores[:, 1])

# Sort the scores in decreasing order (highest score first)


scored_sorted_dec = scores_sorted[::-1]
# Extract top 'k' indices
top_k = scored_sorted_dec[0:num_users]

return scores[top_k]
if __name__=='__main__':
data_file = 'movie_ratings.json'

with open(data_file, 'r') as f:


data = json.loads(f.read())
user = 'John Carson'
print ("\nUsers similar to " + user + ":\n")
similar_users = find_similar_users(data, user, 3)
print ("User\t\t\tSimilarity score\n")
for item in similar_users:
print (item[0], '\t\t', round(float(item[1]), 2))
Output:

5. Generating movie recommendations

Code:
import json
import numpy as np
from pearson_score import pearson_score
# Generate recommendations for a given user
def generate_recommendations(dataset, user):
if user not in dataset:
raise TypeError('User ' + user + ' not present in the dataset')
total_scores = {}
similarity_sums = {}
similarity_score=''
u=''
for u in [x for x in dataset if x != user]:
similarity_score = pearson_score(dataset, user, u)
if similarity_score <= 0:
continue
for item in [x for x in dataset[u] if x not in
dataset[user] or dataset[user][x] == 0]:
total_scores.update({item: dataset[u][item] * similarity_score})
similarity_sums.update({item: similarity_score})
if len(total_scores) == 0:
return ['No recommendations possible']
# Create the normalized list
movie_ranks = np.array([[total/similarity_sums[item], item]
for item, total in total_scores.items()])
# Sort in decreasing order based on the first column
movie_ranks = movie_ranks[np.argsort(movie_ranks[:,0])[::-1]]
# Extract the recommended movies
recommendations = [movie for _, movie in movie_ranks]
return recommendations

data=''
if __name__=='__main__':
data_file = 'movie_ratings.json'
with open(data_file, 'r') as f:
data=json.loads(f.read())

user = 'Michael Henry'


print("\nRecommendations for " + user + ":")
movies=generate_recommendations(data, user)
for i, movie in enumerate(movies):
print(str(i+1) + '. ' + movie)
user = 'John Carson'
print("\nRecommendations for " + user + ":")
movies = generate_recommendations(data, user)
for i, movie in enumerate(movies):
print(str(i+1) + '. ' + movie)
Output:

Overall output:

You might also like