Assignment 1 - Applied Social Network Analysis in Python
Assignment 1 - Applied Social Network Analysis in Python
September 6, 2020
You are currently looking at version 1.1 of this notebook. To download notebooks and datafiles, as well
as get help on Jupyter notebooks in the Coursera platform, visit the Jupyter Notebook FAQ course resource.
1
'The Matrix',
'Anaconda',
'The Social Network',
'The Godfather',
'Monty Python and the Holy Grail',
'Snakes on a Plane',
'Kung Fu Panda',
'The Dark Knight',
'Mean Girls'])
plt.figure()
pos = nx.spring_layout(G)
edges = G.edges()
weights = None
if weight_name:
weights = [int(G[u][v][weight_name]) for u,v in edges]
labels = nx.get_edge_attributes(G,weight_name)
nx.draw_networkx_edge_labels(G,pos,edge_labels=labels)
nx.draw_networkx(G, pos, edges=edges, width=weights);
else:
nx.draw_networkx(G, pos, edges=edges);
1.0.1 Question 1
Using NetworkX, load in the bipartite graph from Employee_Movie_Choices.txt and return that
graph.
This function should return a networkx graph with 19 nodes and 24 edges
return G
#answer_one()
2
1.0.2 Question 2
Using the graph from the previous question, add nodes attributes named 'type' where movies
have the value 'movie' and employees have the value 'employee' and return that graph.
This function should return a networkx graph with node attributes {'type': 'movie'} or {'type':
'employee'}
return G
#answer_two()
1.0.3 Question 3
Find a weighted projection of the graph from answer_two which tells us how many movies differ-
ent pairs of employees have in common.
This function should return a weighted projected graph.
B = answer_two()
weighted_projection = bipartite.weighted_projected_graph(B, employees)
return weighted_projection
#answer_three()
1.0.4 Question 4
Suppose you’d like to find out if people that have a high relationship score also like the same types
of movies.
Find the Pearson correlation ( using DataFrame.corr() ) between employee relationship scores
and the number of movies they have in common. If two employees have no movies in common it
should be treated as a 0, not a missing value, and should be included in the correlation calculation.
This function should return a float.
3
# print (G_df)
G_copy_df = G_df.copy()
# change the edge direction and get a double direction graph
G_copy_df.rename(columns={"From":"From_", "To":"From"}, inplace=True)
G_copy_df.rename(columns={"From_":"To"}, inplace=True)
# print (G_copy_df)
G_final_df = pd.concat([G_df, G_copy_df])
# print (G_final_df)
final_df = pd.merge(G_final_df, Rel_df, on = ['From', 'To'], how='right')
final_df['movies_score'] = final_df['movies_score'].map(lambda x: x['weight'] if ty
final_df['relationship_score'] = final_df['relationship_score'].map(lambda x: x['re
final_df['movies_score'].fillna(value=0, inplace=True)
# print (final_df)
return final_df['movies_score'].corr(final_df['relationship_score'])
answer_four()
Out[13]: 0.78839622217334737