Social Media and Data Analytics Unit 3 Notes
Social Media and Data Analytics Unit 3 Notes
These study notes provide detailed explanations for each topic under Unit 3: Network Fundamentals, tailored
for your exam preparation in social media analytics and data analytics. Each section includes definitions, key
concepts, applications, and examples, ensuring you have a comprehensive understanding of the material. The
notes are structured for quick review, with a focus on clarity and relevance to your exam.
1. Network Structures
• Definition: Network structures describe how nodes (entities like individuals or organizations) and
edges (relationships like friendships or interactions) are organized in a network, often visualized as
graphs (Social Network Analysis).
• Key Concepts:
o Nodes and Edges: Nodes are entities (e.g., users on X), and edges are relationships (e.g.,
follows or likes).
o Network Density: The ratio of actual to possible connections. High density means many
connections (e.g., a tightly knit X community).
o Centralization: Indicates if a few nodes dominate connections. High centralization occurs in
networks with influencers (e.g., X users with many followers).
o Centrality Measures:
▪ Degree Centrality: Number of direct connections (e.g., followers on X).
▪ Betweenness Centrality: Nodes on many shortest paths (e.g., users bridging X
communities).
▪ Closeness Centrality: Proximity to all nodes (e.g., users quickly spreading X posts).
▪ Eigenvector Centrality: Influence based on connections to influential nodes (e.g., X
users linked to celebrities).
o Network Types: Directed (e.g., X follows), undirected (e.g., mutual friendships), weighted
(e.g., interaction frequency), or unweighted.
• Applications:
o Analyzing X follower networks to identify influencers.
o Mapping organizational communication for efficiency.
o Studying information spread (e.g., viral X posts).
• Example: On X, a network structure might show users as nodes, with edges representing retweets,
revealing clusters of users discussing similar topics.
• Key Takeaway: Network structures provide a framework to analyze relationships, with centrality and
density revealing key patterns.
2. Equivalence
• Definition: Equivalence identifies nodes with similar relational patterns, making them interchangeable
in a network (Social Network Structure).
• Types:
o Structural Equivalence: Nodes with identical connections (e.g., two X users following the
same accounts).
o Regular Equivalence: Nodes connected to similar types of nodes (e.g., two X influencers with
similar follower types).
o Automorphic Equivalence: Nodes that can be swapped without changing network structure
(e.g., X users in similar roles across different communities).
• Applications:
o Identifying roles (e.g., X moderators with similar interaction patterns).
o Grouping users for targeted marketing based on network roles.
• Example: Two X users who follow and are followed by the same accounts are structurally equivalent,
suggesting similar interests.
• Key Takeaway: Equivalence helps classify nodes by their network roles, aiding in understanding
network dynamics.
3. Homophily
• Definition: Homophily is the tendency for individuals to connect with others who share similar
attributes, like interests or demographics (Social Hierarchies).
• Types:
o Status Homophily: Connections based on social status (e.g., X influencers linking with other
influencers).
o Value Homophily: Shared beliefs or interests (e.g., X users in a fandom).
o Geographic Homophily: Proximity-based connections (e.g., X users in the same city).
• Applications:
o Predicting X community formation based on shared interests.
o Analyzing social media echo chambers where similar views dominate.
• Example: X users who share political views are more likely to follow each other, forming dense
clusters.
• Key Takeaway: Homophily drives network clustering, shaping how communities form and interact.
4. Clustering
• Definition: Clustering occurs when nodes form tightly connected groups with more internal than
external connections (Social Network Analysis).
• Key Concepts:
o Clustering Coefficient: Measures how connected a node’s neighbors are (e.g., if X followers
of a user also follow each other).
o Community Detection: Algorithms (e.g., Louvain) identify clusters (e.g., X hashtag
communities).
o Modularity: Assesses how well a network divides into communities.
• Applications:
o Identifying X user groups for targeted advertising.
o Detecting communities in social media for trend analysis.
• Example: X users discussing a trending topic form a cluster, with many mutual follows within the
group.
• Key Takeaway: Clustering reveals community structures, essential for understanding network
organization.
5. Snowball Sampling
• Definition: Snowball sampling is a non-probability method where initial participants recruit others,
used for hard-to-reach populations (Snowball Sampling).
• Process:
o Start with “seed” participants who refer others.
o Continues iteratively, expanding the sample like a snowball.
• Applications:
o Mapping X networks of niche communities (e.g., cryptocurrency enthusiasts).
o Studying hidden populations (e.g., influencers not publicly listed).
• Advantages: Accesses hard-to-reach groups.
• Limitations: Potential bias from non-representative seeds.
• Example: To study X crypto traders, start with a few known traders who refer others, building a
network map.
• Key Takeaway: Snowball sampling is effective for network studies of elusive groups but requires
careful seed selection.
6. Contact Tracing and Random Walks
• Contact Tracing:
o Definition: Identifies contacts of infected individuals to control disease spread, using network
analysis to map interactions (Contact Tracing).
o Process: Nodes are individuals, edges are contacts (e.g., physical meetings or X interactions).
o Applications: Tracking COVID-19 spread via contact networks.
• Random Walks:
o Definition: A process where a “walker” moves randomly between nodes, used to explore
network properties (Random Walks).
o Uses:
▪ Community detection (walkers stay in dense clusters).
▪ Modeling information spread (e.g., X post virality).
▪ Ranking nodes (e.g., PageRank for influential X users).
• Applications:
o Simulating X post spread to predict viral content.
o Identifying key nodes in disease or information networks.
• Example: Contact tracing maps X users who interacted with an infected user, while random walks
simulate how a post spreads through followers.
• Key Takeaway: Contact tracing maps disease networks, and random walks analyze network dynamics.
7. Ego-centered Networks
• Definition: Ego-centered networks focus on one individual (ego) and their direct connections (alters)
(Social Network Structure).
• Key Concepts:
o Size: Limited by cognitive capacity (~150 connections, Dunbar’s number).
o Structure: Hierarchical, with close contacts (e.g., family) and distant ones (e.g., acquaintances).
• Applications:
o Analyzing X user influence based on their follower network.
o Studying personal social support systems.
• Example: An X user’s ego network includes their followers and those they follow, showing their social
reach.
• Key Takeaway: Ego-centered networks reveal an individual’s social environment and influence.
8. Dominance Hierarchies
• Definition: Dominance hierarchies rank individuals based on asymmetric dominance relationships,
often from competition (Social Hierarchies).
• Key Concepts:
o Linear Hierarchies: Clear ranking (e.g., A dominates B, B dominates C).
o Centrality: High-ranking nodes have higher centrality (e.g., degree or betweenness).
• Applications:
o Studying X influencer hierarchies based on follower counts.
o Analyzing organizational leadership structures.
• Example: On X, users with more followers dominate those with fewer, forming a hierarchy.
• Key Takeaway: Dominance hierarchies show power structures, with central nodes wielding influence.
9. Third Party Records
• Definition: Third party records are external data sources (e.g., Telecom records, social media APIs)
used to construct networks (Telecom Data).
• Key Concepts:
o Data Types: Call detail records, business reports, or X API data.
o Use: Build networks when direct data is unavailable.
• Applications:
o Mapping X user interactions via API data.
o Analyzing customer networks from Telecom data.
• Example: X API data showing who follows whom creates a network for analysis.
• Key Takeaway: Third party records enable network construction from external sources, broadening
analysis scope.
10. Affiliation Networks
• Definition: Two-mode networks with actors (e.g., individuals) and events/groups (e.g., X groups),
where edges show participation (Affiliation Networks).
• Key Concepts:
o Duality: Actors are linked via shared affiliations (e.g., X users in the same group).
o Transformation: Can be converted to one-mode networks (e.g., users connected by shared
groups).
• Applications:
o Studying X community membership patterns.
o Analyzing organizational affiliations.
• Example: X users joining the same hashtag campaign form an affiliation network.
• Key Takeaway: Affiliation networks reveal indirect connections through shared memberships.
11. Citation Networks
• Definition: Networks where nodes are documents (e.g., papers), and edges are citations (Citation
Networks).
• Key Concepts:
o Co-citation: Documents cited together are related.
o Bibliographic Coupling: Documents citing the same sources are linked.
• Applications:
o Mapping X research trends via cited papers.
o Identifying influential studies in analytics.
• Example: Papers citing X analytics form a citation network, showing research evolution.
• Key Takeaway: Citation networks map knowledge structures, highlighting influential works.
12. Peer-to-Peer Networks
• Definition: Likely refers to decentralized social networks where users interact directly without a central
server, using P2P technology (P2P Social Networks).
• Key Concepts:
o Decentralization: No central authority; users share data directly.
o Alternative Interpretation: May refer to peer influence networks, where peers affect each
other’s behavior.
• Applications:
o Analyzing decentralized platforms like Mastodon.
o Studying X peer influence on content sharing.
• Example: A P2P social network allows X users to share posts directly, bypassing central servers.
• Key Takeaway: P2P networks emphasize user control and resilience, distinct from centralized
platforms.
13. Recommender Networks
• Definition: Systems using social network data to recommend content, connections, or products
(Recommender Systems).
• Key Concepts:
o Social Regularization: Incorporates friendships into recommendations.
o Trust and Influence: Weights recommendations by social connections.
• Applications:
o Suggesting X users to follow based on friends’ follows.
o Recommending products on e-commerce platforms.
• Example: X recommends accounts based on mutual followers, using network data.
• Key Takeaway: Recommender networks enhance personalization using social relationships.
14. Biological Networks
• Definition: Networks representing biological systems, like gene or protein interactions, analyzed with
SNA methods (Biological Networks).
• Types:
o Gene Regulatory Networks: Genes as nodes, regulatory interactions as edges.
o Protein-Protein Interaction Networks: Proteins as nodes, interactions as edges.
o Ecological Networks: Species as nodes, interactions (e.g., predation) as edges.
• Applications:
o Identifying key genes in disease pathways.
o Analyzing X discussions on biology using network methods.
• Example: A protein interaction network shows clusters of proteins involved in a biological process.
• Key Takeaway: Biological networks use SNA to uncover functional insights in complex systems.
Summary Table
Predicting X community
Homophily Similarity-based connections
formation
Third Party Records External data sources Using X API for network analysis