0% found this document useful (0 votes)
9 views20 pages

Big Data

Mining social network graphs involves analyzing relationships within networks like Facebook and Twitter to detect communities, identify influencers, and predict trends. Key techniques include centrality analysis, community detection, link prediction, and fraud detection. Applications range from friend recommendations to trend analysis and cybersecurity.

Uploaded by

a29241191
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views20 pages

Big Data

Mining social network graphs involves analyzing relationships within networks like Facebook and Twitter to detect communities, identify influencers, and predict trends. Key techniques include centrality analysis, community detection, link prediction, and fraud detection. Applications range from friend recommendations to trend analysis and cybersecurity.

Uploaded by

a29241191
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Unit-5

Mining Social Network Graphs


Mining social network graphs involves analyzing relationships and interactions within a network (e.g.,
Facebook, Twitter, LinkedIn). It helps in detecting communities, identifying influencers,
recommending connections, and predicting trends.

1. What is a Social Network Graph? 🌍🔗


A social network graph represents people (nodes) and their connections (edges). The edges can
represent friendships, follows, messages, likes, or collaborations.

💡 Example:
●​ Facebook → Nodes = Users, Edges = Friendships​

●​ Twitter → Nodes = Users, Edges = Followers​

●​ LinkedIn → Nodes = Professionals, Edges = Connections​

2. Key Graph Mining Techniques


A. Centrality Analysis (Finding Influencers) 🌟
Measures how important a node (user) is in the network.

🔹 Types of Centrality:​
✅ Degree Centrality → Counts direct connections (e.g., most followed person on Twitter).​
✅ Betweenness Centrality → Measures influence in connecting others (e.g., news aggregators).​
✅ Closeness Centrality → Identifies people who can quickly reach everyone.​
✅ PageRank → Google's algorithm for ranking websites and influencers.
💡 Example:
●​ Twitter ranks influential users based on followers & retweets.​
B. Community Detection (Finding Groups) 👥
Identifies clusters of tightly connected users.

🔹 Popular Algorithms:​
✅ Modularity Maximization → Finds groups with dense internal connections.​
✅ Louvain Algorithm → Fast, scalable method for large graphs.​
✅ Label Propagation → Spreads labels to detect communities.
💡 Example:
●​ Facebook suggests Groups based on detected communities.​

C. Link Prediction (Recommending Friends) 🔮


Predicts future connections based on existing relationships.

🔹 Methods:​
✅ Jaccard Coefficient → Measures similarity between two users.​
✅ Adamic/Adar Score → Favors users with common neighbors.​
✅ Graph Embedding (Node2Vec, DeepWalk) → Uses machine learning to recommend links.
💡 Example:
●​ LinkedIn’s "People You May Know" feature uses link prediction.​

D. Influence Propagation (Viral Trends) 🚀


Analyzes how information spreads across a network.

🔹 Models:​
✅ Independent Cascade Model → Users influence their neighbors probabilistically.​
✅ Linear Threshold Model → Users adopt behavior based on peer pressure.
💡 Example:
●​ Twitter’s trending hashtags spread based on retweets & mentions.​

E. Fraud & Anomaly Detection (Spotting Fake Accounts) 🚨


Detects fake profiles, spam bots, and cyber threats.

🔹 Techniques:​
✅ Graph-based anomaly detection → Identifies outlier nodes.​
✅ Behavioral analysis → Detects unusual activity patterns.​
✅ Bot detection algorithms → Finds automated accounts.
💡 Example:
●​ Facebook & Twitter use AI to detect fake accounts & spam bots.​

3. Tools & Technologies for Social Graph Mining


✅ Graph Databases: Neo4j, ArangoDB, Amazon Neptune​
✅ Big Data Tools: Apache Spark GraphX, GraphFrames​
✅ Python Libraries: NetworkX, SNAP, iGraph​
✅ Machine Learning: DeepWalk, Node2Vec, GNNs (Graph Neural Networks)

4. Applications of Social Network Mining


💡 Real-World Examples:​
✅ Facebook & LinkedIn → Friend & job recommendations​
✅ Twitter & Reddit → Trend & influencer detection​
✅ E-commerce (Amazon, eBay) → Product recommendations​
✅ Healthcare → Disease spread modeling​
✅ Cybersecurity → Fraud & fake account detection

Introduction to Social Network Mining


What is Social Network Mining?

Social Network Mining is the process of analyzing relationships, interactions, and patterns in social
networks using graph theory, machine learning, and data analytics. It helps in understanding how people
or entities are connected and how information spreads across a network.
💡 Example:
●​ Facebook → Mining user friendships to suggest new connections.​

●​ Twitter → Analyzing retweets to identify trending topics.​

Applications of Social Network Mining


1️⃣ Influencer Identification & Marketing 🌟
●​ Detects key influencers who impact user behavior.​

●​ Helps in targeted advertising and social media campaigns.​

💡 Example:
●​ Instagram & YouTube → Brands collaborate with influencers for promotions.​

●​ Twitter → Ranks influential users based on retweets and mentions.​

2️⃣ Friend & Connection Recommendations 👥


●​ Uses link prediction algorithms to suggest new friends or connections.​

💡 Example:
●​ LinkedIn’s "People You May Know" → Suggests professional connections.​

●​ Facebook’s friend suggestions based on mutual friends.​

3️⃣ Trend & Sentiment Analysis 📊


●​ Identifies viral trends and public sentiment on topics.​
●​ Helps in political analysis, brand reputation monitoring, and crisis management.​

💡 Example:
●​ Twitter Trending Topics → Analyzing popular discussions.​

●​ Sentiment analysis in elections → Predicting voter opinions.​

4️⃣ Fraud Detection & Cybersecurity 🚨


●​ Detects fake profiles, bots, and suspicious activities using anomaly detection.

💡 Example:
●​ Facebook & Twitter → AI models detect and block fake accounts.​

●​ Banks → Fraud detection in financial transactions.​

5️⃣ Social Graph-Based Recommendations 🛒


●​ Uses social connections to suggest products, movies, or content.

💡 Example:
●​ Netflix & Spotify → Suggest movies & music based on friends' preferences.​

●​ Amazon → Recommends products based on user interactions.​

6️⃣ Disease Spread & Healthcare Analytics 🏥


●​ Studies how diseases spread through social networks.

💡 Example:
●​ COVID-19 tracking → Contact tracing using social graphs.​

●​ Epidemic modeling → Predicting disease outbreaks.​


7️⃣ Political & Social Analysis 🏛️
●​ Examines political campaigns, protests, and movements using social media data.​

💡 Example:
●​ Twitter analysis in elections → Predicting candidate popularity.​

●​ Social activism → Tracking global movements (e.g., #MeToo, Black Lives Matter).​

8️⃣ Crime & Terrorism Detection 🔍


●​ Law enforcement uses network mining to track criminals & terrorist networks.

💡 Example:
●​ FBI & Interpol → Monitor social media for suspicious activities.​

●​ Dark Web analysis → Detecting illegal transactions.

Social Networks as a Graph


1. What is a Social Network Graph?
A social network graph represents relationships between individuals, groups, or entities using nodes
(vertices) and edges (links).

🔹 Nodes (Vertices): Represent users, profiles, or entities.​


🔹 Edges (Links): Represent connections like friendships, follows, messages, or interactions.
💡 Examples:
●​ Facebook → Nodes = Users, Edges = Friendships​

●​ Twitter → Nodes = Users, Edges = Followers​


●​ LinkedIn → Nodes = Professionals, Edges = Connections​

2. Types of Social Network Graphs


1️⃣ Undirected Graph (Mutual Connections) 🔄
●​ Edges have no direction (A ↔ B).​

●​ Represents friendship networks (e.g., Facebook, LinkedIn).​

💡 Example:
●​ If Alice is friends with Bob, the connection is mutual.​

2️⃣ Directed Graph (One-Way Interactions) ➡️


●​ Edges have a direction (A → B).​

●​ Used in Twitter (followers), Instagram (likes), and email networks.​

💡 Example:
●​ If Alice follows Bob on Twitter, Bob may not follow Alice back.​

3️⃣ Weighted Graph (Strength of Connections) ⚖️


●​ Edges have weights indicating the strength of relationships.​

●​ Used in social influence & recommendation systems.​

💡 Example:
●​ Strong ties (family, close friends) vs. Weak ties (acquaintances, followers).​
4️⃣ Bipartite Graph (Two Different Node Types) 🔗
●​ Two types of nodes with edges only between different types.​

●​ Used in user-content interaction networks (e.g., YouTube, Netflix, Amazon).​

💡 Example:
●​ Users ↔ Movies (Netflix) → User A watches Movie X.​

3. Key Graph Metrics in Social Networks


📌 Centrality (Influence Detection)
●​ Degree Centrality → Measures number of direct connections.​

●​ Betweenness Centrality → Identifies nodes connecting different groups.​

●​ Closeness Centrality → Finds users who can quickly reach others.​

●​ PageRank → Measures influence based on link structure (used by Google).​

💡 Example:
●​ Most followed user on Twitter has high degree centrality.​

📌 Community Detection (Group Formation)


●​ Detects clusters or communities within a network.​

●​ Used in recommendation systems, political analysis, and fraud detection.​

💡 Example:
●​ Facebook suggests groups based on detected communities.​
📌 Link Prediction (Friend & Connection Recommendations)
●​ Predicts future relationships based on current network structure.​

●​ Used in LinkedIn ("People You May Know") and Facebook friend suggestions.​

💡 Example:
●​ If Alice & Bob share many mutual friends, they may connect soon.​

📌 Information Spread (Viral Trends & Influence)


●​ Models how information, memes, or viruses spread in networks.​

●​ Used in marketing campaigns, epidemiology, and trend analysis.​

💡 Example:
●​ Twitter trends spread via retweets (Viral Marketing).​

4. Real-World Applications of Social Network Graphs


✅ Social Media Analysis → Influencer detection, trend prediction​
✅ Recommendation Systems → Netflix, Amazon, Spotify recommendations​
✅ Fraud & Anomaly Detection → Fake accounts, bot networks​
✅ Epidemiology → Predicting disease outbreaks (COVID-19 contact tracing)​
✅ Cybersecurity → Detecting criminal and terrorist networks

5. Tools for Social Network Graph Analysis


🔹 Graph Databases: Neo4j, Amazon Neptune, ArangoDB​
🔹 Big Data Tools: Apache Spark GraphX, GraphFrames​
🔹 Python Libraries: NetworkX, iGraph, SNAP
Types of social Networks,
Social networks can be categorized in several ways, but one common approach is based on their primary
purpose or the type of content shared. Here are some key types of social networks:

1. Social Networking Sites:

●​ Focus: Connecting with friends, family, and other individuals. These platforms are often used for
sharing updates, photos, videos, and engaging in conversations.
●​ Examples: Facebook, X (formerly Twitter), LinkedIn, Instagram, Threads, Bluesky, Mastodon.

2. Media Sharing Networks:

●​ Focus: Sharing specific types of media, such as photos and videos.


●​ Examples: Instagram, YouTube, TikTok, Snapchat, Pinterest, Flickr, Vimeo.

3. Discussion Forums:

●​ Focus: Facilitating conversations and discussions around specific topics or interests. Users ask
questions, share opinions, and engage in debates.
●​ Examples: Reddit, Quora, Discord, Digg, Clubhouse.

4. Social Blogging Networks:

●​ Focus: Publishing and sharing blog posts and articles. These platforms allow individuals and
businesses to share their thoughts, expertise, and stories.
●​ Examples: Medium, Tumblr, WordPress.com, Blogger, LiveJournal.

5. Professional Networking Sites:

●​ Focus: Connecting with professionals for career development, networking, and business
opportunities.
●​ Examples: LinkedIn, Xing.

6. Review Networks:

●​ Focus: Sharing reviews and ratings of businesses, products, and services.


●​ Examples: Yelp, TripAdvisor, Google Business Profile.

7. Interest-Based Networks:

●​ Focus: Connecting people with shared hobbies, interests, or passions.


●​ Examples: Pinterest (for visual interests), Goodreads (for books), Last.fm (for music).

8. Dating Networks:

●​ Focus: Connecting individuals for romantic relationships.


●​ Examples: Tinder, Bumble, Hinge.

9. Social Gaming Networks:


●​ Focus: Connecting gamers and facilitating social interaction around video games.
●​ Examples: Twitch, Discord (also used for other communities), Steam.

10. Educational Networks: * Focus: Connecting students, educators, and institutions for learning and
knowledge sharing. * Examples: Academia.edu, ResearchGate.

11. Mobile Messaging Apps (with social features): * Focus: Primarily for private messaging, but often
include features for group chats, sharing media, and social interaction. * Examples: WhatsApp,
Telegram, Facebook Messenger, WeChat.

12. Niche Social Networks: * Focus: Catering to specific communities or demographics, such as
networks for parents (CafeMom), travelers (CouchSurfing), or specific cultural groups.

Or

Types of Social Networks


Social networks can be categorized based on purpose, structure, and interaction type. Each type
serves a unique function in communication, business, entertainment, and professional growth.

1️⃣ Personal & Social Networking Sites 👥


These platforms focus on connecting individuals for social interaction, communication, and content
sharing.

💡 Examples:
●​ Facebook → Friendships, groups, and personal connections.​

●​ Instagram → Photo and video sharing.​

●​ Snapchat → Temporary media sharing.​

🔹 Key Features: Friend requests, messaging, content feeds, and live updates.

2️⃣ Professional & Career Networks 💼


Designed for business networking, career growth, and professional interactions.
💡 Examples:
●​ LinkedIn → Professional connections, job postings, and industry networking.​

●​ Xing → European professional networking platform.​

●​ AngelList → Startups and investor networking.​

🔹 Key Features: Resume sharing, job searches, endorsements, and business networking.

3️⃣ Microblogging & Real-Time News Networks 📝


Platforms that allow short-form content sharing, discussions, and real-time updates.

💡 Examples:
●​ Twitter (X) → Text-based updates, retweets, and hashtags.​

●​ Tumblr → Microblogging with multimedia content.​

●​ Threads → Instagram-based short-text sharing.​

🔹 Key Features: Short posts, trending hashtags, and real-time discussions.

4️⃣ Media Sharing Networks 📸🎥


Focused on photo, video, and content sharing.

💡 Examples:
●​ YouTube → Video sharing and streaming.​

●​ TikTok → Short-form video content.​

●​ Pinterest → Image and idea curation.

🔹 Key Features: Content creation, likes, comments, and recommendations.

5️⃣ Interest-Based & Community Networks 🎭


Connect people with shared interests, hobbies, and discussions.

💡 Examples:
●​ Reddit → Communities (subreddits) for discussions.​

●​ Quora → Q&A-based social networking.​

●​ Discord → Interest-based chat servers.

🔹 Key Features: Forums, topic-based discussions, and anonymous engagement.

6️⃣ Dating & Relationship Networks ❤️


Designed for connecting individuals for relationships or friendships.

💡 Examples:
●​ Tinder → Swipe-based dating.​

●​ Bumble → Women-first messaging.​

●​ Hinge → Relationship-oriented matching.

🔹 Key Features: Matching algorithms, chat options, and profile browsing.

7️⃣ Enterprise & Workplace Collaboration Networks 🏢


Used for team communication and business collaboration.

💡 Examples:
●​ Slack → Business communication via channels.​

●​ Microsoft Teams → Corporate messaging and video calls.​

●​ Yammer → Internal social networking for organizations.

🔹 Key Features: Workspaces, document sharing, and project collaboration.

8️⃣ Academic & Research Networks 📚


Focused on education, research collaboration, and knowledge sharing.

💡 Examples:
●​ ResearchGate → Academic networking for researchers.​

●​ Academia.edu → Sharing research papers and publications.​

●​ Google Scholar → Citation-based networking.

🔹 Key Features: Paper sharing, research collaboration, and academic discussions.

9️⃣ Gaming & Virtual Worlds 🎮


Connects gamers for multiplayer experiences, streaming, and esports.

💡 Examples:
●​ Twitch → Live game streaming.​

●​ Steam Community → Gaming discussions and networking.​

●​ Roblox & Fortnite → Virtual social interaction within games.​

🔹 Key Features: Live streaming, voice chat, and in-game socializing.

🔟 Blockchain & Decentralized Social Networks 🔗


Social platforms based on blockchain technology and decentralized governance.

💡 Examples:
●​ Mastodon → Open-source and decentralized social networking.​

●​ Steemit → Blockchain-based blogging and rewards.​

●​ Lens Protocol → Web3 social networking.​

🔹 Key Features: No central control, cryptocurrency integration, user-owned content.


Clustering of Social Graphs & Community
Discovery
Clustering in social graphs helps identify communities, interest groups, and influence zones. It is used
in social media analysis, recommendation systems, and fraud detection.

1. What is Social Graph Clustering?


Clustering is the process of grouping highly connected nodes (users) together based on relationships
or interactions. These groups are called communities.

🔹 Nodes (Users/Entities) → Represent people or items.​


🔹 Edges (Connections) → Represent friendships, follows, or interactions.
💡 Example:
●​ Facebook Groups → Friends with similar interests form clusters.​

●​ LinkedIn Network → Professional circles like colleagues, alumni, and industry groups.​

2. Direct Discovery of Communities in a Social Graph


🔹 A. Community Detection Methods
1️⃣ Modularity-Based Clustering 📊
●​ Measures how well nodes are grouped compared to random connections.​

●​ Algorithm: Louvain Algorithm (widely used).​

●​ Example: Facebook groups with high internal interactions.​

2️⃣ Label Propagation Algorithm (LPA) 🔄


●​ Nodes adopt the most common label of their neighbors iteratively.​

●​ Fast and scalable for large social networks.​


●​ Example: Twitter hashtag communities.​

3️⃣ Hierarchical Clustering 🌳


●​ Builds a tree-like structure where closely connected nodes form subgroups.​

●​ Example: LinkedIn suggests nested connections (company → department → team).​

4️⃣ Spectral Clustering 🎭


●​ Uses eigenvalues of graph adjacency matrices for grouping.​

●​ Best for dense networks with strong community structures.​

5️⃣ Graph Neural Networks (GNNs) 🤖


●​ AI-driven approach using deep learning for dynamic community detection.​

●​ Example: Facebook’s AI-driven friend suggestions.​

🔹 B. Types of Communities in Social Graphs


✅ Tightly Knit Communities → Friends & Family groups.​
✅ Interest-Based Communities → Reddit subreddits, Facebook Groups.​
✅ Influencer-Follower Groups → Twitter users around celebrities.​
✅ Workplace & Professional Circles → LinkedIn networks.

3. Real-World Applications of Social Graph Clustering


📌 1. Social Media & Marketing
🔹 Identify target audience for personalized ads.​
🔹 Find influencers & brand advocates.
💡 Example: Instagram detects fashion, tech, or sports groups for ad targeting.

📌 2. Recommendation Systems
🔹 Suggests friends, content, and products based on social clusters.​
🔹 Improves Netflix, Amazon, Spotify recommendations.
💡 Example: Netflix recommends movies liked by your community.

📌 3. Fraud Detection & Cybersecurity


🔹 Detects anomalous clusters of fake accounts or bots.​
🔹 Uncovers cyber threats & phishing networks.
💡 Example: Facebook detects bot farms spreading misinformation.

📌 4. Political & Trend Analysis


🔹 Clusters show political group affiliations.​
🔹 Predicts trending topics & movements.
💡 Example: Twitter analyzes hashtag clusters during elections.

4. Tools for Social Graph Clustering


🔹 Graph Databases: Neo4j, Amazon Neptune​
🔹 Big Data Tools: Apache Spark GraphX, GraphFrames​
🔹 Python Libraries: NetworkX, iGraph, SNAP

Introduction to Recommender Systems


1. What is a Recommender System?
A Recommender System is a machine learning-based system that suggests relevant content,
products, or services to users based on their preferences, behavior, or interactions.

💡 Examples:
●​ Netflix → Suggests movies & shows based on watch history.​
●​ Amazon → Recommends products based on past purchases.​

●​ Spotify → Suggests songs based on listening habits.​

2. Types of Recommender Systems


🔹 1️⃣ Content-Based Filtering 🏷️
●​ Suggests items similar to those a user liked before.​

●​ Uses item attributes (genre, tags, description, etc.).​

💡 Example:
●​ If you watch action movies, Netflix will suggest more action films.​

🔹 Algorithms: TF-IDF, Cosine Similarity

🔹 2️⃣ Collaborative Filtering 🤝


●​ Suggests items based on other users with similar preferences.​

●​ Works on user-item interaction matrix.​

💡 Example:
●​ If User A & User B like similar products, A gets recommendations from B’s choices.​

🔹 Types:​
✅ User-Based Collaborative Filtering → Finds similar users.​
✅ Item-Based Collaborative Filtering → Finds similar items.
🔹 Algorithms: Pearson Correlation, Matrix Factorization (SVD)

🔹 3️⃣ Hybrid Recommender Systems 🔄


●​ Combines Content-Based + Collaborative Filtering for better accuracy.​

💡 Example:
●​ Netflix uses both user preferences & similar user behavior to recommend shows.​

🔹 Algorithms: Weighted Hybrid, Stacking Models

🔹 4️⃣ Knowledge-Based Recommender Systems 🧠


●​ Uses domain-specific rules & expert knowledge.​

●​ Used when user history is unavailable (e.g., cold start problem).​

💡 Example:
●​ Travel booking sites recommend places based on user preferences (beach, adventure, budget).​

3. Challenges in Recommender Systems


🚧 Cold Start Problem → New users/items have no interaction history.​
🚧 Scalability Issues → Processing millions of users & items efficiently.​
🚧 Sparsity → Most users interact with only a few items, making predictions difficult.​
🚧 Diversity vs. Accuracy → Balance between relevant & diverse recommendations.

4. Real-World Applications of Recommender Systems


✅ E-commerce → Amazon, eBay product recommendations​
✅ Streaming Services → Netflix, Spotify, YouTube content suggestions​
✅ Social Media → Facebook, Twitter personalized feeds​
✅ Online Learning → Coursera, Udemy course recommendations​
✅ Healthcare → Personalized treatment recommendations
5. Tools & Technologies for Building Recommender
Systems
🔹 Libraries: Surprise, LightFM, Scikit-Learn, TensorFlow Recommenders​
🔹 Big Data Technologies: Apache Spark MLlib, Hadoop​
🔹 Graph-Based Approaches: Neo4j, NetworkX

You might also like