0% found this document useful (0 votes)
464 views9 pages

Social Media Analytics - Unit I

Uploaded by

SIVATHMIKA C
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
464 views9 pages

Social Media Analytics - Unit I

Uploaded by

SIVATHMIKA C
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

SOCIAL MEDIA ANALYTICS- UNIT I

DATA IDENTIFICATION:
Data identification involves locating the right datasets required for analysis. This
step is crucial as the quality and relevance of the data directly impact the
insights and conclusions drawn from it. In the context of social media analytics,
data can come from various sources such as:
 Social media platforms (e.g., Twitter, Facebook, Instagram)
 Forums and discussion boards (e.g., Reddit)
 Blogs and news sites
 Public datasets and APIs
DATA MEANING:
Data Meaning refers to interpreting and understanding the significance and
implications of the data. This process involves contextualizing the data,
assessing its relevance, analyzing sentiments, and identifying trends. Here’s a
detailed explanation of each aspect:
1. Contextual Analysis
Contextual Analysis involves understanding the circumstances and conditions
in which the data was generated. This helps in interpreting the data accurately
and deriving meaningful insights.
Key Components:
 Source: Understanding where the data comes from (e.g., social media
platform, survey, website).
 Content: Analyzing the actual content, such as text, images, or videos.
 Metadata: Evaluating additional information like timestamps, user
demographics, and location.
 Historical Context: Considering past events or trends that might
influence the data.
Example:
Analyzing tweets about a political event requires understanding the event itself,
the timing of the tweets, and the users' locations to accurately interpret the
sentiments and opinions expressed.
2. Relevance
Relevance refers to the importance and applicability of the data to the specific
objectives of the analysis. Relevant data directly impacts the accuracy and
usefulness of the insights derived.
Key Components:
 Objective Alignment: Ensuring the data aligns with the goals of the
analysis.
 Scope: Ensuring the data covers the necessary aspects and dimensions
required.
 Completeness: Ensuring the data is comprehensive and not missing
critical information.
 Quality: Ensuring the data is accurate, reliable, and free from errors.
Example:
For a marketing campaign targeting young adults, data on social media usage
patterns and preferences of users aged 18-25 is highly relevant, while data on
older age groups may be less relevant.
3. Sentiment Analysis
Sentiment Analysis involves determining the emotional tone behind a series of
words to gain an understanding of the attitudes, opinions, and emotions
expressed within an online mention.
Key Components:
 Polarity: Classifying the sentiment as positive, negative, or neutral.
 Intensity: Measuring the strength of the sentiment expressed.
 Context: Considering the context in which sentiments are expressed to
avoid misinterpretation.
 Language Processing: Using natural language processing (NLP)
techniques to analyze text data.
Example:
Analyzing product reviews on an e-commerce website can reveal customers'
positive or negative sentiments towards specific product features, helping
businesses make informed decisions.
4. Trend Analysis
Trend Analysis involves examining data over time to identify patterns, trends,
and changes. This helps in understanding how variables of interest evolve and
predicting future movements.
Key Components:
 Temporal Patterns: Identifying trends over specific time periods (daily,
monthly, yearly).
 Cyclical Patterns: Recognizing repetitive patterns or cycles in the data.
 Anomalies: Detecting unexpected changes or outliers in the data.
 Comparative Analysis: Comparing current trends with historical data to
gauge changes.
Example:
Analyzing social media mentions of a brand over several months can reveal
trends in customer engagement, identify peak times of interest, and highlight
any significant changes in public perception.

VALUE PYRAMID:
The value pyramid is a conceptual model used to illustrate the progression from
raw data to wisdom. This progression represents increasing value and utility at
each level. Here's a breakdown of each stage in the pyramid:
1. Noisy Data:
o Description: This is raw, unprocessed data that contains errors,
redundancies, and irrelevant information.
o Value: Low. It's difficult to derive meaningful insights from noisy
data due to its unrefined state.
2. Filtered Data:
o Description: Noisy data that has been cleaned and organized,
removing errors, redundancies, and irrelevant information.
o Value: Moderate. While still basic, filtered data is more reliable and
easier to work with than noisy data.
3. Information:
o Description: Filtered data that has been processed, organized, and
structured in a way that gives it context and meaning. This often
involves summarizing or categorizing data.
o Value: Higher. Information is useful for understanding specific
aspects of a situation or phenomenon.
4. Knowledge:
o Description: Information that has been synthesized and analyzed
to provide insights, patterns, and understanding. Knowledge is often
gained through experience or study.
o Value: High. Knowledge allows for informed decision-making and
problem-solving.
5. Wisdom:
o Description: The application of knowledge in practical, judicious,
and ethical ways. Wisdom involves deep understanding and the
ability to make sound judgments and decisions.
o Value: Highest. Wisdom enables individuals and organizations to
navigate complex situations and achieve long-term success.
Visual Representation

Each level of the pyramid builds upon the previous


one, adding value and utility. The progression
from noisy data to wisdom represents the
transformation of raw inputs into valuable outputs
that can guide actions and decisions.

DATA ATTRIBUTES:
The 5 V's are defined as follows:
1.Velocity is the speed at which the data
is created and how fast it moves.
2.Volume is the amount of data
qualifying as big data.
3.Value is the value the data provides.
4.Variety is the diversity that exists in
the types of data.
5.Veracity is the data's quality and
accuracy.

Velocity
Velocity refers to the speed at which data is generated and processed. This is
crucial for organizations needing real-time data to make timely decisions.
Example: In healthcare, medical devices like in-hospital monitors and wearable
devices continuously collect patient data. This data needs to be transmitted and
analyzed rapidly to ensure timely medical interventions.
Volume
Volume pertains to the sheer amount of data collected. Big data is characterized
by its large volume, which requires significant storage and processing
capabilities.
Example: A retail company with hundreds of stores generates millions of
transactions daily. The total number of these transactions represents the volume
of data, qualifying it as big data.
Value
Value is the benefit derived from analyzing big data. It's essential for
organizations to extract meaningful insights to enhance their operations and
decision-making processes.
Example: By gathering and analyzing individual customer data, a company can
create detailed customer profiles. This allows for personalized marketing and
sales strategies, improving customer satisfaction and operational efficiency.
Variety
Variety refers to the different types of data collected from various sources. This
includes structured, semi-structured, and unstructured data from both internal
and external sources.
Example: An organization collects data from multiple sources like social media,
transactional databases, and customer feedback forms. The challenge lies in
standardizing and integrating this diverse data for comprehensive analysis.
Veracity
Veracity concerns the accuracy, quality, and trustworthiness of data. Ensuring
high data veracity is crucial for deriving reliable and valuable insights.
Example: Collected data may have missing values, inaccuracies, or
inconsistencies. Ensuring high veracity means implementing rigorous data
cleaning and validation processes to maintain the integrity and credibility of the
data.
CASTING A NET:
Casting a net in social media analytics is a comprehensive strategy to collect and
analyze data from various sources to gain a thorough understanding of public
sentiment, trends, and interactions. This approach involves using different
methods and tools to ensure no relevant information is missed.
Explanation of Casting a Net:
Casting a net involves gathering a broad and varied set of data from multiple
sources to ensure comprehensive insights. Here’s how each component
contributes to this strategy, with real-time examples and sources of information.
1. Broad Search Query
Definition:
A broad search query uses general and inclusive terms to capture a wide range
of data related to a topic. It aims to gather all possible mentions and discussions.
How It Works:
You create search queries that include broad and general terms. This helps
capture various ways people might discuss the topic, even if they use different
language or terminology.
Real-Time Example:
Scenario: A company is launching a new smartwatch and wants to understand
overall consumer interest.
Implementation: Instead of searching for “SmartWatch XYZ reviews,” they use
broader terms like “best smartwatches 2024” or “latest wearable technology.”
This approach gathers a wide array of related discussions and reviews.
Where We Get This Information:
Search Engines: Google, Bing (for articles, blog posts, and news).
Social Media Platforms: Twitter, Facebook, Instagram (for user-generated content
and comments).
2. Automation Tools
Definition:
Automation tools are software applications that automate the process of data
collection, analysis, and reporting, providing real-time insights and reducing
manual effort.

How It Works:
These tools continuously track relevant data, analyze trends, and generate
reports automatically. This allows for efficient and timely monitoring of large
volumes of information.
Real-Time Example:
Scenario: A company wants to track social media feedback on their new smart
watch line.
Implementation: Using tools like Hootsuite or Sprout Social, the retailer sets up
automated tracking for mentions, sentiment analysis, and engagement metrics
related to the smart watches. These tools provide real-time updates and alerts.
Where We Get This Information:
Automation Tools: Hootsuite, Sprout Social, Buffer (for tracking and reporting).
3. Multiple Platforms of Gathering Data
Definition:
Collecting data from various social media platforms and other sources to ensure
a comprehensive view of user interactions and sentiments.
How It Works:
Data is gathered from different platforms to capture a wide range of discussions
and interactions. This provides a holistic view of the topic or brand.
Real-Time Example:
Scenario: A smart watch company wants to understand international reactions to
a new product launch.
Implementation: Using tools like Brandwatch, the company collects data from
Twitter, Facebook, Instagram, LinkedIn, and tech forums. This multi-platform
approach ensures they capture diverse discussions from different regions and
audiences. Use Twitter API to collect tweets mentioning smart watch related
keywords.
Where We Get This Information:
Social Media Platforms: Twitter, Facebook, Instagram, LinkedIn (for varied user
interactions).
Data Aggregation Tools: Brandwatch, BuzzSumo (for comprehensive data
collection).
4. Continuous Monitoring
Definition:
Ongoing tracking and analysis of data to capture real-time trends, feedback, and
changes. It involves regularly reviewing data to stay updated.
How It Works:
Continuous monitoring involves setting up alerts and frequently checking data to
track shifts in trends, sentiment, and user engagement.
Real-Time Example:
Scenario: A company is running a promotional campaign and wants to monitor
customer reactions in real time.
Implementation: Using tools like Google Alerts and Mention, the company
continuously tracks mentions and feedback related to the campaign. This
enables quick responses to emerging issues or opportunities.
Where We Get This Information:
Monitoring Tools: Google Alerts, Mention, Sprout Social (for real-time tracking
and notifications).
5. Inclusive Criteria
Definition:
Using broad and varied criteria to ensure all relevant data is captured. This
approach avoids missing significant information by employing wide-ranging
search terms and filters.
How It Works:
Inclusive criteria involve setting up diverse search terms and parameters to
cover all potential mentions and interactions related to the topic.
Real-Time Example:
Scenario: A smart watch company is assessing the impact of a recent health
tracker watch campaign.
Implementation: The organization monitors variations of the campaign’s name,
related hashtags, and general terms like “smart watch” or “health tracker
watch.” This ensures comprehensive data collection.
Where We Get This Information:
Search Engines: Google, Bing (for broad search terms).
Social Media Platforms: Twitter, Facebook, Instagram (for diverse content).
Analytics Tools: Mention, Brandwatch (for broad analysis).

REGULAR EXPRESSIONS:
Regular expressions are the sequence of characters that are used to match or
find similar strings from an existing data.
1. Match function - this function attempts to match regular expression
pattern to an existing string. The syntax for this function is:
re.match (patterns, strings, flags=0)

The parameters to the match functions are patterns, strings, flags. The
re.match fuction returns a match object on success and none on no result.

2. Match object method


 group (num=0) – this method returns an entire match or a specific
subgroup of number.
 groups() – this method returns all matching subgroups in a
database.

3. Search function – this function searches for the first occurrence of the
regular expression pattern. The syntax:
re.search (patterns, strings, flags=0)
The parameters are patterns, strings, flags. The re.search fuction returns a
match object on success and none on no result.

Looking for the Right Subset of People


Objective: Identify specific groups of people based on various attributes for
targeted analysis.
Attributes and Techniques:
Employment
Technique: Filter users based on job titles or company names. Example:
Targeting marketing professionals by searching for job titles like "Marketing
Manager" or company names like "Google" in user profiles.
Sentiment
Technique: Analyze sentiment scores to identify users with specific opinions.
Example: Using sentiment analysis tools to find users expressing positive or
negative sentiments about a product, and then targeting them for marketing or
feedback purposes.
Location/Geography
Technique: Use location data to target users from specific regions. Example:
Filtering social media posts by geotags to focus on users from New York City for
a local event promotion.
Language
Technique: Filter users based on the language of their posts. Example:
Identifying Spanish-speaking users by detecting the language of their posts for a
marketing campaign aimed at Spanish-speaking regions.
Age
Technique: Use self-reported age or inferred age based on content and
behavior. Example: Targeting users in the 18-25 age group by analyzing self-
reported age in profiles or inferred age from content topics and engagement
patterns.
Gender
Technique: Use profile information or NLP techniques to infer gender.
Example: Inferring gender from profile information or using NLP techniques to
analyze names and pronouns in posts to segment users for gender-specific
products.
Profession/Expertise
Technique: Identify users based on their professional background or expertise
mentioned in profiles. Example: Finding medical professionals by searching for
terms like "doctor," "nurse," or "healthcare professional" in user bios for a
medical survey.
Eminence/Popularity
Technique: Filter based on follower count, engagement metrics, or verified
status. Example: Identifying influencers by their high follower count and
engagement rates for a brand partnership.
Role
Technique: Identify users based on their roles (e.g., influencers, customers,
employees). Example: Segmenting users into groups like influencers,
customers, or employees based on their interactions and content, such as
identifying brand ambassadors for a promotional campaign.
Specific People or Groups
Technique: Use specific usernames or group identifiers to target particular
individuals or communities. Example: Targeting a specific community by
identifying and following key group identifiers or hashtags, such as targeting
members of a specific online forum or social media group.

You might also like