0% found this document useful (0 votes)
25 views36 pages

IM08

Uploaded by

10280326
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views36 pages

IM08

Uploaded by

10280326
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 36

Big Data Components

Learning Objectives
• Big data definition
• The key drivers for big data solutions
• Building blocks for big data solutions
• Critical success factors for big data
• Tools for big data

2
Big Data
• Our world is becoming more interconnected, with vast
amounts of data being generated from various sources.
• Unstructured data, like social media content and sensor
readings, makes up a significant portion, around 80%, of
global data.
• Examples such as 30 billion Facebook posts and 2.9 million
emails per second highlight the scale of data creation.
• Enterprises are shifting focus from structured data to
analyzing diverse data types for valuable insights, leading to
the rise of big data solutions.

3
Big Data
• Big data challenges have been present in various industries
before big data technologies emerged.
• Industries like oil and gas exploration and stock exchanges
have dealt with large volumes of data requiring rapid
processing.
• Oil and gas companies process sensor data, seismic images,
and well log data for real-time insights during rig operations.
• Stock exchanges handle real-time stock indexes, necessitating
large data volume processing.

4
Big Data Definition
• Transitioning from traditional data warehouses to big data
environments isn't always clear-cut due to differing
definitions across organizations and industries.
• A practical definition of big data includes processing, storing,
and analyzing large and diverse data sets, including
structured, semi-structured, and unstructured data.
• Big data solutions are necessary when traditional information
management technologies struggle with the scale and
complexity of these data sets.

5
Big Data Definition
• The definition of big data emphasizes its ability to handle
heterogeneous data types that traditional systems cannot
manage effectively.
• Enterprises adopt big data technologies when they encounter
data sets beyond the capabilities of traditional information
management tools, signaling a shift in their data processing
and analytics strategies.

6
Big Data Definition
• Gartner's definition of big data is based on the three Vs:
volume, velocity, and variety.
– Volume refers to the enormous amounts of data generated from
various sources such as the Internet and sensors in mechanical
devices.
– Velocity relates to the high speed at which streaming data from
sensors, RFID tags, and smart meters is generated, requiring near
real-time processing.
– Variety encompasses the diverse forms of data, including structured
data from enterprise applications, semi-structured data from weblogs,
and unstructured data from text documents, emails, and social media.

7
Common data types of big data

8
Determining if an enterprise is ready
for big data
• Involves assessing specific characteristics within its
information landscape:
– Large Data Volumes: The enterprise deals with substantial data
volumes from various sources like enterprise applications, social
media, machine data, weblogs, and weather data.
– Diverse Data Types: The data includes a mix of structured,
unstructured, and semi-structured types, reflecting a broad range of
information formats.
– Longer Data Retention: Data is stored for extended periods due to
regulatory and compliance requirements, leading to a search for cost-
effective storage solutions like Hadoop.

9
Determining if an enterprise is ready
for big data
• Involves assessing specific characteristics within its
information landscape:
– Wide Application Usage: Data serves multiple applications such as
customer retention, loyalty analysis, weather impact on sales, etc.,
necessitating integration of structured and unstructured data for
comprehensive insights.
– Time Sensitivity and Decision Making: There are pressures to reduce
time to market and enable faster decision-making. This requires
technologies that can handle diverse data types efficiently, produce
actionable insights, and support decision-making processes across
product design and customer relationship management.

10
Real-world examples showcase how
enterprises have embraced the realm
of big data:
• Wind Power Companies: By analyzing petabytes of weather
data, wind power companies can swiftly determine optimal
location sites, reducing analysis time from weeks to minutes,
thanks to big data technologies.
• Logistics Companies like UPS: Utilize big data analytics from
truck sensors to improve route planning, leading to cost
savings through reduced driver time and fuel consumption.

11
Real-world examples showcase how
enterprises have embraced the realm
of big data:
• Social Media Platforms: Twitter and Facebook process
terabytes of data daily, while over 200 million smart meters
contribute to the data explosion, highlighting the widespread
adoption of big data technologies.
• Utility Companies: Use big data for forecasting energy
production, leveraging insights from weather data analysis to
meet potential demand efficiently.

12
Key Drivers for Big Data Solutions

13
Data Monetization Opportunities

• Big data offers ways for companies to make money from


data they already have.
• For instance, retailers can use weather data to offer
promotions not affected by weather, or analyze store data to
improve their promotions.
• Utility companies can use real-time data from smart meters
to offer energy discounts during low-demand times.
• Telecom companies can improve their services by analyzing
call data to prevent customers from leaving. These methods
help create new business opportunities and increase
revenue.

14
New Product Innovations

• Big data solutions help companies analyze social media


trends, customer feedback, and transaction data to
understand product usage and customer preferences.
• By combining this information, enterprises can gain insights
into new product ideas that align with customer demand or
even create entirely new markets.
• This process of analyzing data for product innovation is a
key driver for businesses using big data solutions.

15
Deeper Customer Insights

• Big data solutions offer businesses a comprehensive view of


customer behavior by combining structured and
unstructured data from various sources like CRM systems,
ERP systems, social media, and more.
• This deeper understanding is achieved through analyses
such as promotion analysis, clickstream analysis, and churn
prediction.
• By quickly processing large datasets, businesses can make
faster decisions and uncover new opportunities to engage
with customers effectively, improving overall customer
insights and relationship management.

16
Operational Process Efficiencies

• Big data solutions facilitate real-time tracking of events in


operational systems, leading to process optimizations and
prompt corrective actions based on triggered events/alerts.
• This capability enhances operational process efficiencies
across various industries.
• Examples include patient monitoring in ICUs, optimizing
transportation networks like roads and shipping routes,
inventory management through sales linkage, monitoring
drill rig operations in the oil and gas sector, and
environmental analysis for wind farm companies using
weather patterns.

17
Fraud Detection and Reduction of
Risk
• Big data solutions play a crucial role in proactively
managing fraud detection and reducing compliance risks in
businesses.
• By combining structured and unstructured data sets and
integrating historical data with fraud modeling, these
solutions can detect patterns and enhance fraud detection
capabilities.
• Additionally, integrating identity data with surveillance
further strengthens fraud prevention measures, helping
businesses mitigate risks associated with fraudulent
activities and regulatory reporting.

18
Cost Optimization

• Big data solutions offer cost optimization benefits by


leveraging technologies like Hadoop for long-term storage
of structured and unstructured data without performance
degradation. This reduces storage costs without
compromising performance.
• Additionally, big data solutions facilitate the integration of
new and complex data types more efficiently and affordably
compared to traditional methods..

19
Cost Optimization

• Moreover, the rise of the Internet of Things (IoT) is another


significant driver for big data solutions. IoT networks,
comprising interconnected sensors and devices, collect and
exchange vast amounts of data.
• This data enables remote sensing and control, leading to
process efficiencies and cost reductions. Big data solutions
play a crucial role in handling the massive volumes of IoT
data, offering capabilities for data ingest, analytics, and
visualization to make a substantial impact on operational
efficiencies and cost management.

20
The seven phases of a big data
strategy

21
Building Blocks and Enablers for
Big Data Solutions

22
The three key factors in the
enablement big data solutions
• Big Data Vision and Strategy encompass the initial
assessment of a company's capabilities and future roadmap
for big data adoption. The vision aligns with the enterprise's
mission, focusing on how big data can support its goals. The
strategy outlines specific use cases and initiatives to realize
this vision, with a roadmap and business case detailing the
sequence of actions and potential benefits. Determining ROI
for big data initiatives, especially those involving data
monetization, can be complex, often relying on industry or
process-specific benchmarks for estimation..

23
The three key factors in the
enablement big data solutions
• Big Data Pilot and the Next Steps
• Big Data Pilot projects are crucial for enterprises to test and
understand the benefits of big data solutions. These pilots
help in refining use cases that can drive revenue, cost
optimization, enhanced customer service, and product
innovations.
• For instance, integrating social media data into customer
management can provide insights into customer sentiments,
improving marketing campaigns and customer service
levels.

24
The three key factors in the
enablement big data solutions
• Big Data Pilot and the Next Steps
• The steps involved in a Proof of Concept (PoC) for such
projects include identifying potential use cases, importing
relevant data, analyzing data using visualization tools, and
deriving actionable insights.
• This approach ensures that the technology is validated
before full implementation, leading to a deeper
understanding of business benefits like improved campaign
effectiveness and customer retention. This understanding
helps in defining the next steps in the roadmap for big data
adoption.

25
The three key factors in the
enablement big data solutions
• Big Data Solution Architecture and Tools
• The key drivers in any big data solution are defining the
solution architecture and selecting the supporting tools and
technologies. This involves identifying the necessary solution
components and the tools to build the big data solution.
• Hadoop-Based Repository: A central repository is crucial for
storing various types of data (structured, semi-structured,
unstructured) such as transaction data, enterprise data, machine
data, and social media data. The repository is typically based on
Hadoop Distributed File System (HDFS), with multiple
distributions available like Cloudera, Hortonworks, and
BigInsights.
26
The three key factors in the
enablement big data solutions
• Big Data Solution Architecture and Tools
• Hadoop Components: Hadoop comprises modules such as
storage based on HDFS, resource management for task
execution, a distributed processing model like MapReduce
(now with alternatives like Apache Spark for interactive
processing and Apache Storm for real-time processing), and
utilities and software libraries supporting the Hadoop platform.
• Wider Applications: While initially adopted by e-commerce
and internet companies like Yahoo and Google, Hadoop and big
data solutions are now used across various industries, with
common use cases including data storage, processing, analytics,
and real-time data processing.
27
Common Use Cases for Hadoop

28
key architecture principles
concerning big data solutions
• The big data solution should provide the organization with a
trusted, unified, and consistent view of diverse data types
integrated from a variety of structured and unstructured data
sources.
• The big data solution should provide batch, interactive, and
near real-time data integration and analytics capabilities. It
must have the capability to handle mixed workloads and
query patterns.
• The big data solution needs to scale well (to petabytes) to
handle large volumes as well as to ensure that storage costs
are kept low through HDFS based storage

29
key architecture principles
concerning big data solutions
• In a big data enabled world where enterprises dabble with
both structured and unstructured data, the big data solution
requires that metadata be associated with both structured
and unstructured data.
• The big data solution must have governance and the right to
secure the data elements stored in its data lakes and to
ensure there are suitable audit mechanisms. Compliance to
local and global laws concerning data privacy and customer
data should be maintained.

30
Critical Success Factors in Big Data

• Build a Clear Roadmap: Start by creating a roadmap with


specific big data use cases aligned with your business goals.
This roadmap guides how big data can transform your
business, from revenue opportunities to cost savings.
• Align with Business Needs: Focus on use cases that align
with your business stakeholders' priorities. This ensures that
big data initiatives deliver tangible business benefits and
support the overall strategy.

31
Critical Success Factors in Big Data

• Start Small with Pilots: Begin with one or two pilot projects
to test the technology and gauge its impact. This approach
helps validate the technology, quantify actual benefits, and
learn valuable lessons for future initiatives.
• Enhance Skills: Invest in training and hiring to build
expertise in big data technologies and data science.
Enhancing skills is crucial for executing big data initiatives
successfully.

32
Critical Success Factors in Big Data

• Build Incrementally: Design and implement big data


solutions incrementally based on business priorities and
lessons learned from pilots. This reduces risks and ensures
business support for ongoing initiatives.
• Ensure Information Governance: Establish strong
information governance practices to manage data quality,
security, ownership, and compliance challenges. Involving
Chief Data Officers (CDOs) is vital for effective governance
and data management throughout the big data journey.

33
Tools for Big Data

34
Assignment

• Describe five important signs that show a company is


prepared to adopt big data technology. Also, discuss the
influence of growing data volume, variety, and speed on
storage methods and data processing technologies
• How does the surge in data volume, diversity, and speed
affect storage solutions and data processing technologies?
• How does the integration of structured and unstructured data
in big data initiatives pose challenges for information
governance, and what strategies can organizations employ to
address these challenges effectively?

35
36

You might also like