0% found this document useful (0 votes)
25 views28 pages

ETB 1 (Big Data)

The document provides an introduction to Big Data, covering its fundamental concepts, technologies, and the data lifecycle. It highlights the importance of Big Data in decision-making, ethical considerations, and various business applications, while also addressing challenges and use cases across different industries. Key technologies and analytics techniques are discussed, emphasizing the role of Big Data in enhancing operational efficiency and driving innovation.

Uploaded by

Fitness Guide
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views28 pages

ETB 1 (Big Data)

The document provides an introduction to Big Data, covering its fundamental concepts, technologies, and the data lifecycle. It highlights the importance of Big Data in decision-making, ethical considerations, and various business applications, while also addressing challenges and use cases across different industries. Key technologies and analytics techniques are discussed, emphasizing the role of Big Data in enhancing operational efficiency and driving innovation.

Uploaded by

Fitness Guide
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Introduction to Big Data

Learning Objectives
1. Understand Big Data Concepts: Gain a comprehensive understanding of the
fundamental principles of Big Data, including the 7 Vs and the significance of
Big Data in today’s digital landscape.
2. Explore Big Data Technologies: Familiarize yourself with essential Big Data
technologies and tools such as Hadoop, Spark, NoSQL databases, and data
warehouses used for data storage, processing, and analysis.
3. Analyze Data Lifecycle: Learn the complete data lifecycle, from data ingestion
and processing to storage and analysis, understanding how each phase
contributes to effective decision-making.
4. Apply Data Analytics Techniques: Understand various data analysis
techniques, including descriptive, diagnostic, predictive, and prescriptive
analytics, and how these techniques can be applied to extract actionable
insights.
5. Evaluate Ethical and Privacy Considerations: Discuss the ethical implications
of Big Data analysis, including issues of data privacy, security, and
governance, and how to ensure responsible data handling.
Agenda
► Introduction to Big data
► What is big data
► Characteristics of Big data
► Importance and Benefits
► Working of big data
► Big data challenges
► Business applications
INTRODUCTION
► Big Data refers to large and complex datasets that traditional data processing
applications cannot handle efficiently.
Introduction to Big data
What is Data?
► The quantities, characters, or symbols on which operations are
performed by a computer, which may be stored and
transmitted in the form of electrical signals and recorded on
magnetic, optical, or mechanical recording media.

What is Big Data?


► Big Data is a collection of data that is huge in volume, yet
growing exponentially with time. It is a data with so large size
and complexity that none of traditional data management tools
can store it or process it efficiently. Big data is also a data but
with huge size.
What is big data?
“BIG DATA” IS HIGH-VOLUME, VELOCITY, AND VARIETY INFORMATION
ASSET THAT DEMANDS COST-EFFECTIVE AND INNOVATIVE FORMS OF
INFORMATION PROCESSING FOR ENHANCED INSIGHT AND DECISION
MAKING.”
► Big data is a combination of structured, semi-structured, and
unstructured data collected by organizations that can be mined for
information and used in machine learning projects, predictive
modeling, and other advanced analytics applications.
Structured
► Any data that can be stored, accessed and processed in the form of fixed
format is termed as a ‘structured’ data.
Unstructured
► Any data with unknown form or the structure is classified as unstructured
data
Semi-structured
► Semi-structured data can contain both the forms of data.
SOURCES OF BIG DATA

• Social Media • Research and Scientific Data


• Internet of Things (IoT) • Customer Feedback and Surveys
• Transactional Data • Email and Communication Data
• Web and Clickstream Data • Health and Medical Data
• Multimedia Data • Geospatial Data
• Machine-Generated Data • Public and Proprietary APIs
• Public and Government Data • Legacy Systems and Historical
• Enterprise Data Data
SOURCES
► Main sources of data:
► People-generated data
► Machine-generated data
► Business-generated data
► Examples:
► New York Stock Exchange
► Social network data
► Jet Engines
► IoT devices
CHARACTERISTICS OF BIG DATA (7 Vs)
• Volume: It is the amount of data generated and collected.
• Velocity: It is the speed at which data is generated and processed.
• Variety: It is different type of data.
• Veracity: The quality and accuracy of data.
• Value: The potential insights and benefits that can be derived from
analyzing the data.
• Variability: The variability in the context of Big Data refers to the
variations in the data flow rates and the inconsistency in data
formats and sources.
• Visualization: It is critical in Big Data as it helps in making sense of
large datasets by representing them in a graphical format.
Benefits Of Big Data Processing
• Businesses can utilize outside intelligence while taking decisions
► Access to social data from search engines and sites like facebook, twitter are
enabling organizations to fine tune their business strategies.
• Improved customer service
► Traditional customer feedback systems are getting replaced by new systems
designed with Big Data technologies. In these new systems, Big Data and natural
language processing technologies are being used to read and evaluate consumer
responses.
• Early identification of risk to the product/services, if any
• Better operational efficiency
► Big Data technologies can be used for creating a staging area or landing zone for
new data before identifying what data should be moved to the data warehouse. In
addition, such integration of Big Data technologies and data warehouse helps an
organization to offload infrequently accessed data.
Importance of Big data
1. Cost Savings
► Big Data tools like Apache Hadoop, Spark, etc. bring cost-saving benefits to businesses
when they have to store large amounts of data.
2. Time-Saving
► Real-time in-memory analytics helps companies to collect data from various sources.
3. Understand the market conditions
► Big Data analysis helps businesses to get a better understanding of market situations.
4. Social Media Listening
► Companies can perform sentiment analysis using Big Data tools.
5. Boost Customer Acquisition and Retention
► Big data analytics helps businesses to identify customer related trends and patterns.
Customer behavior analysis leads to a profitable business.
6. Solve Advertisers Problem and Offer Marketing Insights
► Big data analytics shapes all business operations. It enables companies to fulfill customer
expectations.
7. The driver of Innovations and Product Development
► Big data makes companies capable to innovate and redevelop their products.
Importance of Big data
► Companies use big data in their systems to improve operations,
provide better customer service, create personalized marketing
campaigns and take other actions that, ultimately, can increase
revenue and profits.
► For example:
✔ big data provides valuable insights into customers that companies
can use to refine their marketing, advertising and promotions in
order to increase customer engagement and conversion rates.
✔ Big data is also used by medical researchers to identify disease
signs and risk factors and by doctors to help diagnose illnesses and
medical conditions in patients
✔ In the energy industry, big data helps oil and gas
companies identify potential drilling locations and
monitor pipeline operations; likewise, utilities use it to
track electrical grids.
✔ Financial services firms use big data systems for risk
management and real-time analysis of market data.
✔ Manufacturers and transportation companies rely on big
data to manage their supply chains and optimize
delivery routes.
✔ Other government uses include emergency response,
crime prevention and smart city initiatives.
Challenges with Big Data
1.Sharing and Accessing Data:
► Perhaps the most frequent challenge in big data efforts is the
inaccessibility of data sets from external sources.
► Sharing data can cause substantial challenges.

2. Privacy and Security:


► This challenge includes sensitive, conceptual, technical as well as legal
significance.
3. Analytical Challenges:
► There are some huge analytical challenges in big data which arise
some main challenges questions like how to deal with a problem if
data volume gets too large?
► Or how to find out the important data points?

► Or how to use data to the best advantage?


4. Technical challenges:
► Quality of data:
• When there is a collection of a large amount of data and storage of
this data, it comes at a cost. Big companies, business leaders and
IT leaders always want large data storage.
• For better results and conclusions, Big data rather than having
irrelevant data, focuses on quality data storage.
► Fault tolerance:
► fault tolerance computing is extremely hard, involving intricate
algorithms.
► Nowadays some of the new technologies like cloud computing
and big data always intended that whenever the failure occurs the
damage done should be within the acceptable threshold
► Scalability: Big data projects can grow and evolve rapidly. The
scalability issue of Big Data has lead towards cloud computing.
Use cases
• Product development:
Companies like Netflix and Procter & Gamble use big data to anticipate customer
demand. They build predictive models for new products and services by classifying key
attributes of past and current products or services and modeling the relationship between
those attributes
• Predictive maintenance:
structured data, such as the year, make, and model of equipment, as well as the
unstructured data that covers millions of log entries, sensor data, error messages, and
engine temperature can be analyze the potential issues before the problems happen.
• Customer experience:
The race for customers is on. A clearer view of customer experience is more possible now
a days than ever before. To start delivering personalized offers, reduce customer churn,
and handle issues proactively.
• Fraud and compliance:
Big data helps you identify patterns in data that indicate fraud and aggregate large
volumes of information to make regulatory reporting much faster
• Machine learning:
The availability of big data to train machine learning models makes that possible.
• Operational efficiency:
With big data, you can analyze and assess production, customer feedback and
returns, and other factors to reduce outages and anticipate future demands
• Drive innovation:
Big data can help you innovate by studying interdependencies among humans,
institutions, entities, and process and then determining new ways to use those
insights.
How big data works
• Collect and Integration
• Clean the data
• Process the data
• Analysis the data
• Visualization
• tables vs graphical ways
• e.g. map of temperatures
BIG DATA TECHNOLOGIES

► Technologies
▪ Data Storage
▪ Data Processing
▪ Data Ingestion
▪ Data Analysis and Machine Learning
▪ Data Visualization
► Frameworks and Ecosystems
▪ Apache Hadoop Ecosystem
▪ Apache Spark Ecosystem
▪ Lambda Architecture
BIG DATA TECHNOLOGIES
► Cloud-Based Big Data Technologies
▪ Amazon Web Services (AWS)
▪ Google Cloud Platform (GCP)
▪ Microsoft Azure

► Trends and Innovations


▪ Edge Computing
▪ AI and Machine Learning Integration
▪ Data Governance and Privacy
ANALYTICS USING BIG DATA
Components of Big Data Analytics Key Techniques and Technologies
▪Data Collection • Data Processing Frameworks
• Hadoop
▪Data Storage • Apache Spark
▪Data Processing • Analytical Methods
▪Data Analysis • Descriptive Analytics
• Diagnostic Analytics
▪Data Visualization • Predictive Analytics
• Prescriptive Analytics
• Machine Learning and AI
• Supervised Learning
• Unsupervised Learning
• Deep Learning
• Real-time Analytics
• Stream Processing
APPLICATIONS OF BIG DATA
• Healthcare • Manufacturing
• Patient Monitoring • Predictive Maintenance
• Disease Outbreak Prediction • Quality Control
• Drug Discovery • Supply Chain Management
• Electronic Health Records
• Resource Management • Telecommunications
• Network Optimization
• Finance • Customer Churn Analysis
• Fraud Detection • Service Personalization
• Risk Management
• Personalized Banking • Energy and Utilities
• Smart Grids
• Retail • Predictive Maintenance
• Customer Insights • Renewable Energy
• Inventory Management
• Product Recommendations • Transportation and Logistics
• Dynamic Pricing • Fleet Management
• Predictive Analytics
• Autonomous Vehicles
Applications of BIG data
► 1. Tracking Customer Spending Habit, Shopping Behavior: In
big retails store (like Amazon, Walmart, Big Bazar etc.) management
team has to keep data of customer’s spending habit, shopping
behavior, customer’s most liked product.
► Banking sector uses their customer’s spending behavior-related data so
that they can provide the offer to a particular customer to buy his
particular liked product by using bank’s credit or debit card with
discount or cashback
► 2. Recommendation: By tracking customer spending habit,
shopping behavior, Big retails store provide a recommendation to the
customer. E-commerce site like Amazon, Walmart, Flipkart.
► 3. Smart Traffic System: Data about the condition of the traffic of
different road, collected through camera kept beside the road.
► 4. Secure Air Traffic System: At various places of flight (like
propeller etc) sensors present. These sensors capture data like the
speed of flight, moisture, temperature, other environmental
condition. Based on such data analysis, an environmental parameter
within flight are set up and varied.
► 5. Auto Driving Car:
► 6. Virtual Personal Assistant Tool: like Siri in Apple Device,
Cortana in Windows, Google Assistant in Android.
► 7. IoT: Manufacturing company install IOT sensor into machines to
collect operational data.
► Using big data tool, data regarding patient experience is collected and is
used by doctors to give better treatment.
Applications of BIG data
► 8. Education Sector: Online educational course conducting
organization utilize big data to search candidate, interested in that
course.
► 9. Energy Sector: Smart electric meter read consumed power every
15 minutes and sends this read data to the server, where data
analyzed and it can be estimated what is the time in a day when the
power load is less throughout the city.
► 10. Media and Entertainment Sector: Media and entertainment
service providing company like Netflix, Amazon Prime, Spotify do
analysis on data collected from their users.

You might also like