Intr Oduction of Big Data

Uploaded by

aksaraf1508

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views12 pages

Intr Oduction of Big Data

Uploaded by

aksaraf1508

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

INTR ODUCTION OF BIG DATA

Presented By : Man Singh

Enrolment No. : 0198EX211025
Branch : CSE VII Sem
Subject : CS702(B)-Big Data
Introduction of Big
Data
Big data refers to the massive and complex datasets that traditional
data processing applications are inadequate to handle. It encompasses
a wide range of data types, including structured, unstructured, and
semi-structured data, generated from various sources at high velocity
and volume.
Key Characteristics of B ig Data
1 Volume 2 Variety
The sheer amount of data being generated and collected, The diverse data types, including structured,
often in the petabytes or exabytes range. unstructured, and semi-structured data, from various
sources.

3 Velocity 4 Veracity
The speed at which data is being created, collected, and The trustworthiness and reliability of the data, ensuring its
processed, often in real-time or near-real-time. accuracy and consistency.
B enefits of B ig Data
Improved Decision-Making Increased Operational E nhanced Customer
E fficiency E xperience
Leveraging data-driven insights to
make more informed and strategic Optimizing processes and workflows Personalized and targeted
decisions. through data-based process products/services based on
improvements. customer behavior and preferences.
Challenges in Big Data Management
Data Storage Data Security
Managing the vast amount of data and ensuring scalable storage Protecting sensitive data and ensuring compliance with data privacy
solutions. regulations.

1 2 3

Data Integration
Integrating diverse data sources and formats into a cohesive
system.
B ig Data Technologies and Tools

Hadoop Apache S park Apache Kafka Tableau

A framework for distributed A unified analytics engine A distributed streaming A data visualization and
processing of large for large-scale data platform for building real- business intelligence
datasets. processing. time data pipelines. software.
Unit -2

Introduction to
Hadoop
Hadoop is an open-source software framework designed to handle
large-scale data processing and storage across a distributed
computing environment. It provides a reliable, scalable, and fault-
tolerant infrastructure for data-intensive applications.
What is Hadoop?
Distributed File System MapReduce Ecosystem

Hadoop's Distributed File System Hadoop's MapReduce framework Hadoop has a rich ecosystem of tools
(HDFS) stores and manages data processes large datasets in parallel, and technologies that extend its
across multiple servers, providing distributing the workload across a capabilities, such as Hive, Spark, and
high availability and fault tolerance. cluster of computers. Kafka.
Key Components of Hadoop
1 NameNode
The NameNode manages the file system namespace
and coordinates access to files by clients.

2 DataNode
DataNodes store and manage the actual data blocks,
providing redundancy and fault tolerance.

3 ResourceManager
The ResourceManager allocates and manages the
computational resources in the Hadoop cluster.
Hadoop Ecosystem and Applications
Big Data Analytics Data Warehousing
Hadoop is widely used for analyzing large datasets, including web Hadoop's distributed storage and processing capabilities make it a
logs, sensor data, and social media information. popular choice for data warehousing and business intelligence
applications.

Stream Processing Machine Learning

Hadoop's ecosystem includes tools like Spark Streaming and Kafka Hadoop's scalability and parallel processing features enable
for real-time data processing and streaming analytics. advanced machine learning and deep learning algorithms to be
applied to big data.
B enefits and Challenges of
Hadoop

S calability Cost-E ffective

Hadoop's distributed architecture Hadoop runs on commodity hardware,
allows it to scale up or down to handle making it a cost-effective solution for
growing data volumes and processing big data processing and storage.
needs.

Fault Tolerance Complexity

Hadoop's replication and failover The distributed nature of Hadoop and
mechanisms provide a high degree of its ecosystem can make it challenging
fault tolerance and data reliability. to set up, configure, and manage.
Thank you

Big Data Analytics 1-5
100% (1)
Big Data Analytics 1-5
63 pages
Big Data Analytics - Lecture Slides
No ratings yet
Big Data Analytics - Lecture Slides
72 pages
BDH Admin Ebook
No ratings yet
BDH Admin Ebook
807 pages
$RM5TSDQ
No ratings yet
$RM5TSDQ
70 pages
BigData Unit1
No ratings yet
BigData Unit1
74 pages
Unit 5
No ratings yet
Unit 5
68 pages
Bdhs - Ebook
No ratings yet
Bdhs - Ebook
970 pages
Big Data - 1
No ratings yet
Big Data - 1
46 pages
Bdt..u1 PPT 08112023
No ratings yet
Bdt..u1 PPT 08112023
71 pages
DBMS Unit-5
No ratings yet
DBMS Unit-5
92 pages
NetBackup Training Module1
100% (3)
NetBackup Training Module1
35 pages
Anoverviewon Big Dataand Hadoop
No ratings yet
Anoverviewon Big Dataand Hadoop
8 pages
Big Data Complete Notes
No ratings yet
Big Data Complete Notes
33 pages
Big Data-2
No ratings yet
Big Data-2
40 pages
BIT4440 BSE4040 CloudComputing 3.big Data Technologies
No ratings yet
BIT4440 BSE4040 CloudComputing 3.big Data Technologies
43 pages
Big Data Analytics Overview
No ratings yet
Big Data Analytics Overview
17 pages
Big Data
No ratings yet
Big Data
4 pages
BigData Session1
No ratings yet
BigData Session1
14 pages
Big Data A Comprehensive Overview
No ratings yet
Big Data A Comprehensive Overview
25 pages
Unit1 - BDH
No ratings yet
Unit1 - BDH
77 pages
Bda CHP1
No ratings yet
Bda CHP1
83 pages
Lec1 Special
No ratings yet
Lec1 Special
21 pages
Module 1
No ratings yet
Module 1
54 pages
Big Data Analytics
No ratings yet
Big Data Analytics
32 pages
Hadoop Ecosystem Large PDF
No ratings yet
Hadoop Ecosystem Large PDF
229 pages
Taming Big Data
No ratings yet
Taming Big Data
268 pages
Introduction To Big Data With Spark and Hadoop
No ratings yet
Introduction To Big Data With Spark and Hadoop
61 pages
Hadoop Big Data Unit 2
No ratings yet
Hadoop Big Data Unit 2
23 pages
Biggdata
No ratings yet
Biggdata
24 pages
Hadoop PPT
No ratings yet
Hadoop PPT
25 pages
Big Data Unit 1 Notes
No ratings yet
Big Data Unit 1 Notes
20 pages
Bda Unit1
No ratings yet
Bda Unit1
19 pages
IOT and Comp - Architecture
No ratings yet
IOT and Comp - Architecture
17 pages
Experiment No - 1 Bda
No ratings yet
Experiment No - 1 Bda
10 pages
Bba13 Notes BDF Unit 1
No ratings yet
Bba13 Notes BDF Unit 1
3 pages
Lect 2 Big Data Lesson01
No ratings yet
Lect 2 Big Data Lesson01
26 pages
Big Data
No ratings yet
Big Data
10 pages
Introduction To Big Dat1
No ratings yet
Introduction To Big Dat1
6 pages
Bigdata Overview PDF
No ratings yet
Bigdata Overview PDF
98 pages
Big Data Overview
No ratings yet
Big Data Overview
18 pages
Ashish Presentation Stage1 Modify LR
No ratings yet
Ashish Presentation Stage1 Modify LR
24 pages
Bigdata
No ratings yet
Bigdata
12 pages
Chapter 2-Data Science
No ratings yet
Chapter 2-Data Science
23 pages
Prepared by Richa Btech (Cse) 6 Sem Dav University Jalandhar
No ratings yet
Prepared by Richa Btech (Cse) 6 Sem Dav University Jalandhar
30 pages
The Age OF: Every Minute
No ratings yet
The Age OF: Every Minute
47 pages
Big Data and Hadoop Self Notes
No ratings yet
Big Data and Hadoop Self Notes
16 pages
Data Science
No ratings yet
Data Science
87 pages
Hadoop - Quick Guide Hadoop - Big Data Overview
No ratings yet
Hadoop - Quick Guide Hadoop - Big Data Overview
32 pages
Updated Unit-2
0% (1)
Updated Unit-2
55 pages
Hadoop V.01
No ratings yet
Hadoop V.01
24 pages
Hadoop Quick Guide
No ratings yet
Hadoop Quick Guide
32 pages
BDA - Unit-1
No ratings yet
BDA - Unit-1
24 pages
6101 Fundamentals Ofcomputer and IT MCQ
100% (1)
6101 Fundamentals Ofcomputer and IT MCQ
69 pages
Big Data: Presented By, Nishaa R
No ratings yet
Big Data: Presented By, Nishaa R
24 pages
Introduction To Big Data: Soorya Prasanna Ravichandran
No ratings yet
Introduction To Big Data: Soorya Prasanna Ravichandran
33 pages
Hadoop - Quick Guide Hadoop - Big Data Overview
No ratings yet
Hadoop - Quick Guide Hadoop - Big Data Overview
41 pages
Big Data: Introduction To Terms, Concepts and Tools
No ratings yet
Big Data: Introduction To Terms, Concepts and Tools
23 pages
Imsva 9.1 BPG 20160531
No ratings yet
Imsva 9.1 BPG 20160531
61 pages
Hadoop & BigData (UNIT - 2)
No ratings yet
Hadoop & BigData (UNIT - 2)
22 pages
Big Data Analytics Using Apache Hadoop
No ratings yet
Big Data Analytics Using Apache Hadoop
33 pages
Google - Real Exams - Professional Cloud DevOps Engineer.v2021!03!27.by - Louie.20q
No ratings yet
Google - Real Exams - Professional Cloud DevOps Engineer.v2021!03!27.by - Louie.20q
11 pages
Deep Security 12.0 On-Premise Administration Guide
No ratings yet
Deep Security 12.0 On-Premise Administration Guide
1,579 pages
Ib02603003e - PXG600 User Manual
No ratings yet
Ib02603003e - PXG600 User Manual
192 pages
AIA Final Exam Notes
100% (1)
AIA Final Exam Notes
4 pages
How To Add Digital Signature Certificate On NSWS
No ratings yet
How To Add Digital Signature Certificate On NSWS
7 pages
Led LCD
No ratings yet
Led LCD
19 pages
29lv160abtc 90 PDF
No ratings yet
29lv160abtc 90 PDF
66 pages
Demo 2 How To Use Multiple Serial Ports On Arduino ESP32
No ratings yet
Demo 2 How To Use Multiple Serial Ports On Arduino ESP32
2 pages
GV55 Manage Tool User Guide R1.02
No ratings yet
GV55 Manage Tool User Guide R1.02
22 pages
19 Ways To Bypass Software Restrictions and Spawn A Shell
No ratings yet
19 Ways To Bypass Software Restrictions and Spawn A Shell
26 pages
Module 4 - Deploying and Implementing A Cloud Solution
No ratings yet
Module 4 - Deploying and Implementing A Cloud Solution
39 pages
Assingment-Đỗ Thị Hằng-BKC13360-D01K13
No ratings yet
Assingment-Đỗ Thị Hằng-BKC13360-D01K13
46 pages
Sistem Komputer
No ratings yet
Sistem Komputer
12 pages
Bandwidth Test
No ratings yet
Bandwidth Test
12 pages
0 0 Filtertrie Intermediate
No ratings yet
0 0 Filtertrie Intermediate
30 pages
Symphony VST3 Log
No ratings yet
Symphony VST3 Log
12 pages
How To Install Oracle 11g R2 On Windows 7?: Step # Step Details Additional Comments
No ratings yet
How To Install Oracle 11g R2 On Windows 7?: Step # Step Details Additional Comments
3 pages
AT91 ARM Thumb Microcontrollers AT91F40816 Errata Sheet V1.0
No ratings yet
AT91 ARM Thumb Microcontrollers AT91F40816 Errata Sheet V1.0
5 pages
Week 05 Lectures
No ratings yet
Week 05 Lectures
24 pages
Lecture1 CloudComputing
No ratings yet
Lecture1 CloudComputing
23 pages
B1W English User Manual
No ratings yet
B1W English User Manual
18 pages
Electrical Engineering Department
No ratings yet
Electrical Engineering Department
10 pages
Cisco ASA Site-To-site VPN With MX Series
No ratings yet
Cisco ASA Site-To-site VPN With MX Series
7 pages
Sorting Data: Implement Grep and Tar
No ratings yet
Sorting Data: Implement Grep and Tar
3 pages
Haar Training Python
No ratings yet
Haar Training Python
9 pages
Lovely Professional University Home Work: #1 Course Code: CSE102 Course Title: Exposure To Computer Disciplines School: LITCA
No ratings yet
Lovely Professional University Home Work: #1 Course Code: CSE102 Course Title: Exposure To Computer Disciplines School: LITCA
4 pages
Tracegraph Installation in Ubuntu 13
No ratings yet
Tracegraph Installation in Ubuntu 13
4 pages

Intr Oduction of Big Data

Uploaded by

Intr Oduction of Big Data

Uploaded by

INTR ODUCTION OF BIG DATA

Presented By : Man Singh

Hadoop Apache S park Apache Kafka Tableau

Stream Processing Machine Learning

S calability Cost-E ffective

Fault Tolerance Complexity

You might also like