Big Data Processing Concepts

Uploaded by

S.KAVITHA COMPUTERSCIENCE

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views

Big Data Processing Concepts

Uploaded by

S.KAVITHA COMPUTERSCIENCE

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 9

Big Data Processing Concepts

S.Kavitha
Head & Assistant Professor
Department of Computer Science
Sri Sarada Niketan College of Science for Women,Karur.
Parallel Data Processing
• Parallel data processing involves the simultaneous
execution of multiple sub-tasks that
• collectively comprise a larger task. The goal is to
reduce the execution time by dividing a
• single larger task into multiple smaller tasks that run
concurrently.
• Although parallel data processing can be achieved
through multiple networked machines,
• it is more typically achieved within the confines of a
single machine with multiple
• processors
Distributed Data Processing
• Distributed data processing is closely related
to parallel data processing in that the same
• principle of “divide-and-conquer” is applied.
However, distributed data processing is
• always achieved through physically separate
machines that are networked together as a
• cluster.
Hadoop
• Hadoop is an open-source framework for large-scale
data storage and data processing that
• is compatible with commodity hardware. The Hadoop
framework has established itself as
• a de facto industry platform for contemporary Big
Data solutions. It can be used as an
• ETL engine or as an analytics engine for processing
large amounts of structured, semistructured
• and unstructured data. From an analysis perspective,
Hadoop implements the
• MapReduce processing framework.
Processing Workloads
• A processing workload in Big Data is defined
as the amount and nature of data that is
• processed within a certain amount of time.
Workloads are usually divided into two types:
• batch
• transactional
Batch
• Batch processing, also known as offline
processing, involves processing data in
batches
• and usually imposes delays, which in turn
results in high-latency responses. Batch
• workloads typically involve large quantities of
data with sequential read/writes and
• comprise of groups of read or write queries.
Transactional
• Transactional processing is also known as online
processing. Transactional workload
• processing follows an approach whereby data is
processed interactively without delay,
• resulting in low-latency responses. Transaction workloads
involve small amounts of data
• with random reads and writes.
• OLTP and operational systems, which are generally write-
intensive, fall within this
• category. Although these workloads contain a mix of
read/write queries, they are generally
• more write-intensive than read-intensive.
Cluster
• In the same manner that clusters provide
necessary support to create horizontally scalable
• storage solutions, clusters also provides the
mechanism to enable distributed data
• processing with linear scalability. Since clusters
are highly scalable, they provide an ideal
• environment for Big Data processing as large
datasets can be divided into smaller datasets
• and then processed in parallel in a distributed
manner.
Processing in Batch Mode
• In batch mode, data is processed offline in
batches and the response time could vary from
• minutes to hours. As well, data must be
persisted to the disk before it can be
processed.
• Batch mode generally involves processing a
range of large datasets, either on their own or
• joined together, essentially addressing the
volume and variety characteristics of Big Data
• datasets.

Introduction To Big Data With Spark and Hadoop
No ratings yet
Introduction To Big Data With Spark and Hadoop
61 pages
THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE: "THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE"
From Everand
THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE: "THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE"
AJIT DASH
2/5 (2)
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
From Everand
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
Wei Liu
No ratings yet
Big Data Unit5
No ratings yet
Big Data Unit5
57 pages
Big Data Processing, MapReduce
No ratings yet
Big Data Processing, MapReduce
13 pages
Bda CH3
No ratings yet
Bda CH3
10 pages
Chapter 9 - BDMT
No ratings yet
Chapter 9 - BDMT
61 pages
Articol Disteibuted Data Processing
No ratings yet
Articol Disteibuted Data Processing
9 pages
C42-Batch Stream Micro Batch Realtime Processing
No ratings yet
C42-Batch Stream Micro Batch Realtime Processing
33 pages
Big Data Engines: Binary Batch Processing
No ratings yet
Big Data Engines: Binary Batch Processing
12 pages
Spark
No ratings yet
Spark
36 pages
Hadoop PPT
No ratings yet
Hadoop PPT
25 pages
CLOUD COMPUTING UNIT 3
No ratings yet
CLOUD COMPUTING UNIT 3
10 pages
Jifs223295 2
No ratings yet
Jifs223295 2
25 pages
Chapter 6
No ratings yet
Chapter 6
15 pages
Lecture 10
No ratings yet
Lecture 10
7 pages
Big data Handling Techniques
No ratings yet
Big data Handling Techniques
21 pages
Terminal Remote Company Interface Internet Website Product Orders Dealing Payments Offers Savings Efficiency Business Sales Operations System Terms
No ratings yet
Terminal Remote Company Interface Internet Website Product Orders Dealing Payments Offers Savings Efficiency Business Sales Operations System Terms
4 pages
Document 15
No ratings yet
Document 15
15 pages
Ecs765p W1
No ratings yet
Ecs765p W1
39 pages
Apznzazjo 11alycvdryonqpbec 5rayiswmeqdj7tdf9lzdbqz3fyfqimvdhobrxk2cshbphryoa7avx3vv8cv Gg4h81ojwronue2twsfc5eifvyppetawllb0sh12okn7def9ydrsx1q1zyrs5lqbwjktpcbrllwdxclmv11kamhf7ygbaup4itld55rtzkqkld4jtsdu7ixe8bwmqbcikhqchz4r0g-Ctn Kdm Nnc2m
No ratings yet
Apznzazjo 11alycvdryonqpbec 5rayiswmeqdj7tdf9lzdbqz3fyfqimvdhobrxk2cshbphryoa7avx3vv8cv Gg4h81ojwronue2twsfc5eifvyppetawllb0sh12okn7def9ydrsx1q1zyrs5lqbwjktpcbrllwdxclmv11kamhf7ygbaup4itld55rtzkqkld4jtsdu7ixe8bwmqbcikhqchz4r0g-Ctn Kdm Nnc2m
29 pages
Parcial Cono 1 21
No ratings yet
Parcial Cono 1 21
21 pages
Parcial Cono 1 14
No ratings yet
Parcial Cono 1 14
14 pages
Big Data and Hadoop - 12 Aug 2021
No ratings yet
Big Data and Hadoop - 12 Aug 2021
19 pages
j.ijdsa.20241005.11
No ratings yet
j.ijdsa.20241005.11
14 pages
Types of Data Processing
No ratings yet
Types of Data Processing
3 pages
0 The BigDataEra
No ratings yet
0 The BigDataEra
36 pages
Guide to High Performance Distributed Computing Case Studies with Hadoop Scalding and Spark Computer Communications and Networks 2015th Edition Srinivasa K G Muppalla Anil Kumar - The complete ebook set is ready for download today
No ratings yet
Guide to High Performance Distributed Computing Case Studies with Hadoop Scalding and Spark Computer Communications and Networks 2015th Edition Srinivasa K G Muppalla Anil Kumar - The complete ebook set is ready for download today
69 pages
Introduction To Big Data: Soorya Prasanna Ravichandran
No ratings yet
Introduction To Big Data: Soorya Prasanna Ravichandran
33 pages
Intro Big Data
No ratings yet
Intro Big Data
36 pages
Session 8 Big Data
No ratings yet
Session 8 Big Data
7 pages
BigData Materials
No ratings yet
BigData Materials
68 pages
Chapter - 1 Introduction
No ratings yet
Chapter - 1 Introduction
22 pages
Hadoop & BigData (UNIT - 2)
No ratings yet
Hadoop & BigData (UNIT - 2)
22 pages
Unit 1 1
No ratings yet
Unit 1 1
10 pages
Hadoop Ecosystem for Big Data
From Everand
Hadoop Ecosystem for Big Data
Dr. Zemelak Goraga
No ratings yet
Big Data and Hadoop Overview
100% (1)
Big Data and Hadoop Overview
17 pages
Ashish_Presentation_Stage1_modify_LR
No ratings yet
Ashish_Presentation_Stage1_modify_LR
24 pages
Nuraddeen Abdulmalik
No ratings yet
Nuraddeen Abdulmalik
3 pages
OS Lecture 5
No ratings yet
OS Lecture 5
29 pages
Cluster Basics
No ratings yet
Cluster Basics
34 pages
UNIT1 -BDH
No ratings yet
UNIT1 -BDH
77 pages
Big Data Training
No ratings yet
Big Data Training
244 pages
BDA Model Qp Soln
No ratings yet
BDA Model Qp Soln
55 pages
CloudxLab BDHS Course Details
No ratings yet
CloudxLab BDHS Course Details
9 pages
Introduction To Big Data-0
No ratings yet
Introduction To Big Data-0
77 pages
Bigdata PPT Slides (E)
No ratings yet
Bigdata PPT Slides (E)
10 pages
BDP 2023 03
No ratings yet
BDP 2023 03
59 pages
Big Data Question Bank
No ratings yet
Big Data Question Bank
38 pages
Big data assignment notes
No ratings yet
Big data assignment notes
13 pages
Big Data: Introduction To Terms, Concepts and Tools
No ratings yet
Big Data: Introduction To Terms, Concepts and Tools
23 pages
Big Data Processing: Jiaul Paik
No ratings yet
Big Data Processing: Jiaul Paik
47 pages
40833 OR
No ratings yet
40833 OR
29 pages
Mastering Apache Iceberg: Managing Big Data in a Modern Data Lake
From Everand
Mastering Apache Iceberg: Managing Big Data in a Modern Data Lake
Robert Johnson
No ratings yet
HADOOP
No ratings yet
HADOOP
55 pages
Map Reduce and Format Features
No ratings yet
Map Reduce and Format Features
61 pages
Course Outline and Introduction
No ratings yet
Course Outline and Introduction
37 pages
Intr Oduction of Big Data
No ratings yet
Intr Oduction of Big Data
12 pages
Hands-on Guide to Apache Spark 3: Build Scalable Computing Engines for Batch and Stream Data Processing Alfonso Antolínez García download
100% (1)
Hands-on Guide to Apache Spark 3: Build Scalable Computing Engines for Batch and Stream Data Processing Alfonso Antolínez García download
79 pages
BDA Answer Bank
No ratings yet
BDA Answer Bank
24 pages
IoT Protocol
No ratings yet
IoT Protocol
12 pages
Java Throw Keyword
No ratings yet
Java Throw Keyword
9 pages
Java Swing
No ratings yet
Java Swing
9 pages
Security Management of An Iot Ecosystem
No ratings yet
Security Management of An Iot Ecosystem
8 pages
Bandwidth Utilization
No ratings yet
Bandwidth Utilization
17 pages
Big Data Storage Concepts
No ratings yet
Big Data Storage Concepts
7 pages
NAND and NOR Implementations
No ratings yet
NAND and NOR Implementations
15 pages
Types of Big Data Analytics
No ratings yet
Types of Big Data Analytics
7 pages
Error Detection & Correction Codes
No ratings yet
Error Detection & Correction Codes
10 pages
Stack and Queue
No ratings yet
Stack and Queue
10 pages
Bipolar Logic Family
No ratings yet
Bipolar Logic Family
10 pages
Realization of IoT Ecosystem Using Wireless Technologies
No ratings yet
Realization of IoT Ecosystem Using Wireless Technologies
8 pages
IPv4 Uses 32-Bit Addresses
No ratings yet
IPv4 Uses 32-Bit Addresses
17 pages
Iot AND m2m
No ratings yet
Iot AND m2m
11 pages
Fs Webmethods Activetransfer en
No ratings yet
Fs Webmethods Activetransfer en
3 pages
PROJECT 2-SEMM1013 PROGRAMMING FOR ENGINEERS
No ratings yet
PROJECT 2-SEMM1013 PROGRAMMING FOR ENGINEERS
4 pages
Demo Documentation
No ratings yet
Demo Documentation
6 pages
SDS Online Shopping Application Project
No ratings yet
SDS Online Shopping Application Project
10 pages
co-po mapping
No ratings yet
co-po mapping
2 pages
Ultimate Ethical Hacker Handbook The Complete Guide To Mastering Ethical Hacking
No ratings yet
Ultimate Ethical Hacker Handbook The Complete Guide To Mastering Ethical Hacking
182 pages
Routing Protocols For Wireless Sensor Networks: Murtadha S. Al-Sabbagh
No ratings yet
Routing Protocols For Wireless Sensor Networks: Murtadha S. Al-Sabbagh
34 pages
Ebook - Foundations of Embedded Security Solutions
No ratings yet
Ebook - Foundations of Embedded Security Solutions
23 pages
Course Listing PDF
100% (1)
Course Listing PDF
5 pages
Windows: Figure 1-1. The Docker Desktop For Windows Download Button
No ratings yet
Windows: Figure 1-1. The Docker Desktop For Windows Download Button
4 pages
Wmi Shell Andrei Dumitrescu PDF
No ratings yet
Wmi Shell Andrei Dumitrescu PDF
32 pages
Syntec CNC Alarm Manual
No ratings yet
Syntec CNC Alarm Manual
772 pages
Powershell Examples
No ratings yet
Powershell Examples
5 pages
CSS Selectors
No ratings yet
CSS Selectors
3 pages
Basic Excel PDF
No ratings yet
Basic Excel PDF
5 pages
Present and Future of Linux
No ratings yet
Present and Future of Linux
2 pages
Speeech
No ratings yet
Speeech
2 pages
Track Consignment
No ratings yet
Track Consignment
1 page
Ajp11. Minu
No ratings yet
Ajp11. Minu
5 pages
PCI DSS Certification-Compressed
No ratings yet
PCI DSS Certification-Compressed
13 pages
Andromeda
No ratings yet
Andromeda
23 pages
CO Unit 1 Chap 1 Notes
No ratings yet
CO Unit 1 Chap 1 Notes
11 pages
Quickspec DL380 G9
No ratings yet
Quickspec DL380 G9
38 pages
Animal-Face AI Test
No ratings yet
Animal-Face AI Test
1 page
Vuelos de Avion PDF
No ratings yet
Vuelos de Avion PDF
5 pages
About Me Sample of PPT WD
No ratings yet
About Me Sample of PPT WD
6 pages
Data Set 9: Bears (Wild Bears Anesthetized) : Explore
No ratings yet
Data Set 9: Bears (Wild Bears Anesthetized) : Explore
1 page
Key Answer
No ratings yet
Key Answer
4 pages
mod_menu_crash_2025_02_22-16_53_20
No ratings yet
mod_menu_crash_2025_02_22-16_53_20
5 pages
Postman Readme Guide
No ratings yet
Postman Readme Guide
36 pages

Big Data Processing Concepts

Uploaded by

Big Data Processing Concepts

Uploaded by

Big Data Processing Concepts

You might also like