0% found this document useful (0 votes)

15 views24 pages

02 Unit-BDA - Big Data Analytics

The document outlines the curriculum for a Big Data Analytics course, detailing its content, including the introduction to big data, analytics, technology landscape, and tools like Hadoop and MongoDB. It discusses the importance of big data analytics, challenges businesses face, and classifications of analytics. Additionally, it highlights the need for technologies to address big data challenges and provides references for further reading.

Uploaded by

sidhukola28

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views24 pages

02 Unit-BDA - Big Data Analytics

Uploaded by

sidhukola28

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

B.

TECH CSE III Year I Semester

2020 – 2021
VCE-R18 (Integrated Course)
BIG DATA ANALYTICS (A4513)

UNIT-2
BIG DATA ANALYTICS

A. BHANU PRASAD
Associate Professor, Dept. of CSE

VARDHAMAN COLLEGE OF ENGINEERING

(AUTONOMOUS)
Shamshabad – 501218, Hyderabad, AP
Course Theory Contents
INTRODUCTION TO BIG DATA: Classification of Digital Data, Characteristics
of Data, Evolution of Big Data, Definition of Big Data, Challenges with Big
Data, What is Big Data?, Other Characteristics of Data Which are not
Definitional Traits of Big Data, Why Big Data?
Are We Just an Information Consumer or Do we also Produce Information?,
Traditional Business Intelligence (BI) versus Big Data, A Typical Data
Warehouse Environment, A Typical Hadoop Environment, What is New
Today?, What is changing in the Realms of Big Data?

BIG DATA ANALYTICS: Where do we Begin?, What is Big Data

Analytics?, What Big Data Analytics Isn’t?, Why this Sudden Hype
Around Big Data Analytics?, Classification of Analytics, Greatest
Challenges that Prevent Businesses from Capitalizing on Big Data, Top
Challenges Facing Big Data,
Why is Big Data Analytics Important?, What Kind of Technologies are
we looking Toward to Help Meet the Challenges Posed by Big Data?,
Terminologies Used in Big Data Environments, Basically Available Soft
State Eventual Consistency (BASE), Few Top Analytics Tools.

2
Theory Contents Contd..
THE BIG DATA TECHNOLOGY LANDSCAPE: NoSQL (Not Only SQL),
Hadoop, Introduction to Hadoop, Introducing Hadoop, Why Hadoop?, Why
not RDBMS?, RDBMS versus Hadoop, Distributed Computing Challenges,
History of Hadoop, Hadoop Overview, Use Case of Hadoop, Hadoop
Distributors, HDFS (Hadoop Distributed File System), Processing Data with
Hadoop, Managing Resources and Applications with Hadoop YARN (Yet
another Resource Negotiator), Interacting with Hadoop Ecosystem.
INTRODUCTION TO MONGODB: What is MongoDB?, Why MongoDB?,
Terms Used in RDBMS and MongoDB, Data Types in MongoDB, MongoDB
Query Language.
INTRODUCTION TO MAPREDUCE PROGRAMMING: Introduction, Mapper,
Reducer, Combiner, Partitioner, Searching, Sorting, Compression
INTRODUCTION TO HIVE: What is Hive?, Hive Architecture, Hive Data
Types, Hive File Format, Hive Query Language (HQL)
INTRODUCTION TO PIG: What is Pig?, The Anatomy of Pig, Pig on Hadoop,
Pig Philosophy, Use Case for Pig: ETL Processing, Pig Latin Overview, Data
Types in Pig, Running Pig, Execution Modes of Pig, HDFS Commands.

3
UNIT – 2 CONTENTS
2. BIG DATA ANALYTICS
2.1 Where do we Begin?
2.2 What is Big Data Analytics?
2.3 What Big Data Analytics Isn’t?
2.4 Why this Sudden Hype Around Big Data Analytics?
2.5 Classification of Analytics
2.6 Greatest Challenges that Prevent Businesses from Capitalizing
on Big Data
2.7 Top Challenges Facing Big Data
2.8 Why is Big Data Analytics Important?
2.9 What Kind of Technologies are we looking Toward to Help
Meet the Challenges Posed by Big Data?
2.10 Terminologies Used in Big Data Environments
2.11 Basically Available Soft State Eventual Consistency (BASE)
2.12 Few Top Analytics Tools.
4
BOOKS
TEXT BOOKS:
1. Big Data and Analytics
Seema Acharya, Subhashini Chellappan
2nd Edition, Wiley India.

REFERENCE BOOKS:
2. Big Data Now
O'Reilly Media, 2nd Edition, 2012

3. Big Data: A Revolution That Will Transform

How We Live, Work, and Think
Viktor Mayer-Schonberger, Kenneth Cukier,
Mariner Books, 2014
5
2. BIG DATA ANALYTICS
2.1 Where do we Begin?
 Raw data is collected, classified, and organized.
 Associating it with adequate metadata and laying bare the context
converts this data into meaningful information.
 It is then aggregated and summarized so that it becomes easy to
consume it for analysis.
 Gradual accumulation of such meaningful information builds a
knowledge repository. This, in turn, helps with actionable insights
which prove useful for decision making. Refer Figure 3.1.

Fig 2.1: Transformation of data to yield actionable insights. 6

Where do we Begin? Contd..
 Organizations have realized that they will not be able to ignore big
data if they want to be competitive enough and make those timely
decisions to make well of the fleeting opportunities.
 They will have to analyze big time and also take into consideration big
data that makes it to the organization at unprecedented level in terms
of volume, velocity, and variety.
 Big data analytics is the process of examining big data to uncover
patterns, unearth trends, and find unknown correlations and other
useful information to make faster and better decisions.
 Analytics begin with analyzing all available data. Refer Figure 3.2.

Fig 2.2: Types of unstructured data available for analysis. 7

2.2 What is Big Data Analytics?
Big Data Analytics is:
1) Technology-enabled analytics: Quite a few data analytics and
visualization tools are available in the market today from leading
vendors such as IBM, Tableau, SAS, R Analytics, Statistica, etc. to help
process and analyze your big data.
2) About gaining a meaningful, deeper, and richer insight into your
business to steer it in the right direction, understanding the
customer’s demographics to cross-sell and up-sell to them, better
leveraging the services of your vendors and suppliers, etc.
3) About a competitive edge over your competitors by enabling you with
findings that allow quicker and better decision-making.
4) A tight handshake between three communities: IT, business users, and
data scientists.
5) Working with datasets whose volume and variety exceed the current
storage, processing capabilities and infrastructure of your enterprise.
6) About moving code to data. This makes perfect sense as the program
for distributed processing is tiny (few KBs) compared to the data
(TBs/PBs/ZBs). 8
2.3 What Big Data Analytics Isn’t?
 Big data isn’t only about volume but the variety and velocity too are
very important factors.
 Big data isn’t just about technology. It is about understanding what
the data is saying to us. It is about understanding relationships that we
thought never existed between datasets. It is about patterns and trends
waiting to be unveiled.
 And of course, big data analytics is not here to replace our now very
robust and powerful Relational Database Management System
(RDBMS) or our traditional Data Warehouse. It is here to coexist with
both RDBMS and Data Warehouse, leveraging the power of each to
yield business value.
 Big data analytics is not “One-size fits all” traditional RDBMS built on
shared disk and memory.
 It is not only used by huge online companies like a Google or Amazon,
but for any business and any industry that needs actionable insights
out of their data (both internal and external).

9
2.4 Why this Sudden Hype Around Big Data
Analytics?
 Why this sudden hype? Let us put it down to three foremost reasons:
1) Data is growing at a 40% compound annual rate, reaching nearly 45
ZB by 2020. In 2010, almost about 1.2 trillion Gigabyte of data was
generated. This amount doubled to 2.4 trillion Gigabyte in 2012 and to
about 5 trillion Gigabytes in the year 2014. The volume of business
data worldwide is expected to double every 1.2 years. Every day 2.5
quintillion bytes of data is created, with 90% of the world’s data
created in the past 2 years alone.
2) Cost per gigabyte of storage has
hugely dropped.
3) There are an overwhelming number
of user-friendly analytics tools
available in the market today.

Fig 2.1: What big data entails? 10

2.5 Classification of Analytics
 There are basically two schools of thought:
1) Those that classify analytics into basic, operationalized, advanced, and
monetized.
2) Those that classify analytics into analytics 1.0, analytics 2.0, and
analytics 3.0.
First School of Thought:
1) Basic analytics: This primarily is slicing and dicing of data to help with
basic business insights. This is about reporting on historical data,
basic visualization, etc.
2) Operationalized analytics: It is operationalized analytics if it gets
woven into the enterprise’s business processes.
3) Advanced analytics: This largely is about forecasting for the future by
way of predictive and prescriptive modeling.
4) Monetized analytics: This is analytics in use to derive direct business
revenue.
5) 3.5.2 Second School of Thought Let us take a closer look at analytics
1.0, analytics 2.0, and analytics 3.0. Refer Table 3.1. 11
Classification of Analytics Contd..
Second School of Thought:
1) Let us take a closer look at analytics 1.0, analytics 2.0, and analytics
3.0. Refer Table 3.1.
Analytics 1.0 Analytics 2.0 Analytics 3.0
mid 1950s to 2009 2005 to 2012 2012 to present
Descriptive statistics Descriptive statistics Descriptive + predictive +
(report on events, + predictive statistics prescriptive statistics (use
occurrences, etc. of (use data from the data from the past to
the past) past to make make prophecies for the
predictions for the future and make
future) recommendations)
Key questions asked: Key questions asked: Key questions asked:
What happened? What will happen? What will happen?
Why did it happen? Why will it happen? When will it happen?
Why will it happen?
What should be the action
taken to take advantage of
what will happen? 12
Classification of Analytics Contd..
Analytics 1.0 Analytics 2.0 Analytics 3.0
Data from legacy Big data A blend of big data and
systems, ERP, CRM, data from legacy systems,
and 3rd party ERP, CRM, and 3rd party
applications. applications.
Small and structured Big data is being A blend of big data and
data sources. Data taken up seriously. traditional analytics to
stored in enterprise Data is mainly yield insights and
data warehouses or unstructured, offerings with speed and
data marts. arriving at a much impact.
higher pace.
Data was internally Data was often Data is both being
sourced. externally sourced. internally and externally
sourced.
Relational databases Database appliances, In memory analytics, in
Hadoop clusters, SQL database processing, agile
to Hadoop analytical methods,
environments, etc. machine learning 13
techniques, etc.
2.6 Greatest Challenges that Prevent
Businesses from Capitalizing on Big Data
1) Obtaining executive sponsorships for investments in big data and its
related activities (such as training, etc.).
2) Getting the business units to share information across organizational
silos.
3) Finding the right skills (business analysts and data scientists) that can
manage large amounts of structured, semi-structured, and
unstructured data and create insights from it.
4) Determining the approach to scale rapidly and elastically. In other
words, the need to address the storage and processing of large volume,
velocity, and variety of big data.
5) Deciding whether to use structured or unstructured, internal or
external data to make business decisions.
6) Choosing the optimal way to report findings and analysis of big data
(visual presentation and analytics) for the presentations to make the
most sense.
7) Determining what to do with the insights created from big data.

14
2.7 Top Challenges Facing Big Data
 Following are the various top challenges of big data:
1) Scale: Storage (RDBMS or NoSQL) is one major concern that needs to be
addressed to handle the need for scaling rapidly and elastically. The need of
the hour is a storage that can best withstand the onslaught of large volume,
velocity, and variety of big data? Should you scale vertically or should you
scale horizontally?
2) Security: Most of the NoSQL big data platforms have poor security
mechanisms (lack of proper authentication and authorization mechanisms)
when it comes to safeguarding big data.
3) Schema: Rigid schemas have no place. We want the technology to be able to
fit our big data and not the other way around. The need of the hour is
dynamic schema. Static (pre-defined schemas) are old.
4) Continuous availability: The big question here is how to provide 24/7
support because almost all RDBMS and NoSQL big data platforms have a
certain amount of downtime built in.
5) Consistency: Should one opt for consistency or eventual consistency?
6) Partition tolerant: How to build partition tolerant systems that can take care
of both hardware and software failures?
7) Data quality: How to maintain data quality – data accuracy, completeness,15
timeliness, etc.? Do we have appropriate metadata in place?
2.8 Why is Big Data Analytics Important?
 Following are the various approaches to analysis of data and what it
leads to.
1) Reactive – Business Intelligence: Business Intelligence (BI) allows
the businesses to make faster and better decisions by providing the
right information to the right person at the right time in the right
format. It is about analysis of the past or historical data and then
displaying the findings of the analysis or reports in the form of
enterprise dashboards, alerts, notifications, etc.
2) Reactive – Big Data Analytics: Here the analysis is done on huge
datasets but the approach is still reactive as it is still based on static
data.
3) Proactive – Analytics: This is to support futuristic decision making by
the use of data mining, predictive modeling, text mining, and
statistical analysis. This analysis has severe limitations on the storage
capacity and the processing capability.
4) Proactive – Big Data Analytics: This is filtering through terabytes of
information to filter out the relevant data to analyze. This also
includes high performance analytics to gain rapid insights from big16
data and the ability to solve complex problems using more data.
2.9 What Kind of Technologies are we looking Toward to
Help Meet the Challenges Posed by Big Data?
1) The first requirement is of cheap and abundant storage.
2) We need faster processors to help with quicker processing of big data.
3) Affordable open-source, distributed big data platforms, such as
Hadoop.
4) Parallel processing, clustering, virtualization, large grid
environments (to distribute processing to a number of machines),
high connectivity, and high throughputs rather than low latency.
5) Cloud computing and other flexible resource allocation
arrangements.

17
2.10 Terminologies Used in Big Data Environments
1) In-Memory Analytics: Data access from non-volatile storage such as
hard disk is a slow process. All the relevant data is stored in Random
Access Memory (RAM) or primary storage thus eliminating the need
to access the data from hard disk. The advantage is faster access,
rapid deployment, better insights, and minimal IT involvement.
2) In-Database Processing (analytics): works by blending data
warehouses with analytical systems. With in-database processing, the
database program itself can run the computations eliminating the
need for Extraction Transformation and Loading data into data
warehouse and thereby saving on time.
3) Symmetric Multiprocessor System (SMP): In SMP, there is a single
common main memory that is shared by two or more identical
processors. The processors have full access to all I/O devices and are
controlled by a single operating system instance. SMP are tightly
coupled multiprocessor systems. Each processor has its own high-
speed memory, called cache memory and are connected using a
system bus.

18
Terminologies in Big Data Contd..
4) Massive Parallel Processing (MPP): refers to the coordinated
processing of programs by a number of processors working parallel.
The processors, each have their own operating systems and dedicated
memory. They work on different parts of the same program and all
the executing segments can communicate with each other.
5) Difference Between Parallel and Distributed Systems: A parallel
database system is a tightly coupled system in which the processors
co-operate for query processing. The user is unaware of the
parallelism since he/she has no access to a specific processor of the
system. Either the processors have access to a common memory or
make use of message passing for communication. Distributed
database systems are known to be loosely coupled and are composed
by individual machines that can run their individual application and
serve their own respective user. The data is usually distributed across
several machines, thereby necessitating quite a number of machines
to be accessed to answer a user query.

19
Terminologies in Big Data Contd..
6) Shared Nothing Architecture: The three most common types of
architecture for multiprocessor high transaction rate systems are:
1. Shared Memory (SM) architecture: a common central memory is
shared by multiple processors
2. Shared Disk (SD) architecture: multiple processors share a
common collection of disks while having their own private
memory.
3. Shared Nothing (SN) architecture: neither memory nor disk is
shared among multiple processors.
Advantages of a “Shared Nothing Architecture”
1. Fault Isolation: A fault in a single node is contained and
confined to that node exclusively and exposed only through
messages (or lack of it).
2. Scalability: Assume that the disk is a shared resource in which
different nodes will have to take turns to access the critical
data. This imposes a limit on how many nodes can be added to
the distributed shared disk system, thus compromising on
scalability.
20
Terminologies in Big Data Contd..
7) CAP Theorem (Brewer’s Theorem) : states that in a distributed
computing environment (a collection of interconnected nodes that
share data), it is impossible to provide the following guarantees. One
must be sacrificed.
1. Consistency implies that every read fetches the last write.
2. Availability implies that reads and writes always succeed.
3. Partition tolerance implies that the system will continue to
function when network partition occurs.

21
2.11 Basically Available Soft State Eventual
Consistency (BASE)
 A few basic questions to start with:
1) Where is it used? In distributed computing.
2) Why is it used? To achieve high availability.
3) How is it achieved? If no new updates are made to the given data item
for a stipulated period of time, eventually all accesses to this data item
will return the updated value.
4) What is replica convergence? A system that has achieved eventual
consistency is said to have converged or achieved replica convergence.
5) Conflict resolution: How is the conflict resolved?
(a) Read repair: If the read leads to discrepancy or inconsistency, a
correction is initiated. It slows down the read operation.
(b) Write repair: If the write leads to discrepancy or inconsistency, a
correction is initiated. This will cause the write operation to slow
down.
(c) Asynchronous repair: The correction is not part of a read or write
operation.
22
2.12 Few Top Analytics Tools.
Below are the list of few top analytics tools.
1. MS Excel
2. SAS
3. IBM SPSS Modeler
4. Statistica
5. Salford systems
6. World Programming Systems (WPS)
Open Source Analytics Tools
1. R analytics
2. Weka

23
24

Big Data Analytics - CCS334 - Notes - Unit 1 - Understanding Big Data
No ratings yet
Big Data Analytics - CCS334 - Notes - Unit 1 - Understanding Big Data
40 pages
Databricks Associate Data Engineer Notes
No ratings yet
Databricks Associate Data Engineer Notes
39 pages
1.3 Module-1
No ratings yet
1.3 Module-1
26 pages
IoT NOtes
No ratings yet
IoT NOtes
34 pages
CHAPTER 02: Big Data Analytics
No ratings yet
CHAPTER 02: Big Data Analytics
73 pages
Chapter 1 - Intro To Business Analytics
No ratings yet
Chapter 1 - Intro To Business Analytics
52 pages
Unit - 2 Fundamentals of Big Data Analytics
No ratings yet
Unit - 2 Fundamentals of Big Data Analytics
39 pages
School Management System Complete Readmade Project
50% (4)
School Management System Complete Readmade Project
127 pages
Prepare Installer Lesson
No ratings yet
Prepare Installer Lesson
25 pages
What Is Big Data & Why Is Big Data Important in Today's Era
100% (1)
What Is Big Data & Why Is Big Data Important in Today's Era
13 pages
Big Data Analytics - Unit 1
No ratings yet
Big Data Analytics - Unit 1
43 pages
Big Data Analytics PDF
No ratings yet
Big Data Analytics PDF
22 pages
01 - Big Data Analytics - An Introduction
No ratings yet
01 - Big Data Analytics - An Introduction
45 pages
Big-Data-Analytics Notes For Ug
No ratings yet
Big-Data-Analytics Notes For Ug
10 pages
21ai402 Data Analytics Unit-1
No ratings yet
21ai402 Data Analytics Unit-1
37 pages
Reviewed Big Data Assignment
No ratings yet
Reviewed Big Data Assignment
6 pages
Apache Hadoop Training For Developers Day 1
No ratings yet
Apache Hadoop Training For Developers Day 1
136 pages
Big Data Analytics - AAM - Unit 1
No ratings yet
Big Data Analytics - AAM - Unit 1
178 pages
Unit1 BDT
No ratings yet
Unit1 BDT
96 pages
BDA Unit 1
No ratings yet
BDA Unit 1
23 pages
OC - Module 1 - Intro To BDA 021312
No ratings yet
OC - Module 1 - Intro To BDA 021312
37 pages
UNIT I BIG DATA Extra Content
No ratings yet
UNIT I BIG DATA Extra Content
15 pages
Chapter 1
No ratings yet
Chapter 1
35 pages
Big Data
No ratings yet
Big Data
13 pages
SRDF Solutions
No ratings yet
SRDF Solutions
192 pages
Unit I Big Data
No ratings yet
Unit I Big Data
256 pages
Unit - I - Types of Digital Data
No ratings yet
Unit - I - Types of Digital Data
45 pages
Da 1
No ratings yet
Da 1
20 pages
Introduction To Data
No ratings yet
Introduction To Data
34 pages
Unit 2
No ratings yet
Unit 2
19 pages
Big Data Analytics: Achieving Business Value From Big Data Analyticcs Anoop Dwivedi March 21, 2013
No ratings yet
Big Data Analytics: Achieving Business Value From Big Data Analyticcs Anoop Dwivedi March 21, 2013
24 pages
Chapter 1
No ratings yet
Chapter 1
49 pages
Bda Unit-1
No ratings yet
Bda Unit-1
43 pages
Big Data
No ratings yet
Big Data
16 pages
Big Data Manual - Edited
No ratings yet
Big Data Manual - Edited
69 pages
Chapter 1
No ratings yet
Chapter 1
40 pages
Unit-Ii Bdaur-Bcom
No ratings yet
Unit-Ii Bdaur-Bcom
7 pages
Krist Jayanti School, Bariya 20240624 192351 0000
No ratings yet
Krist Jayanti School, Bariya 20240624 192351 0000
9 pages
Operational and Analytical Big Data
No ratings yet
Operational and Analytical Big Data
23 pages
Big Data - Module 1
No ratings yet
Big Data - Module 1
35 pages
Big Data Unit 2
No ratings yet
Big Data Unit 2
19 pages
Data Analytic Life Cycle
No ratings yet
Data Analytic Life Cycle
33 pages
BDA Notes
No ratings yet
BDA Notes
68 pages
Big Data Analytics - CCS334 - Notes - ALL UNITS NOTES
No ratings yet
Big Data Analytics - CCS334 - Notes - ALL UNITS NOTES
130 pages
117769
No ratings yet
117769
20 pages
Chapter - 01 - Introduction To Big Data
No ratings yet
Chapter - 01 - Introduction To Big Data
22 pages
BigData - BCom Unit 2
No ratings yet
BigData - BCom Unit 2
10 pages
CS 329 Lecture One 2025
No ratings yet
CS 329 Lecture One 2025
28 pages
Content For
No ratings yet
Content For
7 pages
Insights Into Big Data: An Industrial Perspective
No ratings yet
Insights Into Big Data: An Industrial Perspective
52 pages
Bigdata Units
No ratings yet
Bigdata Units
80 pages
Big Data Analytics Project Proposal by Slidesgo
No ratings yet
Big Data Analytics Project Proposal by Slidesgo
12 pages
Big Data Analytics
No ratings yet
Big Data Analytics
37 pages
Module 4 DSBD
No ratings yet
Module 4 DSBD
89 pages
Introduction To Big Data
No ratings yet
Introduction To Big Data
4 pages
Unit 1 - ETI (BDA)
No ratings yet
Unit 1 - ETI (BDA)
20 pages
Unit 1
No ratings yet
Unit 1
89 pages
Introduction Part
No ratings yet
Introduction Part
5 pages
1.big Data and Its Importance
No ratings yet
1.big Data and Its Importance
17 pages
TDWI BPReport Q411 Big Data ExecSummary
No ratings yet
TDWI BPReport Q411 Big Data ExecSummary
6 pages
OS Class 06 Selected MCQ's of For Bihar STET, BPSC 4.0 & Bihar Beltron Programmer 2024
No ratings yet
OS Class 06 Selected MCQ's of For Bihar STET, BPSC 4.0 & Bihar Beltron Programmer 2024
8 pages
Exacqvision Product Catalog
No ratings yet
Exacqvision Product Catalog
40 pages
Sams Teach Yourself UNIX System Administration in 24 Hours
100% (1)
Sams Teach Yourself UNIX System Administration in 24 Hours
525 pages
Introduction To Oracle Linux - Shell and Command Line
No ratings yet
Introduction To Oracle Linux - Shell and Command Line
1 page
Netezza Fundamentals PDF
No ratings yet
Netezza Fundamentals PDF
60 pages
MMC Fiori Cheat Sheet PDF
No ratings yet
MMC Fiori Cheat Sheet PDF
2 pages
Database Management SystemsWFA
No ratings yet
Database Management SystemsWFA
71 pages
Eckmar's Marketplace Script v2.0 - Latest Version
No ratings yet
Eckmar's Marketplace Script v2.0 - Latest Version
5 pages
LoopCV Guide
No ratings yet
LoopCV Guide
5 pages
7388 SE07201 BD00726 DaoDuyVien ASM Part1 Report
No ratings yet
7388 SE07201 BD00726 DaoDuyVien ASM Part1 Report
37 pages
The Rails and Hotwire Codex Preview
No ratings yet
The Rails and Hotwire Codex Preview
86 pages
Quanta Op8 R1a Schematics PDF
No ratings yet
Quanta Op8 R1a Schematics PDF
41 pages
Design Studio SDK Ui5 Table With Conditional Formats
No ratings yet
Design Studio SDK Ui5 Table With Conditional Formats
7 pages
CCNA 2 Chapter 2 v5.0 Exam Answers 2015 100
No ratings yet
CCNA 2 Chapter 2 v5.0 Exam Answers 2015 100
7 pages
April2025 AzureOpen (AI) PromptEngineering en
No ratings yet
April2025 AzureOpen (AI) PromptEngineering en
10 pages
Aradial AAA Server Performance Settings
No ratings yet
Aradial AAA Server Performance Settings
4 pages
Final Project
No ratings yet
Final Project
31 pages
Release Notes For Red Hat Enterprise Linux 5.8 Edition 8
No ratings yet
Release Notes For Red Hat Enterprise Linux 5.8 Edition 8
22 pages
04 Introduction To CassandraDB
No ratings yet
04 Introduction To CassandraDB
19 pages
(Ch#13) Functions Computer Science Part-II
No ratings yet
(Ch#13) Functions Computer Science Part-II
4 pages
CST202 Computer Organization and Architecture, December 2024
No ratings yet
CST202 Computer Organization and Architecture, December 2024
2 pages
Tutorial 4 Problem Solving: Repetition (Nested Loop) & Conditional Structures
No ratings yet
Tutorial 4 Problem Solving: Repetition (Nested Loop) & Conditional Structures
4 pages
Com - Bat.loader Logcat
No ratings yet
Com - Bat.loader Logcat
14 pages
A. ROMMON and Configuration Register Settings
No ratings yet
A. ROMMON and Configuration Register Settings
5 pages
Adpcm-Hco PB 1 5 2
No ratings yet
Adpcm-Hco PB 1 5 2
2 pages
Application Form For Grant of Condonation-IV B.Tech. II Semester
No ratings yet
Application Form For Grant of Condonation-IV B.Tech. II Semester
1 page
Wireless Communications and Mobile Computing - 2022 - Hussain - Face Mask Detection Using Deep Convolutional Neural Network
No ratings yet
Wireless Communications and Mobile Computing - 2022 - Hussain - Face Mask Detection Using Deep Convolutional Neural Network
10 pages
Log
No ratings yet
Log
3 pages
OOPs Question Bank
No ratings yet
OOPs Question Bank
2 pages
Kunal Gir BodyRecomposition Workout Plan
No ratings yet
Kunal Gir BodyRecomposition Workout Plan
7 pages
It Resume Formate
No ratings yet
It Resume Formate
3 pages
Bus Ticket Booking Documentation
No ratings yet
Bus Ticket Booking Documentation
4 pages
Calloused Mind Training Journal
No ratings yet
Calloused Mind Training Journal
3 pages
Brosur DNH Computer
No ratings yet
Brosur DNH Computer
2 pages
Daily Failure Visualization Script Huberman
No ratings yet
Daily Failure Visualization Script Huberman
2 pages
Parisodhana 2025 Template
No ratings yet
Parisodhana 2025 Template
1 page
Squares and Cubes - 7041083 - 2025 - 02 - 21 - 23 - 27
No ratings yet
Squares and Cubes - 7041083 - 2025 - 02 - 21 - 23 - 27
3 pages
Big Data: Revolutionizing the Future
From Everand
Big Data: Revolutionizing the Future
Parvati Mishra
No ratings yet
BI and Big Data Management
From Everand
BI and Big Data Management
Ulrich Hambuch
No ratings yet
Big Data for Enterprise Architects
From Everand
Big Data for Enterprise Architects
Dr Mehmet Yildiz
4.5/5 (3)
Big Data: Statistics, Data Mining, Analytics, And Pattern Learning
From Everand
Big Data: Statistics, Data Mining, Analytics, And Pattern Learning
Rob Botwright
No ratings yet

02 Unit-BDA - Big Data Analytics

Uploaded by

02 Unit-BDA - Big Data Analytics

Uploaded by

B.

TECH CSE III Year I Semester

VARDHAMAN COLLEGE OF ENGINEERING

BIG DATA ANALYTICS: Where do we Begin?, What is Big Data

3. Big Data: A Revolution That Will Transform

Fig 2.1: Transformation of data to yield actionable insights. 6

Fig 2.2: Types of unstructured data available for analysis. 7

Fig 2.1: What big data entails? 10

You might also like