Cascade
Cascade
K.JanakiRam,
Classification of
DATA
Types of Data..
4 VS OF BIG DATA
COLLECTING DATA
Data collected at sensors and sent to big data system via events or
flat files.
Event Streams: we name the events by its content/ originator .
STORING DATA
Historically we used databases
Scale is a challenge: replication, sharding
Scalable options
No SQL (Cassandra, Hbase) [If data is
structured]
Distributed file systems (e.g. HDFS) [If
data is unstructured]
New SQL
In Memory computing, Volt DB
Specialized data structures
Graph Databases, Data structure servers
How to move?
Assuming 10Gb network, it takes 2 hours to copy
1TB, or 83 days to copy a 1PB.
How to Search?
Assuming each record is 1KB and one machine can
process 1000 records per sec, it needs 277CPU days to
process a 1TB and 785 CPU years to process a 1 PB
How to process?
How to convert algorithms to work in large size
How to create new algorithms
CHALLENGES
System build of many Computers.
That handles lots of data.
Running complex logic.
This pushes us to frontier of Distributed Systems
and Databases.
More data does not mean there is a simple
model.
Some models can be complex as the system.
WHAT IS HADOOP ?
FUTURE SCOPE
ROBOTIC
S
BIG
DATA
THAN
Mail me at,
[email protected]
Mobile No :9247661152.
K. JANAKI
RAM
12481A0456