Big Data Storage Concepts
Big Data Storage Concepts
S.Kavitha
Head & Assistant Professor
Department of Computer Science
Sri Sarada Niketan College of Science for Women,Karur.
• clusters
• file systems and distributed files systems
• NoSQL
• sharding
Clusters
• In computing, a cluster is a tightly coupled
collection of servers, or nodes. These servers
• usually have the same hardware specifications
and are connected together via a network to
• work as a single unit
File Systems and Distributed File Systems
• A file system is the method of storing and
organizing data on a storage device, such as
• flash drives, DVDs and hard drives. A file is an
atomic unit of storage used by the file
• system to store data.
NoSQL
• A Not-only SQL (NoSQL) database is a non-
relational database that is highly scalable,
• fault-tolerant and specifically designed to
house semi-structured and unstructured data.
A
• NoSQL database often provides an API-based
query interface that can be called from
• within an application.
Sharding
• Sharding is the process of horizontally
partitioning a large dataset into a collection of
• smaller, more manageable datasets called
shards.
1. Each shard can independently service reads
and writes for the specific subset of data that
it is responsible for.
2. Depending on the query, data may need to be
fetched from both shards.