Class Notes

This document discusses distributed file systems. It begins by defining distributed file systems and their ability to provide location-transparent storage across distributed processes. It then discusses challenges in building distributed file systems that can perform well under different workloads. The document reviews several popular distributed file systems and how they are optimized for specific use cases. It also summarizes the evolution of distributed file system architectures and how techniques like distributed hash tables and erasure codes have improved scalability and resilience. Finally, it discusses how cryptographic hashes and techniques like Merkle trees and digital signatures can help ensure integrity and authenticity of data in distributed file systems.

Uploaded by

viswa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

70 views9 pages

Class Notes

Uploaded by

viswa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 9

DISTRUBUTED FILE

SYSTEMS
BY:
T.GUNA SEKHAR – 18BCI0002
U. CHAITANYA – 18BCE0292
M.SAI KRISHNA – 18BCE0783
S.SUSANTH – 18BCE0552
ABSTRACT
Distributed file systems provide a fundamental abstraction to
location-transparent, permanent storage. They allow
distributed processes to co-operate on hierarchically
organized data beyond the life-time of each individual
process. The great power of the file system interface lies in the
fact that applications do not need to be modified in order to
use distributed storage. On the other hand, the general and
simple file system interface makes it notoriously difficult for a
distributed file system to perform well under a variety of
different workloads. This has lead to today’s landscape with a
number of popular distributed file systems, each tailored to a
specific use case..
ABSTRACT
Early distributed file systems merely execute file system calls on a
remote server, which limits scalability and resilience to failures. Such
limitations have been greatly reduced by modern techniques such
as distributed hash tables, content-addressable storage, distributed
consensus algorithms, or erasure codes. In the light of upcoming
scientific data volumes at the exabyte scale, two trends are
emerging. First, the previously monolithic design of distributed file
systems is decomposed into services that independently provide a
hierarchical namespace, data access, and distributed
coordination. Secondly, the segregation of storage and computing
resources yields to a storage architecture in which every compute
node also participates in providing persistent storage.
INTRODUCTION
A distributed file system is a client/server-based application that
allows clients to access and process data stored on the server as if it
were on their own computer. When a user accesses a file on the
server, the server sends the user a copy of the file, which is cached
on the user's computer while the data is being processed and is
then returned to the server. Ideally, a distributed file system
organizes file and directory services of individual servers into a
global directory in such a way that remote data access is not
location-specific but is identical from any client. All files are
accessible to all users of the global file system and organization is
hierarchical and directory-based.
INTRODUCTION:

Since more than one client may access the same data
simultaneously, the server must have a mechanism in place (such as
maintaining information about the times of access) to organize
updates so that the client always receives the most current version
of data and that data conflicts do not arise. Distributed file systems
typically use file or database replication (distributing copies of data
on multiple servers) to protect against data access failures. Sun
Microsystems' Network File System (NFS), NovellNetWare, Microsoft's
Distributed File System, and IBM/Transarc's DFS are some examples
of distributed file systems.
LITERATURE SURVEY:
HOW THESE FILE SYSTEMS ARE USED?
Even though the file system interface is general and fits a broad spectrum of applications,
most distributed file system implementations are optimized for a particular class of
applications. For instance, the Andrew File System (AFS) is optimized for users’ home
directories. XrootD is optimized for high-throughput access to high-energy physics data sets
the Hadoop File System (HDFS) is designed as a storage layer for the MapReduce framework
These use cases differ both quantitatively and qualitatively. Consider a multi-dimensional
vector describing different levels of properties or requirements for a particular class of data
that consists of data value, data confidentiality, redundancy, volume, median file size,
change frequency, and request rate. Every single use case above poses high requirements in
only some of the dimensions. All of the use cases combined, however, would require a
distributed file system with outstanding performance in every dimension. Moreover, some
requirements contradict each other: a high level of redundancy (e. g. for recorded
experiment data) inevitably reduces the write throughput in cases where redundancy is not
needed (e. g. for a scratch area). The file system interface provides no standard way to
specify quality of service properties for particular files or directories. Instead, we have to resort
to using a number of distributed file systems, each with implicit quality of service guarantees
and mounted at a well-known location (/afs, /eos, /cvmfs, /data, /scratch, . . . ). Quantitative
file system studies, which are unfortunately rare, provide precise workload characterizations
to guide file system implementers
LITERATURE SURVEY:
ARCHITECTURE EVOLUTION:
The simplest architecture for a distributed file system is a single server that exports a local
directory tree to a number of clients (e. g. NFSv3). This architecture is obviously limited
by the capabilities of the exporting server. An approach to overcome some of these
limitations is to delegate ownership and responsibility of certain file system subtrees to
different servers, as done by AFS. In order to provide access to remote servers, AFS
allows for lose coupling of multiple file system trees (“cells”). Across cells, this
architecture is not network-transparent: moving a file from one cell to another requires a
change of path. It also involves a copy through the node which triggers the move, e. g.
move is not a namespace-only operation. Furthermore, the partitioning of a file system
tree is static and changing it requires administrative intervention. In object-based file
systems, data management and meta-data management is separated (e. g. GFS). Files
are spread over a number of servers that handle read and write operations. A meta-
data server maintains the directory tree and takes care of data placement. As long as
meta-data load is much smaller than data operations (i. e. files are large), this
architecture allows for incremental scaling. As the load increases, data servers can be
added one by one with minimal administrative overhead. The architecture is refined by
parallel file systems (e. g. Lustre) that cut every file in small blocks and distribute the
blocks over many nodes. Thus read and write operations are executed in parallel on
multiple servers for better maximum throughput.
LITERATURE SURVEY:
FILE SYSTEM INTEGRITY:
Global file systems often need to transfer data via untrusted connections and
still ensure integrity and authenticity of the data. Cryptographic hashes of the
content of files are often used to ensure data integrity. Cryptographic hashes
provide a short, constant length, unique identifier for data of any size. Collisions
are virtually impossible to occur neither by chance nor by clever crafting,
which makes cryptographic hashes a means to protect against data
tampering Many globally distributed file systems use cryptographic hashes in
the form of content-addressable storage where the name of a file is derived
from its cryptographic content hash. This allows for verification of the data
independently of the meta-data. It also results in immutable data, which
eliminates the problem of detecting stale cache entries and keeping cache
consistency. Furthermore, redundant data and duplicated files are
automatically de-duplicated, which in some use cases (backups, scientific
software binaries) reduces the actual storage space utilization by many
factorsCryptographic hashes are also used to protect the integrity of the file
system tree when combined with a Merkle tree
LITERATURE SURVEY:
FILE SYSTEM INTEGRITY:
Cryptographic hashes are also used to protect the integrity of the file
system tree when combined with a Merkle tree. In a Merkle tree, nodes
recursively hash their children’s cryptographic hashes so that the root hash
uniquely identifies the state of the entire file system. Copies of this root hash
created at various points in time provide access to previous snapshots of
file systems, which effectively allows for backups and for versioned file
systems. The hashes in the tree can also be cryptographically signed in
order to ensure data authenticity of a file system or a subtree (who
created the content). An elegant way to solve the problem of key
distribution inherent to digital signatures is the encoding of the public key
as part of the path name. To protect against silent corruption—the
probabilistic decay of physical storage media over time—simple
checksums such as CRC32 provide an easy means. Checksums can be
faster verified than cryptographic hashes, fast enough to compute them
on the fly on every read access

Database Bca
No ratings yet
Database Bca
148 pages
Pro Watch Installation Guide
No ratings yet
Pro Watch Installation Guide
44 pages
Enhancement Framework - Class Enhancements - Pre-Exit, Post-Exit and Overwrite-Exit Methods - Concept and Simple Scenarios
No ratings yet
Enhancement Framework - Class Enhancements - Pre-Exit, Post-Exit and Overwrite-Exit Methods - Concept and Simple Scenarios
26 pages
Calculator Project Report 1
No ratings yet
Calculator Project Report 1
36 pages
Cs614 Solved Current Subjective Final Term by Junaid
No ratings yet
Cs614 Solved Current Subjective Final Term by Junaid
8 pages
Write A Java Program To Perform Basic Arithmetic Operations of Two Numbers. Sol
No ratings yet
Write A Java Program To Perform Basic Arithmetic Operations of Two Numbers. Sol
36 pages
KENDRIYA VIDYALAYA, KAVARATTI 682555 (U.T. of Lakshadweep) : Mcqs Class 12
No ratings yet
KENDRIYA VIDYALAYA, KAVARATTI 682555 (U.T. of Lakshadweep) : Mcqs Class 12
28 pages
SnowPro Architect Exam Guide - 060123
No ratings yet
SnowPro Architect Exam Guide - 060123
12 pages
01 IdentityIQ Preview
No ratings yet
01 IdentityIQ Preview
18 pages
Smart Monitoring of The Train and Train Tracks To Prevent Railway Hazards
No ratings yet
Smart Monitoring of The Train and Train Tracks To Prevent Railway Hazards
25 pages
Distributed File System Questions and Answers
100% (1)
Distributed File System Questions and Answers
6 pages
Distributed File System
100% (1)
Distributed File System
17 pages
Unite 2011 Scalable Game Development
No ratings yet
Unite 2011 Scalable Game Development
58 pages
Loadrunner and Performance Center Ds
No ratings yet
Loadrunner and Performance Center Ds
2 pages
Dist Sys Unit 4 Notes
No ratings yet
Dist Sys Unit 4 Notes
45 pages
Lecture 5 - DFS & NFS
No ratings yet
Lecture 5 - DFS & NFS
45 pages
Rev. Lecture 1 PPT2
No ratings yet
Rev. Lecture 1 PPT2
24 pages
Enterprise Networking Explained Types Concepts Trends
No ratings yet
Enterprise Networking Explained Types Concepts Trends
4 pages
Expdp Impdp Log
No ratings yet
Expdp Impdp Log
29 pages
Satya 89 Survey
No ratings yet
Satya 89 Survey
28 pages
SIT102 Lecture 8.2
No ratings yet
SIT102 Lecture 8.2
32 pages
4.1 Distributed File Systems: Introduction: Jisy Raju Assistant Professor, CE Cherthala
No ratings yet
4.1 Distributed File Systems: Introduction: Jisy Raju Assistant Professor, CE Cherthala
20 pages
Cse1007 Java-Programming Eth 1.0 52 Cse1007
No ratings yet
Cse1007 Java-Programming Eth 1.0 52 Cse1007
3 pages
5 Distributed File System
100% (1)
5 Distributed File System
59 pages
Operating System
No ratings yet
Operating System
38 pages
File Management
No ratings yet
File Management
40 pages
Navigating The Landscape of Distributed File Systems: Architectures, Implementations, and Considerations
No ratings yet
Navigating The Landscape of Distributed File Systems: Architectures, Implementations, and Considerations
10 pages
Oschapter 8
No ratings yet
Oschapter 8
27 pages
Administration, Sales, CRM, Banking, Inventory: Sap Business One - Students New Future Carrier Enhancer
No ratings yet
Administration, Sales, CRM, Banking, Inventory: Sap Business One - Students New Future Carrier Enhancer
2 pages
Module 2
No ratings yet
Module 2
27 pages
Distributed File System - File Service Architecture
No ratings yet
Distributed File System - File Service Architecture
51 pages
Docs Template
No ratings yet
Docs Template
12 pages
BDA Unit I
No ratings yet
BDA Unit I
18 pages
Lecture 4.0 - Distributed File Systems
No ratings yet
Lecture 4.0 - Distributed File Systems
15 pages
DC - PPT A Case Study On Distributed File Systems
No ratings yet
DC - PPT A Case Study On Distributed File Systems
17 pages
Distributed File Systems
No ratings yet
Distributed File Systems
22 pages
A Novel Distributed File System Using Blockchain Metadata
No ratings yet
A Novel Distributed File System Using Blockchain Metadata
20 pages
@klwks - Bot Os Co-4 Ha-4
No ratings yet
@klwks - Bot Os Co-4 Ha-4
17 pages
DFS Design and Implementation: Brent R. Hafner
No ratings yet
DFS Design and Implementation: Brent R. Hafner
40 pages
Hadoop and Big Data Unit 2
No ratings yet
Hadoop and Big Data Unit 2
11 pages
Distributed Systems U4
No ratings yet
Distributed Systems U4
8 pages
2distributed File System Dfs
No ratings yet
2distributed File System Dfs
21 pages
Presentation ON Distributed File System: Institute of Engineering and Technology Bundelkhand University
No ratings yet
Presentation ON Distributed File System: Institute of Engineering and Technology Bundelkhand University
51 pages
Module III Hadoop Framework
No ratings yet
Module III Hadoop Framework
21 pages
Helix 3 Pro
No ratings yet
Helix 3 Pro
1 page
File System Interface: Polytechnic University of The Philippines
No ratings yet
File System Interface: Polytechnic University of The Philippines
31 pages
Electronics: Performance Evaluations of Distributed File Systems For Scientific Big Data in FUSE Environment
No ratings yet
Electronics: Performance Evaluations of Distributed File Systems For Scientific Big Data in FUSE Environment
16 pages
Distributed File Systems Design
No ratings yet
Distributed File Systems Design
21 pages
Chapter 3 The File System
No ratings yet
Chapter 3 The File System
34 pages
Advanced, Composite, Matrix, Nbox, Search, Simple, Transposed, and Trending
No ratings yet
Advanced, Composite, Matrix, Nbox, Search, Simple, Transposed, and Trending
2 pages
UNS A Portable Mobile and Exchangeable Namespace For Supporting Fetch-from-Anywhere Big Data Eco-Systems
No ratings yet
UNS A Portable Mobile and Exchangeable Namespace For Supporting Fetch-from-Anywhere Big Data Eco-Systems
8 pages
Distributed File System
No ratings yet
Distributed File System
27 pages
Distributed File Systems Concepts and e 61384
No ratings yet
Distributed File Systems Concepts and e 61384
54 pages
DFS Design and Implementation
No ratings yet
DFS Design and Implementation
40 pages
Unit-3 (Bit-43)
No ratings yet
Unit-3 (Bit-43)
16 pages
A Comparative Study of The Architectures and Applications of Scalable High-Performance Distributed File Systems
No ratings yet
A Comparative Study of The Architectures and Applications of Scalable High-Performance Distributed File Systems
11 pages
Smoke and Sanity Tesing
No ratings yet
Smoke and Sanity Tesing
6 pages
m94922 Laing Paper
No ratings yet
m94922 Laing Paper
7 pages
Distributed File Systems & Name Services: UNIT-4
No ratings yet
Distributed File Systems & Name Services: UNIT-4
70 pages
Distributed File System
No ratings yet
Distributed File System
7 pages
8 06072873 Sec Real DFS
No ratings yet
8 06072873 Sec Real DFS
6 pages
(DFS) Distributed File System-1
No ratings yet
(DFS) Distributed File System-1
12 pages
The Hadoop Approach
100% (2)
The Hadoop Approach
14 pages
File System
No ratings yet
File System
27 pages
9238 DC Assignment 3
No ratings yet
9238 DC Assignment 3
5 pages
Distributed File System
No ratings yet
Distributed File System
5 pages
What Is A File System?
No ratings yet
What Is A File System?
7 pages
Distributed Operating Systems
No ratings yet
Distributed Operating Systems
5 pages
Google File System and Hadoop Distributed File System-An Analogy
No ratings yet
Google File System and Hadoop Distributed File System-An Analogy
11 pages
CSCI319 Distributed Systems
No ratings yet
CSCI319 Distributed Systems
26 pages
COM 214 File Ogarnization and Management Lecture Note 4
No ratings yet
COM 214 File Ogarnization and Management Lecture Note 4
14 pages
LU Decomposition Method PDF
No ratings yet
LU Decomposition Method PDF
5 pages
LU Decomposition Method PDF
No ratings yet
LU Decomposition Method PDF
5 pages
Macdoored: A First Look Into Real-World Macos Intrusions
No ratings yet
Macdoored: A First Look Into Real-World Macos Intrusions
35 pages
44-Layer Management PDF
No ratings yet
44-Layer Management PDF
12 pages
Human-Computer Interaction: Termin 3: The Human Movement, Memory, Attention, Thinking, Emotion
No ratings yet
Human-Computer Interaction: Termin 3: The Human Movement, Memory, Attention, Thinking, Emotion
48 pages
7 A Taxonomy and Survey On Distributed File Systems
No ratings yet
7 A Taxonomy and Survey On Distributed File Systems
6 pages
Distributed File Systems Engineering: Definitive Reference for Developers and Engineers
From Everand
Distributed File Systems Engineering: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Jenkins A Beginner's Guide
No ratings yet
Jenkins A Beginner's Guide
4 pages
Java Da-1
No ratings yet
Java Da-1
27 pages
Requirements For Distributed File Systems
No ratings yet
Requirements For Distributed File Systems
4 pages
RMO No. 28-2020
No ratings yet
RMO No. 28-2020
7 pages
GPS Vs Hdfs
No ratings yet
GPS Vs Hdfs
6 pages
Da 4 Aod Lab
No ratings yet
Da 4 Aod Lab
10 pages
Apex One Iproduct Disaster Recovery Guide-12-09-20
No ratings yet
Apex One Iproduct Disaster Recovery Guide-12-09-20
19 pages
VL2020210104311 Fat PDF
No ratings yet
VL2020210104311 Fat PDF
6 pages
VL2020210104311 Fat PDF
No ratings yet
VL2020210104311 Fat PDF
6 pages
Lab Exam 16 10 2020
No ratings yet
Lab Exam 16 10 2020
4 pages
Cloud Computing
No ratings yet
Cloud Computing
6 pages
My - Resume-18th Dec
No ratings yet
My - Resume-18th Dec
2 pages
Distributed File Systems
No ratings yet
Distributed File Systems
18 pages
Rsync Solutions: Definitive Reference for Developers and Engineers
From Everand
Rsync Solutions: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
TSplus Customer Portal User Guide
No ratings yet
TSplus Customer Portal User Guide
7 pages
Advanced Fuse Implementation: Definitive Reference for Developers and Engineers
From Everand
Advanced Fuse Implementation: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Chapter 1 Introduction To Software Engineering
No ratings yet
Chapter 1 Introduction To Software Engineering
18 pages
Network File System in Practice: Definitive Reference for Developers and Engineers
From Everand
Network File System in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Web Storage API
No ratings yet
Web Storage API
1 page
Project Management Model
No ratings yet
Project Management Model
3 pages
Pyxie Remote Access Trojan Rat
No ratings yet
Pyxie Remote Access Trojan Rat
22 pages
6
No ratings yet
6
3 pages
Carrer Objective: Technical Skills On Devops Tools
100% (5)
Carrer Objective: Technical Skills On Devops Tools
3 pages

Class Notes

Uploaded by

Class Notes

Uploaded by

DISTRUBUTED FILE

You might also like