0% found this document useful (0 votes)
75 views

A Novel Distributed File System Using Blockchain Metadata

Uploaded by

关nicole
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
75 views

A Novel Distributed File System Using Blockchain Metadata

Uploaded by

关nicole
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Wireless Personal Communications

https://doi.org/10.1007/s11277-022-10108-2

A Novel Distributed File System Using Blockchain Metadata

Deepa S. Kumar1 · S. Dija2 · M. D. Sumithra3 · M. Abdul Rahman4 · Praseeda B. Nair5

Accepted: 15 October 2022


© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022

Abstract
Cluster computing has become an inevitable part of data processing as the huge volume
of data being produced from different sources like online social media, IoT, mobiledata,
sensor data, black box data and so on increases in an exponentially fast manner. Distrib-
uted File System defines different methods to distribute, read and eliminate the files among
different cluster computing nodes. It is found that popular distributed file systems such as
Google File System and Hadoop Distributed File System store metadata centrally. This cre-
ates a chance for a Single Point of Failure that arises the need for backup and alternative
solutions to recover the metadata on the failure of the metadata server. Also, the name
node server is built using expensive and reliable hardware. For small and medium clusters,
it is not cost effective to maintain expensive name node server. Even though cheap com-
modity hardware may substitute the name node functionality, they are prone to hardware
failure. This paper proposes a novel distributed file system to distribute files over a cluster
of machines connected in a Peer-to-Peer network. The most significant feature of the file
system is its capability to distribute the metadata using distributed consensus, using hash
values. Although the distributed metadata is visible to the public, the methodology ensures
that it is immutable and irrefutable. As part of the in-depth research, the proposed file
system has been successfully tested in the Google Cloud Platform. Also, the basic opera-
tions like read, write, and delete on Distributed File System with distributed metadata are
compared with that of Hadoop Distributed File System based on distribution time on the
same cluster setup. The novel distributed file system provides better results compared to
the existing methodologies.

Keywords  Cluster computing · Google file system · Peer-to-Peer network · HDFS · Single
point of failure · Meta data

* Deepa S. Kumar
[email protected]
1
College of Engineering Munnar, Idukki, India
2
Centre for Development of Advanced Computing, Thiruvananthapuram, India
3
LBS Institute of Technology for Women, Thiruvananthapuram, India
4
LBS Centre for Science and Technology, Thiruvananthapuram, India
5
Amal Jyothi College of Engineering, Kanjirappally, Kottayam, India

13
Vol.:(0123456789)
D. S. Kumar et al.

1 Introduction

Distributed File System (DFS) is a set of services on a collection of nodes with the capabil-
ity to distribute file contents with the properties of location transparency and availability. The
other features include replica management, fault tolerance, data rebuild, error detection and
correction etc. The core of DFS lies with the metadata and its management.
The first-generation distributed file systems like Network File System (1974–95) were only
network storage file systems. Later with the development of distributed object systems, per-
sistent stored datasets with a visible namespace and sharing of data between users with access
mechanisms like read-only access, concurrent access ansd access control, mounting files etc.
were introduced [1].
Later, new architectures were being proposed and implemented with high-speed networks,
and distributed data among different servers, and then P2P architecture was proposed where
the service was distributed at the level of individual files. Initial file system designs satisfied
transparency, heterogeneity, efficiency and fault tolerance, and a limited achievement of con-
currency, replication, consistency, and security was achieved.
The evolvement of Big Data initialy lead to the Hadoop ecosystem with two main compo-
nents: one storage component, known as Hadoop Distributed File System (HDFS) wherein the
data are distributed, and the second one is the processing component known as Map Reduce
which is the most popular processing framework. HDFS architecture follows client server
architecture with name node as the server and data nodes as slaves [2, 3]. The name node
is designed with expensive and reliable hardware as it suffers from a single point of failure,
which keeps all the metadata regarding the huge collection of data stored on cheap commodity
hardware, data nodes.
The main motivation to decouple metadata and data was for better performance, especially
on large sized clusters like Yahoo. When the entire metadata is being kept in the RAM of
a central server, its processing becomes fast and simultaneously data nodes handle multiple
read and write operations. But the horizontal scalability was limited by the RAM capacity
of the server [4]. Hence several proposals and implementations on distributed metadata and
partitioned metadata servers had been evolved and are developing. Giraffa File System is one
of the latest projects carrying out based on dynamically partitioned metadata on a cluster of
metadata servers [5].
When managing relatively low sized clusters, the need for maintaining an expensive com-
ponent for keeping metadata is another major aspect that is being pointed out in this paper.
HDFS design mainly focused on overall system throughput rather than individual operational
latency [4].
This paper proposes a new file system without central name node for metadata storage and
processing. This paper outlines the implementation details of the proposed DFS basic opera-
tions such as read, write, delete, metadata creation and metadata distribution. The proposed
DFS operation’s read and write performance is compared with the traditional HDFS’ read and
write performance. The main feature of the proposed system lies with the distribution of meta-
data in an immutable manner and instead of keeping it centrally as in HDFS. The software
architectural components are also explored in detail.

13
A Novel Distributed File System Using Blockchain Metadata

2 Review on Existing Distributed File Systems

An existing system study of the various data parallel storage architectures conducted on
various file architectures—Google File System, Hadoop Distributed File System, Cas-
sandra File System, Ignite File System and Giraffa File System are explored and detailed
below.

2.1 Google File System (GFS)

GFS was implemented for storage on a cluster with the capability to store big files, devel-
oped by Google, and enhanced for the communication between system-to-system rather
than user-to-system. GFS was developed as a scalable distributed file system for large, dis-
tributed data-intensive applications [6]. It provides fault tolerance and run-on commodity
hardware. It is optimized for a write-once-read-many access structure. GFS exhibits the
scalability, reliability, and availability and better performance. Handling component fail-
ures and a huge volume of data are the challenges involved in GFS [7]. It provides a file
system interface for all the operations on files.
Figure  1 indicates the various architectural components of GFS. GFS was mainly
intended for Google’s core data storage in search engines. The GFS cluster is organized
as a master node, which keeps track of the metadata information and a collection of chunk
servers.The files are partitioned into 64 MB chunks as the default size and each chunk is
assigned a unique 64-bit chunk handle by the master node for mapping the data chunks.
Reliability is ensured by replicating each chunk on multiple chunk servers. The master
maintains all file system metadata. All metadata are kept in the master memory and hence
master operations are faster. As a backup mechanism in case of a master node failure, the
first two types of files are kept in an operation log stored on the master’s local disk and
replicated on remote machines [6]. The chunk location information is known to the chunk
server and asks by the master during startup and whenever a chunk server joins the cluster.
The permissible operations are read and appended. Data replication is handled by the sys-
tem automatically, by maintaining at least three copies. GFS achieves a comparable read-
ing performance but is relatively slow for writing data to files due to the verification proce-
dure in the modifying chunk master (shadow master).

Fig. 1  GFS Architecture

13
D. S. Kumar et al.

2.2 Hadoop Distributed File System (HDFS)

HDFS, an advanced implementation of GFS, identified the main issues of data locality and
data replication [8]. HDFS is the most commonly used storage that relies on Map Reduce
programming (HDFS/Map Reduce). HDFS is organized as master/slave architecture with
a central server, name node which keeps the metadata, secondary name node as the check
pointing node and a cluster of commodity hardware known as data nodes which stores the
data. In the HDFS architecture, the name node and secondary name node are expensive
to maintain. In earlier versions of Hadoop, there are two daemons in name node and data
nodes namely job tracker and task tracker respectively, for resource allocation and task pro-
cessing. Now the Hadoop 2.x version introduces the component known as YARN which
acts as a resource manager and substitutes the job of job tracker daemon in name node
and the job of task tracker in data nodes. The metadata having details like HDFS location,
filename, replicas, path etc. are kept in the name node. The primary issues of large-scale
data management like fault tolerance, scalability and reliability are handled well. Figure 2
depicts the HDFS architecture.
The major drawback of HDFS architecture is that the existence of a central name node
leads to single point failure and for extremely large clusters, the single name node archi-
tecture limits the scalability of cluster size. Another issue is that the entire workflow in
HDFS architecture is based on the metadata, which is not secured enough. The privacy
and anonymity features are not being addressed in the existing HDFS architecture. HDFS
mainly focuses on overall performance for large sized clusters. Latency and expenses for
maintaining the cluster main servers like name node and secondary name node appear high
for relatively small sized clusters.

2.3 Ignite File System (IGFS)

A unique in Memory Computing distributed file system for improving the processing
performance of big data is by the component of Hadoop, known as Apache Ignite [9].
It is an Application Programming Interface that is placed on the top of HDFS, which
can be plugged into Apache Hadoop or Spark as shown in Fig. 3. IGFS does not need
a name node. It automatically determines file data locality using a hashing function.
IGFS architecture eliminates the overhead associated with job tracker and task trackers
in HDFS architecture thereby providing low-latency and high-performance distributed

Fig. 2  HDFS Architecture (*
https://​hadoop.​apache.​org/​docs/​
r1.2.​1/​hdfs_​design.​html)

13
A Novel Distributed File System Using Blockchain Metadata

Fig. 3  IGFS Architecture [9]

processing. The Ignite nodes in IGFS are relatively expensive, and is not suitable for
data intensive processing.

2.4 Cassandra File System (CFS)

CFS suggests better storage than HDFS by eliminating the name node, secondary name
node, and the job tracker. CFS is represented as key space having two column families
in Cassandra. Like HDFS, it provides replica management, and the corresponding set-
tings are done on the keyspace. The two column families represent the inode column
family that tracks each file’s metadata and block locations, and the file blocks are stored
in the sblock column family. CFS architecture is both faults tolerant and scalable. The
metadata is stored in the inodes.
CFS is organized as a serverless network instead of Master/Slave configuration in
HDFS. Gossip Protocol is used for communicating among peers. When a write request
arises to the analytic node, as depicted in Fig. 4, CFS writes the metadata information
in a table called inode. The blocks are created with ID numbers for each subblock cre-
ated and written to the Cassandra. For handling the read request, the metadata table
information is being utilized for the selection of blocks which introduces delays in disk
I/O operations. Also, the metadata searching time from the NoSQL database will incur
additional execution time [10].

Fig. 4  Cassandra File System


Architecture (*http://​www.​datas​
tax.​com/​dev/​blog/​cassa​ndra-​file-​
system-​design)

13
D. S. Kumar et al.

2.5 Giraffa File System (GiFS)

HDFS keeps all the metadata and the entire namespace in the RAM of a single name node
and hence the growth of the files is limited to 20 PB [12]. Adding more nodes by the way
of dynamic partitioning of namespace leads to the development of the new distributed file
system called Giraffa File System. It is a highly available distributed file system that uti-
lizes the features of HDFS and HBASE. Ceph [13], Lustre [14], CassandraFS [10] etc.
are the other distributed file systems with distributed namespace. GiFS is designed with
minimal changes in the existing components of HDFS and HBASE, and the project is in
the experimental stage. Giraffa requirements include metadata scalability, load balancing
and speed. The GiFS was implemented with the Giraffa clients who read the metadata from
HBASE database whereby exchanges file to the data nodes (Fig. 5).
Distributed block management module in GiraffaFS supports block managers which
maintain flat namespace of blocks and manages block allocation, replication, removal, data
node management and the storage of namespace metadata in HBASE.
The architectural diagram shows the various functionalities of the application as follows.

• Giraffa client gets blocks and files from HBASE.


• Directly query the block manager for the file and block access.
• The application can directly get data from data nodes.

The Giraffa system is in an experimental stage, and it suits large-sized clusters which
include thousands of nodes. Instead of keeping the metadata centrally, it is partitioned and
stored across multiple metadata servers dynamically and the file system handles load bal-
ancing too.

3 Proposed Software of DFS

The software offers various file services such as storage/extraction and deletion of data
across the cluster. Most important distributed file systems are Google File System [GFS]
[6], Cassandra File System [CFS][10], Ignite File System [IGFS][9], Hadoop Distributed
File System [HDFS][11] and Giraffa File System [GiFS][5]. This paper proposes a Dis-
tributed File System with Distributed Metadata (DFS-DM) that exhibits the features such

Fig. 5  Giraffa File System


Architecture (*https://​www.​slide​
share.​net/​Konst​antin​VShva​chko/​
hdfs-​design-​princ​iples)

13
A Novel Distributed File System Using Blockchain Metadata

as data recovery through replica management, error detection through Cyclic Redundancy
Check (CRC) and block rebuild using parity block addition.

3.1 Software Architecture of DFS‑DM

This software manages the storage of several nodes, which are connected by the P2P net-
work. The software offers a file system interface for the DFSClients.

3.1.1 Components

The major software components of DFS-DM are DFSClient, DFSAdmin and the DFS
module as shown in Fig. 6. DFSClient issues read/write/delete request to the DFSAdmin.
DFSAdmin responds with the rObj/wObj to the DFSClient. Then the DFSClient interacts
with the DFS module using the read/write object given by the DFSAdmin. If the DFSCli-
ent issues a delete request, the DFSAdmin directly interacts with the DFS module. Delete-
Handler of the DFS module offers a delObj with the details of the file to be deleted and acts
as an interface to the cluster, where the actual data resides.
The DFS module includes the following important functional blocks.

3.1.1.1  BlockGenerator  This software module reads the input file submitted by the DFS-
Client and splits it into blocks of 128 MB. The blocks are stored as separate files of size
128 MB, except for the last block.

3.1.1.2  ParityGenerator  ParityGenerator module creates parity for the consecutive 10 data
blocks and is stored as a separate file, along with data blocks. Parity generation is based on
XOR operation on a bunch of 10 consecutive blocks of size 128 MB and hence the length of
parity will also hold the same size.

Fig. 6  Major DFS-DM Software


components

13
D. S. Kumar et al.

Table 1  Distribution of Data Node 1 Node 2 Node 3 Node 4


blocks [B1-B12] & replica [R1-
R12]
B1 B4 B7 B10
B2 B5 B8 B11
B3 B6 B9 B12
R10 R1 R4 R7
R11 R2 R5 R8
R12 R3 R6 R9

Table 2  Distribution of Data Node 1 Node 2 Node 3 Node 4


blocks [B1-B12] & replica [R1-
R12]
B1 B4 B7 B10
B2 B5 B8 B11
B3 B6 B9 B12
R4 R1 R2 R3
R7 R8 R5 R6
R10 R11 R12 R9

3.1.1.3  IPPieceMapper & Replica Manager  This software module prepares the list of IP
addresses for the available data nodes. This module has incorporated a scheduling policy
as a mapping function from the list of available IPs to the blocks generated by the Block-
Generator.
IPPieceMapper maps the data blocks (primary & secondary copies) to the available list
of IPs. The Replica Manager manages the number of block replicas to be maintained; by
default, two copies of each block known as primary copy and secondary copy.

3.1.1.4  BlockDistributer  The functionality of BlockDistributer is to send the data blocks


to the assigned list of IPs of data nodes based on IPPieceMapper’s mapping function.
The distribution strategy has been chosen based on time to process on each node. Two
alternative methods are depicted in Tables 1 and 2.
It is assumed that a file is split into 12 blocks [B1-B12] as primary blocks and 12
block replicas [R1-R12] as secondary copy. Table 1 shows the distribution of blocks as
a unit of three consecutive blocks [B1-B3] to be written to the first node and the replica
of three blocks [R1-R3] on the next node on the cluster and the procedure is repeated for
the total 12 data blocks on a cluster of 4 data nodes. If the time for processing one block
is one minute, then the total processing time for each node is three minutes.
So, if node 1 fails, node 2 has to execute entire blocks allocated to node 1 and hence
the total execution time would be six minutes.
Table  2 depicts the distribution of 12 blocks [B1-B12] and replica [R1-R12] on a
different approach described in Table 1. The first three consecutive blocks [B1-B3] are
stored on the first node on the cluster and their replicas [R1-R3] are distributed to the
remaining three nodes with one replica on each subsequent node on the cluster in a
circular fashion. Here if a node fails, the replica recovery process completes in four
minutes on the remaining three nodes on the cluster and hence 2 min is saved, but at the

13
A Novel Distributed File System Using Blockchain Metadata

expense of three nodes. But in the schedule mentioned in Table 1, only one node suffers
to recover all replicas of failed node.

3.1.1.5  MetadataCreater  After the successful BlockDistributer functionality, the Metada-


taCreater software needs to create the metadata, which contains information such as file-
name, block size, list of IPs to store the replica. Metadata is maintained as blocks using the
data structure called Data Dictionary. Data Dictionary as a data structure, where the data
are represented as (key, value) pairs. The metadata block contains two hashes, as shown in
Fig. 7, in addition to the actual metadata. Current hash [H1] is computed using SHA256
algorithm on the current metadata block. The previous hash [H2] is also attached in each
block. For the first metadata block, the previous hash is generated from a Genesis block [15].
Figure 7 illustrates the blocks [Mblk1-Mblk5] created for files [F1–F5] and each block
stores the metadata corresponding to one file. Each block with two hash values computed:
H1-current hash and H2-previous hash. The genesis block is the hash value generated ran-
domly to indicate the start of the block. The blocks are linked using the H2 in each block
which is received from the previous block. If any corruptions to the metadata of any files
happen, the subsequent hash values won’t match and hence the alterations to metadata can
easily be detected.

3.1.1.6  MetadataDistributer  This proposed software implements the metadata distribution


as a chain of blocks and distributed across all P2P networked machines. All the data nodes
on the network can view the entire blockchain contents, but the contents cannot be altered
by anyone on the network. Figure 8 shows a P2P network of nodes with storage (S) and
compute (C) functionality. Metadata blocks created are distributed across all the nodes of
P2P network.

Mblk1-F1 Mblk2-F2 Mblk3-F3 Mblk4-F4 Mblk5-F5


H1 H1 H1 H1 H1
Genesis
H2 H2 H2 H2 H2
Block

Mblks- Metadata
blocks [Mblk1-Mblk5]
[F1-F5]- Files
H1- Current hash

Fig. 7  Metadata blocks

13
D. S. Kumar et al.

Fig. 8  Blockchain based distrib-


uted file system architecture

3.1.1.7  FileWriter  FileWriter acts as a file controller which accepts the wObj (issued by
DFSAdmin) from the DFSClient. FileWriter accepts data from the DFSClient and the
BlockGenerator computes the required number of files to store all the data blocks and
store the data in a block queue until the last block of the file has been read. For better
performance, data are being written on n + 1 disks as separate files, with n disks for stor-
ing the data simultaneously and 1 disk for parity storage. The parity blocks are generated
for 10 consecutive data blocks.
The parity blocks are stored among n + 1 disks in a distributive manner. For each
block of data received, a checksum is being computed by the CRC module. Hence, the
FileWriter converts the file into n block files + 1 parity file + 1 checksum file. Errors
encountered, if any, could be verified during reading process by the checksum file and
the particular block can be rebuilt using the parity file based on XOR function calcula-
tion. FileWriter then fetches the list of IPs of data nodes available on the cluster from
IPPieceMapper and finally, all the three files corresponding to the original file are writ-
ten to data nodes. The files are replicated as a factor of 2, with the primary copy and
the secondary copy. The distribution of the two copies either follows as described in
Table 1 or in Table 2.

3.1.1.8  FileReader  FileReader is another file controller to assist the DFSClient during a
read operation, error free. As the DFSClient receives the rObj from DFSAdmin, it passes the
rObj to the FileReader. FileReader then fetches the metadata from the respective data node.
After getting the metadata from the blockchain, the block is validated by the time stamp field
of the blockchain, and the FileReader reads the data blocks and place them on the data queue
and verifies using the checksum file. In case of errors, the corrupted block can be rebuilt
using the information from the parity file. If the data nodes on the cluster fail to retrieve the
data blocks, the FileReader issues requests to read the data blocks from the secondary copy.

3.1.1.9  DeleteHandler  As soon as the DFSClient issues the delReq to the DFSAdmin,
the admin passes the request to the DFS Module. DeleteHandler of the DFS Module
creates a delObj and passes this object to the cluster nodes. The data node on the cluster
verifies the availability using metadata. If the blocks are found, then the blocks will be
marked for deletion by setting an invalid time stamp and the information is propagated
on the P2P network. The DFS Module communicates with the DFSAdmin and acknowl-
edges the admin that the deletion is successful and finally DFSAdmin returns delete suc-

13
A Novel Distributed File System Using Blockchain Metadata

Fig. 9  Functional components of
DFS-Module

Fig. 10  Read operation workflow (DFS-DM)

cessful message to the DFSClient. If the blocks are found missing, DFS Module updates
the admin with an unsuccessful message back to the DFSClient.
The block diagram of the DFS module illustrating different functionalities is shown
in Fig. 9.

3.2 Basic Operations of DFS‑DM

The basic operations of DFS-DM are read, write, and delete files from / to the Distrib-
uted File System and the detailed workflow for the respective operations are depicted in
Figs. 10, 11, and 12 respectively.

13
D. S. Kumar et al.

Fig. 11  Write operation workflow (DFS-DM)

Fig. 12  Delete operation work-


flow (DFS-DM)

13
A Novel Distributed File System Using Blockchain Metadata

3.2.1 Read Operation Dataflow

Step 1: DFSClient issues a read request to DFSAdmin.


Step 2: Admin returns the read object.
Step 3: DFSClient connects to the DFS Module using read object.
Step 4: DFS Module interacts with any of the cluster nodes to get metadata from the
chain.
Step 5: DFS module receives the metadata block from the cluster and validates the
timestamp field.
Step 6: If metadata with valid time stamp is found, the DFS module then issues a block
read request to the cluster nodes.
Step 6.1: FileReader reads the blocks and is stored in temporary storage called as Data-
Queue and also maintains AckQueue.
Step 6.2: FileReader checks the received blocks for the error detection mechanism using
Cyclic Redundancy Check (CRC). If the blocks are found corrupted, then DFS Module
fetches the secondary copy of the same block.
Step 6.3: Otherwise, if there is no response from cluster nodes, FileReader will again
issue the block request after a random amount of time, for the secondary copy of the
file.
Step 6.4: The secondary copy is read from the cluster and blocks are merged to get the
original data.
Step 6.5: If the secondary copy is also found corrupted, then a single node failure can be
rebuilt using the parity block attached to each file.
Step 7: FileReader merges blocks and checks for error detection using CRC. If no errors
are encountered, the original file is returned to the DFSClient. Otherwise, the affected
block is rebuilt from parity bits.

3.2.2 Write Operation Dataflow

Step 1: DFSClient submits the file write request to DFSAdmin.


Step 2: Admin returns wObj.
Step 3: DFSClient submits a file to FileWriter.
Step 4: BlockGenerator () accepts the file and divides it into different blocks.
Step 5: BlockGenerator () adds the generated block into a BlockQue.
Step 6: ParityGenerator () generates parity for the blocks. After all data writes are com-
mitted, the parity is stored on the parity file.
Step 7: FileStreamWriter reads the block one by one from the BlockQue and with the
help of IPPieceMapper () writes blocks on the data nodes.
Step 8: The acknowledgement is stored in AckQue for each block read operation.
Step 9: Metadata is created for each file as blocks. Blocks grow further corresponds to
each file. The block metadata are linked using the computed hashes attached on every
block.
Step 10: The metadata chain is distributed on the cluster of data nodes.

3.2.3 Delete Operation Dataflow

Step 1: DFSClient issues delete request.

13
D. S. Kumar et al.

Step 2: DFSAdmin produces delObj to DeleteHandler.


Step 3: The metadata block that corresponds to the data to be deleted is located.
Step 3.1: If metadata block is found, then DeleteHandler sets an invalid timestamp
artificially, and propagates an invalid timestamp to all the data nodes on the clus-
ter and acknowledges the admin with the deletion successful message which in turn
returns the same to the DFSClient.
Step 3.2: Otherwise, DeleteHandler acknowledges the deletion unsuccessful message
which in turn returns the same to the DFSClient.

4 Characteristics of DFS‑DM vs HDFS

• DFS-DM file system provides high availability by maintaining a single secondary


copy as a replica and a parity file. If a single node on the network fails, automati-
cally the secondary copy is fetched. If the secondary copy also fails, then the cor-
responding blocks can be rebuild using the parity file stored along with the data
file. Whereas in HDFS, this feature is achieved by maintaining two replicas, thereby
wastage of memory is high when compared to that of the proposed DFS-DM.
• DFS-DM provides an error detection procedure by adding the CRC checksum file. In
HDFS, the integrity check feature is not included.
• DFS-DM follows a P2P network configuration with equal priority to all the nodes
on the network, implemented using cheap commodity hardware. HDFS follows a
client server architecture, where there is a single central authority, called name node,
which is maintained with expensive hardware. Unlike DFS-DM, HDFS architecture
suffers from a single point of failure.
• DFS-DM distributes the metadata to all the peers on the P2P network using block-
chain; whereas, in HDFS, metadata is stored on a central name node.
• DFS-DM provides only one replica and addresses two node failures simultaneously.
When the primary and secondary copy fails, then rebuild operation is initiated using
the distributed parity file stored in the cluster.

HDFS, by default, maintains two replicas and addresses two node failures. The stor-
age space needed by HDFS is three times the original data, but in DFS-DM, the space
required is only two times and negligible space for parity and checksum file storage.

5 Results and Discussion

All the software components of the DFS-DM are implemented and tested as described
in the previous sections. Also, performance testing is done by setting up a local clus-
ter of four PCs with configuration as i5 core, 4  GB RAM, 500  GB HDD, networking
through D-LINK switch (10/100Mbps). Then, the time taken for both the distribution of
different datasets (PUT time) on the cluster and collecting the datasets (GET time) from
the cluster are measured. The PUT/GET time are also measured in big data Hadoop
cluster file system called HDFS with one name node and three data nodes.

13
A Novel Distributed File System Using Blockchain Metadata

6 Performance Comparison of DFS‑DM with HDFS

The performance of the proposed system is evaluated by measuring the time to AddFile to
the cluster and GetFile from the cluster and the results are compared with HDFS. DFS-DM
shows almost similar PUT/GET times measured in HDFS as shown in Fig. 13.
Results prove that all the two existing distributed file systems and the proposed file sys-
tems have approximately the same time to complete the write and read data on the same
cluster setup. The existing HDFS has to maintain expensive components like name node,
secondary name node, stand-by passive name node and name node clusters on different
configurations. However, the proposed system shows a similar performance with P2P net-
work configuration and hence there is no need for expensive servers to keep the metadata
management. Hence the expensive hardware in the existing systems can be replaced with
the software blockchain metadata creation and distribution technology. The metadata main-
tained on the blockchain ensures immutability & anonymity by its design. The validation
of metadata can be checked through the time stamp available on each block of data.

6.1 Integration of DFS‑DM to MPI

DFS-DM is optimized in different ways.

• Copying of replicas by the appropriate scheduling policy.


• Automatic fault tolerance by keeping the secondary copy.
• Rebuild option for the lost blocks using parity files stored using the XOR computation
of available data blocks.
• Synthesize a file system library from DFS-DM which can be included in any parallel
computing environment.

6.2 Performance Comparison of Sentiment Analysis on Different Datasets in HDFS/


Apache Spark vs BBDFS/MPI

Sentiment Analysis on Spark vs MPI


Performance of the distributed and cluster computing frameworks i.e., spark and MPI
are compared based on the test case application, sentiment analysis on Twitter data.

Fig. 13  Comparison of distrib-
ute/collect time in HDFS vs
DFS-DM

13
D. S. Kumar et al.

Table 3  Comparison of Dataset sizes (GB) Execution time HDFS/ Execution time


execution time on HDFS/Spark SPARK (MIN) BBDFS/MPI
vs DFS-DM/MPI for Sentiment (MIN)
Analysis application
32 54.09 56.42
64 147.38 96.41
92 192.19 147.55
116 292.30 172.7
200 394.50 302.47
500 891.18 833.17
1 TB 1793.29 1667.42

Fig. 14  Comparison of execution
time on HDFS/Spark vs DFS-
DM/MPI for sentiment analysis

Sentiment Analysis is the process of recognizing the opinions from the sentences. The
sentiments can be positive, negative, or neutral [16]. The sentiment analysis application
is written in Scala programming which is tested for different file sizes and evaluates the
execution time in both Spark and MPI frameworks as shown in Table 3. Performance com-
parison is done with HDFS/Spark and the proposed DFS-DM in MPI framework and the
results show that DFS-DM/MPI outperforms approximately one order of magnitude than
HDFS/Spark as shown in Fig. 14.

7 Conclusion

The newly proposed Blockchain-based Distributed File system is successfully implemented


in a P2P network using commodity hardware only. The main advantages achieved by this
novel methodology include fault tolerance, availability, rebuild facility, error check mecha-
nism, immutable distributed metadata creation and management as the proposed architec-
ture adopts the mentioned features of Hadoop Distributed File System. Moreover, it is a
cost-effective solution for setting up the cluster. Thus, the expensive server hardware is
replaced by blockchain technology without sacrificing the performance parameter, read/
write time. A detailed study is conducted on spark processing in Spark 2.0.2 on Hadoop
2.7.3 and YARN on a Google cloud cluster setup. During the in-depth research, the experi-
mental results are analysed based on execution time. Besides, sentiment analysis applica-
tion is chosen as the application, and it is developed in Scala for spark processing using

13
A Novel Distributed File System Using Blockchain Metadata

C +  + for the implementation in MPI. The execution time is analysed on spark and MPI.
The detailed quantitative analysis on Apache Spark vs MPI is carried out on a dataset of
upto 1 TB [17] and results show that the execution speed of MPI is roughly 1.5 times faster
than spark processing. Hence, the proposed Blockchain-based Distributed File System
library can be integrated into any big data and HPC applications under low-cost hardware
setup and without compromising the read and write performance.

8 Future Works and Challenges

The proposed decentralized infrastructure can be implemented using Inter Planetary File
System (IPFS) [18] as a future work. The IPFS is a protocol and peer-to-peer network for
storing and sharing data in a distributed file system. But the IPFS cannot store large files.
It uses content-addressing to uniquely identify each file in a global name space connecting
all computing devices. Files are stored inside IPFS object which is up to 256 kb in size.
IPFS objects can also contain a link to other IPFS objects. The major challenge during the
implementation of IPFS is that, the files that are larger than 256 kb, an image or video, are
split up into multiple IPFS objects, and the system will create an empty IPFS object that
links to all other pieces of the file. Each object gets hashed and is given a unique content
identifier, which serves as a fingerprint. This makes it faster and easier to store the small
pieces of data on the network quickly. But if the empty IPFS object goes offline, the file
access will be difficult. Hence, the replication of metadata in IPFS is needed which leads to
an increase in cost of operating the blockchain.

Author contribution  All authors are contributed in this work.

Funding  No funding.

Data Availability  Data sharing not applicable—no new data generated.

Code Availability  We used our own data and coding.

Declarations 
Conflict of interest  No conflicts of interest to disclose.

Human and Animal Rights  Humans and animals are not involved in this research work.

References
1. Li, X. S., et al. (2011). Analysis and simplification of three-dimensional space vector PWM for three-
phase four-leg inverters. IEEE Transactions on Industrial Electronics, 58, 450–464.
2. https://​www.​slide​share.​net/​wahab​tl/​chapt​er-8-​distr​ibuted-​file-​syste​ms.
3. Shvachko, K., et al. (2010). The hadoop distributed file system. Yahoo! Sunnyvale, California USA.
4. White, T. (2009). Hadoop: The definitive guide. O’Reilly Media, Yahoo! Press, 2009

13
D. S. Kumar et al.

5. Shvachko, K. V. (2010). HDFS scalability: the limits to growth. LOGIN


6. Shavchko, K. V., et al. (2017) File systems and storage- scaling namespace operations with giraffa file
system. Summer 2017login 42(2):27–30, 2017. www.​usenix.​org.
7. Ghemawat, S., et al. (2003). The Google file system. In: Proceedings of the ACM symposium on oper-
ating systems principles, Lake George, NY, pp. 29–43.
8. McKusick, M. K., et al. (2009). GFS: Evolution on fastforward. ACM Queue, Vol. 7, no 7. ACM, New
York
9. http://​hadoop.​apache.​org/​relea​ses.​pdf.
10. https://​ignite.​apache.​org/​featu​res/​igfs.​html.
11. https://​www.​datas​tax.​com/​wp-​conte​nt/​uploa​ds/​2012/​09/​WP-​DataS​tax-​HDFSv​sCFS.​pdf.
12. Shvachko, K. V. (2006). The hadoop distributed file system requirements. Hadoop Wiki. http://​wiki.​
apache.​org/​hadoop/​DFS_​requi​remen​ts.
13. https://​www.​slide​share.​net/​Konst​antin​VShva​chko/​hdfs-​design-​princ​iples.
14. Weil, S., et al. (2006). Ceph: A scalable, high-performance distributed file system. In: Proceedings of
OSDI ’06: 7th conference on operating systems design and implementation (USENIX Association,
2006)
15. Lustre: http://​www.​lustre.​org.
16. Bhaskar et  al. (2016). 3–Bitcoin mining technology. Handbook of digital currency: Bitcoin, inno-
vation, financial instruments, and big data. Academic Press. pp. 47–51. Retrieved 2 Dec 2016—via
ScienceDirect
17. Kumar, D. S., et al. (2017). Performance evaluation of apache spark Vs MPI: A practical case study
on twitter sentiment analysis. Journal of Computer Sciences, 13(12), 781–794. https://​doi.​org/​10.​3844/​
jcssp.​2017.​781.​794
18. Steichen, M., et al. (2018). Blockchain-based, decentralized access control for IPFS, 2018. In: IEEE
confs on internet of things, green computing and communications, cyber, physical and social comput-
ing, smart data, blockchain, computer and information technology, congress on cybermatics.

Publisher’s Note  Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional affiliations.

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under
a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted
manuscript version of this article is solely governed by the terms of such publishing agreement and applicable
law.

Deepa S. Kumar  Associate Professor and HOD, Department of CSE,


College of Engineering Munnar, received the M.Tech Degree from
Cochin University of Science & Technology. She pursued her PhD
degree in Big Data from Karpagam Academy of Higher Education,
Coimbatore. She has presented papers in various National and Interna-
tional conferences, authored two journal papers in the area of Big Data
& High-Performance Computing, and authored & published one book.
Her area of specialization includes blockchain, cryptography and big
data processing.

13
A Novel Distributed File System Using Blockchain Metadata

S. Dija  is a Scientist F at Centre for Development of Advanced Com-


puting (C-DAC), Thiruvananthapuram, India. C-DAC is the premier
R&D organization of the Ministry of Electronics and Information
Technology (MeitY), Govt. of India for carrying out R&D in IT, Elec-
tronics and associated areas. She joined CDAC in the year 1999 and
she has more than 22 years of experience in Research and Develop-
ment, particularly in the Cyber Forensics Area. She is currently lead-
ing various Cyber Forensics Projects in the area of Live Forensics and
Internet Forensics.She developed various indigenous Cyber Forensics
Tools and deployed them to various Law Enforcement Agencies
(LEAs) in India including IB, NIA, Defense, CBI, FSL, State Polices
etc. She has trained more than 10,000 officials from various LEAs on
Cyber Forensics. She has extended technical support in cyber crime
analysis to different LEAs in India. Her areas of expertise include Disk
Forensics, Memory Forensics, EMail Forensics and Malware Foren-
sics. She has authored 24 International Research Publications in the
Cyber Forensics area. Dija S obtained her Master of Technology from
BITS, Pilani in 2019, and Bachelor of Technology from Kerala University in 1998.

M. D. Sumithra  is an Assistant Professor in LBS Institute of Technol-


ogy for Women, Thiruvananthapuram, affiliated with APJ Abdul
Kalam Kerala Technological University. She has 20 years of teaching
experience, and her research and teaching interests include data analyt-
ics, image processing and blockchain technology. She received her
PhD degree from the Computer Science and Engineering Department,
Karpagam Academy of Higher Education in 2019. She obtained her
Master of Technology in 2012 and Bachelor of Technology in 1998
from Kerala University.

M. Abdul Rahman  is currently the Pro-Vice Chancellor of APJ Abdul


Kalam Technological University Kerala. He was the former Director of
All India Council for Technical Education (AICTE), the apex body of
technical education under Ministry of Human Resource Development,
Government of India which regulates Engineering, Management,
Architecture, Hotel Management, Pharmacy institutions of the country.
He received the Doctor of Philosophy (Ph.D.) degree in Computer Sci-
ence & Engineering from Karpagam University. He obtained his Mas-
ter of Technology from Kerala University in 2004, and Bachelor of
Technology from Calicut University 1998.

13
D. S. Kumar et al.

Praseeda B. Nair  is presently working as an Assistant Professor at


Amal Jyothi College of Engineering, Kanjirapally, Kerala, affiliated to
APJ Abdul Kalam Technological University, Kerala. Praseeda has 12
years of teaching experience, and her research interests include digital
image processing, signal processing, cryptography, and big data analy-
sis. She earned her M. Tech degree from Mahatma Gandhi university
in 2013 with specialization in communication engineering.

13

You might also like