A Compendium On Distributed Systems
A Compendium On Distributed Systems
Abstract— Computer systems have evolved over the years A distributed system is a computing environment in which
starting from sizable, single-user, slow, and expensive numerous components are located across multiple
machines to multi-user, fast, cheaper, and small-sized computing devices on a network. These devices segregate
machines. The use of multi-user computer networks has given
rise to a new paradigm of computing known as Distributed the work, communicating and coordinating their efforts to
Systems. A distributed system is regarded as software appear to be a single cohesive system to the end-user. A
consisting of a collection of dependent network distributed system can be an arrangement of different
communication and computational nodes. This paradigm configurations, such as mainframes, computers,
yields high performance while also maintaining high workstations, and minicomputers.
efficiency due to the decentralization of various computer-
related tasks to several computer nodes that are
interconnected. Even if distributed systems have proven to be
This definition highlights two principal features of
beneficial over the years it also has some design flaws, security distributed systems. The first one is the idea that a
concerns and challenges. In this paper, the main objective is distributed system is made up of various computer
to define these issues, challenges and security concerns while components that can include either software or hardware
also examining the various solutions developed over the years components. They are autonomous of each other, where the
to resolve them. This paper also briefly covers the components computing elements are referred to as nodes. The second
as well as the working of Distributed Systems.
characteristic is that users feel like they are operating with
Keywords— Heterogeneity, Concurrency, Scalability, the system alone.
HTCondor, Entropia, Encryption.
1.2 Why do we need Distributed Systems?
The secondary controller acts as a process or a 4. Peer-to-peer - A peer-to-peer network, also called
communication controller. It’s responsible for regulating a (P2P) network, works on the concept of a
the flow of server processing requests and managing the decentralized system. There are no additional
system’s translation load. It is also responsible for machines used to provide services or manage
governing the communication between systems and VANs resources. The machines in the system, called
or trading partners. peers, have uniformly distributed responsibilities.
Both clients and servers can use them.
2.1.3 User-Interface client
2.3 Working of Distributed Systems
The user interface client is also an additional element in the
system whose task is to provide important system
Distributed systems have evolved with today’s use cases of
information to the users. The user interface controller isn’t
distributed systems which are designed to operate over the
a part of a clustered environment and doesn’t operate on the
internet and cloud. The working of distributed systems can
same machines as the controller. Its functions include
be explained through this example. The process begins with
monitoring and controlling the system.
a task like rendering a video. The video editor then manages
this task and splits the work into pieces. Multiple computers
2.1.4 System datastore (referred to as nodes) are used with each node getting a
frame of the video. These nodes work independently on a
The system’s data store is normally present on the disk vault single frame and when they complete the rendering on that
whether clustered or not and each system has only one data frame a new frame is given by the video editor to work on.
store for all shared data. In non-clustered systems, the data This process goes on until the rendering of all frames is
store can be present on a single machine or across several completed. Also due to the distributed system, there is no
machines, but all the computers should have access to this limitation on how many nodes can be used. The quantity of
datastore. nodes we use is directly proportional to the time required to
complete the rendering process. Therefore, using more
2.1.5 Database nodes means the process is finished faster. It is described in
the diagram below.
For a distributed system all the data gets stored in a
relational database. After locating the data from the
database, the data store shares the data among multiple
users. All the data systems have relational databases and it
allows multiple users to use the same information at the
same time.
Interruption is the condition in which services or data
become hard to get, unusable, corrupt, etc. According to the
`denial of service attacks`, by which some third party
illegally attempts to make a service inaccessible to other
entities is a security threat that identifies as an interruption.
3. Modification
All the nodes are connected through the network and they Fabrication is the condition in which extra data or activity
communicate with each other while the process is running. are created that would normally be absent.
These nodes work in tandem to complete the process while
also being able to perform the operations. Several models 3.2 Security Requirements
and architectures are being used today like client-server and
peer-to-peer systems. Both of these architectures produce 1. Confidentiality
the same output but their methods to achieve the output are
different. The working of a distributed system is explained It consists of data being inaccessible to unauthorized
by the above example. individuals. Only after the authentication process, data is
kept accessible to concerned authorities.
Confidentiality is maintained by encryption of data.
III. SECURITY
2. Integrity
Security in distributed systems can kind of be divided into
two parts. One element guides the communication among
It avoids any unauthorized changes to the data and detects
users or processes, probably residing on special machines.
if any changes are made. Many authentication algorithms
The important mechanism for ensuring secure
are used for such validation processes. This helps only
communication is that of a secure channel. The second part
authorities to modify any piece of data.
concerns authorization, which offers with ensuring that a
process receives the access rights to only the resources of
the distributed system. It is concerned with secure channels 3. Availability
and access control that requires mechanisms to distribute
cryptographic keys, but also mechanisms to add and put off It provides the availability of data only to the concerned
users from a system. These subjects are covered using what authorities. This means that data is not available to any
is called security management. person to modify or change and hence is a security
Protection in a computer machine is firmly related to the requirement.
idea of dependability. Unofficially, a dependable computer
system is one which we believe to provide its services. 3.3 Security Mechanisms
Dependability in a computer system encompasses various
elements such as its accessibility, stability, safety, and ease 1. Encryption
of maintenance. However, to ensure that the system can be
fully trusted, it is also crucial to consider factors such as the Encryption is fundamental to computer security.
protection of sensitive information and the preservation of Encryption transforms data into an entity the attacker
data integrity. Wrong alterations in a secure computer cannot interpret easily. Encryption provides a means to
system must be detectable and recoverable. The most apply data confidentiality and allows us to check whether
important assets of any computer system are its hardware, data have been modified. The primitive approaches are:
software program, and statistics.
i. Conventional Encryption: Here the same key is used by
3.1 Security Threats a sender to encrypt a message and by the receiver to decrypt
the message. It is also known as Symmetric Encryption.
1. Interception ii. Public-key Encryption: Here two keys are used for
encryption purposes and one key is made public for anyone
The concept of interception points out the circumstance that to use. It is also known as Asymmetric Encryption.
an unauthorized entity has acquired access to a service or
some information. 2. Authentication
4.2 Transparency
4.4 Concurrency
Figure 5. Challenges Of Distributed Systems
Concurrency manages synchronal access to resources. It
4.1 Heterogeneity
stops multiple users from making changes to the same
record at the same time and also organizes transactions in a
Providing access to various services and executing
specific order for backup and recovery purposes. For
applications across a diverse set of computers. Utilities and
example, during an auction multiple participants bid on the
praxis span a significantly varied group of computer
same object, similarly in a distributed environment, as
networks that can be used, run and accessed by clients and
servers may attempt to access shared resources, there is,
end-users over the internet.
therefore, a chance that several clients will try to access the
Hardware devices: computers, tablets, mobile phones,
shared service at the same time. Object to be safe in the
embedded devices, etc.
same place, its functions must be performed in a critical
Different OS variants like the Windows OS, Linux OS, mac
environment and processes need to be harmonized in such
and Unix. Different types of networks that may be used
a way that its data remains consistent. This can be achieved
include a local network, the internet, wireless networks, and
in ways such as the use of semaphores, which are widely
satellite connections. Programming languages such as Java,
used in many applications. Multiple users when trying to
C/C++, Python, and PHP can be utilized. For
access the same set of data or shared resources, led to
communication to be implemented, standardized internet
concurrency issues. This significantly changes or affects
protocols are required and needed to be accepted. The term
the final output of the process as without synchronization
middleware pertains to a tier of software that supplies a
programming abstraction in addition to impersonating the
the processes get executed in the order in which they are diminution of dependability. In other words, in a distributed
inputted. system, there may be some elements that are inoperable
while others remain functional. This phenomenon is
regarded to as partial malfunction. Due to the non-
4.5 Security deterministic nature of time, the message and the time
required to travel to its destination are also non-
Most of the information resources that are made availed and deterministic, therefore we say that partial failures are
maintained in distributed systems have a high inherent unpredictable. We get no acknowledgment to know
value to their users. Their security is therefore significantly whether the system has succeeded or failed. Partial failure
important. While public networks are being used, security includes node crashes or communication connectivity
is the biggest issue concerning the distributed environment. issues in distributed systems.
Security for information resources has three components:
V. SOLUTIONS TO CHALLENGES IN
1. Confidentiality- (protection against divulgence to DISTRIBUTED SYSTEM
unauthorized persons)
Integrity- (protection against alterations or deception).
5.1 Heterogeneity solution
2. Availability- (protection against interfering with ways to
client resources), must be provided in DSS. Encryption is In the effort to solve issues related to heterogeneity, we
one of the methods to avoid security concerns. Ensuring show how selected approaches in the field of distributed
that only authorized and legitimate users can access the computing handle different kinds of heterogeneity. While
resources, modify and perform operations. Presently many there are numerous systems for sharing computation, this
institutions and organizations in the world have designed overview presents selected frameworks that represent the
and developed systems with distributed environments that full spectrum. We introduce a batch workload management
possess all of the security features mentioned above. system (HTCondor), a project-based volunteer and grid
computing system (BOINC), a service-oriented desktop
4.6 Scalability grid approach (Aneka), an enterprise grid (Entropia), and a
library-based programming model for distributed
A system encounters scalability issues when it lacks the computing (libWater).
capabilities to handle an abrupt boost of resources and or The development of Distributed Systems is the main focus
several users. In such situations, efficient use of of this proposition. Following things of tasks are not being
architecture and algorithms must be done. A system considered i.e. Accessibility and their nature.
demands scaling on specifications like size, Geography, or We have presented frameworks with different approaches
Administration. in general. Each system has a different focus point for
instance - high throughput, security, or application
• Size: Size is the number of users and resources to integration.
be processed. Problems that may arise due to size The table below shows us how they handle different
include overloading. dimensions of heterogeneity.
• Geography: Geography is the distance that links
users and resources. Communication reliability is
one such issue that arises due to geographic Hard Opera Program Accessi
limitations. ware ting ming bility
• Administration: Nodes of distributed systems Nat
Syste Languag ure
need to be controlled as the dimensions steadily m e of
increase. Administrative chaos and its related
Tas
difficulties arise due to scalability problems
ks
associated with administration.
There should be no apparent difference between local and During migration of processes or data for improved
remote access methods. In other words, open performance, reliability, or to conceal differences between
communication can be kept secret. For example, from a hosts, the user should not be aware of the changes. This is
user's point of view, access to the remote-control service as known as migration transparency.
a printer should be the same as access to a local printer.
From the programmer's POV, access to a remote object 5.2.6 Performance Transparency
may be the same as access to a local object in the same
category. This transparency has two parts: Performance transparency dictates that the system's
Maintaining syntactical or mechanical coherence between configuration should not affect the user's perception of
distributed and non-distributed access, Maintaining the performance. This may necessitate the utilization of
same semantics. Because remote semantics are very advanced resource management systems. This may require
complex, especially failure methods, this means that local sophisticated resource management systems. In cases
access has to be a minimum set. Remote access will not where resources are only accessible through low-
always look like local access because certain services may performance networks, the distinction may not be possible.
not make sense to support (for example, a complete global
search for a single-factor distributed system may not make 5.2.7 Scaling Transparency
sense in terms of network traffic).
Scaling transparency requires the ability for the system to
5.2.2 Location Transparency expand without impacting the application algorithms. The
capacity to grow and evolve is crucial for many businesses
The details of the topology of the system should not bother and the system should also be able to reduce in size when
the user. The location of an item in the system may not be necessary and allocate necessary space and/or time.
visible to the user or editor. This differs from open access
in that both the design methods and the access methods can 5.3 Openness Solution
be the same. Words may not provide any space.
Users of Concurrency and Applications should be able to To make a distributed system open, it is necessary to
access shared data or items without interruption among publish a clear and well-defined interface between
others. This requires a much more sophisticated approach components. The interfaces must be standardized and new
to a distributed system, as there is more realistic components should be easily incorporated into the existing
compatibility than a central system simulation. For system.
example, a distributed printing service should offer the
same level of access to files as a central system, avoiding 5.4 Concurrency Solution
unpredictable interference during printing. The replication
of the system for availability or performance reasons should Take it case-by-case: - The simple solution here is to add a
not impact the user, including an app editor. conditional statement for this event ordering. If the next
message is a link failure notification, store the notification recovery action. A common approach to detecting failures
in memory in case you become a master later. is end-to-end timeouts, but using timeouts brings problems.
Replicate the computation: - In a system comprising of a
solitary node, there exists a unified, global sequence of
events without any conflict.
Make your event handlers transactional: - Enhance your VI. REFERENCES
event handlers by making them transactional. Transactions
[1] G. Couloris, J. Dollimore, and T. Kinberg, Distributed Systems
enable the appearance of a group of operations as if they – Concepts and Design, 4th Edition, Addison-Wesley, Pearson Education,
were executed simultaneously or not at all, providing a UK, 2001.
robust and influential capability. [2] Andrew S. Tanenbaum and Maarten van Steen. 2006.
Distributed Systems: Principles and Paradigms (2nd Edition). Prentice-
Reorder events that no one will notice: -It turns out that we Hall, Inc., USA.
[3] Schafer, Dominik & Edinger, Janick & VanSyckel, Sebastian &
can achieve even better success if we use a replication
Paluska, Justin & Becker, Christian. (2016). Tasklets: Overcoming
model called virtual synchrony. In short, visual synchrony Heterogeneity in Distributed Computing Systems. 156-161.
provides the library with three functions: join ( ) a process 10.1109/ICDCSW.2016.22.
team, register ( ) an event host, and send ( ) an atomic [4] van Steen, M., Tanenbaum, A.S. A brief introduction to
broadcast message to your entire process team. Make distributed systems. Computing 98, 967–1009 (2016).
yourself stateless: - https://doi.org/10.1007/s00607-016-0508-7
[5] Nadiminti, Krishna & Assuncao, Marcos & Buyya, Rajkumar.
In a database, the `ground truth` is stored on the base. In a
(2006). Distributed Systems and Recent Innovations: Challenges and
network, the same is stored in the routing tables of the Benefits. InfoNet Magazine. 16.
switches themselves. It implies that the controllers’ view of [6] A. Aloui, M. Msahli, T. Abdessalem, S. Bressan and S.
the network is just a soft state, i.e., we can always recover Mesnager, "Preserving Privacy in distributed system (PPDS) protocol:
it simply by querying the switches for their current Security analysis," 2017 IEEE 36th International Performance Computing
configuration. and Communications Conference (IPCCC), 2017, pp. 1-7, DOI:
10.1109/PCCC.2017.8280505.
Guarantee self-stabilization: - The previous solutions were
[7] M. R. Ogiela, L. Ogiela, and U. Ogiela, "Security and Privacy
designed to always guarantee correct behavior despite in Distributed Information Management," 2014 International Conference
failures of the other nodes. This final solution, my personal on Intelligent Networking and Collaborative Systems, 2014, pp. 73-78,
favorite, is much more optimistic. DOI: 10.1109/INCoS.2014.108.
1. Enforce isolation among transactions. [8] Y. Bai, "On Distributed System Security," 2008 International
2. Ensure database consistency by executing transactions in Conference on Security Technology, 2008, pp. 54-57, DOI:
10.1109/SecTech.2008.22.
a manner that maintains consistency.
[9] A. Aloui, M. Msahli, T. Abdessalem, S. Bressan and S.
3. Address read-write and write-read conflicts. Mesnager, "Protocol for preserving privacy in a distributed system
(PPDS)," 2017 13th International Wireless Communications and Mobile
5.5 Security solution Computing Conference (IWCMC), 2017, pp.1885-1890, DOI:
10.1109/IWCMC.2017.7986571.
[10] S. Taheri-Boshrooyeh, A. Küpçü and Ö. Özkasap, "Security
Three computer and network address requirements
and Privacy of Distributed Online Social Networks," 2015 IEEE 35th
International Conference on Distributed Computing Systems Workshops,
• Confidentiality: Requires data to be accessible 2015, pp. 112-119, DOI: 10.1109/ICDCSW.2015.30.
only to authorized persons. [11] Ghosh, Soumitra & Mishra, Anjana & Mishra, Brojo. (2019).
• Integrity: Requires that only authorized teams can CyberSecurity Techniques in Distributed Systems, SLAs, and other Cyber
Regulations. 10.1002/9781119488330.ch7.
modify data.
[12] Secara, Ion-Alexandru. (2020). Challenges and
• Availability: Requires data to be available from Considerations in Developing and Architecting Large-Scale Distributed
authorized groups. Systems. International Journal of Internet and Distributed Systems. 04. 1-
13. 10.4236/ijids.2020.41001.
5.6 Scalability solution [13] Emmanuel, Ntaye. (2019). Distributed Systems: Basic Design
Challenges and Technological Novelties.