0% found this document useful (0 votes)

7 views10 pages

A Case Study On Different Applications and Security Issues in Distributed Systems

This case study explores various applications of distributed systems, focusing on distributed web applications, web caching, and distributed file systems, while addressing security issues associated with these systems. It discusses different caching methods, including pull-based, push-based, and hybrid approaches, and proposes a scalable solution for web caching that maintains consistency. The study concludes by highlighting advancements in distributed systems and the effectiveness of the proposed caching approach in ensuring data freshness and reducing server load.

Uploaded by

Abhinav Agarwal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views10 pages

A Case Study On Different Applications and Security Issues in Distributed Systems

Uploaded by

Abhinav Agarwal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

A Case Study on Different Applications and

Security Issues in Distributed Systems

Abhinav Agarwal

19ucs254

[email protected]

Department of Computer Science and Engineering

The LNM Institute of Information Technology, Jaipur

1
Abstract

This case study will look into the background of the different applications of
distributed systems and security around them.

The background includes the functioning of distributed web applications, It

includes an introduction to web caching into which we will dive in the later part
of the paper. We will look into the basics of distributed file systems and
technologies built on top of it, like Sun’s NFS, CODA, HDFS, etc. System Security
is another critical aspect of maintaining these distributed systems, these systems
need to be secured for various attacks, We will see some of the ways to achieve
the same like Authentication Protocols, Public-Private Key Cryptography, Digital
Signatures, and Firewalls.

In the later part of the case study, we will look into web caching and the various
methods to achieve the same. Like we will see the push, pull, and lease-based
approach and their respective merits and demerits.

Then, we will look into the solution proposed by Haobo Yu and the team in [7] to
achieve scalability in web caching and how to ensure consistency among proxies
which is a major drawback in other approaches.

At last, we will conclude by exploring the efficiency if this approach.

2
Background

Distributed Web Applications

A standard web application consists of a web browser client and a web server.. It
can also contain a Proxy Server, which acts as an intermediary between the client
and the server. This proxy server can have multiple use cases ranging from load
balancing to caching. There is also a database server along with a file server with
which our web server communicates to get the data requested by the user.

This structure discussed above can be built as a monolithic architecture or as a

microservice architecture, which is basically, we divide our applications between
different microservices, each responsible for a particular functionality, building this
way also eliminates the single point of failure; also, we can scale any microservice
independently based on the requirements.

Web Caching

Web caching is used when we want performance while scaling the system, in this
we duplicate some functionality or data on multiple nodes and request to these
functions, and data get served from these nodes instead of our main server. This
helps in reducing the load on our main server. Web caching works best when
employes closest to the clients this is where technologies like Edge Computing and
CDN comes into play. Where we come compute result at the “edge” of the network.

The problem with this method is that we observe consistency issues because
although the data got updated at the main server but it has still not been updated at
these nodes, and this old data get served to the users. To solve this problem we can
either use a pull or push-based approach, wherein the first these nodes pull data
from the central server at regular intervals, and in the second, the main server
pushes the new data onto these nodes whenever it receives it.

3
Distributed File Systems

DFS is yet another application of distributed system, where your files are these on
some remote servers and can be accessed using Remote Procedure Calls.

Sun’s Network File System if a widely used DFS, which uses virtual file system layer
to handle local and remote files, NFS uses mount protocol to access remote files,
mount protocol establishes a local name for the remote files, users access remote
files using local names and OS takes care of the mapping, NFS also allows client
caching, where cache data can stale upto 30 seconds, NFS implements security
using user ID, group ID authentication only [2].

To make the file system disconnection transparent, which is especially needed for
the mobile clients CODA was developed. In CODA each file belongs to exactly one
volume and each volume may be replicated across several servers, CODA works on
the principal of read-once write-all, where write conflicts are resolves manually by
the user like GIT [3].

Lets talk about xFS a little, it is basically a server less file system which is designed
for high speed LAN environments, it distributes data storage disks using software
RAID and log based network stripping. It also eliminates central server caching
using cooperative caching. As xFS uses RAID so overhead of parity management
hurts performance for small writes also RAID are very expensive hardware.

Some other file systems include LFS which is a Log Structured FS, which provides
fast writes, simple recovery and flexible file locations, another is Hadoop DFS
(HDFS), which is optimized for large data sets which is accessed using Hadoop. [6]

Distributed System Security

The objective of the security is to protect against invalid operations, unauthorized

invocations, and unauthorized users and to achieve that various techniques and
protocols are used.

4
There have been alot of developments to answer the question, how to provide the
authentication to the user, alot of answers are in the direction of encryption but
even if we make it possible using public-private key cryptography it is as “secure” as
the public key distribution and for that algorithms like Diffie-Hellman have been
introduced.

To protect against the intruders, one can use Firewall which is a network
component sitting between inside and outside, it drops packets on the basis of
source and destination address,

To provide encryption and authentication between web server and client SSL
(Secure Socket Layer ) was developed by the Netscape, to begin the SSL session
server’s public key is needed which is encrypted using CA’s private key, and it is
decoded using CA’s public keys which are stored in the browser [4].

Blockchain to implement the security uses consensus validation where each

transaction is signed using the user’s private key and inserted into the ledger, this
transaction si validated by a p2p network without compromising private
information and eliminating the need of any central security, once approved it
exists on the ledger permanently, Bitcoin uses the same mechanism to record the
transactions on its chain [5].

5
Case Evaluation - Web Caching
Web caching is traditionally done using three methods, pull-based caching,
push-based caching, and a hybrid approach.

Pull-Based Caching
This approach is based on the concept of time-to-live (TTL). When the request
arrives at the cache after the TTL has expired, it pulls the latest data from the
server. If the TTL is fixed, then the cache staleness is bounded by this TTL. If we set
a very small value for TTL, then it mitigates the benefits of web caching.

The proxy can also dynamically determine the refresh interval (TTL) based on past
observations, this is known as intelligent polling. So it can be something like,
increase the interval if the object has not changed in two previous polls and
decrease the interval if it has.

Generally, the pull-based approach is not preferred for dynamic content due to the
high overhead of pulling unchanged data, also there can also be consistency issues
if the data is changing very frequently then the user can see previous data because
new data has not been pulled yet on its closest node. Whereas for static content it
is the best approach

Push-Based Caching
In this type of caching, each server keeps track of the changes on a particular page,
and then, whenever that page changes, it notifies the proxies and floods the
network with the updated data. While this approach eliminates the staleness but it
incurs the cost of requiring the server to keep track of all proxies. Also, flooding the
entire network has its own overhead, Thus this approach does not scale.

When working with dynamic content, ensuring consistency is a very big issue, if we
make our dynamic content static and store it in the cache to be served to the user
then if we employ the push-based approach, then even for very little change we
have to flush the entire cache regenerate the content and again store it in the
cache. Also, this approach is not resilient to server crashes.

6
Hybrid Approach - Leases
Lease is a duration for which the server agrees to notify the proxy of modification,
so a lease is issued to a proxy on the first request, and the server will send the
notifications until expiry. Once the lease expires the proxy have to renew the lease.
So if there is no load on the proxy so it will just poll the main server whenever
necessary, or if it is in a load then infinite push is there
There are different policies defined for Leased duration, one is an Age-based lease
where larger the expected lifetime, longer the lease. Another is Renewal-Frequency
Based, where proxies at which objects are popular get a longer lease. One more is
server load based where shorter leases are given during heavy load.
The Efficiency of the whole system depends upon the lease duration, and there is
the overhead of renewing the short leases.

7
Proposed Solution & Implementation
A scalable consistent hashing method has been proposed in [7], which utilizes
invalidations, hierarchy, and leases.

Each group in the hierarchy is associated with the caches, and caches send
heartbeats to each other that are equivalent to cache-to-cache leases. The cache
maintains a server table in order to locate where the web server is located in the
hierarchy. The client request is forwarded to the first cache in the hierarchy which
consists of a valid copy of the requested page.

The caching hierarchy is maintained in the form of groups where each cache joins
the group owned by its parent. Thus there is no need for parents to know who its
children are and children can choose its parent freely as long as cycles are
prevented. More on hierarchy establishment and maintenance has been discussed
in [8].

The hierarchy is kept alive with the help of heartbeats, Each group owner sends
periodic heartbeats to its associated group. Let each lease length is T and t is the
time difference between subsequent leases then (T/t = 5) in their case. This ensures
that if some heartbeat is lost then it will cause much problems.

With the heartbeat, we piggyback the knowledge of the invalid page. We only need
invalid pages that have been requested after they were last rendered invalid. Each

8
heartbeat request contains the knowledge of these pages that have been rendered
invalid at the parent cache and this knowledge is propagated to its child caches.

Heartbeat along with traveling down also travels up, from the server to the
top-level cache, the cache with which the web server is attached is called a primary
cache. Each server sends a JOIN request up the hierarchy, and every cache on
receiving this request makes an entry in its server routing table. The thing to note is
top-level cache knows all the servers attached in the hierarchy. These servers
communicate with the primary cache with the help of a heartbeat. A cache can also
send a LEAVE signal to its parent and children if it does not receives a heartbeat
within T seconds.

Client can attach to any cache in the hierarchy lets call it the clients’s primary
cache. When a clients requests a page, it sends the request to its primary cache.
This cache checks if it contains the requested page if not the request is forwarded
the next cache. When the request is fulfilled either by the originating server or
some intermediary cache, the response takes the reverse path updating all the
caches in the way and serving the user in the end.

9
Conclusion
In this term paper we first discussed the various applications of distributed systems
and the role of security in the same. We saw that a lot of advancements have been
made in the fields of web applications, caching, and distributed file systems. Then
we saw various approaches used to perform web caching for static and dynamic
content and the merits and demerits of each approach.
At last, we saw a new kind of approach implementing web caching which is scalable
and consistent invariant. The approach combined the lessons of the pull, push, and
lease-based approach. The respective author’s performance evaluation suggests
that when the heartbeat rate is larger than the write times, then this approach is
very effective in keeping the pages fresh. When the pages are write-dominated
then this approach ensures freshness because if the page is invalid the request is
served from the cache in the hierarchy which contains the correct page. However,
when the pages are read-dominated, then the invalidation approach offers
significant reductions in server hits counts and client response time.

References
1. Course on Distributed Systems, 2022-23
2. https://www.ibm.com/docs/en/aix/7.1?topic=management-network-f
ile-system
3. http://www.coda.cs.cmu.edu/
4. https://www.cloudflare.com/learning/ssl/how-does-ssl-work/
5. Course on Blockchain Foundation and Smart Contract, 2021-22
6. Distributed Principals and Paradigms, Tanenbaum-Steen
7. A Scalable Web Consistency Architecture.
8. ROSESSTEIN, A.. 12, J.. AND Tow. S. Y. MASH: The rnulticasting archive
server hierarchy. SIGCOMM Computer Cornmrmication Revtew 2’7. 3
(July 1997).

1001 Solved Problems in Engineering Mathematics by Excel Academic Council
33% (3)
1001 Solved Problems in Engineering Mathematics by Excel Academic Council
7 pages
An14g7 Course
No ratings yet
An14g7 Course
901 pages
PowerStore 3.0 Administration - File Provisioning - Participant Guide
No ratings yet
PowerStore 3.0 Administration - File Provisioning - Participant Guide
173 pages
System Design
No ratings yet
System Design
56 pages
HLD Interview
No ratings yet
HLD Interview
253 pages
Usc Csci555 f12 Part2
No ratings yet
Usc Csci555 f12 Part2
222 pages
Coda
No ratings yet
Coda
15 pages
Red Hat Enterprise Linux-8-Managing File systems-en-US
No ratings yet
Red Hat Enterprise Linux-8-Managing File systems-en-US
166 pages
RAC FAQ's
No ratings yet
RAC FAQ's
9 pages
Forcepoint DLP Getting - Started - Guide
No ratings yet
Forcepoint DLP Getting - Started - Guide
53 pages
Oracle Netapp Best Practices
No ratings yet
Oracle Netapp Best Practices
47 pages
HP StoreOnce Backup Software 3.13.1 Release Notes
No ratings yet
HP StoreOnce Backup Software 3.13.1 Release Notes
22 pages
DELLEMC - Docu48453 - Using VNX File Level Retention 8.1
No ratings yet
DELLEMC - Docu48453 - Using VNX File Level Retention 8.1
66 pages
Os Unit 5
No ratings yet
Os Unit 5
45 pages
Failover Cluster Step-By-Step Guide - Configuring A Two-Node File Server Failover Cluster
No ratings yet
Failover Cluster Step-By-Step Guide - Configuring A Two-Node File Server Failover Cluster
12 pages
SIB Best practices-WSTE Sep 06th 2016 v5 - 09 - 05
No ratings yet
SIB Best practices-WSTE Sep 06th 2016 v5 - 09 - 05
39 pages
Docu 84267
No ratings yet
Docu 84267
28 pages
Security Parameters For Unix and Linux Systems
No ratings yet
Security Parameters For Unix and Linux Systems
33 pages
Distributed Computing System (DCS) Solved MCQs (Set-4)
No ratings yet
Distributed Computing System (DCS) Solved MCQs (Set-4)
6 pages
Operating Systems: Vfs/Nfs
No ratings yet
Operating Systems: Vfs/Nfs
17 pages
Algomasterio System Design Interview Handbook
No ratings yet
Algomasterio System Design Interview Handbook
19 pages
Backing Up and Restoring Undercloud and Control Plane Nodes
No ratings yet
Backing Up and Restoring Undercloud and Control Plane Nodes
30 pages
SRM Performance and Best Practices
No ratings yet
SRM Performance and Best Practices
19 pages
Red Hat Openstack Platform 13: Undercloud and Control Plane Back Up and Restore
No ratings yet
Red Hat Openstack Platform 13: Undercloud and Control Plane Back Up and Restore
17 pages
h18248 Spec Sheet Dell Emc Powerscale
No ratings yet
h18248 Spec Sheet Dell Emc Powerscale
16 pages
RackWare FAQ 2023
No ratings yet
RackWare FAQ 2023
4 pages
Product Spec DS420+ Enu
No ratings yet
Product Spec DS420+ Enu
12 pages
NFS Presentation
No ratings yet
NFS Presentation
25 pages
Linux Pathshala: Rhcsa & Rhce in Rhel7
No ratings yet
Linux Pathshala: Rhcsa & Rhce in Rhel7
4 pages
Plan Your Panorama Deployment
No ratings yet
Plan Your Panorama Deployment
3 pages
SMB CIFS NFS Network Shares On Android With Cifs Manager
No ratings yet
SMB CIFS NFS Network Shares On Android With Cifs Manager
15 pages
Priyanka Agrawal Resume
No ratings yet
Priyanka Agrawal Resume
3 pages
Unit 5
No ratings yet
Unit 5
29 pages
MC - Unit 5 - 31-5-2022 - Final
No ratings yet
MC - Unit 5 - 31-5-2022 - Final
28 pages
Networking Long
No ratings yet
Networking Long
17 pages
3distributed File System
No ratings yet
3distributed File System
42 pages
UNIT5
No ratings yet
UNIT5
34 pages
Mid Term paper-MSCS-3rd
No ratings yet
Mid Term paper-MSCS-3rd
1 page
Unit - 6 - Distributed File System
No ratings yet
Unit - 6 - Distributed File System
6 pages
Final Project
No ratings yet
Final Project
7 pages
Dynamo Tomcat White Paper
No ratings yet
Dynamo Tomcat White Paper
7 pages
Domaine Catia v5
No ratings yet
Domaine Catia v5
25 pages
Distributed File Systems
No ratings yet
Distributed File Systems
42 pages
Web Distribution Systems: Caching and Replication: Chandhok@cse - Wustl.edu
No ratings yet
Web Distribution Systems: Caching and Replication: Chandhok@cse - Wustl.edu
12 pages
Content Distribution: Presented by Tanuja V
No ratings yet
Content Distribution: Presented by Tanuja V
15 pages
Web Technology: Basic
No ratings yet
Web Technology: Basic
64 pages
Web Services
No ratings yet
Web Services
110 pages
Distributed File Systems (DFS) : A Resource Management Component of A Distributed Operating System
No ratings yet
Distributed File Systems (DFS) : A Resource Management Component of A Distributed Operating System
16 pages
DFS
No ratings yet
DFS
37 pages
Web Caching: Presented by
No ratings yet
Web Caching: Presented by
12 pages
Unit-5 Mobile Computing
No ratings yet
Unit-5 Mobile Computing
133 pages
Identifying Load Balancers in Penetration Testing
No ratings yet
Identifying Load Balancers in Penetration Testing
13 pages
Client Server
No ratings yet
Client Server
6 pages
WINSEM2012-13 CP0029 06-Mar-2013 RM01 DFT 2
No ratings yet
WINSEM2012-13 CP0029 06-Mar-2013 RM01 DFT 2
46 pages
90 Must Know Interview Questions
No ratings yet
90 Must Know Interview Questions
90 pages
Distributed File Systems
No ratings yet
Distributed File Systems
28 pages
Web Caches, CDNS, and P2Ps
No ratings yet
Web Caches, CDNS, and P2Ps
7 pages
DISTRIBUTEDFILESYS
No ratings yet
DISTRIBUTEDFILESYS
16 pages
System Design
No ratings yet
System Design
56 pages
Unit - 2
No ratings yet
Unit - 2
16 pages
Distributed File Systems
No ratings yet
Distributed File Systems
18 pages
Client Server Architecture: Group Members
No ratings yet
Client Server Architecture: Group Members
6 pages
WS&SOA
No ratings yet
WS&SOA
54 pages
Non Ieee
No ratings yet
Non Ieee
13 pages
P2P File Sharing
No ratings yet
P2P File Sharing
43 pages
Unit 1 WS
No ratings yet
Unit 1 WS
13 pages
HTML 5
No ratings yet
HTML 5
4 pages
Caching
No ratings yet
Caching
12 pages
Unit-3 Part1
No ratings yet
Unit-3 Part1
57 pages
Practical: 10 AIM: Prepare Proxy Server
No ratings yet
Practical: 10 AIM: Prepare Proxy Server
4 pages
Dynamo: Amazon'S Highly Available Key-Value Store: Csci 8101: Advanced Operating Systems Presented By: Chaithra KN
No ratings yet
Dynamo: Amazon'S Highly Available Key-Value Store: Csci 8101: Advanced Operating Systems Presented By: Chaithra KN
23 pages
Distributed File Systems: Unit - V Essay Questions
No ratings yet
Distributed File Systems: Unit - V Essay Questions
10 pages
Performance Analysis of Cache Policies For Web Servers
No ratings yet
Performance Analysis of Cache Policies For Web Servers
20 pages
Optimized Caching Techniques: Application for Scalable Distributed Architectures
From Everand
Optimized Caching Techniques: Application for Scalable Distributed Architectures
Peter Jones
No ratings yet
PHP Microservices
From Everand
PHP Microservices
Carlos Pérez Sánchez
3/5 (1)
Distributed File Systems Engineering: Definitive Reference for Developers and Engineers
From Everand
Distributed File Systems Engineering: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
The HashiCorp Vault Handbook: Deploying, Managing, and Scaling Secure Access
From Everand
The HashiCorp Vault Handbook: Deploying, Managing, and Scaling Secure Access
Robert Johnson
No ratings yet
Network File System in Practice: Definitive Reference for Developers and Engineers
From Everand
Network File System in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
uWSGI Deployment and Configuration Guide: Definitive Reference for Developers and Engineers
From Everand
uWSGI Deployment and Configuration Guide: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Nginx Configuration and Deployment Guide: Definitive Reference for Developers and Engineers
From Everand
Nginx Configuration and Deployment Guide: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Distributed Cluster Operations with DC/OS: Definitive Reference for Developers and Engineers
From Everand
Distributed Cluster Operations with DC/OS: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Commvault Administration and Best Practices: Definitive Reference for Developers and Engineers
From Everand
Commvault Administration and Best Practices: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Cloud Computing Essentials: A Practical Guide with Examples
From Everand
Cloud Computing Essentials: A Practical Guide with Examples
William E. Clark
No ratings yet
The HAProxy Handbook: Load Balancing for Modern Infrastructure
From Everand
The HAProxy Handbook: Load Balancing for Modern Infrastructure
Robert Johnson
No ratings yet
Qubes OS: Security Architecture and Administration
From Everand
Qubes OS: Security Architecture and Administration
Richard Johnson
No ratings yet
The Ultimate Guide to Unlocking the Full Potential of Cloud Services: Tips, Recommendations, and Strategies for Success
From Everand
The Ultimate Guide to Unlocking the Full Potential of Cloud Services: Tips, Recommendations, and Strategies for Success
Rick Spair
No ratings yet
Cloud Computing For Noobs
From Everand
Cloud Computing For Noobs
Silas Meadowlark
No ratings yet
Cloud Computing Made Simple: Navigating the Cloud: A Practical Guide to Cloud Computing
From Everand
Cloud Computing Made Simple: Navigating the Cloud: A Practical Guide to Cloud Computing
Poonam Devi
No ratings yet
Edge Cloud Operations: A Systems Approach
From Everand
Edge Cloud Operations: A Systems Approach
Larry L Peterson
No ratings yet
Cloud: Get All The Support And Guidance You Need To Be A Success At Using The CLOUD
From Everand
Cloud: Get All The Support And Guidance You Need To Be A Success At Using The CLOUD
John Hawkins
No ratings yet