Unit 1 Notes
Unit 1 Notes
separated but are connected by a centralized computer network that is equipped with
distributed system software. The autonomous computers will communicate among each
system by sharing resources and files and performing the tasks assigned to them.
Example of Distributed System:
Any Social Media can have its Centralized Computer Network as its Headquarters and
computer systems that can be accessed by any user and using their services will be the
Autonomous Systems in the Distributed System Architecture.
While distributed systems offer many advantages, they also present some challenges that
must be addressed. These challenges include:
● Network latency: The communication network in a distributed system can
introduce latency, which can affect the performance of the system.
● Distributed coordination: Distributed systems require coordination among the
nodes, which can be challenging due to the distributed nature of the system.
● Security: Distributed systems are more vulnerable to security threats than
centralized systems due to the distributed nature of the system.
● Data consistency: Maintaining data consistency across multiple nodes in a
distributed system can be challenging.
1. Boosting Performance
The distributed system tries to make things faster by dividing a bigger task into small
chunks and finally processing them simultaneously in different computers. It’s just like a
group of people working together on a project. For example, when we try to search for
anything on the internet the search engine distributes the work among several servers and
then retrieve the result and display the webpage in a few seconds.
2. Enhancing Reliability
Distributed systems are experts at handling increased demands. They manage the demands
by incorporating more and more computers into the system. This way they run everything
smoothly and can handle more users.
4. Resourceful Utilization
Resource Utilization is one of the most prominent features of a Distributed system. Instead
of putting a load on one computer, they distribute the task among the other available
resource. This ensures that work will be done by utilizing every resource.
Further Subtopics and Different Approaches:
The distributed system also provide a seamless experience to the user. They make sure that
when you interact with the system, it feels like working with a single entity, even with many
computers working behind the scene.
Distributed system comes with backup plans. If any computer fails they redirect the task to
some other computer ensuring less delay and a smooth experience.
Distributed system have special codes and lock to protect data from other. They use some
renowned techniques for encryption and authentication to keep information safe and
unauthorized access. Distributed systems prioritize data security as you keep your secret
safe.
4. Load Balancing
As we know distributed systems ensure good resource utilization and allow the system to
handle a high volume of data without getting it slow down, this is achieved by load
balancing, in which it evenly distributed load to all the computers available. Thus
preventing single-machine overload and preventing bottlenecks.
Motivation
● Economics - 10,000 CPUs executing 50 MIPS yields system executing 500,000 MIPS.
Not possible for single CPU to achieve.
● Some problems need distributed solutions - e.g., CSCW
● Reliability of Distributed Systems, Load Sharing
● Incremental Growth
● Software
● Networking - overloading, data loss
● Security
Hardware Concepts
● Although all Distributed Systems consist of multiple CPUs, there are different ways
of interconnecting them and how they communicate
● Flynn (1972) identified two essential characteristics to classify multiple CPU
computer systems: the number of instruction streams and the number of data
streams
● Uniprocessors SISD
Array processors are SIMD - processors cooperate on a single problem
MISD - No known computer fits this model
Distributed Systems are MIMD - a group of independent computers each with its
own program counter, program and data
● MIMD can be split into two classifications
● Multiprocessors - CPUs share a common memory
Multicomputers - CPUs have separate memories
● Can be further subclassified as
● Bus - All machines connected by single medium (e.g., LAN, bus, backplane, cable)
Switched - Single wire from machine to machine, with possibly different wiring
patterns (e.g, Internet)
● Further classification is
● Tightly-coupled - short delay in communication between computers, high data rate
(e.g., Parallel computers working on related computations)
Loosely-coupled - Large delay in communications, Low data rate (Distributed
Systems working on unrelated computations)
Bus-Based Multiprocessors
Switched Multiprocessors
● Omega network
Bus-based multicomputers
● Each CPU has its own memory, and CPUs are connected over a network
But how do the CPUs communicate?
● Software problem
Switched Multicomputers
● Each CPU has direct and exclusive access to its own private memory
Software
Design Issues
Transparency
● Giving the impression that the processor pool is acting as a uniprocessor (WWW is
a good-ish example)
● Location Transparency - don't need to know where resources are
● Migration Transparency - Resources can be moved without their names being
changed
● Replication Transparency - System free to make multiple copies of files without
users being affected (e.g., caching)
● Concurrency Transparency - Multiple users can share resources automatically
● Parallelism Transparency - Activities can happen in parallel without users knowing
Flexibility
● Kernel should do as little as possible. User-level servers provide bulk of operating
systems services
● In this case, the kernel is called the microkernel
● Microkernel responsible for
Interprocess communication
Some memory management
Some low-level process management and scheduling
Low-level input and output
● All other services are obtained by sending messages to the appropriate server
Advantages
● Modular
● Every client has equally opportunity to access server, regardless of location
Reliability
● Availability
● Protection against unauthorised access and data loss
● Data consistency if maintaining multiple copies of data
● Fault tolerence
Performance
● A Distributed system should not run an appliction slower than if is were run on an
independent computer
● How is performance measured? e.g., response time, thoughput, system utilization,
network capacity consumed, etc.
● But performance is affected by communications...
Scalability
● Current distributed systems are designed to work with a few hundred CPUs. Will
they still work with thousands or hundreds of thousands?
- Don't centralise components
- Don't centralize tables
- Don't centralize algorithms
- Don't assume all machines are synchronised