0% found this document useful (0 votes)
14 views

Unit 2

Distributed Computing
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Unit 2

Distributed Computing
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

External Data Representation

A Distributed system consists of numerous components located on different machines that


communicate and coordinate operations to seem like a single system to the end-user.

External Data Representation:

Data structures are used to represent the information held in running applications. The information
consists of a sequence of bytes in messages that are moving between components in a distributed
system. So, conversion is required from the data structure to a sequence of bytes before the
transmission of data. On the arrival of the message, data should also be able to be converted back
into its original data structure.

Different types of data are handled in computers, and these types are not the same in every position
where data must be transmitted. Individual primitive data items can have a variety of data values,
and not all computers store primitive values like integers in the same order. Different architectures
also represent floating-point numbers differently. Integers are ordered in two ways, big-endian
order, in which the Most Significant Byte (MSB) is placed first, and little-endian order, in which the
Most Significant Byte (MSB) is placed last or the Least Significant Byte (LSB) is placed first.
Furthermore, one more issue is the set of codes used to represent characters. Most applications on
UNIX systems use ASCII character coding, which uses one byte per character, whereas the Unicode
standard uses two bytes per character and allows for the representation of texts in many different
languages.

There should be a means to convert all of this data to a standard format so that it can be sent
successfully between computers. If the two computers are known to be of the same type, the
external format conversion can be skipped otherwise before transmission, the values are converted
to an agreed-upon external format, which is then converted to the local format on receiving. For
that, values are sent in the sender’s format, along with a description of the format, and the recipient
converts them if necessary. It’s worth noting, though, that bytes are never changed during
transmission. Any data type that can be supplied as a parameter or returned, as a result, must be
able to be converted and the individual primitive data values expressed in an accepted format to
support Remote Procedure Call (RPC) or Remote Method Invocation (RMI) mechanisms. So, an
external data representation is a standard for representing data structures and primitive values that
have been agreed upon.

 Marshalling: Marshalling is the process of transferring and formatting a collection of data


structures into an external data representation type appropriate for transmission in a
message.

 Unmarshalling: The converse of this process is unmarshalling, which involves reformatting


the transferred data upon arrival to recreate the original data structures at the destination.

Approaches:

There are three ways to successfully communicate between various sorts of data between
computers.

1. Common Object Request Broker Architecture (CORBA):


CORBA is a specification defined by the Object Management Group (OMG) that is currently the most
widely used middleware in most distributed systems. It allows systems with diverse architectures,
operating systems, programming languages, and computer hardware to work together. It allows
software applications and their objects to communicate with one another. It is a standard for
creating and using distributed objects. It is made up of five major components. Components and
their function are given below:

 Object Request Broker (ORB): It provides a communication infrastructure for the objects to
communicate across a network.

 Interface Definition Language (IDL): It is a specification language used to provide an


interface in a software component. To exemplify, it allows communication between
software components written in C++ and Java.

 Dynamic Invocation Interface (DII): Using DII, client applications are permitted to use server
objects without even knowing their types at compile time. Here client obtains an instance of
a CORBA object and then invocation requests can be made dynamically on the
corresponding object.

 Interface Repository (IR): As the name implies, interfaces can be added to the interface
repository. The purpose of IR is that a client should be able to find an object which is not
known at compile-time and information about its interface then request is made to be sent
to ORB.

 Object Adapter (OA): It is used to access ORB services like object reference generation.

Data Representation in CORBA:

Common Data Representation (CDR) is used to describe structured or primitive data types that are
supplied as arguments or results during remote invocations on CORBA distributed objects. It allows
clients and servers’ built-in computer languages to communicate with one another. To exemplify, it
converts little-endian to big-endian.
There are 15 primitive types: short (16-bit), long (32-bit), unsigned short, unsigned long, float (32-
bit), double (64-bit), char, boolean (TRUE, FALSE), octet (8-bit), and any (which can represent any
basic or constructed type), as well as a variety of composite types.

CORBA CDR Constructed Types:

Let’s have a look at Types with their representation:

 sequence: It refers to length (unsigned long) to be followed by elements in order

 string: It refers to length (unsigned long) followed by characters in order (can also have wide
characters)

 array: The elements of the array follow order and length is fixed so not specified.

 struct: in the order of declaration of components

 enumerated: It is unsigned long and here, the values are specified by the order declared.

 union: type tag followed by the selected member

Example:

struct Person {
string name;
string place;
long year;
};
Marshalling CORBA:

From the specification of the categories of data items to be transmitted in a message, Marshalling
CORBA operations can be produced automatically. CORBA IDL describes the types of data structures
and fundamental data items and provides a language/notation for specifying the types of arguments
and results of RMI methods.

2. Java’s Object Serialization:

Java Remote Method Invocation (RMI) allows you to pass both objects and primitive data values ​as
arguments and method calls. In Java, the term serialization refers to the activity of putting an object
(an instance of a class) or a set of related objects into a serial format suitable for saving to disk or
sending in a message.

Java provides a mechanism called object serialization. This allows an object to be represented as a
sequence of bytes containing information about the object’s data and the type of object and the
type of data stored in the object. After the serialized object is written to the file, it can be read from
the file and deserialized. You can recreate an object in memory with type information and bytes that
represent the object and its data.
Moreover, objects can be serialized on one platform and deserialized on completely different
platforms as the whole process is JVM independent.

For example, the Java class equivalent to the Person struct defined in CORBA IDL might be:

Java

import java.io.*;

public class Person implements Serializable {

public String name;

public String place;

public int phonenumber;

public void letter() {

System.out.println("Issue a letter to " + name + " " + place);

3. Extensible Markup Language (XML):

XML is a markup language that was defined by the World Wide Web Consortium for general use on
the web. XML was initially developed for writing structured documents for the web. XML is used to
enable clients to communicate with web services and for defining the interfaces and other
properties of web services.
Clients communicate with web services using XML, which is also used to define the interfaces and
other aspects of web services. However, XML is utilized in a variety of different applications,
including archiving and retrieval systems; while an XML archive is larger than a binary archive, it has
the advantage of being readable on any machine. Other XML applications include the design of user
interfaces and the encoding of operating system configuration files.

In contrast to HTML, which employs a fixed set of tags, XML is extensible in the sense that users can
construct their tags. If an XML document is meant to be utilized by several applications, the tag
names must be unique.

Example:

XML definition of the Person struct:


<person id="9865">
<name>John</name>
<place>England</place>
<year>1876</year>
<!-- comment -->
</person>

Virtualization in Distributed System

Virtualization in distributed systems enhances flexibility and resource efficiency by abstracting


hardware and software layers. This technology enables the creation of virtual environments,
optimizing resource use, improving scalability, and simplifying management in complex, distributed
infrastructures. Understanding its role is crucial for modern IT environments.

What is Virtualization in Distributed Systems?

Virtualization in Distributed Systems refers to the technology that abstracts and pools physical
resources (such as servers, storage, and network devices) to create virtual resources that can be
dynamically allocated and managed across a distributed network of physical machines. Key Aspects
of Virtualization in Distributed Systems:

 Abstraction: Virtualization abstracts the underlying physical hardware, allowing multiple


virtual instances (such as virtual machines or containers) to run on a single physical host.
This abstraction hides the complexities of the physical hardware and provides a unified
interface for managing resources.

 Resource Pooling: Physical resources are pooled together and allocated to virtual instances
as needed. This enables efficient utilization of hardware by distributing resources among
multiple virtual environments.

 Dynamic Allocation: Resources can be allocated, reallocated, or deallocated dynamically


based on the needs of the virtual instances. This allows for flexible scaling and efficient
management of resources in response to changing demands.

Importance of Virtualization in Distributed Systems?


Virtualization is crucial in distributed systems for several reasons:

 Resource Optimization:

o Virtualization allows multiple virtual machines (VMs) or containers to run on a single


physical server, improving resource utilization and reducing hardware costs.

o This leads to better use of CPU, memory, and storage resources.

 Scalability:

o It simplifies scaling by enabling the rapid deployment and management of virtual


instances.

o Distributed systems can quickly scale up or down by adding or removing virtual


resources as needed, without requiring physical hardware changes.

 Isolation and Security:

o Virtualization provides isolation between different virtual environments, which


enhances security.

o If one VM or container is compromised, the others remain unaffected, thus


containing potential security breaches.

 Flexibility and Agility:

o Virtualized environments allow for flexible and dynamic resource allocation.

o They support the rapid deployment of new applications and services and enable
easier testing and development of software in isolated environments.

 Simplified Management:

o Virtualization tools and platforms offer centralized management interfaces for


overseeing virtual resources.

o This simplifies monitoring, configuration, and maintenance tasks, and enables


automation of routine processes.

Types of Virtualization in Distributed Systems

In distributed systems, virtualization can take various forms, each addressing different aspects of
resource management and deployment. Here are the primary types of virtualization used in
distributed systems:

1. Server Virtualization

 Definition: Server virtualization involves creating multiple virtual servers on a single physical
server using a hypervisor.

2. Storage Virtualization

 Definition: Storage virtualization abstracts physical storage resources into a single logical
storage pool, making it easier to manage and allocate storage.

3. Network Virtualization
 Definition: Network virtualization abstracts network resources to create virtual networks
that are independent of physical hardware.

 Types:

o Virtual LANs (VLANs): Segregates network traffic into different virtual networks
within a physical network, improving security and efficiency.

o Software-Defined Networking (SDN): Separates the control plane from the data
plane in networking, allowing for centralized network management and dynamic
resource allocation.

o Network Function Virtualization (NFV): Virtualizes network functions (e.g., firewalls,


load balancers) into software instances rather than relying on dedicated hardware.

Use Cases of Virtualization in Distributed Systems

Virtualization in distributed systems supports a range of use cases and applications that
enhance flexibility, efficiency, and scalability. Here are some key use cases and
applications:

 Cloud Computing

o Public Clouds: Virtualization enables cloud providers to offer scalable, on-demand


resources (compute, storage, networking) to users through virtual machines and
containers.

o Private Clouds: Organizations use virtualization to build private clouds, providing


internal resources and services with similar benefits of scalability and resource
optimization.

 Data Center Optimization

o Server Consolidation: Virtualization allows multiple virtual servers to run on a single


physical server, reducing the number of physical machines needed and optimizing
data center space.

o Resource Pooling: Data centers use virtualization to pool and allocate resources
dynamically based on demand, improving efficiency and utilization.

 Development and Testing

o Isolated Environments: Developers create isolated virtual environments to test


applications across different configurations and OS versions without affecting
production systems.

o Rapid Provisioning: Virtual machines and containers can be quickly provisioned for
development and testing, accelerating development cycles and improving agility.

You might also like