0% found this document useful (0 votes)
28 views15 pages

Question Bank 1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views15 pages

Question Bank 1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 15

IT1701- DISTRIBUTED SYSTEMS AND CLOUD COMPUTING

QUESTION BANK

UNIT I

Part- A

1. What is meant distributed system?


1. We define a distributed system as a collection of autonomous computers linked
by a network, with software designed to produce an integrated computing facility.
2. A system in which hardware or software components located at networked
computers communicate and coordinate their actions only by message passing.
3. A collection of two or more independent computers which coordinate their
processing through the exchange of synchronous or asynchronous message passing.
4. A collection of independent computers that appear to the users of the system as a
single computers.
2. What are the significance of distributed system?
1. Concurrency of computers.
2. No global clock.
3. Independent failures.
3. Why we do you need distributed system?
a. Functional distribution: Computers have different functional capabilities
(i.e., sharing of resources with specific functionalities).
b. Load distribution/balancing: Assign tasks to processors such that the
overall system performance is optimized.
c. Replication of processing power: Independent processors working on the
same task.
d. Distributed system consisting of collections of microcomputers may
have processing powers that no supercomputer will ever achieve.
e. Physical separation: Systems that rely on the fact that computers are
physically separated (e.g., to satisfy reliability requirements).
f. Economics: Collections of microprocessors offer a better
price/performance ratio than large mainframes.mainframes:10 times
faster, 1000 times as expensive.
4. Examples of distributed system?
a. Internet
b. Intranet
c. Mobile and ubiquitous computing.
5. What is meant by location aware computing?
Mobile computing is the performance of computing tasks while the users are on the
move and away from their residence intranet but still provided with access to resources via
the devices they carry with them. They can continue to access the intranet, they can
continue to access resources in their home intranet, and there is increasing provision for
users to utilize resources such as printers that are conveniently nearby as they move
around. This is known as location aware computing.
6. What are the two type of resource sharing?
a. Hardware sharing: Printers. plotters and large disks and other peripherals are
shared to reduce costs.
b. Data sharing is important in many applications:
1. Software developers working in a team need to access each other’s code
and share the same development tools.
2. Many commercial applications enable users to access shared data
objects in a single active database.
3. The rapidly growing area of group-ware tools enables users to
cooperate with in a network.
7. List the importance of data sharing?
 Software developers working in a team need to access each other’s code
and share the same development tools.
 Many commercial applications enable users to access shared data objects
in a single active database.
 The rapidly growing area of group- ware tools enables users to cooperate
with in a network.
8. Write the technological components of web?
 HTML
 HTTP-request-reply protocol
 URL’s
9. List the distributed systems challenges?
a. Heterogeneity: standards and protocols; middleware; virtual machine;
b. Openness: publication of services; notification of interfaces;
c. Security: firewalls; encryption;
d. Scalability: replication; caching; multiple servers;
e. Failure Handling. failure tolerance; recover/roll-back; redundancy;
f. Concurrency. concurrency control to ensure data consistency.
g. Transparency. Middleware; location transparent naming; anonymity

10. What are the three components of security?


Security for information resources has three components:
 Confidentiality: production against disclosure to unauthorized individuals.
 Integrity: production against or corruption.
 Availability: production against interference with the means to access the
resources.
11. What is the use of firewall?
A firewall can be used to form a barrier around an intranet to protect it from outside
users but does not deal with ensuring the appropriate use of resources by users within the
intranet.
12. What are the security challenges? List them.
a. Denial of service attacks: Another security problem is that the user may
wish to disrupt a service for some reason. This can be achieved by
bombarding the service with such a large number of pointless requests that
the serious users are unable to use it. This is called a denial of service
attack and there are many on well known web services.
b. Security of mobile code: Mobile codes needed to be handled with care. PC
users sometimes send executable files as email attachments to be run by
the recipient, but a recipient will not be able to run it.
13. List the challenges to be considered for designing scalable distributed system?
 Controlling the cost of physical resources
 Controlling the performance loss
 Preventing software resources running out
 Avoiding performance bottlenecks.

14. What are the failures detected in DS?


Masking failures: Some detected failures can be hidden or made less severe.
Examples of hiding failures:
1. Messages can be retransmitted when they fail to arrive
2. File data can be written to a pair of disks that if one is corrupted, the other may
still be correct.
Tolerating failures: Most of the services in the Internet do exhibit failures. It
would not be practical for them to detect and hide all the failures occur in such network.
Their clients are designed to tolerate failures, which generally involve the users in that.
Recovery from failures: involves the design of software so that the state
permanent data can be rolled back after a server has crashed.
15. List the key design goals of DS?
a. Performance
b. Reliability
c. Scalability
d. Consistency
e. Security
16. List the technical design goals of DS?
a. Naming
b. Communication
c. Software structure
d. Workload allocation
17. Define RMI .
Each process contains a collection of objects, some of which can
receive both local and remote invocations whereas the other objects can receive only
local invocations.
Method invocation between objects in different processes, whether in the same
computer or not, are known as remote method invocations. Method invocation between
objects in the same process is local method invocation. We refer to objects that can
receive remote invocation as remote objects.
18. What are the main choices to be considered in design of RMI?
RMI invocation semantics

a. Retry-reply protocols, where we showed that doOperation can be implemented in


different ways to provide different guarantees.
b. The main choices are:
i. Retry request message: Controls whether to retransmit the request
message until either a reply is received or the server is assumed to have failed.
ii. Duplicate filtering: Controls when retransmissions are used and whether to
filter out duplicate requests at the server.
Iii.Retrasmission of results: Controls whether to keep a history of result
message to enable lost results to be retransmitted without re-executing the operations at
the server.
19. Sketch the RMI reply-request message structure.
Message Type
Request Id
Object Reference
Method Id
arguments
20. Define RPC.
The software components required to implement RPC are
The client that accesses a service includes one stub procedure for each procedure
the service interface. The stub procedure behaves like a local procedure to the client but
instead of executing the call it marshals the procedure identifier and the arguments into
a request message which it sends via its communication module to the server. When the
reply message arrives it unmarshals the results.
The server process contains a dispatcher together with one server stub procedure
and one service procedure for each procedure in the service interface. The dispatcher
selects one of the server stub procedures according to the procedure identifier in the
request message.
21. What is meant by election
Election: choosing a unique process for a particular role is called an election
– All the processes agree on the unique choice
– For example, server in dist. mutex
22. List the famous mutual exclusion algorithms
 Center server algorithm
 Ring- Based algorithms
 Mutual Exclusion using multicast and Logical Clocks
 Maekawa’s Voting algorithms
 Mutual Exclusion algorithms comparison

23. What is meant by hardware and software clock?


Clock devices can be programmed to generate interrupts at regular intervals in orders
that, for example, time slicing can be implemented.The operating system reads the
node’s hardware clock value, H(t) , scales it and adds an offset so as to produce
software clock C (t)=αHi(t)+β that approximately measures real ,physical time t for
process pi.
24. What is clock resolution?
Note that successive events will correspond to different timestamps only if the clock
resolution-the period between updates of the clock-value-is smaller than the time
interval betw4een successive events. The rate at which events occur depends on such
factors as the length of the processor instruction cycle.
25. What is clock drift?
Clock drift, which means that they count time at different rates and so diverge. The
underling oscillators are subject to physical variations, with the consequence that their
frequencies of oscillation differ. Moreover ,even the same clock’s frequency varies
with temperature. Designs exist that attempt to compensate for this variation, but they
cannot eliminate it. A clock’s drift rate is the change in the offset (difference in
reading) between the clock and a nominal perfect reference clock per unit of time
measured by the reference clock.
What is distributed deadlock? Explain with example.
With deadlock detection schemes, a transaction is aborted only when it is involved in a
deadlock. Most deadlock detection schemes operate by finding cycles in the
transaction wait- for graph. In a distributedsystem involving multiple servers
being accessed by multiple transactions, a global wait-for graph can in
theory be constructed from the local ones. There can be a cycle in the global
wait-for graph that is not in any single local one – that is, there can be a
distributed deadlock
Explain phantom deadlocks.

A deadlock that is 'detected' but is not really a deadlock is called phantom deadlock. In
distributed deadlock detection, information about wait-for relationships between
transactions is transmitted from on server to another. If there is a deadlock, the necessary
information will eventually be collected in one place and a cycle will be detected. Ja this
procedure will take some time, there is a chance that one of the transactions
that Holds a lock will meanwhile have released it, in which case the deadlock will no longer
exist.

Part- B

1. Explain the need of Distributed systems its characteristics with example


2. Explain the issues to be considered in the design of Distributed Systems
3. Explain in detail about the architectural model of distributed system.
4. Explain the RPC mechanism with various functional components.
5. Describe CORBA RMI and its services.
6. Explain about Distributed Deadlock Detection Algorithms.
7. Discuss the following: a) UDP datagram communication b) TCP stream
communication
8. Explain ring based election algorithms
UNIT II

Part – A

1. Define thread.
A thread is the operating system abstraction of an activity (the term derives from the
phrase ‘thread of execution’). An execution environment is the unit of resource
management: a collection of local kernel managed resources to which its threads have
access.
2. List the architecture of multi threaded server.
 Working pool Architecture
 Thread-per-request Architecture;
 Thread-per-connection Architecture
 Thread-per-object Architecture:
3. Compare process and threads.
a. Creation a new thread within an existing process is cheaper than creating a process.
b. More importantly switching to a different thread within the same process is cheaper
than switching between threads belonging to different processes.
c. Threads within a process may share data and other resources conveniently and
efficiently compared with separate processes.
d. But by the same token threads within processes are not protected from one another.
4. Explain thread lifetime.
A new thread is created on the same Java Virtual machine (JVM) as its creator in the
SUSPENDED state. After it is made RUNNABLE with the start() method, it execute in
the run() method of an object designated in its constructor, The JVM and the threads on
top of it all execute in a process on top of the underlying operating system. Threads can
be assigned a priority so that a java implementation that supports priorities will run a
particular threads in preference to any thread with lower priority
5. Enumerate the properties of storage system?

Sharing Persistent Distributed Consistency Example


cache/replicas maintenance
Main memory No No No 1 RAM
File system n No Yes No 1 UNIX file
system
Distributed file Yes Yes Yes Yes Sun NFS
system
web Yes Yes Yes No Web server
Distributed shared Yes No Yes Yes Ivy(DSM)
memory
Remote Yes No No 1 CORBA
objects(RMI/ORB)
Persistent object Yes Yes No 1 CORBA
store persistent state
service
Peer to peer storage Yes Yes Yes 2 Ocean Store
system
6. List out file system modules.

Directory module: Relates file names to file IDs


File module: Relates file IDs to particular files
Access control module: Checks permission for operation requested
File access module: Read or writes file data or attributes
Block module: Accesses and allocates disk blocks
Device module: Disk I/O and buffering
7. Sketch the file attributes and record structure.

File Length
Creation Time stamp
Read Timestamp
Write time stamp
Attribute time stamp
Reference count
Owner
File Type
Access control List

8. What is process?
Process means a program in execution. Process execution must progress sequential
order.
9. What is process migration?
The phenomenon of shifting a process from one machine to another one which is
called process migration.
10. What is Load?
Load may be define as number of tasks are running in queue, CPU utilization,
load average, I/O utilization, amount of free CPU time/memory, etc.
11. List desirable features of good process migration mechanism.
 Transparency
 Efficiency
 Minimal interference
 Minimize freezing time
 Minimal residual dependencies.
12. List any three challenges of process migration.
 Process state capturing and transfer
 Scheduling
 System call
13. What are strategies for the migration of files?
 If the file is locked by the migrating process and resides on the same system,
then transfer file with the process
If the process is moved temporarily, transfer the file only after an access request was made
by the migrated process.

14. Explain the benefit of process migration


 Better response time and execution speed – up
 Reducing network traffic
 Improving system reliability
 Higher throughput and effective resource utilization
15. List the types of process scheduling techniques.
 Task management approaching
 Load balancing approaching
 Load – Sharing approaching
16. What is kernel level thread?
In kernel level thread, thread management is done by kernel. OS support the kernel
level thread. Since kernel managing threads, kernel can schedule another thread if a
given thread blocks rather than blocking the entire processes.

17. What is user level thread?


User level thread uses user space for thread scheduling. These threads are transparent to
the operating system. User level threads are created by runtime libraries that cannot
execute privileged instructions.

18. What is preemptive process migration?


Preemptive process transfer involve the transfer of a process that is partially executed.
This transfer is an expensive operation as the collection of a process’s state can be
difficult.

19. What is non preemptive process migration?


Non –preemptive process transfers involve the transfer of process that have not begun
execution an hence do not require the transfer of the proces’s state. In both types of
transfers, information about the environment in which the process will execute must be
transferred to the receiving node.
20. What are the desirable features of Global Scheduling Algorithm?
No A Priori Knowledge about the Processes Ability to make dynamic
scheduling decisions Flexible
Stable Scalable
Unaffected by system failures

21. What is Task Assignment Approach?


Each process is divided into multiple tasks. These tasks are scheduled to suitable processor to
improve performance. This is not a widely used approach because:It requires characteristics
of all the processes to be Known in advance.This approach does not take into consideration
the dynamically changing state of the system.In this approach, a process is considered to be
composed of multiple tasks and the goal is to find an optimal assignment policy for the tasks
of an individual process.

22. Write the Task Assignment Approach Algorithms.


🠶 Graph Theoretic Deterministic Algorithm.
🠶 Centralized Heuristic Algorithm.
🠶 Hierarchical Algorithm.

23. What is theGoal of Load Balancing Algorithms?


The goal of the load balancing algorithms is to maintain the load to each processing element
such that all the processing elements become neither overloaded nor idle that means each
processing element ideally has equal load at any moment of time during execution to obtain
the maximum performance (minimum execution time) of the system.

24. What is Load balancing?


Load balancing is the way of distributing load units (jobs or tasks) across a set of
processors which are connected to a network which may be distributed across the globe.
🠶 The excess load or remaining unexecuted load from a processor is migrated to other
processors which have load below the threshold load.
🠶 Threshold load is such an amount of load to a processor that any load may come
further to that processor.
🠶 By load balancing strategy it is possible to make every processor equally busy and to
finish the worKs approximately at the same time.

Part- B

1. Explain in detail about thread?


2. Illustrate thread model.
3. Explain in detail about resource management
4. Explain Process Migration with suitable example?
5. Write notes on desirable features of a good process migration mechanism
6. Discuss Load-balancing Approach.
7. Give the techniques and methodologies for scheduling process of a distributed
systems.
8. Give a brief account on desired features of scheduling algorithms.
UNIT III
Part-A
1) Define community cloud.
A community cloud in computing is a collaborative effort in which infrastructure is
shared between several organizations from a specific community with common
concerns (security, compliance, jurisdiction, etc.), whether managed internally or by a
third-party and hosted internally or externally. This is controlled and used by a group
of organizations that have shared interest. The costs are spread over fewer users than a
public cloud (but more than a private cloud), so only some of the cost savings
potential of cloud computing are realized.
2) Difference between public and private cloud.
A public cloud is one based on the standard cloud computing model, in which a
service provider makes resources, such as applications and storage, available to the
general public over the Internet. Public cloud services may be free or offered on a
pay-per-usage model.

Private cloud is a type of cloud computing that delivers similar advantages to public
cloud, including scalability and self-service, but through a proprietary architecture.
Unlike public clouds, which deliver services to multiple organizations, a private cloud
is dedicated to a single organization.
3) What are the categories of cloud computing?
IaaS allows users to rent the infrastructure itself: servers, data center space, and
software. The biggest advantage of renting, as opposed to owning, infrastructure is
that users can scale up the amount of space needed at any time.
PaaSallows developers to create applications, collaborate on projects, and test
application functionality without having to purchase or maintain infrastructure.
Software as a service (SaaS; pronounced /sæs/) is a software licensing and delivery
model in which software is licensed on a subscription basis and is centrally hosted. It
is sometimes referred to as "on-demand software".
4) Define IaaS.
IaaS allows users to rent the infrastructure itself: servers, data center space, and
software. The biggest advantage of renting, as opposed to owning, infrastructure is
that users can scale up the amount of space needed at any time.
5) Define PaaS.
PaaS allows developers to create applications, collaborate on projects, and test
application functionality without having to purchase or maintain infrastructure.
6) Define SaaS.
Software as a service (SaaS; pronounced /sæs/) is a software licensing and delivery
model in which software is licensed on a subscription basis and is centrally hosted. It
is sometimes referred to as "on-demand software".
7) Mention the advantages of cloud computing.
 Flexibility.
 Disaster recovery. ...
 Automatic software updates. ...
 Capital-expenditure Free. ...
 Increased collaboration. ...
 Work from anywhere. ...
 Document control. ...
8) Mention the disadvantages of cloud computing.
 Down time
 Security
 Privacy
 Vulnerable to attack
 Limited control and flexibility
9) Explain the different levels of virtualization.
 Hardware Virtualization
 Virtual Machine
 Storage Virtualization
 Desktop Virtualization
 Network Virtualization
10) What is Virtualization?
Virtualization is the creation of a virtual (rather than actual) version of something,
such as an operating system, a server, a storage device or network resources.
11) What are virtual clusters?
Micro, Small and Medium Enterprises are the backbone of an economy. They are the
most prolific job creators and pioneers in developing new ideas. That is why the
MSME Ministry of Government of India wants to help these businesses in every
possible way to facilitate the industry.
12) Explain about management of resources.
Resource management is the efficient and effective deployment and allocation of an
organization's resources when and where they are needed. Such resources may include
financial resources, inventory, human skills, production resources, or information
technology.
13) How virtualization happens in data center?
A Virtual Datacenter is a pool of cloud infrastructure resources designed specifically
for enterprise business needs. Those resources include compute, memory, storage and
bandwidth. Bluelock Virtual Datacenters are hosted in the public cloud and are based
on VMware vCloud technology, which provides full compatibility with any VMware
environment.
Part-B
1) Explain the various cloud deployment models.
2) Explain the different categories of cloud computing.
3) Explain the advantages and disadvantages of cloud computing.
4) Explain the implementation levels of virtualization.
5) Explain Virtualization Structure in detail.
6) Explain Virtualization of CPU, Memory and I/O devices.
7) Explain about virtual clusters and resource management
8) Write in detail about Virtualization for data centre automation
UNIT IV
Part-A
1)List some of the open source grid middleware packages.

 GridGain
 Hadoop
 Rio
 JPPF

2) Define Globus Toolkit.

The Globus Toolkit is an open source toolkit for grid computing developed and provided by
the Globus Alliance.

The Globus Toolkit adheres to or provides implementations of the following standards:

 Open Grid Services Architecture (OGSA)


 Open Grid Services Infrastructure (OGSI), originally intended to form the basic
“plumbing” layer for OGSA, but has been superseded by WSRF and WS-
Management.

3) What is a programming model?

A Programming Model, refers to the style of programming where execution is invoked by


making what appear to be library calls. Examples include the POSIX Thread library and
Hadoop'sMapReduce.

4) Define Big Data.

Extremely large data sets that may be analysed computationally to reveal patterns, trends, and
associations, especially relating to human behaviour and interactions.

5) What is Hadoop.

Hadoop is an open-source framework that allows to store and process big data in a distributed
environment across clusters of computers using simple programming models. It is designed
to scale up from single servers to thousands of machines, each offering local computation and
storage.

6) What are map and reduce functions?

A MapReduce program is composed of a Map()procedure (method) that performs filtering


and sorting (such as sorting students by first name into queues, one queue for each name) and
a Reduce() method that performs a summary operation (such as counting the number of
students in each queue, yielding name frequencies).

7) How to run a job in hadoop?

 Navigate to the command line.


 Type ./bdutil shell to SSH into the master node of the Hadoop cluster.
 Copy one or more text files into the input directory. ...
 Type cd /hadoop-install/share/hadoop/mapreduce to navigate to the Hadoop install
director.

8) What is HDFS?

The Hadoop Distributed File System (HDFS) is a sub-project of the Apache Hadoop
project. This Apache Software Foundation project is designed to provide a fault-tolerant file
system designed to run on commodity hardware.

9) Explain about the command lines used in HDFS.

 namenode -format
 secondarynamenode
 namenode
 datanode
 dfsadmin
 mradmin
 fsck
 fs

10) What are the java interface used in HDFS?

$ java–version

If everything works fine it will give you the following output.

java version "1.7.0_71"

Java(TM) SE Runtime Environment (build 1.7.0_71-b13)

Java HotSpot(TM) Client VM (build 25.0-b02, mixed mode)

11) Explain the dataflow of read and write in a file system.

An application adds data to HDFS by creating a new file and writing the data to it. After the
file is closed, the bytes written cannot be altered or removed except that new data can be
added to the file by reopening the file for append. HDFS implements a single-writer,
multiple-reader model.

PART-B
1) Explain in detail about GT4 Architecture.
2) What are the usage of globus?
3) Explain in detail about Hadoop.
4) What is HDFS? Explain in detail.
5) What are the open source grid middleware packages?
UNIT-V

Part- A

1) What is Grid Computing?

Grid computing is the collection of computer resources from multiple locations to reach a
common goal. The grid can be thought of as a distributed system with non-interactive
workloads that involve a large number of files.

2) Explain why grid security is required?

The Grid Security Infrastructure (GSI), formerly called the Globus Security Infrastructure, is
a specification for secret, tamper-proof, delegable communication between software in
a grid computing environment. Secure, authenticable communication is enabled using
asymmetric encryption.

3) Explain about host level cloud security.

Majority of cloud service providers store customers’ data on large data centres. Although
cloud service providers say that data stored is secure and safe in the cloud, customers’ data
may be damaged during transition operations from or to the cloud storage provider.

4) What is network level cloud security.

All data on the network need to be secured. Strong network traffic encryption techniques
such as Secure Socket Layer (SSL) and the Transport Layer Security (TLS) can be used to
prevent leakage of sensitive information. Several key security elements such as data security,
data integrity, authentication and authorization, data confidentiality, web application security,
virtualization vulnerability, availability, backup, and data breaches should be carefully
considered to keep the cloud up and running continuously.

5) Explain about application level security.

Studies indicate that most websites are secured at the network level while there may be
security loopholes at the application level which may allow information access to
unauthorized users. Software and hardware resources can be used to provide security to
applications.

6) Why is data security required in cloud?

Majority of cloud service providers store customers’ data on large data centres. Although
cloud service providers say that data stored is secure and safe in the cloud, customers’ data
may be damaged during transition operations from or to the cloud storage provider.

7) What is IAM?

Identity and access management (IAM) is the security and business discipline that "enables
the right individuals to access the right resources at the right times and for the right reasons."
It addresses the need to ensure appropriate access to resources across increasingly
heterogeneous technology environments and to meet increasingly rigorous compliance
requirements.
8) How is IaaS availability in cloud measured?

Infrastructure as a Service (IaaS) is a form of cloud computing that provides virtualized


computing resources over the Internet. IaaS is one of three main categories of cloud
computing services, alongside Software as a Service (SaaS) and Platform as a Service (PaaS)

9) What is PaaS and SaaS?

Platform as a service (PaaS) is a category of cloud computing services that provides a


platform allowing customers to develop, run, and manage applications without the
complexity of building and maintaining the infrastructure typically associated with
developing and launching an app.

10) What is the disadvantages in cloud computing?

 Downtime
 Security
 Privacy
 Vulnerability to attack

11) Explain about grid security.

The Grid Security Infrastructure (GSI), formerly called the Globus Security Infrastructure, is
a specification for secret, tamper-proof, delegable communication between software in
a grid computing environment. Secure, authenticable communication is enabled using
asymmetric encryption

PART-B

1) Explain the Authentication and Authorization methods in grid security environment.

2) What is Grid Security Infrastructure. Explain in detail.

3) What is Cloud Infrastructure Security. Explain in detail.

4) Explain the different aspects of Data Security in cloud.

5) Write about Identity and Access Management Architecture.

6) Explain the IAM practices in cloud and their availability.

You might also like