0% found this document useful (0 votes)
2 views19 pages

Answers

The global exchange of cloud resources enables access to computing services worldwide, facilitating collaboration and resource sharing. Major cloud providers maintain interconnected data centers globally, but face legal and security concerns regarding data privacy. Microsoft Azure offers a variety of cloud services, including compute, storage, networking, and AI, while virtualization can be implemented at various levels to optimize resource management.

Uploaded by

Rishee
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views19 pages

Answers

The global exchange of cloud resources enables access to computing services worldwide, facilitating collaboration and resource sharing. Major cloud providers maintain interconnected data centers globally, but face legal and security concerns regarding data privacy. Microsoft Azure offers a variety of cloud services, including compute, storage, networking, and AI, while virtualization can be implemented at various levels to optimize resource management.

Uploaded by

Rishee
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Write short notes on global exchange of cloud resources

The global exchange of cloud resources refers to the ability of cloud computing platforms
to share and deliver computing resources, such as storage, processing power, and
software, across the world via the Internet.

Key Points:

1. Worldwide Access

• Cloud services can be accessed from any location using the Internet.

• This enables global collaboration and remote computing without needing


physical infrastructure at every site.

2. Interconnected Data Centers

• Cloud providers like Amazon, Google, Microsoft have data centers in multiple
countries.

• These centers are interconnected over the Internet to provide scalable and
efficient services.

3. Legal & Security Concerns

• Users in different countries may hesitate to use foreign clouds due to data privacy
laws, trust, or regulatory issues.

• For example, European users may not feel comfortable storing sensitive data on
servers located in the U.S.

4. SLAs (Service Level Agreements)

• To support global exchange, clear agreements between users and providers are
necessary.

• SLAs define service quality, availability, security, and privacy standards.

5. Hybrid Cloud Use

• Many enterprises use hybrid clouds, combining local (private) infrastructure with
global (public) cloud services.

• This allows them to meet international demand while keeping sensitive data
secure.
Benefits of Global Exchange:

• High availability of resources

• Load balancing across time zones

• Disaster recovery through geographic distribution

• Cost savings by using cheaper global resources

Discuss a set of cloud services provided by Microsoft Azure.

1. Compute Services

• Virtual Machines (VMs): Run Windows or Linux machines in the cloud.

• Azure Functions: Serverless compute for running code without managing servers.

• Kubernetes Service (AKS): Manage containerized apps easily.

2. Storage Services

• Blob Storage: Store large unstructured data like images and videos.

• Disk & File Storage: Persistent and shared storage for VMs and apps.

• Archive Storage: Low-cost storage for rarely used data.

3. Networking Services

• Virtual Network (VNet): Secure cloud networking.

• Load Balancer & VPN Gateway: Distribute traffic and connect securely to on-
premises systems.

• CDN: Delivers content faster to users worldwide.

4. Database & Analytics

• Azure SQL Database: Fully managed relational database.


• Cosmos DB: Global, scalable NoSQL database.

• Data Factory: For data integration and movement.

5. Security & Identity

• Azure Active Directory: Manage user identity and access.

• Security Center & Key Vault: Monitor security and store sensitive data securely.

6. DevOps & Developer Tools

• Azure DevOps Services: Tools for Continuous Integration / Continuous


Deployment (CI/CD).

• DevTest Labs: Create and manage test environments for development.

• Azure Container Instances (ACI): Run containers without needing to manage


servers.

7. AI & Machine Learning Services

• Azure Machine Learning: Build, train, and deploy machine learning models.

• Cognitive Services: Pre-built AI APIs for vision, speech, and language tasks.

• Azure Bot Services: Build and deploy intelligent chatbots.

8. Hybrid Cloud Solutions

• Azure Arc: Manage on-prem, multi-cloud, and edge resources with Azure tools.

• Azure Stack: Run Azure services in on-premise environments.

• Site Recovery: Disaster recovery to keep applications running during failures.

9. IoT (Internet of Things) Services

• Azure IoT Hub: Connect and manage IoT devices securely.

• IoT Central: Fully managed IoT app platform with simplified setup.
• Digital Twins: Create digital replicas of real-world systems for simulation and
monitoring.

10. Migration & Modernization Services

• Azure Migrate: Assess and move workloads to Azure easily.

• Database Migration Service: Move databases with minimal downtime.

• App Service Migration Assistant: Shift web and .NET apps to Azure smoothly.

Explain in detail about Implementation Levels of virtualization.

Implementation Levels of Virtualization

Virtualization can be implemented at different levels in a computing system, each with


its own role, advantages, and challenges. As per the textbook, there are four main levels
of virtualization:

1. Instruction Set Architecture (ISA) Level

• This level allows virtualization of CPU instruction sets.

• It enables applications compiled for one hardware architecture to run on different


hardware.

• Achieved using binary translation.

• For example, Intel binaries can run on PowerPC using binary translation techniques.

Advantage: Increases application portability.


Limitation: Can be slow due to translation overhead.

2. Hardware Level

• Virtualization is done using a Virtual Machine Monitor (VMM) or Hypervisor.

• The hypervisor directly manages the hardware like CPU, memory, and I/O devices.

• This is also called bare-metal virtualization.


Example: VMware ESX Server, Xen.
Advantage: Best performance and resource isolation.
Limitation: Requires complex hardware support.

3. Operating System (OS) Level

• The OS kernel is modified to support multiple user spaces.

• Each user space behaves like a separate virtual machine.

• No need for a separate guest OS—this reduces overhead.

Example: Containers like Docker, LXC.


Advantage: Fast and lightweight.
Limitation: All containers must use the same OS kernel.

4. Library Support Level

• Uses API translation to make an application believe it’s running in a different


environment.

• For example, Windows applications can run on Linux using a library that mimics
Windows APIs.

Example: WINE (for running Windows apps on Linux).


Advantage: No need to install full OS.
Limitation: Limited compatibility and performance.

5. Application Level Virtualization (Continuation)

• This level allows individual applications to run in isolated environments, regardless


of the underlying system.

• The application is encapsulated along with its dependencies in a virtual


environment.

• It avoids software conflicts and simplifies deployment.


Example: Java Virtual Machine (JVM), .NET CLR.
Advantage: Platform independence and ease of deployment.
Limitation: Only the specific application is virtualized—not the whole system.

Explain how Migration of Memory, Files, and Network Resources happen in cloud
computing.

Migration of Memory, Files, and Network Resources in Cloud Computing

In cloud computing, migration means moving a running virtual machine (VM) or its
components like memory, files, or network state from one physical machine to another.
This is essential for load balancing, fault tolerance, and maintenance.

There are three major types of resource migration:

1. Migration of Memory

• Memory migration involves transferring the active memory pages of a VM.

• A live migration technique is used, where memory is copied while the VM is still
running.

• Dirty pages (pages that change during copying) are tracked and re-copied.

• Eventually, the VM is paused briefly to transfer the remaining pages.

Goal: Ensure minimal downtime and service disruption.


Use Case: Moving VMs during hardware maintenance.

2. Migration of Files

• VM files like disk images, system libraries, and configurations are stored in shared
storage systems.

• Network File System (NFS) or distributed file systems (like Google File System)
are used.

• Instead of physically copying files, the new machine accesses the same shared
storage.
Advantage: Fast and efficient file access.
Challenge: Requires file consistency and synchronization.

3. Migration of Network Resources

• Migrating VMs also requires preserving their network identity (like IP addresses).

• Virtual networking technologies allow the VM to retain its address even when
moved.

• Dynamic routing and network reconfiguration help maintain ongoing sessions.

Goal: Maintain ongoing network sessions without interruption.


Challenge: Handling IP bindings and latency during rerouting.

Explain VM based intrusion detection system.

What is Intrusion Detection?

Intrusion Detection Systems (IDS) are used to detect unauthorized access, misuse, or
attacks on computing systems.

Why Use VM-Based IDS in Cloud Computing?

In a traditional physical system, it’s hard to monitor all activities without interfering with
the system itself. But with virtual machines (VMs), it's possible to detect attacks from
outside the VM, without affecting its internal processes.

Definition from the Textbook

"With VM-based intrusion detection, one can build a secure VM to run the intrusion
detection facility (IDS) outside all guest VMs."

How It Works

1. A secure VM (called IDS VM) is created on the same host system.

2. This VM runs an IDS engine (e.g., Snort, Suricata).


3. It monitors:

o Guest OS activities

o Network traffic

o System calls

o Disk operations

4. All other VMs (guest VMs) continue their operations unaware of the monitoring.

Key Advantages

• No modification required in guest VMs.

• Can detect:

o Virus activities

o Worm propagation

o Security policy violations

o Suspicious system calls

• High-level isolation enhances security and stealth.

• IDS VM can monitor multiple guest VMs simultaneously.

Illustration (from textbook)

• The IDS engine resides in a separate monitoring VM.

• It has visibility into:

o VM memory states

o I/O operations

o Application behavior
Explain reputation system design options.

Reputation System Design Options

In cloud computing, trust is very important since users do not control the infrastructure.
One way to manage trust is by using a reputation system.

A reputation system helps in measuring and managing the trustworthiness of users,


services, and providers based on their past behavior and interactions.

1. What is Reputation?

Reputation is the quality assigned to an entity (user/service/provider) based on:

• Past interactions.

• Feedback from other users.

• Observations over time.

2. Ways to Determine Trust

There are three key approaches:

a. Policies

• A set of rules or conditions to define trust.

• Based on credentials, e.g., certificates or digital signatures.

• Example: A CSP is trusted only if it has ISO certification.

b. Reputation

• Based on history of behavior.

• More trust is given to entities with positive feedback over time.

• Example: A cloud provider with good uptime and secure service earns a high
reputation score.

c. Recommendations

• Trust is based on opinions of others.

• Can be direct (personal experience) or indirect (third-party feedback).


• Example: One CSP may trust another if a mutual user has had a good experience.

3. How Reputation is Used

• In service selection: Choose the provider with the highest score.

• In access control: Only allow users with good reputation to access critical data.

• In trust-based transactions: Reputation helps avoid malicious or unreliable


entities.

4. Technical Definition (from notes)

“Trust of a party A to a party B for a service X is the measurable belief of A that B behaves
dependably for a specified period within a specified context.”

This means trust is not random, it is measured, specific, and time-bound.

5. Design Goals of a Good Reputation System

• Should be accurate and tamper-proof.

• Should adapt to changing behavior over time.

• Should support decentralized evaluation, especially in distributed cloud systems.

Conclusion

Reputation systems are essential tools in cloud environments to build trust and ensure
secure interactions between unknown entities. They combine past performance, policies,
and recommendations to help in decision making.
What are the various system issues for running a typical parallel program in either
parallel or distributed manner?

To run a parallel program effectively on a parallel or distributed system, several important


system issues must be addressed. These are necessary to ensure smooth execution,
coordination, and proper utilization of computing resources.

The key issues are:

1. Partitioning

a) Computation Partitioning

• The given program is split into smaller tasks.

• These tasks are distributed to run simultaneously on different workers.

• It requires identifying parallel parts of the program that can run independently.

b) Data Partitioning

• The input or intermediate data is divided into smaller parts.

• Each part is processed by a different worker.

• This allows parallel data processing and improves speed.

2. Mapping

• Assigns tasks or data to specific computing resources.

• The aim is to distribute load evenly and make efficient use of resources.

• Usually handled by resource allocators in the system.

3. Synchronization

• Ensures that workers coordinate properly.

• Prevents race conditions (when two workers access same data simultaneously).

• Maintains data dependency so that a worker waits for data from another if needed.
4. Communication

• Workers often need to exchange data during execution.

• Communication is mainly required when there is data dependency.

• Efficient communication methods are essential for better performance.

5. Scheduling

• Decides which task runs when and on which worker.

• If there are more tasks than available resources, the scheduler prioritizes them.

• Follows specific scheduling policies to improve system performance.

Conclusion

In summary, the main system issues for running a parallel program are:

• Partitioning

• Mapping

• Synchronization

• Communication

• Scheduling

All these ensure the program runs efficiently across multiple computing resources, either
in a parallel or distributed environment.
 Private clouds balance workloads within the company’s network for better efficiency.  Developers don’t manage servers, storage, or infrastructure.
 Private clouds offer better security, privacy, and testing environments.  Focus is only on writing and running code.
 Public clouds help avoid big upfront costs in hardware, software, and staff.  It handles app hosting, scaling, updates, and security automatically.
 Companies often start by virtualizing their systems to reduce operating costs.  Faster development because everything is ready to use.
 Big companies (like Microsoft, Oracle, SAP) use policy-based IT management to  Great for developers and software teams.
improve services.  Useful for web apps, mobile apps, and APIs.
 IT as a service boosts flexibility and avoids replacing servers often.  Examples of PaaS providers:
 This leads to better IT efficiency and agility for companies.
 Google App Engine
 Microsoft Azure App Service
4.1.3 Infrastructure-as-a-Service (IaaS)
 Heroku
 IBM Cloud Foundry
 IaaS means renting IT infrastructure like servers, storage, and networks over the  Red Hat OpenShift
Internet.
 It provides virtual machines, storage, networks, and firewalls.  Pay only for what you use – no upfront setup or hardware costs.
 Users can choose their own operating system and software.  Helps teams collaborate easily and launch apps faster.
 Users don’t manage physical hardware, only virtual resources.
 It's a pay-as-you-go model – no need to buy expensive equipment. 4.1.4 Software-as-a-Service (SaaS)
 IaaS is scalable – add or remove resources anytime.
 Great for startups, developers, and large businesses.
 SaaS means using software over the internet.
 Helps with testing, hosting apps, data backup, and disaster recovery.
 Examples of IaaS providers:  No need to install or update anything.
 Accessible from any device with internet.
o Amazon EC2, S3
 Software is managed by the provider, not the user.
o Microsoft Azure VMs
o Google Compute Engine  Users pay monthly or yearly (subscription model).
o IBM Cloud  No hardware or server needed by the user.
o Oracle Cloud Infrastructure (OCI)  Used for email, file sharing, CRM, video calls, etc.

 Saves money and time by avoiding hardware setup.  Data is stored in the cloud by the provider.
 Ideal for companies needing flexible and powerful IT resources.  Saves time, money, and effort.
 Great for businesses and individuals.
 Examples of SaaS:
4.1.3 Platform-as-a-Service (PaaS)
 Gmail
 Paas provides a platform to build, test, and deploy applications.  Google Docs
 Microsoft 365
 It includes tools, libraries, databases, and runtime environments.  Salesforce
 Zoom
Dept. of CSE, SVIT Page 7 Dept. of CSE, SVIT Page 8
 Instead of moving data around, cloud sends programs to the data. 4.1.1.4 Hybrid Clouds
 This saves time and improves internet speed.
 A hybrid cloud combines both public and private clouds.
 Virtualization helps use resources better and cuts costs.  It allows a company to use its private cloud but also get extra power from a public
 Companies don’t need to set up or manage servers themselves. cloud when needed.
 Cloud provides hardware, software, and data only when needed.  Example: IBM’s RC2 connects private cloud systems across different countries.
 Hybrid clouds give access to the company, partners, and some third parties.
 The goal is to replace desktop computing with online services.  Public clouds offer flexibility, low cost, and standard services.
 Cloud can run many different apps at the same time easily.  Private clouds give more security, control, and customization.
 Hybrid clouds balance the two, making compromises between sharing and privacy.

4.1.1.1 Centralized versus Distributed Computing

 Cloud computing is distributed using virtual machines in big data centers.


 Public and private clouds work over the Internet.
 Big companies like Amazon, Google, and Microsoft build distributed cloud systems for
speed, reliability, and legal reasons.
 Private clouds (within companies) can connect to public clouds to get more resources.
 People may worry about using clouds in other countries unless strong agreements (SLAs)
are made.

4.1.1.2 Public Clouds

 A public cloud is available to anyone who pays for it.


 It is run by cloud providers (like Google, Amazon, Microsoft, IBM, Salesforce).
 Users subscribe to use services like storage or computing power.
 Public clouds let users create and manage virtual machines online.
 Services are charged based on usage (pay-as-you-go).

4.1.1.3 Private Clouds

 A private cloud is built and used within one organization (not public). 4.1.1.5 Data-Center Networking Structure
 It is owned and managed by the company itself.
 Only the organization and its partners can access it — not the general public.  The core of a cloud is a server cluster made of many virtual machines (VMs).
 It does not sell services over the Internet like public clouds do.  Compute nodes do the work; control nodes manage and monitor cloud tasks.
 Private clouds give flexible, secure, and customized services to internal users.  Gateway nodes connect users to the cloud and handle security.
 They allow the company to keep more control over data and systems.  Clouds create virtual clusters for users and assign jobs to them.
 Private clouds may affect cloud standard rules, but offer better customization for the  Unlike old systems, clouds handle changing workloads by adding or removing resources
company. as needed.
 Private clouds can support this flexibility if well designed.

Dept. of CSE, SVIT Page 2 Dept. of CSE, SVIT Page 3


4. Searchable Symmetric Encryption (SSE) – Protects database queries from explicit data • Optimize data replication and consistency mechanisms for reliability.
leakage while enabling single-keyword, multi-keyword, ranked, and Boolean searches.
Cloud databases enhance efficiency, but proper security measures are essential to prevent
5. Private Cloud Risks – While firewalls protect against outsiders, insider threats remain a unauthorized access, data breaches, and operational failures.
concern. Access restrictions and monitoring help mitigate risks.

By utilizing OPE and SSE, encrypted databases can support efficient searches while enhancing data
OPERATING SYSTEM SECURITY
security. However, insider threats and query pattern exposure require additional safeguards.
An OS manages hardware resources while protecting applications from malicious attacks like
unauthorized access, code tampering, and spoofing. Security policies include access control,
SECURITY OF DATABASE SERVICES authentication, and cryptographic protection.

DBaaS allows cloud users to store and manage their data, but security risks include data integrity, Key Security Concerns:
confidentiality, and availability concerns.
1. Mandatory vs. Discretionary Security – Mandatory policies enforce strict security, while
Major Security Threats: discretionary policies leave security decisions to users, increasing risks.

1. Authorization & Authentication Issues – Weak access controls can lead to data leaks or 2. Trusted Paths & Applications – Trusted software needs secure communication mechanisms
unauthorized modifications. to prevent impersonation.

2. Encryption & Key Management – Poor encryption handling exposes data to external attacks. 3. OS Vulnerabilities – Commodity OSs often lack multi-layered security, making them
susceptible to privilege escalation.
3. Insider Threats – Superusers with excessive privileges may misuse confidential data.
4. Malicious Software Threats – Java Security Manager uses sandboxing but cannot prevent all
4. External Attacks – Methods like spoofing, sniffing, man-in-the-middle, and DoS attacks can
security bypasses.
compromise cloud databases.
5. Closed vs. Open Systems – ATMs, smartphones, and game consoles have embedded
5. Multi-Tenancy Risks – Shared environments increase data recovery vulnerabilities if proper
cryptographic keys for stronger authentication.
sanitation isn’t enforced.
6. Weak Isolation Between Applications – A compromised app can expose the entire system.
6. Data Transit Risks – Without encryption, data transfer over public networks is vulnerable.
7. Application-Specific Security – Certain applications, like e-commerce, require extra
7. Data Provenance Challenges – Tracking data origin and movement requires complex metadata
protection like digital signatures.
analysis.
8. Challenges in Distributed Computing – OS security gaps affect application authentication and
8. Lack of Transparency – Users may not know where their data is stored, complicating security
secure user interactions.
assessments.
A secure OS is crucial, but additional security measures like encryption, auditing, and
9. Replication & Consistency Issues – Synchronizing data across multiple cloud locations is
authentication are necessary for comprehensive protection.
difficult.

10. Auditing & Compliance Risks – Third-party audits can violate privacy laws if data is stored in
restricted locations. VIRTUAL MACHINE SECURITY

Mitigation Strategies: Virtual Machine (VM) security primarily relies on hypervisors for isolation and access control,
reducing risks compared to traditional OS security.
• Implement strong authentication and authorization protocols.

• Use robust encryption for stored and transmitted data.

• Restrict superuser access and enforce logging and monitoring.

• Conduct regular audits while ensuring legal compliance.


• Optimize data replication and consistency mechanisms for reliability.

Cloud databases enhance efficiency, but proper security measures are essential to prevent
unauthorized access, data breaches, and operational failures.

OPERATING SYSTEM SECURITY

An OS manages hardware resources while protecting applications from malicious attacks like
unauthorized access, code tampering, and spoofing. Security policies include access control,
authentication, and cryptographic protection.

Key Security Concerns:

1. Mandatory vs. Discretionary Security – Mandatory policies enforce strict security, while
discretionary policies leave security decisions to users, increasing risks.

2. Trusted Paths & Applications – Trusted software needs secure communication mechanisms Key Aspects of VM Security:
to prevent impersonation.
1. Hypervisor-Based Security – Ensures memory, disk, and network isolation for VMs.
3. OS Vulnerabilities – Commodity OSs often lack multi-layered security, making them
2. Trusted Computing Base (TCB) – A compromised TCB affects entire system security.
susceptible to privilege escalation.
3. VM State Management – Hypervisors can save, restore, clone, and encrypt VM states.
4. Malicious Software Threats – Java Security Manager uses sandboxing but cannot prevent all
security bypasses. 4. Attack Prevention – Dedicated security VMs and intrusion detection systems enhance
protection.
5. Closed vs. Open Systems – ATMs, smartphones, and game consoles have embedded
cryptographic keys for stronger authentication. 5. Inter-VM Communication – Faster than physical machines, enabling secure file migration.
6. Weak Isolation Between Applications – A compromised app can expose the entire system. Security Threats:
7. Application-Specific Security – Certain applications, like e-commerce, require extra Hypervisor-Based Threats:
protection like digital signatures.
• Resource starvation & DoS due to misconfigured limits or rogue VMs.
8. Challenges in Distributed Computing – OS security gaps affect application authentication and
• VM side-channel attacks exploiting weak inter-VM isolation.
secure user interactions.
• Buffer overflow vulnerabilities in hypervisor-managed processes.
A secure OS is crucial, but additional security measures like encryption, auditing, and
authentication are necessary for comprehensive protection. VM-Based Threats:

• Deployment of rogue or insecure VMs due to weak administrative controls.


VIRTUAL MACHINE SECURITY • Tampered VM images from insecure repositories lacking integrity checks.
Virtual Machine (VM) security primarily relies on hypervisors for isolation and access control, Mitigation Strategies:
reducing risks compared to traditional OS security.
• Enforce strong access controls and isolate inter-VM traffic.

• Use digitally signed VM images to ensure integrity.

• Implement intrusion detection & prevention systems for proactive security.

Virtualization enhances security but requires proper configurations, access control, and
monitoring to prevent exploits.
Architecture of MapReduce in Hadoop ➢ Task Assignment: The JobTracker assigns map tasks based on data locality and reduce
tasks without locality constraints.
The topmost layer of Hadoop is the MapReduce engine that manages the data flow and
control flow of MapReduce jobs over distributed computing systems. ➢ Task Execution: The TaskTracker runs tasks by copying the job's JAR file and
executing it in a Java Virtual Machine (JVM).
➢ Task Monitoring: Heartbeat messages from TaskTrackers inform the JobTracker about
their status and readiness for new tasks

Dryad and DryadLINQ from Microsoft


Two runtime software environments are reviewed in this section for parallel and
Figure 6.11 shows the MapReduce engine architecture cooperating with HDFS. distributed computing, namely the Dryad and DryadLINQ, both developed by Microsoft.

➢ Similar to HDFS, the MapReduce engine also has a master/slave architecture Dryad
consisting of a single JobTracker as the master and a number of TaskTrackers as the ➢ Flexibility Over MapReduce: Dryad allows users to define custom data flows using
slaves (workers). directed acyclic graphs (DAGs), unlike the fixed structure of MapReduce.
➢ The JobTracker manages the MapReduce job over a cluster and is responsible for
monitoring jobs and assigning tasks to TaskTrackers. ➢ DAG-Based Execution: Vertices represent computation engines, while edges are
➢ The TaskTracker manages the execution of the map and/or reduce tasks on a single communication channels. The job manager assigns tasks and monitors execution.
computation node in the cluster. ➢ Job Manager & Name Server: The job manager builds, deploys, and schedules jobs,
➢ Each TaskTracker manages multiple execution slots based on CPU threads (M * N while the name server provides information about available computing resources.
slots).
➢ Each data block is processed by one map task, ensuring a direct one-to-one mapping ➢ 2D Pipe System: Unlike traditional UNIX pipes (1D), Dryad's 2D distributed pipes
between map tasks and data blocks. enable large-scale parallel processing across multiple nodes.

Running a Job in Hadoop ➢ Fault Tolerance: Handles vertex failures by reassigning jobs and channel failures by
recreating communication links.
➢ Job Execution Components: A user node, a JobTracker, and multiple TaskTrackers
coordinate a MapReduce job. ➢ Broad Applicability: Supports scripting languages, MapReduce programming, and
SQL integration, making it a versatile framework.
➢ Job Submission: The user node requests a job ID, prepares input file splits, and
submits the job to the JobTracker.
PROGRAMMING SUPPORT OF GOOGLE APP ENGINE
Supported Languages
• Java: Comes with Eclipse plug-in and GWT (Google Web Toolkit).
• Python: Supports Django, CherryPy, and Google’s built-in webapp environment.
• Other Languages: JVM-based interpreters enable JavaScript and Ruby compatibility.
Data Management
• NoSQL Data Store: Schema-less, entity-based system (max size: 1MB per entity).
• Java Interfaces: JDO & JPA via Data Nucleus Access platform.
• Python Interface: SQL-like GQL.
• Transactions: Strong consistency with optimistic concurrency control.
• Memcache: Speeds up data retrieval; works independently or with the datastore.
• Blobstore: Supports large files up to 2GB.
External Connectivity & Communication
• Secure Data Connection (SDC): Allows tunneling between intranet and GAE.
• URL Fetch Service: Fetches web resources via HTTP/HTTPS.
Google File System
• Google Data API: Enables interaction with services like Maps, Docs, YouTube,
Calendar. • Fundamental storage service for Google’s search engine.
• Google Accounts Authentication: Simplifies user login using existing Google • Designed for massive data storage on commodity hardware.
accounts.
• Optimized for large file sizes (100MB to several GB).
• Email Mechanism: Built-in system for sending emails from GAE applications.
• Not a traditional POSIX-compliant file system but offers a customized API for
Task & Resource Management Google applications.
• Cron Service: Automates scheduled tasks (daily, hourly, etc.). Key Features
• Task Queues: Allows asynchronous background processing. • 64MB block size (much larger than traditional 4KB blocks).
• Resource Quotas: Controls resource consumption, ensuring budget adherence. • Optimized for sequential writes & streaming reads (minimal random access).
• Free Tier: Usage is free up to certain quota limits. • Reliability through replication (each chunk stored on at least three chunk servers).
• Single master manages metadata and coordinates file access.
• Shadow master provides backup to prevent single-point failure.
• No caching (as large-scale sequential reads/writes do not benefit from locality).
• Supports snapshot & record append operations.
• Purpose – Open-source cloud computing framework for scalable and secure ➢ The proxy server routes requests and facilitates lookups for accounts, containers, and
infrastructure. objects.
➢ Rings map entity names to physical locations while using zones, devices, partitions, and
• Community – Involves technologists, developers, researchers, and industries sharing replicas for fault tolerance and resource isolation.
technologies. ➢ Object Server stores, retrieves, and deletes binary objects with metadata, but requires
• Key Components: specific file system support.
➢ Container Server lists objects, while Account Server manages container listings.
o Compute: Manages large groups of virtual private servers. ➢ The system ensures resilience through replication, updaters, and auditors while balancing
partitions across heterogeneous storage clusters.
o Storage: Offers scalable object storage using clusters of commodity servers.
➢ OpenStack’s first release, Austin, launched on October 22, 2010, supported by an active
• Recent Development: developer community.

o Image repository prototyped. Manjrasoft Aneka Cloud and Appliances

o Image registration & discovery service – Locates and manages stored images. Developed by Manjrasoft (Melbourne, Australia), Aneka is a cloud application platform for deploying
parallel & distributed applications on private or public clouds.
o Image delivery service – Transfers images to compute services from storage.
• Deployment Options – Can be hosted on Amazon EC2 (public cloud) or a private cloud with
• Future Direction – Expanding service integration and enhancing its ecosystem. restricted access.
OpenStack Compute • Key Advantages:
➢ OpenStack Nova is a cloud fabric controller within the OpenStack IaaS ecosystem, o Supports multiple programming & application environments.
designed for efficient computing support.
➢ It follows a shared-nothing architecture, relying on message queues for o Allows simultaneous execution across multiple runtime environments.
communication and using deferred objects to prevent blocking. o Provides rapid deployment tools & frameworks.
➢ The system maintains state through a distributed data system with atomic transactions
for consistency. o Enables dynamic resource allocation based on QoS/SLA requirements.
➢ Built in Python, Nova integrates boto (Amazon API) and Tornado (fast HTTP server o Built on Microsoft .NET, with Linux support via Mono.
for S3).
➢ The API Server processes requests and forwards them to the Cloud Controller, which • Three Core Capabilities:
handles system state, authorization (via LDAP), and node management. 1. Build – Includes an SDK with APIs & tools for developing applications. Supports
➢ Nova also includes network components such as NetworkController (VLAN enterprise/private clouds, Amazon EC2, and hybrid clouds.
allocation), RoutingNode (NAT conversion & firewall rules), AddressingNode (DHCP
services), and TunnelingNode (VPN connectivity), with network state stored in a 2. Accelerate – Allows rapid deployment across multiple runtime environments
distributed object store managing IP and subnet assignments. This ensures scalability, (Windows/Linux/UNIX), optimizing local resources & leasing extra capacity from public clouds when
security, and efficient cloud resource management needed.

3. Manage – Offers tools (GUI & APIs) for monitoring, managing, and scaling Aneka compute
clouds dynamically based on SLA/QoS requirements.

• Programming Models Supported:

1. Thread Programming Model – Best for utilizing multicore nodes in cloud computing.

2. Task Programming Model – Ideal for prototyping independent task-based applications.

3. MapReduce Programming Model – Efficient for large-scale data processing & computation.

OpenStack Storage

OpenStack's storage solution consists of several interconnected components. Aneka Architecture

You might also like