0% found this document useful (0 votes)

3 views28 pages

CC MQP Solutions

The document outlines the evolution of computer technologies across five generations, highlighting key eras such as Mainframe, Minicomputer, Personal Computer, Portable Devices, and HPC/HTC. It discusses the benefits of cloud computing, including cost reduction and improved resource sharing, and explains the Message Passing Interface (MPI) for parallel computing. Additionally, it addresses virtualization concepts, system attacks, and the architecture of cloud ecosystems, emphasizing the importance of security and efficient resource management.

Uploaded by

ahemm044

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views28 pages

CC MQP Solutions

Uploaded by

ahemm044

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

Q.01 a.

Explain the Platform Evolution of different computer technologies with a

neat diagram.
The Platform Evolution
Computer technologies have evolved over five generations, with each lasting 10 to 20 years. The
transitions were not sudden; there was often a 10-year overlap between generations.

Generation-wise Platform Evolution:

1. Mainframe Era (1950–1970):

o Built to serve large businesses and governments.

o Examples: IBM 360, CDC 6400.

o Large centralized computers with limited access.

2. Minicomputer Era (1960–1980):

o Lower-cost systems for smaller businesses and colleges.

o Examples: DEC PDP 11, VAX Series.

o More interactive and affordable.

3. Personal Computer (PC) Era (1970–1990):

o Rise of personal computing with VLSI microprocessors.

o Widespread usage in homes, schools, and offices.

4. Portable Devices Era (1980–2000):

o Growth of laptops, PDAs, and wireless devices.

o Enabled mobility and ubiquitous computing.

5. HPC and HTC Era (1990–Present):

o Use of High-Performance Computing (HPC) and High-Throughput Computing

(HTC).

o Technologies: Clusters, Grids, Cloud Computing.

o Used in both scientific and commercial web-scale applications.

Current Trends in Computing:

• Focus on web-based shared resources and big data.

• Use of supercomputers (MPP) replaced by clusters of homogeneous nodes.

• HTC systems emphasize peer-to-peer (P2P) sharing and cloud/web services.

Q.01 b. Outline eight reasons to adapt the cloud for upgraded Internet
applications and web services.

1. Desired location in areas with protected space and higher energy efficiency
2. Sharing of peak-load capacity among a large pool of users, improving overall utilization
3. Separation of infrastructure maintenance duties from domain-specific application
development
4. Significant reduction in cloud computing cost, compared with traditional computing paradigms
5. Cloud computing programming and application development
6. Service and data discovery and content/service distribution
7. Privacy, security, copyright, and reliability issues
8. Service agreements, business models, and pricing policies
Q.01 c. Briefly explain Message Passing Interface (MPI).

❖ It is a standard library used to allow communication between multiple processes in parallel

computing.
❖ It is commonly used in supercomputers and clusters for high-performance tasks.
❖ Processes work independently and exchange data through message passing.
❖ Functions like MPI_Send and MPI_Recv are used to send and receive messages.
OR
Q.2 a. Summarize VM Primitive Operations with relevant diagram.
• The VMM (Virtual Machine Monitor) gives a virtual machine (VM) view to the guest operating
system.
• With full virtualization, the VMM gives a VM that looks exactly like a real machine.

• This allows standard operating systems like Windows 2000 or Linux to run as if they are on actual
hardware.

• Mendel Rosenblum explained the low-level VMM operations, as shown in Figure 1.13.

Basic VM Operations:

1. A VM can be shared (multiplexed) between different physical hardware machines. (Figure

1.13(a))

2. A VM can be paused (suspended) and saved to stable storage. (Figure 1.13(b))

3. A suspended VM can be resumed or moved to a new hardware platform. (Figure 1.13(c))

4. A VM can be moved (migrated) from one hardware machine to another. (Figure 1.13(d))
Benefits:

• VMs can be used on any available hardware platform.

• Easy to move distributed applications.

• Helps use server resources better.

• Many server functions can run on one hardware machine.

• Avoids too many physical servers (server sprawl).

• VMware said this method can increase server use from 5–15% to 60–80%.

Q.02 b. Illustrate Various system attacks and network threats to the

cyberspace, resulting in 4 types of losses with a neat diagram.
Threats to Systems and Networks

• Clusters, grids, clouds, and P2P systems must be protected to be trusted.

• Network viruses have caused major damage to routers and servers.

• These attacks have resulted in large money losses in business and government.

• Information leaks cause loss of confidentiality.

• Data integrity may be lost due to user changes, Trojan horses, and spoofing attacks.

• Denial of Service (DoS) attacks stop system working and break Internet connections.

• Attackers may misuse systems when there is no proper authentication.

• Open systems like data centers and P2P networks are easy targets for attackers.
• Attacks can damage computers, networks, and storage systems.

• Network issues in routers and gateways reduce trust in public systems.

Loss of Confidentiality

• Happens when private information is exposed without permission.

• Caused by actions like eavesdropping, traffic analysis, or EM/RF interception.

Loss of Integrity

• Occurs when data is modified, tampered, or misused.

• Caused by penetration, masquerade, bypassing controls, and no proper authorization.

Loss of Availability

• Happens when systems or services become unavailable to users.

• Caused by DoS (Denial of Service), Trojan Horses, or service spoofing attacks.

Improper Authentication / Illegitimate Use

• Happens when attackers gain access without proper login or rights.

• Leads to misuse of resources and data theft through weak or missing authentication.
MODULE-02
Q.03 a. Demonstrate the architecture of a computer system before and
after virtualization.

Before Virtualization:

Only one operating system runs directly on the hardware.

All applications run on this single OS.

If the OS crashes, all applications stop working.

Hardware usage is low and not efficient.

Cannot run multiple OS (like Windows and Linux) on the same system.

No separation between apps one faulty app can affect others.

Software testing or running different environments is not possible.

Adding new apps or OS requires system reboot or reinstall.

Not suitable for cloud, data centers, or multi-user environments.

Overall, less flexible and harder to manage.

After Virtualization:

A Hypervisor (Virtual Machine Monitor) runs on top of hardware.

You can create multiple Virtual Machines (VMs) on one system.

Each VM has its own OS and apps runs independently.

If one VM fails, others are not affected.

Hardware is used better by sharing across VMs.

You can run Windows, Linux, etc. together on the same machine.

Easy to move, copy, or back up VMs.

Ideal for testing, cloud services, and hosting.

Server load is balanced and managed well.

Overall, more flexible, cost-saving, and easy to manage.

Q.03 b. Compare Physical versus Virtual Clusters.

Sl.
Physical Clusters Virtual Clusters
No.

Made of real physical machines connected Made of virtual machines running on one or more
1.
through a network. physical servers.

Each node needs separate hardware like

2. Multiple VMs share the same hardware resources.
CPU, RAM, storage.

3. High cost due to more physical equipment. Cost-effective as fewer physical machines are needed.

Requires large physical space and power Saves space and energy as many VMs run on fewer
4.
supply. machines.

Difficult to scale; new hardware must be

5. Easy to scale; just create a new VM.
added manually.

Setup and maintenance are time-consuming

6. Setup is faster using virtualization tools.
and complex.
If one node fails, it may impact the entire
7. VM failures are isolated; others remain unaffected.
cluster.

High efficiency; hardware is shared among multiple

8. Less efficient; some machines may stay idle.
VMs.

Moving data or applications between nodes

9. VMs can be moved easily between servers.
is harder.

Manual monitoring and control of each Easier to manage using virtualization software (e.g.,
10.
physical server. VMware, VirtualBox).

OR
Q.04 a. Construct the Live migration process of a VM from one host to
another.

VM States Before Migration

Active VM is running and performing tasks.

Paused VM is created but currently not processing.

Suspended VM data is stored to disk and is inactive.

Steps in Live VM Migration

1. Pre-Migration (Step 0)
VM runs on Host A.
Destination Host B is selected and prepared.

2. Reservation (Step 1)
A container for the VM is initialized on Host B.

3. Iterative Pre-Copy (Step 2)

Memory is copied in multiple rounds to Host B.
Changed memory (dirty pages) is re-copied.
VM continues running during this time.
4. Stop and Copy (Step 3)
VM on Host A is suspended.
Final memory and CPU/network state is transferred.
Network is redirected to Host B.
This short delay is called downtime.
5. Commit (Step 4)
VM's data is released from Host A.
Host A no longer controls the VM.
6. Activation (Step 5)
VM is started on Host B.
It resumes work and connects to local devices.

Q.04 b. Develop a architecture of livewire for Intrusion Detection using a

dedicated VM.

Intrusion means unauthorized access to a system by local or network users.

Intrusion Detection System (IDS) is used to detect such unauthorized activities.

IDS can be of two types:

o HIDS (Host-based IDS): Runs on the same machine it's monitoring but is at
risk if the system is attacked.

o NIDS (Network-based IDS): Monitors network traffic but can't detect fake
(spoofed) actions.
In a virtualized system, guest VMs are isolated, so even if one VM is attacked, it
doesn't affect the others like NIDS.
The Virtual Machine Monitor (VMM) monitors access requests and behaves like a
HIDS by tracing fake actions.

VM-based IDS can be implemented in two ways:

1. As a separate process in each VM or a high-privileged VM.

2. Integrated directly into the VMM with full hardware access.

A VM-based IDS includes:

o Policy Engine: Analyzes and applies security rules.

o Policy Module: Uses tools like PTrace to trace and enforce policies in guest
VMs.

It's hard to prevent intrusions immediately, so post-attack analysis is important.

Logs are used to study attack behavior, but if the OS is compromised, logs may be
untrustworthy.

Honeypots and Honeynets are also used:

o They attract attackers with fake systems to protect real systems.

o Can be physical or virtual.

o Virtual honeypots must ensure the VM can't attack the host or VMM.
MODULE-03
Q.05 a. Outline six design objectives for cloud computing.

1. Shift from desktop to data center

– Computing, storage, and software are moved from personal desktops to centralized data centers
via the Internet.

2. Service provisioning and cloud economics

– Cloud services are provided through SLAs (Service Level Agreements) with users.
– Pricing is based on a pay-as-you-go model, and services should use power and resources
efficiently.

3. Scalability in performance
– Cloud systems must support more users by scaling up performance as needed.

4. Data privacy protection

– Users should feel confident that cloud providers can keep their private data safe and secure.

5. High quality of cloud services

– Quality of Service (QoS) should be standardized to allow smooth working across different
providers.

6. New standards and interfaces

– Universal APIs and access protocols are needed to avoid data lock-in and ensure flexibility in
moving apps between cloud platforms.

Q.05 b. With a neat diagram, build a cloud ecosystem with a private cloud.
❖ A cloud ecosystem includes cloud providers, users, and technologies working together.

❖ Public clouds are commonly used and form the base of the cloud ecosystem.

❖ Private and hybrid clouds allow organizations to use both internal and public cloud resources.

❖ Users want flexible platforms to run services like websites and databases.

❖ Cloud management provides virtual resources over an IaaS platform.

❖ Virtual infrastructure management allocates virtual machines across server clusters.

❖ VM managers handle and control VMs running on physical machines like Xen, KVM, and VMware.

❖ Tools like OpenNebula, vSphere, Eucalyptus, and Nimbus are used to manage cloud systems.

❖ Many startup companies use cloud resources instead of building their own IT setups.
❖ Interfaces like Amazon EC2WS, Nimbus WSRF, and ElasticHosts REST are used to access cloud
services.

❖ VI tools also support load balancing, dynamic resizing, and efficient use of server resources.

Q.05 c. Organize Functional Modules of GAE.

• GAE is a Platform-as-a-Service (PaaS) used to build and run web applications on Google’s
cloud.

• It provides several important modules to support app development and hosting.

Functional Modules

1. Runtime Environment
– Runs applications written in Java, Python, Go, or PHP.

2. Datastore
– NoSQL database service for storing structured data.

3. Task Queues
– Handles background tasks without blocking user requests.

4. Memcache
– Provides fast, in-memory caching for frequently accessed data.

5. User Authentication
– Offers APIs to manage user login and identity.

6. App Versioning and Deployment

– Supports multiple versions of the app; easy deployment and rollback.
OR
Q.06 a. Identify basic requirements for managing the resources of a data center.
❖ Managing a data center means handling all its operations smoothly, securely, and efficiently.
❖ These management issues are based on real experiences in IT and cloud service industries.

1. User Satisfaction
– The system should give good service to users for many years (minimum 30 years).
– Quality of service (QoS) must be maintained always.

2. Controlled Information Flow

– Data must flow properly between systems without delay.
– The system must provide high availability (HA) and continuous service.

3. Multiuser Management
– The data center should support many users at the same time.
– It should handle activities like traffic control, database updates, and server monitoring.

4. Scalability
– As more users or data come in, the system should be ready to grow.
– Storage, processing power, I/O, power supply, and cooling must be easily expandable.

5. Reliability in Virtualized Systems

– The system must support fault tolerance, failover, and live migration of virtual machines.
– This helps recover quickly from hardware failure or disasters.

6. Cost Efficiency
– The total cost must be low for both cloud providers and users.
– This includes hardware, electricity, staff, and maintenance.

7. Security and Data Protection

– Strong security must be there to protect data from hackers or internal misuse.
– Data privacy and integrity must be maintained at all times.

8. Green Computing (Energy Efficiency)

– Power-saving systems should be used to reduce energy use.
– Eco-friendly designs help in saving operational costs and reducing pollution.

9. Service Automation
– Automated tools should manage routine tasks like backups, load balancing, and patch updates.
– This improves speed, accuracy, and reduces manual errors.

10. Monitoring and Reporting

– The system must continuously monitor performance and usage.
– Reports help in planning for upgrades, detecting issues, and ensuring smooth operation.
Q.06 b. Summarize six open challenges in cloud architecture development.

Challenge 1 – Service Availability and Data Lock-in Problem

1. If a cloud service fails, the whole system may stop, especially if run by a single company.

2. Using multiple cloud providers can increase service availability.

3. Proprietary APIs cause "lock-in" — users can't easily move apps/data between clouds.

Challenge 2 – Data Privacy and Security Concerns

1. Cloud systems are open to cyberattacks like DDoS, malware, and VM hijacking.

2. Data can be stolen or misused if not properly encrypted or protected.

3. Some countries require data to stay within their borders, adding legal issues.

Challenge 3 – Unpredictable Performance and Bottlenecks

1. VMs share CPU/memory well, but I/O (like disk access) causes slowdowns.

2. Large-scale applications face data transfer delays and traffic problems.

3. Bottlenecks must be avoided using better placement and hardware upgrades.

Challenge 4 – Distributed Storage and Software Bugs

1. Cloud systems need storage that can grow and shrink with demand.

2. Debugging cloud errors is hard because bugs appear only at a large scale.

3. Virtual machines and simulators can help collect useful debugging info.

Challenge 5 – Cloud Scalability, Interoperability, and Standardization

1. Cloud services must scale quickly without breaking SLAs.

2. Standard VM formats (like OVF) help run apps on different platforms.

3. Cross-platform migration between Intel/AMD is still difficult.

Challenge 6 – Software Licensing and Reputation Sharing

1. Cloud needs flexible software licenses (e.g., pay-per-use or bulk licensing).

2. One bad user can damage the whole cloud's reputation (e.g., IP blacklisting).

3. Legal liability between provider and customer must be handled in SLAs.

Q.06 c. Summarize six open challenges in cloud architecture development.

1. AWS uses the IaaS (Infrastructure as a Service) model

– It gives virtual machines (VMs) through EC2 to run cloud applications.

2. S3 and EBS for storage

– S3 (Simple Storage Service) is used for object-based storage.
– EBS (Elastic Block Store) gives block-level storage for regular apps.

3. SQS and SNS for messaging

– SQS (Simple Queue Service) stores messages even if the receiver is offline.
– SNS (Simple Notification Service) is used for sending notifications.

4. Extra services for performance

– ELB (Elastic Load Balancer) spreads traffic across EC2 instances.
– CloudWatch monitors resources like CPU, memory, and network use.
– Auto Scaling adds or removes instances based on demand.
MODULE-04
Q.07 a. Demonstrate surfaces of attacks in a cloud computing environment with neat
diagram.

1. Starting cloud use is too easy

– Many users start using cloud services without understanding the security risks or ethics.
– This creates chances for misuse or unsafe actions.

2. Clouds can be used for large attacks

– Cloud systems can be misused to launch big cyber-attacks on other systems.
– Preventing such misuse is an important challenge.

3. Three types of risks in cloud

– (i) Traditional threats, (ii) Availability issues, (iii) Third-party data control risks.
– These problems make cloud computing risky without proper care.

4. Traditional threats
– These include DDoS attacks, phishing, SQL injection, cross-site scripting, etc.
– In clouds, these threats affect many users because resources are shared.

5. Authentication & user access issues

– Users from the same company may need different access levels.
– Mixing company security rules with cloud rules is not easy.

6. Cloud attacks are hard to trace

– It’s difficult to identify how a cloud system is attacked.
– Traditional investigation methods like logs don’t work well in clouds.
7. Service availability problems
– Power failure, hardware crash, or natural disasters can shut down cloud services.
– If data is locked inside the cloud, it affects companies badly.

8. Third-party trust issues

– Cloud providers may use untrusted vendors or low-quality hardware.
– Users lose data because they have no control or transparency.

9. Cloud providers are not responsible

– For example, AWS terms say they are not liable for data loss or failure.
– This creates risk for users because there’s no strong guarantee.

10. Top 7 threats (CSA Report 2010)

– Abuse of cloud, Insecure APIs, Malicious insiders, Shared tech issues, Account hijacking,
Data loss/leakage, and Unknown risk profile.
– IaaS is affected by all 7, PaaS and SaaS are affected by fewer.

Q.07 b. List out the top cloud security threats of CSA2016.

1. Data Breaches
– Unauthorized access to sensitive or confidential data.

2. Weak Identity, Credential and Access Management

– Poor password practices or stolen credentials lead to unauthorized access.

3. Insecure APIs (Application Programming Interfaces)

– Vulnerable APIs can be exploited by attackers to gain access.

4. System Vulnerabilities
– Bugs or flaws in software can allow attackers to exploit systems.

5. Account Hijacking
– Attackers use stolen credentials to take over accounts and services.

6. Malicious Insiders
– Employees or partners with access misuse their privileges.

7. Advanced Persistent Threats (APTs)

– Long-term targeted attacks aimed at stealing data or damaging systems.

8. Data Loss
– Accidental deletion, system failure, or lack of backups leads to permanent data loss.

9. Insufficient Due Diligence

– Organizations adopt cloud without fully understanding responsibilities or risks.
10. Abuse and Nefarious Use of Cloud Services
– Attackers use cloud resources for spamming, malware hosting, or launching DDoS attacks.

11. Denial of Service (DoS)

– Attackers overload cloud services, making them unavailable to legitimate users.

12. Shared Technology Vulnerabilities

– Risks from the multi-tenant model where many users share the same infrastructure.

Q.07 c. Select four widely-accepted fair information practices that “Consumer oriented
commercial web sites that collect personal identifying information from or about consumers
online would be required to comply with.
1. Notice

o Websites must clearly inform users about their data collection practices.

o This includes what data is collected, how it is collected (e.g., cookies), how it’s used, and if it is
shared with other entities.

2. Choice

o Users must be given choices on how their personal data is used.

o This includes both internal use (like marketing) and external use (sharing with third parties).

3. Access

o Users must be allowed to view, correct, or delete their personal information.

o This ensures transparency and user control over their data.

4. Security

o Websites must take reasonable steps to protect user data from theft or misuse.

o The approach should be technologically neutral and flexible for future developments.

OR
Q.08 a. Summarize The design goals of Xoar are.

• Xoar is a modified version of Xen, designed to improve system security using microkernel
principles.

• It assumes trusted system administrators manage the system and threats mainly come from guest VMs or
bugs in the management code.

• It maintains all Xen functionalities while controlling privileges tightly—each component gets
only what it needs.

• Interfaces are minimized to reduce attack surfaces, and sharing is avoided or made explicitly logged.

• Components run only when needed to reduce the time window for attacks.

• Xoar allows secure audit logging for better traceability.

• There are four types of components:

– Always running (e.g., XenStore-State)
– Used at boot time and then removed
– Loaded when requested
– Restarted on a timer

• Modular design reduces the risk and footprint of the system, with only a small performance
impact.

• Examples include: Builder (starts VMs), QEMU (device emulation), and drivers like PCIBack and
NetBack.
Q.08 b. Explain mobile devices and cloud security.

1. Mobile Cloud Ecosystem

o Mobile apps use cloud services for data storage, backups, and processing because
devices have limited CPU, memory, and storage.

2. Security Challenges on Mobile Devices

o Mobile devices often connect over public or untrusted Wi-Fi networks, which can be
intercepted by attackers.

o They are frequently lost or stolen, increasing the risk of unauthorized data access.

3. Authentication and Identity Management

o Weak device-side authentication (like reused or weak passwords) can let attackers
access cloud accounts.

o Modern best practices recommend using multi-factor authentication (MFA).

4. Data Protection in Transit and at Rest

o Data must be encrypted both while traveling (e.g., TLS/SSL) and while stored in the cloud.

o End-to-end encryption ensures only device users can read sensitive data.

5. App and API Security

o Malicious or vulnerable apps might access or leak user data saved in the cloud.

o Secure mobile apps need trustworthy APIs with proper access controls.

6. Device and App Management

o Enterprises use Mobile Device Management (MDM) tools to enforce

encryption, require strong passcodes, push updates, and enable remote wipe capabilities.

o This protects data even if a device is lost or stolen.

Q.08 c. Model an overview of reputation system design options.

1. Centralized Reputation System

o One central authority collects, evaluates, and manages reputation scores.

o Easy to control and monitor but can be a single point of failure or target for attack.

2. Distributed Reputation System

o Reputation data is shared among peers or systems.

o No single control point, but more complex to manage consistency and trust.

3. User-Based Ratings

o Users give direct feedback (e.g., stars, likes, reviews) after a service or transaction.

o Simple to implement but can be manipulated using fake reviews or Sybil attacks.
4. Behavior-Based Monitoring

o The system monitors actual behavior (e.g., uptime, response time, data accuracy).

o More reliable and objective, but needs complex tracking and analytics.

5. Context-Aware Reputation

o Reputation scores depend on the specific service or environment (e.g., reliability in

storage vs. computation).

o Allows fine-grained trust decisions for different scenarios.

6. Time-Based Reputation

o Reputation fades over time if not updated, encouraging ongoing good behavior.

o Prevents users from building high scores and then acting maliciously later.

7. Incentive-Driven Models

o Users are rewarded (credits, trust scores) for providing accurate feedback or behaving
well.

o Encourages honesty but needs protection against misuse of incentives.

MODULE-05
Q.09 a. Outline Important Cloud Platform Capabilities.

1. On-Demand Self-Service

o Users can access computing resources (like servers, storage) whenever they need, without
human help.

2. Broad Network Access

o Services are available over the internet and can be used from laptops, phones, or tablets.
3. Resource Pooling

o Cloud providers share resources (like storage, memory) among many users using
virtualization.

4. Rapid Elasticity

o Resources can be increased or decreased quickly based on need (auto-scaling).

5. Measured Service (Pay-as-You-Go)

o Users only pay for what they use (like mobile recharge) — helps save money.
6. High Availability

o Cloud platforms make sure services run 24/7 without downtime using backup and load
balancing.

7. Security and Compliance

o Provides user authentication, data encryption, firewalls, and follows government

policies.

8. Automation

o Many tasks like backups, updates, scaling can be done automatically without manual work.

9. Multi-Tenancy

o Multiple users can use the same cloud system securely and privately.

10. APIs and Developer Tools

• Easy-to-use tools for developers to build, test, and deploy applications on the cloud.

Q.09 b. Organize the steps involved in MapReduce.

1. Input Splitting

o The input data is split into small parts (blocks) for processing.

2. Map Function

o Each part is processed in parallel by the Map function.

o It converts data into key-value pairs.

(Example: “apple” becomes <apple, 1>)

3. Shuffling

o The system groups all values with the same key together.
(All <apple, 1> pairs are brought together.)

4. Sorting

o Keys are sorted before passing them to the reducer.

5. Reduce Function

o All grouped key-value pairs are processed by the Reduce function.

o It performs operations like counting, summing, or averaging.

(Example: <apple, [1,1,1]> becomes <apple, 3>)

6. Output Generation

o The final output is stored in the file system in key-value format.

Simple Example

Input: A list of words → ["apple", "apple", "banana"]

Map Output:
<apple, 1>, <apple, 1>, <banana, 1>

Shuffle + Reduce Output:

<apple, 2>, <banana, 1>

OR
Q.10 a. Explain with a neat diagram how data flows in running a MapReduce job at
various task trackers using the Hadoop library.

Phase 1: Setup and Input

1. Data Partitioning

o Input file stored in HDFS is split into M pieces (Input Splits).

o Each split is assigned to a Map Task.

2. Computation Partitioning

o User writes Map() and Reduce() functions.

o Hadoop system forks user programs and distributes to workers.

3. Master and Workers Setup

o One instance becomes the Master (JobTracker).

o Others become Workers (TaskTrackers or NodeManagers).

o Master assigns Map/Reduce Tasks to workers.

Phase 2: Map Side Processing

4. Input Reading

o Each Map worker reads its split and passes it to the Map() function.

5. Map Function Execution

o Produces intermediate (key, value) pairs.

6. Combiner (Optional)

o Combines values locally (e.g., local sum) to reduce network data.

7. Partitioning Function

o Intermediate data is split into R partitions (one per Reduce task) using: Hash(key) mod
R
Phase 3: Shuffle and Reduce

8. Synchronization

o Reduce workers wait for all Map tasks to complete.

9. Communication

o Reduce workers fetch partitions from all Map workers using RPC.

10. Sorting & Grouping

• Keys are sorted and grouped (all values with same key together).

11. Reduce Function Execution

• Final results are written to HDFS output files.

Q.10 b. Explain Data mutation sequence in GFS with diagram1.

1. Client Requests Chunk Info

o Client contacts Master to ask which chunk server has the lease for the chunk and where other
replicas are.

2. Master Responds

o Master replies with:

▪ Identity of Primary replica

▪ Locations of Secondary replicas

o Client caches this info for future requests.

3. Client Pushes Data

o Client sends the data to all replicas (primary + secondary).

o Data is stored in a buffer cache at each chunk server.

o This step is decoupled from control flow for better performance.

4. Client Sends Write Request to Primary

o After all servers receive the data, the client informs the Primary to begin mutation.
o Primary assigns serial numbers to maintain write order.

5. Primary Forwards to Secondaries

o Primary sends write request to all secondary replicas, enforcing the same
serial order.

6. Secondaries Acknowledge

o Secondaries confirm mutation is applied successfully.

7. Primary Responds to Client

o After all secondaries reply, primary responds to the client.

o If any error occurs, the client:

▪ Marks write as failed

▪ Retries the mutation (steps 3–7 or restart from step 1 if needed)

Contemporary Philippine Arts From The Regions
100% (2)
Contemporary Philippine Arts From The Regions
13 pages
Global Operations Ass.2
100% (3)
Global Operations Ass.2
26 pages
Cloud Answers
No ratings yet
Cloud Answers
29 pages
Module-01 MQP Solutions (Search Creators)
No ratings yet
Module-01 MQP Solutions (Search Creators)
9 pages
Module-02 (BIS613D) CC&S MQP Solutions
No ratings yet
Module-02 (BIS613D) CC&S MQP Solutions
7 pages
Bis613d MQP Solved
No ratings yet
Bis613d MQP Solved
41 pages
Unit 1 Cloud
No ratings yet
Unit 1 Cloud
102 pages
CDC Ansbank
No ratings yet
CDC Ansbank
26 pages
To Virtualization
No ratings yet
To Virtualization
13 pages
Cloud-Enabling Technology
No ratings yet
Cloud-Enabling Technology
24 pages
Operating System Lecture 4
No ratings yet
Operating System Lecture 4
17 pages
Unit 9
No ratings yet
Unit 9
34 pages
Fog and Cloud Computing
No ratings yet
Fog and Cloud Computing
22 pages
Solution cs66 Week 02 Assignment 02
No ratings yet
Solution cs66 Week 02 Assignment 02
2 pages
CCWS - Unit1 - CHAPTER 2
No ratings yet
CCWS - Unit1 - CHAPTER 2
11 pages
Hybrid Storage Solutions
100% (1)
Hybrid Storage Solutions
76 pages
Cloud
No ratings yet
Cloud
19 pages
Cloud
No ratings yet
Cloud
9 pages
CC Unit 2-1
No ratings yet
CC Unit 2-1
7 pages
Unit-5 Cloud Computing
No ratings yet
Unit-5 Cloud Computing
39 pages
INTRO
No ratings yet
INTRO
8 pages
Unit-V - Virtual Machines and Mobile OS-QB
No ratings yet
Unit-V - Virtual Machines and Mobile OS-QB
15 pages
Introduction
No ratings yet
Introduction
61 pages
CH 03
No ratings yet
CH 03
37 pages
Q1: What Is Cloud Computing and Its Five Caracterstics?
No ratings yet
Q1: What Is Cloud Computing and Its Five Caracterstics?
7 pages
C-20 - CCL Experiment No 2A - VM Ware - New
No ratings yet
C-20 - CCL Experiment No 2A - VM Ware - New
13 pages
Virtual Ization
No ratings yet
Virtual Ization
42 pages
Virtual Ization
No ratings yet
Virtual Ization
10 pages
Virtualize
No ratings yet
Virtualize
32 pages
Chapter6 21110477 TrinhThiThanhHuyen
No ratings yet
Chapter6 21110477 TrinhThiThanhHuyen
9 pages
VM-Notes 1
No ratings yet
VM-Notes 1
5 pages
Cloud Assignment 1: Application Virtualization
No ratings yet
Cloud Assignment 1: Application Virtualization
8 pages
1CC Notes
No ratings yet
1CC Notes
30 pages
Grid and Cloud Unit III
No ratings yet
Grid and Cloud Unit III
29 pages
CCD Chapter 1.0
No ratings yet
CCD Chapter 1.0
22 pages
Module-1 Notes
No ratings yet
Module-1 Notes
6 pages
Assignment of Os of Aps CLG
No ratings yet
Assignment of Os of Aps CLG
65 pages
Final Report On Virtualization
No ratings yet
Final Report On Virtualization
27 pages
Virtual Machine Security Challenges: Case Studies
No ratings yet
Virtual Machine Security Challenges: Case Studies
14 pages
Cloud PT1
No ratings yet
Cloud PT1
20 pages
Cloud 1-2-3-4-5-6-7-8
No ratings yet
Cloud 1-2-3-4-5-6-7-8
206 pages
OS Android
No ratings yet
OS Android
37 pages
Grid and Cloud Computing Important Questions Unit 3 Part A
No ratings yet
Grid and Cloud Computing Important Questions Unit 3 Part A
5 pages
Lesson 6 Virtualization
No ratings yet
Lesson 6 Virtualization
5 pages
Cloud Architecture
No ratings yet
Cloud Architecture
6 pages
Virtual Ization
No ratings yet
Virtual Ization
40 pages
Hybrid Cloud Model
No ratings yet
Hybrid Cloud Model
10 pages
Live VM Migration Polices Attacks and Security A Survey
No ratings yet
Live VM Migration Polices Attacks and Security A Survey
8 pages
Chapter 9
No ratings yet
Chapter 9
11 pages
CC CW Chapter 2
No ratings yet
CC CW Chapter 2
9 pages
Cloud Computing CCS335 - Unit 2
No ratings yet
Cloud Computing CCS335 - Unit 2
27 pages
CCD Chap Notes
No ratings yet
CCD Chap Notes
43 pages
My Virualization Final..
100% (1)
My Virualization Final..
30 pages
CC Midsem
No ratings yet
CC Midsem
4 pages
Written Test 1 Virtualization PDF
No ratings yet
Written Test 1 Virtualization PDF
4 pages
Set 1 Answer Nov24
No ratings yet
Set 1 Answer Nov24
9 pages
Virtualization
No ratings yet
Virtualization
17 pages
SY0-071-Module 3 Powerpoint Slides
No ratings yet
SY0-071-Module 3 Powerpoint Slides
232 pages
Chapter-1: Cloud Computing Fundamentals
No ratings yet
Chapter-1: Cloud Computing Fundamentals
42 pages
How To Do Virtualization: Your Step-By-Step Guide To Virtualization
From Everand
How To Do Virtualization: Your Step-By-Step Guide To Virtualization
HowExpert
No ratings yet
Cloud Infrastructure and Data Center
From Everand
Cloud Infrastructure and Data Center
Duong Tran
No ratings yet
Cloud Computing Made Simple: Navigating the Cloud: A Practical Guide to Cloud Computing
From Everand
Cloud Computing Made Simple: Navigating the Cloud: A Practical Guide to Cloud Computing
Poonam Devi
No ratings yet
3.1 Fluid Mosaic Model
No ratings yet
3.1 Fluid Mosaic Model
35 pages
Invisisil Op2131sd Uv Cure Optical Bonding Silicone Tds
No ratings yet
Invisisil Op2131sd Uv Cure Optical Bonding Silicone Tds
5 pages
Business Plan For Poultry in Ibadan
No ratings yet
Business Plan For Poultry in Ibadan
6 pages
Algebra P4
No ratings yet
Algebra P4
95 pages
Arihant (Madam Rides The Bus)
No ratings yet
Arihant (Madam Rides The Bus)
8 pages
Biju Expence Details
No ratings yet
Biju Expence Details
2 pages
Elements of The Traditional Music of Thailand
80% (5)
Elements of The Traditional Music of Thailand
8 pages
DSCP & Vlan Priority
No ratings yet
DSCP & Vlan Priority
13 pages
Growth Unhinged Carousel
No ratings yet
Growth Unhinged Carousel
10 pages
Defination, Types and Vids On Poetry
No ratings yet
Defination, Types and Vids On Poetry
4 pages
Lesson 13
No ratings yet
Lesson 13
8 pages
Inglesina Zippy Free Manual
No ratings yet
Inglesina Zippy Free Manual
44 pages
Withania Coagulans (Solanaceae)
No ratings yet
Withania Coagulans (Solanaceae)
11 pages
Lesson 5 - Site Layout and Design-1
No ratings yet
Lesson 5 - Site Layout and Design-1
7 pages
Catalogo Hiab 122
No ratings yet
Catalogo Hiab 122
4 pages
Oscp Preparation
83% (6)
Oscp Preparation
39 pages
IDE Faith Sharing
No ratings yet
IDE Faith Sharing
9 pages
1711954353
No ratings yet
1711954353
58 pages
Name N Address Details DEAF Aug 2019
No ratings yet
Name N Address Details DEAF Aug 2019
428 pages
GED Practice 2025
No ratings yet
GED Practice 2025
7 pages
Rhotic Degemination in Sanskrit and The Etymology of Vedic Ūrú - Thigh', Hittite UZU (U) Walla - Id.'
No ratings yet
Rhotic Degemination in Sanskrit and The Etymology of Vedic Ūrú - Thigh', Hittite UZU (U) Walla - Id.'
32 pages
Customer Persona
No ratings yet
Customer Persona
2 pages
Isle of The Unknown
100% (3)
Isle of The Unknown
134 pages
R8 Waray BoSY CRLA 11.24.2021 v4
No ratings yet
R8 Waray BoSY CRLA 11.24.2021 v4
10 pages
Ey Step Up To Ind As For Banks and NBFCSMHNGG
No ratings yet
Ey Step Up To Ind As For Banks and NBFCSMHNGG
44 pages
Treating The Wise Baby
No ratings yet
Treating The Wise Baby
6 pages
BARRIERS TO GIRLS EDUCATION IN - SOUTH CENTRAL SOMALIA Annex 1
No ratings yet
BARRIERS TO GIRLS EDUCATION IN - SOUTH CENTRAL SOMALIA Annex 1
34 pages
Anticipatory Bail: Submitted To Dr. Asad Malik
No ratings yet
Anticipatory Bail: Submitted To Dr. Asad Malik
37 pages