0% found this document useful (0 votes)

35 views

Scalability and Heterogeneity: Colin Perkins

Uploaded by

BARUTI JUMA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views

Scalability and Heterogeneity: Colin Perkins

Uploaded by

BARUTI JUMA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

Scalability and Heterogeneity

Colin Perkins
http://csperkins.org/teaching/2004-2005/gc5/
Lecture Outline

• Review of Traditional Distributed Systems

• How is Grid Computing Different?
– Aspects of Heterogeneity
– Aspects of Scalability
– Implications for System Design
• Preparation for Tutorial 1

The aims of today:

• To understand why grid computing is difficult, raise a number of
issues to consider throughout the module
Copyright © 2004 University of Glasgow

• To give more examples of Grid computing systems

Review of Traditional Distributed Systems

“A distributed system is a collection of independent computers that

appears to its users as a single coherent system.”
[Tanenbaum & van Steen, 2002]

• The machines are autonomous, but the users think they’re dealing
with a single system
• Typically distributed systems are used to share resources within
an organization:
– Homogeneity eases management, fault tolerance, scheduling, authentication
– E.g. a departmental fileserver, database of exam marks
Copyright © 2004 University of Glasgow

What if we break the assumption of homogeneity?

…we move from distributed systems to grid computing!
What is Grid Computing?

Infrastructure for Internet-scale Distributed Systems

• A software system, implemented in terms of a middleware layer,

that provides dependable, consistent, pervasive, and inexpensive
access to high-end computational capabilities
• A system that allows sharing of remote resources as-if they were
local, across geographical and organizational boundaries
• A large and widely distributed collection of independent systems
that appears to its users as a single coherent system
Copyright © 2004 University of Glasgow

There are many definitions, depending who you ask…

How is Grid Computing Different?

• A computational grid integrates disparate resources into a single

virtual organization
– Varying applications using the services of the Grid
– Large and varied amounts of data to be processed
– Varying classes of user, with different rights and responsibilities
– Running on a range of networks, using varying hardware and software
– Across different administrative and legal domains
• Implies a more heterogeneous environment and greater scaling
than traditional distributed systems

• How does this affect system design?

Heterogeneity comes from several factors:

• Users, applications and data
• Software and the hardware on which it runs
• Interconnection networks
• Organizations
Copyright © 2004 University of Glasgow
Heterogeneous Users, Applications & Data

• Large scale grid computing started to service the needs of the

e-science community

• The EGEE project (“Enabling Grids for e-Science in Europe”)

– A typical e-science grid development project
– European Union funding: €30 million
– Building a computational grid for physics, health and bio-sciences, earth
sciences, astronomy, etc.
– Share resources of 70 sites in 27 countries; aiming for thousands of active
users, wide range of applications, lots of data
• The Grid2003 Production Grid for physics and astronomy [1]
Copyright © 2004 University of Glasgow

Consider diversity of user locations and environment,

job processing, data, trust and security models…
Heterogeneous Users, Applications & Data

• Many of the concepts of grid computing finding their way into

commercial applications

• Google, Amazon or iTunes

– Large database/commerce sites; significant financial value
– Accessed directly via web-site, or embedded in an application
– Worldwide user community; millions of users and transactions
• Business process automation
– Automatic inventory processing, ordering management
– E.g. airline reservation systems, stock trading and financial modelling
Copyright © 2004 University of Glasgow

Consider diversity of user locations and environment,

job processing, data, trust and security models…
Heterogeneous Hardware and Software

• EGEE aiming to allow users at 70 different sites to share data, run

distributed computational jobs
• Google is reputed to manage a distributed system of ~100,000
hosts around the world; caching the entire web in memory
• In large-scale grids like these, you cannot standardize on a
particular hardware or software environment
– By the time you’ve synchronised the system software, some hardware will
have failed, requirements will have changed
– Client software will vary widely
– Likely multiple versions of server and client software in use
Copyright © 2004 University of Glasgow

Design for compatibility, interoperability

and cross-platform operation
Heterogeneous Networks

• Grids are widely distributed systems connected by the Internet

• What are the characteristics of the Internet?
– Big and complex
– Best effort service; no performance guarantees
– Fragmented ownership

• Implication: the variation in the network will affect how we build

a computational grid
– Paper [2] discusses how network heterogeneity affects design and
modelling of new protocols
Copyright © 2004 University of Glasgow
Heterogeneous Networks

• Big, and getting bigger:

– Size of the network has more than doubled every year
since early 1980s
– Approximately 100 million hosts at end of year 2000
– What happens if 0.1% of hosts behave atypically?

• Traffic patterns shift rapidly:

– World-wide web: doubled every ~7 weeks for 2 years
– Mbone: some sites reported >50% traffic in 1995 was
multicast; now virtually none
– Peer-to-peer: many reports of network congestion due
Copyright © 2004 University of Glasgow

to Napster, Kazaa, BitTorrent, etc.

– Worms and malicious traffic: Nimda; from release to
100 probes-per-second in 30 minutes
Heterogeneous Networks

• At least 6 orders of magnitude variation in link speed:

– 9.6 kbps GSM wireless → 10 Gbps optical fibre
– Link capacity growing faster than Moore’s law
• At least 4 orders of magnitude variation in latency:
– Sub-millisecond LAN connections; hundreds of milliseconds worldwide
– Varies with queuing delay, network congestion, path changes
– A fundamental limiting factor for synchronous protocols (e.g. web services)
• Wide variation in packet loss rates:
45
40
35
Loss Rate (percent)

30
25
Copyright © 2004 University of Glasgow

20
15
10
5
0
8:00 10:00 12:00 14:00 16:00 18:00
Time (BST)
Implications of Heterogeneous Networks

• Systems and protocols must be adaptive and scalable

• Decentralisation is essential, to handle load
• Global synchronisation is difficult, tending to impossible, due to
latency

Your system works in the lab today…

Will it still work in a few months,
Implications for systems
when you have thousands of users?

Widely distributed or peer-to-peer

Asynchronous and weakly consistent

Location transparent
Loss tolerant, rate/latency adaptive
Organizational Heterogeneity

• Goal is to share resources across organizational boundaries, to

form new virtual organizations
• How to authenticate users and resources?
– Who do you trust to do the authentication?
– Do you trust users to delegate authority?
• To other users? Significant implications
• To software agents?
on security infrastructure
– Do you trust the servers? The data?
• How to provide, control and limit access?
– Full user accounts or a limited subset of functionality
– Firewalls
– Malicious users/applications
Copyright © 2004 University of Glasgow

• Who sets the acceptable use policy?

– Is it consistent worldwide? Can/should it be?
How is Grid Computing Different?

• A computational grid integrates disparate resources into a single

• How does this affect system design?

When building a grid, need to consider how it will scale in terms of:
• Data Storage and Distribution
• Software
• Scheduling
• Robustness and Fault Tolerance
• System Management
Copyright © 2004 University of Glasgow
Scalability of Data Storage & Distribution

Storage is cheap:
• Apple Xserve RAID: 3.5Tbytes = £8,799
5.25×17×18.4 inches
• Consider the storage available on a large distributed system…

Grid computing applications produce a lot of data:

• The ATLAS experiment at CERN will produce 1.3 petabytes/year
of raw data (a stack of CDROMs 10 miles high…)
– Simulation and analysis software routinely produces data files around 2
gigabytes in size
• Measurements on Grid2003 show 2 terabytes/day transferred to
Copyright © 2004 University of Glasgow

support experiments on a 27 site grid

– Continuous 200 Mbits/second transfer rate
Scalability of Data Storage & Distribution

• How to manage, distribute this much data?

– Do you move the data to the job? Or the job to the data?
– Are you allowed to move the data?
• Copyright, confidentiality, privacy, legal reasons
• E.g. Grid computing for oil exploration: governments won’t let geological data
out of the country – remote access to terabytes of data in Africa from the US?
– How to transfer large datasets?
• Manually?
• Automatically and transparently? How?
• How to index and search this much data?
• Need interoperable and standardised data formats
– Long term archival and curation; efficient short term access
Copyright © 2004 University of Glasgow

• How to do data fusion across heterogeneous databases/sources?

– Transparent database queries across multiple systems, worldwide
• How to maintain data provenance?
Scalable Scheduling

• Job scheduling on single and parallel computers well understood

• Evolving towards job scheduling for clusters:
– VAX/VMS clustering in the mid-1980s
– Condor, OpenPBS, Sun Grid Engine, etc. more recently

• How to move to Internet-scale job scheduling?

– Location transparency
An open research problem…
See also paper [3]
Naming, Addressing and Middleware

• How to write an application that runs over thousands of hosts,

when you don’t know which hosts it’s using?

• Need a naming scheme and communication protocol that works

independently of location
– Can’t use DNS or IP addresses directly; tied to organizational structure,
network topology
• Peer-to-peer protocols solve some of these problems:
– Distributed hash tables/content addressable networks and event notification
systems built on them
– e.g. Pastry and Scribe
• Lots of research; no standards yet
Copyright © 2004 University of Glasgow

Paper [3] addresses some of these issues in more depth

Robustness and Fault Tolerance

• Systems fail; an internet-scale distributed system might never be

completely operational
– If a system is large enough, statistically likely something will have failed
• Grid2003 reports job failure and restart rates of 30% in some cases… [1]
– How big can a system be before failures become overwhelmingly likely?

• How to detect and recover from failures?

– Routing around failures?
– Recovering from failure while a job is running?
– Avoiding cascading failures?
• Distributed systems and parallel computing has given us many
useful algorithms
Copyright © 2004 University of Glasgow

– Complicated by the scale of computational grids, and cross organizational

management issues
See paper [4]
System Configuration Management

• Independent of job scheduling and resource management, need to

manage the configuration of the grid
– Operating system updates + patches
– New versions of application software
– Detecting and fixing hardware and software failures

• How to manage thousands of hosts?

• How to manage a system that’s never completely functional?

• Build applications that monitor the system, and reconfigure it as

• The two biggest challenges to designing a computational grid are

heterogeneity and scalability
• These distinguish grids from traditional distributed systems

• Have asked lots of questions… the reading list will raise more
issues
• The rest of the module will try to answer some of these questions;
others are open research topics…
Copyright © 2004 University of Glasgow

• Next week: discussion of current standard architectures and

protocols for grid computing
References

[1] I. Foster et al., “The Grid2003 Production Grid: Principles and Practice”,
Proceedings of the 13th IEEE Intl. Symp. on High Performance Distributed
Computing, 2004.
[2] S. Floyd and V. Paxson, “Difficulties in Simulating the Internet”, IEEE/ACM
Transactions on Networking, Vol. 9, No. 4, August 2001.
[3] J. A. Crowcroft, S. M. Hand, T. L. Harris, A. J. Herbert, M. A. Parker and I.
A. Pratt, “FutureGRID: A Program for long-term research into GRID systems
architecture”, Proceedings of the UK e-Science All Hands Meeting, Sept 2003.
[4] M. Amin, “Toward Self-Healing Infrastructure Systems”, IEEE Computer,
August 2000.
[5] J. O. Kephart and D. M. Chess, “The Vision of Autonomic Computing”, IEEE
Computer, January 2003.
Copyright © 2004 University of Glasgow
Preparation for Tutorial 1

We will be discussing two papers on Monday:

• “Computational Grids”
• “The Anatomy of the Grid”

Between now and Monday:

• You should all read both papers
• Prepare a summary of each paper, explain “what is a grid?”
– Work in groups to do this, discuss the papers in advance
– Use the material from Research Techniques to help you prepare
On Monday, two people will be chosen at random for each paper:
Copyright © 2004 University of Glasgow

– They will stand in front of the class and present the paper (5 minutes)
– Then, the rest of the class will then discuss the paper, to see if they agree
with that view of grid computing

NPTEL Cloud Computing Notes
67% (3)
NPTEL Cloud Computing Notes
26 pages
Lecture02 PDF
No ratings yet
Lecture02 PDF
25 pages
Introduction To Grid Computing With High Performance Computing
No ratings yet
Introduction To Grid Computing With High Performance Computing
46 pages
GridComputing An Introduction
No ratings yet
GridComputing An Introduction
30 pages
Computational Grids
No ratings yet
Computational Grids
40 pages
DC UT1 CompsA
No ratings yet
DC UT1 CompsA
23 pages
L01
No ratings yet
L01
12 pages
Lecture-2 - Architecture of Distributed Systems
No ratings yet
Lecture-2 - Architecture of Distributed Systems
35 pages
Peer-to-Peer Grid Databases For Web Service Discovery: Wolfgang Hoschek
No ratings yet
Peer-to-Peer Grid Databases For Web Service Discovery: Wolfgang Hoschek
48 pages
01 - Ch1
No ratings yet
01 - Ch1
25 pages
Grid Computing5
No ratings yet
Grid Computing5
46 pages
Lecture-2- Architecture of Distributed Systems
No ratings yet
Lecture-2- Architecture of Distributed Systems
35 pages
Grid Computing: Qis College Ofenginering&Technology
No ratings yet
Grid Computing: Qis College Ofenginering&Technology
12 pages
Lecture_5_GridComputing-2014
No ratings yet
Lecture_5_GridComputing-2014
39 pages
Distributed Computing PPT
No ratings yet
Distributed Computing PPT
37 pages
(a) Introduction to Compute
No ratings yet
(a) Introduction to Compute
30 pages
Grid Computing: Advanced Web, For Computing, Collaboration and Communication
No ratings yet
Grid Computing: Advanced Web, For Computing, Collaboration and Communication
11 pages
GridComputing-An Introduction MAIN
No ratings yet
GridComputing-An Introduction MAIN
65 pages
Introduction To Grid Computing
No ratings yet
Introduction To Grid Computing
59 pages
Chapter 6 BasicsDS
No ratings yet
Chapter 6 BasicsDS
38 pages
The Grid: Past, Present, Future: 8/31/2004 Grid Computing Fall 2004 Paul A. Farrell
No ratings yet
The Grid: Past, Present, Future: 8/31/2004 Grid Computing Fall 2004 Paul A. Farrell
9 pages
Distributed System
No ratings yet
Distributed System
37 pages
assignment
No ratings yet
assignment
12 pages
Cloud Computing - Notes
No ratings yet
Cloud Computing - Notes
75 pages
InfoNet Article06
No ratings yet
InfoNet Article06
5 pages
Grid Computing Seminar Report
100% (1)
Grid Computing Seminar Report
36 pages
Grid Computing: Jenifer.s, Mallika.P
No ratings yet
Grid Computing: Jenifer.s, Mallika.P
7 pages
CS542: Topics in Distributed Systems
No ratings yet
CS542: Topics in Distributed Systems
39 pages
DCS Intro
No ratings yet
DCS Intro
28 pages
Grid Computing "The Next Internet": Index
No ratings yet
Grid Computing "The Next Internet": Index
12 pages
Computational Grids
No ratings yet
Computational Grids
8 pages
Introduction To Distributed Systems
No ratings yet
Introduction To Distributed Systems
36 pages
Tema1
No ratings yet
Tema1
59 pages
Seminar Presented By: Sehar Sultan M.SC (CS) 4 Semester UAF
No ratings yet
Seminar Presented By: Sehar Sultan M.SC (CS) 4 Semester UAF
33 pages
WINSEM2022-23 CSE4001 ETH VL2022230503162 Reference Material I 09-02-2023 Module 4 Distributed Systems Lecture 1
No ratings yet
WINSEM2022-23 CSE4001 ETH VL2022230503162 Reference Material I 09-02-2023 Module 4 Distributed Systems Lecture 1
21 pages
Mct702 All Units
No ratings yet
Mct702 All Units
747 pages
Characterization of Distributed Systems (Chapter-1)
No ratings yet
Characterization of Distributed Systems (Chapter-1)
35 pages
Grid Computing
No ratings yet
Grid Computing
12 pages
DistributedComputing Rev2
No ratings yet
DistributedComputing Rev2
44 pages
Distributed System
No ratings yet
Distributed System
18 pages
Grid Computing: College of Computer Science and Information Technology, Junagadh
No ratings yet
Grid Computing: College of Computer Science and Information Technology, Junagadh
5 pages
CCunit 1
No ratings yet
CCunit 1
57 pages
Unit II
No ratings yet
Unit II
37 pages
Grid Computing
No ratings yet
Grid Computing
27 pages
DC - Co 1 All in 1 PDF
No ratings yet
DC - Co 1 All in 1 PDF
197 pages
6 Grid Computing National Context
No ratings yet
6 Grid Computing National Context
22 pages
Grid Characteristics and Uses: A Grid Definition: (Migbot, Yannis, Edugom) @tel - Uva.es
No ratings yet
Grid Characteristics and Uses: A Grid Definition: (Migbot, Yannis, Edugom) @tel - Uva.es
8 pages
Chapter 1.4
No ratings yet
Chapter 1.4
32 pages
Final
No ratings yet
Final
19 pages
Grid and Cloud Computing
No ratings yet
Grid and Cloud Computing
10 pages
Learning About Distributed Systems
No ratings yet
Learning About Distributed Systems
27 pages
DS Syllabus Introduction (Reference)
No ratings yet
DS Syllabus Introduction (Reference)
44 pages
Distributed Computing: Beakal Gizachew Assefa
No ratings yet
Distributed Computing: Beakal Gizachew Assefa
54 pages
Eureka Service Discovery Essentials: Definitive Reference for Developers and Engineers
From Everand
Eureka Service Discovery Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
The Power of Big Data: Transforming Industries and Shaping the Future
From Everand
The Power of Big Data: Transforming Industries and Shaping the Future
Tom Henricksen
No ratings yet
Hyperledger: Architecture, Development, and Implementation
From Everand
Hyperledger: Architecture, Development, and Implementation
Richard Johnson
No ratings yet
Cloud-Based Multi-Modal Information Analytics
From Everand
Cloud-Based Multi-Modal Information Analytics
Tanushri Kaniyar
No ratings yet
EDGE Computing Architecture and Protocols: Definitive Reference for Developers and Engineers
From Everand
EDGE Computing Architecture and Protocols: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Cloud Computing For Noobs
From Everand
Cloud Computing For Noobs
Silas Meadowlark
No ratings yet
Jupyter Environments and Workflows: Definitive Reference for Developers and Engineers
From Everand
Jupyter Environments and Workflows: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Picus
No ratings yet
Picus
175 pages
Udom Workshop Word Excel
No ratings yet
Udom Workshop Word Excel
3 pages
Virtual Organisations: Richard Sinnott
No ratings yet
Virtual Organisations: Richard Sinnott
22 pages
Grid Security Concepts: Richard Sinnott
No ratings yet
Grid Security Concepts: Richard Sinnott
36 pages
Lecture11 PDF
No ratings yet
Lecture11 PDF
25 pages
Technologies For Building Grids: 15 October 2004
No ratings yet
Technologies For Building Grids: 15 October 2004
48 pages
Grid Security in Practice: John Watt
No ratings yet
Grid Security in Practice: John Watt
42 pages
Web Services: Richard Sinnott
No ratings yet
Web Services: Richard Sinnott
44 pages
Lecture04 PDF
No ratings yet
Lecture04 PDF
34 pages
Resource Discovery and Information Services: John Watt
No ratings yet
Resource Discovery and Information Services: John Watt
36 pages
Open Standards and Architectures: Richard Sinnott
No ratings yet
Open Standards and Architectures: Richard Sinnott
34 pages
Second Semester M. Tech. Examination, MAY/JUNE 2009: Model Question Paper
No ratings yet
Second Semester M. Tech. Examination, MAY/JUNE 2009: Model Question Paper
8 pages
Lenovo Thinkpad X200T LCM CARAMEL-1 07251-2-0731
No ratings yet
Lenovo Thinkpad X200T LCM CARAMEL-1 07251-2-0731
62 pages
CTC-4G Fronthaul CWDM Solution-20231025-2
No ratings yet
CTC-4G Fronthaul CWDM Solution-20231025-2
15 pages
Aircel Net Connection
No ratings yet
Aircel Net Connection
2 pages
NR-320502 Computer Networks
100% (2)
NR-320502 Computer Networks
6 pages
Idirect Security Best Practices Technical Note
No ratings yet
Idirect Security Best Practices Technical Note
12 pages
Cyber Security 2024 Notes
0% (1)
Cyber Security 2024 Notes
3 pages
PAM RADIUS - Implementation Guide
No ratings yet
PAM RADIUS - Implementation Guide
8 pages
4 Umts & 3G
No ratings yet
4 Umts & 3G
3 pages
Qawvware 3
No ratings yet
Qawvware 3
29 pages
Hpe Comware 7 Netconf XML API Reference
No ratings yet
Hpe Comware 7 Netconf XML API Reference
6,475 pages
Cadac SG User Guide
No ratings yet
Cadac SG User Guide
13 pages
SHODAN - Computer Search Engine
No ratings yet
SHODAN - Computer Search Engine
3 pages
CEHv12 Course Outline
No ratings yet
CEHv12 Course Outline
74 pages
Switching - Data Communications
No ratings yet
Switching - Data Communications
37 pages
Introduction To UNIX and Linux - Lecture Seven - Exercise
No ratings yet
Introduction To UNIX and Linux - Lecture Seven - Exercise
1 page
Ipmi Server Management
No ratings yet
Ipmi Server Management
21 pages
Beyond FTP White Paper
No ratings yet
Beyond FTP White Paper
8 pages
Mx122 Initial Setup Guide: Mcintosh Part No. 04175900
No ratings yet
Mx122 Initial Setup Guide: Mcintosh Part No. 04175900
2 pages
Pjsip Backup
No ratings yet
Pjsip Backup
25 pages
AQZ Netplay Help
No ratings yet
AQZ Netplay Help
2 pages
User Manual: Trustport Antivirus 2012 Bartpe Plugin
No ratings yet
User Manual: Trustport Antivirus 2012 Bartpe Plugin
21 pages
Notification
No ratings yet
Notification
29 pages
Site Solution Ericsson-Huawei-Nokia-Antenna 20160824 1730
No ratings yet
Site Solution Ericsson-Huawei-Nokia-Antenna 20160824 1730
42 pages
SMS PDU Mode
No ratings yet
SMS PDU Mode
10 pages
8.2.5.3 Packet Tracer - Configuring IPv6 Addressing Instructions
No ratings yet
8.2.5.3 Packet Tracer - Configuring IPv6 Addressing Instructions
2 pages
Sruthi
No ratings yet
Sruthi
2 pages
Name: Mutuku Joseph Kioko: MSC Procurement and Logistics Management
0% (1)
Name: Mutuku Joseph Kioko: MSC Procurement and Logistics Management
14 pages
V10 Exchange Active Sync Guide
No ratings yet
V10 Exchange Active Sync Guide
56 pages
Profibus Vulnerability
No ratings yet
Profibus Vulnerability
6 pages

Scalability and Heterogeneity: Colin Perkins

Uploaded by

Scalability and Heterogeneity: Colin Perkins

Uploaded by

Scalability and Heterogeneity

• Review of Traditional Distributed Systems

The aims of today:

• To give more examples of Grid computing systems

“A distributed system is a collection of independent computers that

What if we break the assumption of homogeneity?

Infrastructure for Internet-scale Distributed Systems

• A software system, implemented in terms of a middleware layer,

There are many definitions, depending who you ask…

• A computational grid integrates disparate resources into a single

• How does this affect system design?

Heterogeneity comes from several factors:

• Large scale grid computing started to service the needs of the

• The EGEE project (“Enabling Grids for e-Science in Europe”)

Consider diversity of user locations and environment,

• Many of the concepts of grid computing finding their way into

• Google, Amazon or iTunes

Consider diversity of user locations and environment,

• EGEE aiming to allow users at 70 different sites to share data, run

Design for compatibility, interoperability

• Grids are widely distributed systems connected by the Internet

• Implication: the variation in the network will affect how we build

• Big, and getting bigger:

• Traffic patterns shift rapidly:

to Napster, Kazaa, BitTorrent, etc.

• At least 6 orders of magnitude variation in link speed:

• Systems and protocols must be adaptive and scalable

Your system works in the lab today…

Widely distributed or peer-to-peer

Asynchronous and weakly consistent

• Goal is to share resources across organizational boundaries, to

• Who sets the acceptable use policy?

• A computational grid integrates disparate resources into a single

• How does this affect system design?

Grid computing applications produce a lot of data:

support experiments on a 27 site grid

• How to manage, distribute this much data?

• How to do data fusion across heterogeneous databases/sources?

• Job scheduling on single and parallel computers well understood

• How to move to Internet-scale job scheduling?

• How to write an application that runs over thousands of hosts,

• Need a naming scheme and communication protocol that works

Paper [3] addresses some of these issues in more depth

• Systems fail; an internet-scale distributed system might never be

• How to detect and recover from failures?

– Complicated by the scale of computational grids, and cross organizational

• Independent of job scheduling and resource management, need to

• How to manage thousands of hosts?

• Build applications that monitor the system, and reconfigure it as

• The two biggest challenges to designing a computational grid are

• Next week: discussion of current standard architectures and

We will be discussing two papers on Monday:

Between now and Monday:

You might also like