Computational Grids
Computational Grids
Computational Grids
Summary
What types of applications will grids be used for? There are many different purposes of using computational grids and many different problems that need different underlying technology. Those different approaches can be classified into five classes, which are distributed supercomputing, high throughput, on demand, data intensive and collaborative computing. Who will use computational grids? There are many different groups in society that could benefit from a grid. Communities like government, health maintenance, science collaborations and many others all need the ability to share data and CPU power. Since there are so many areas that grids could be used in, we dont expect to see only one grid architecture, but many different. What is involved in building a grid? This totally depends on what the grid is going to be used for, giving a single answer is just not possible. We divide the grids into four main groups ordered by scale. End systems, Clusters, Intranets and Internets What approaches are needed to develop computational grids? One needs to divide the computational grid development in at least three levels. Those levels are like the protocol layers that build up the World Wide Web today. Where each layer is standardised so it becomes easy to develop new application. The developers of the layers can be grid developers, tool developers and application developers. What is needed for computational grids to become a service everyone uses? The computational grid development needs to be standardised to make it robust, effective and easy to use. Creating possibilities for new applications to be produced effectively and cheap.
if a researcher needs to make some very CPU consuming calculations, he could occasionally borrow CPU-time from a grid to a much lower cost than borrow the time from a super computer. A grid could be created in all environments where end users have a computer with memory and CPU. Computers are often in idle state and over time they only use 5 % of their capacity. Hence there is a lot of computational power going to waste, set up as a computational grid each end user would be able enjoy a lot of computational power that otherwise would have gone to waste.
around to work and interact in real-time with each other. They are often structured in virtual shared spaces, where they also share resources and data, which also are the main issue in on-demand and dataintensive grid application. But here the main challenge is to make it possible for people to interact in real-time without disturbance. There is as we can see a lot of different reasons and problems for using computational grids. Which require different technical approaches. It will therefore require much effort to standardise the grid technology to fit every application.
Intranets The main difference between an intranet and a cluster is that an Intranet introduces
heterogeneity into the system, it also presents the problem with separate administration giving rise to systems must negotiate conflicting policies (systems in an Intranet are assumed to be centrally administered). Another problem is the lack of global knowledge. It is impossible for any system to have global accurate knowledge about the different systems states. Centralized administration gives the advantage that it simplifies security and systems like Distributed Computing Environment (DCE), DCOM and CORBA could be successfully applied to Intranets. Programs in these systems do not generally create processes manually, but rather connect to services that encapsulate hardware resources.
Interactions occur via RPC (Remote Procedure Call) or remote method invocation, models that have become a standard in remote function calls. Services like wide area files system technologies could also be applied successfully in an intranet, DFS (Distributes File System) is probably the most well known virtual file system. Systems like these introduce services that can be used to break the boundaries for secondary storage restricted by the operating system. E.g. offering petabytes of storage on network drive F:\ (Even though windows imposes restrictions on hard drive size not to be bigger than x bytes). Virtual file systems also allow for data to be duplicated, securing access and making the system more secure against data loss (e.g., in case of hard drive failure). Due to the less secure environment, systems like Kerberos has evolved, offering security and a unified authentication structure throughout the Intranet. Internets is the most complicated system and is characterised by a lack of centralised control, large geographical distribution and international issues. In an Internet we cannot rely on the existence of a common scheduler and must therefore explore other alternatives. A common strategy to solve this problem is using a scavenging grid system. A system that enables, often idle, computers communicate with a kind of global scheduler, called management node. The management node applies jobs to computers that match the current jobs restraints. New technologies like Legion and Globus is being developed, treating hosts like objects in an ordinary object oriented fashion.
works are often expensive, hard to adapt to other applications as well as other grid-systems. They are also often fragile. The development of grids must be internationally standardised. The development must also be done in smaller modules, as with the different protocol layers that are the fundaments of the Internet today. One can divide the developers into three classes, which are grid, tool, and application developers. Grid developers - develop protocols and produce routine libraries. The challenge here is to produce a library of protocols which will work well with many underlying technologies (e.g., different types of networks). The library must also fulfil the many different requests from the tool developers, making it hard to give every different request best performance, while at the same time accommodating the different underlying technologies. There will therefore be a battle between generality and performance. It is very important to standardise all protocols so the tool developers knows how they can implement their work. Tool developers - concentrate on developing a system that will take care of the main things that must exist for using various applications. Security must be taken care of, things like authentication and confidentiality has to be implemented. They also develop methods for payments, which are very important in for example on-demand grids. Finally they also develop methods to find and organise resources and information. Which include communication, fault detection and many more things. Tool developers must adapt their protocols to fit the protocols developed by the grid developers and also keep in mind the requests from the application developers. Everything must be standardised so that the application developers easily can make use of the capabilities from the tool-layert. The tool developers must also inform the application developers which implementation can get higher or lower performance. Finally there are the application developers which are supposed to use all the methods they need from the tool level to make specific application programs for the end users. Applications that are intended to solve hard problems for the end users. The challenge for the application developers is finding algorithms that divide a task into thousands of smaller tasks that can be handled separately and to make the tasks work efficient with the tool layer. For the end user it will only be important to solve a request while it is less important for the user to know how it works.
performance. Compared to electricity one end-user can use very little power for little money, although it comes from very expensive power plants. Making payments just for the power one consumes. To make computational grids common there must be some influence from politics and organs that acts internationally to standardise technology.