Web services and "grid" computing continue to be hot technologies that are evolving rapidly so it seems logical to examine what grid computing and Web services can offer each other. Generally speaking, grid computing is a subset of distributed computing that emphasizes making efficient use of otherwise unused CPU or data resources. Another implication of the term implies access to commodity computing power as easy as the electric grid makes access to electrical power.
It turns out that the term "grid" gets applied to a wide range of computing configurations, so our first step will be characterizing various kinds of grid technology along various conceptual dimensions. This should lead to an indication of the areas where grid computing and Web services can cooperate.
|Tight||Identical or very similar hardware
Physically close to keep connections fast
Workers are assigned interconnections to share information
Japan's Earth Simulator
TCP/IP network connectivity
Workers request jobs when they have spare time
Workers can switch projects
Workers connect and disconnect freely
Workers only communicate with controller
BOINC (Seti@home, etc)
World Community Grid
Controller communicates indirectly by placing jobs in a spaceWorkers request jobs when they have spare time
Workers communicate only with space
One space can feed a number of different applications
|Jini - JavaSpaces|
Tightly coupled clusters
Beowulf clusters are a good example of closely coupled sets of nearly identical hardware. Beowulf clusters use Linux and standard commodity hardware for low cost, while other clustered system use specialized interconnection technology and CPU architecture for high throughput. Tightly coupled clusters are typically put to work on problems involving massive amounts of computation such as weather simulations, aerodynamic simulations and computer graphics "render farms."
Although Beowulf clusters have been built from miscellaneous hardware found lying around a university department, the typical job requires that all workers have similar speed. A computer wired into a cluster is there for life and is not going to be able to take other work.
Typically the software has to be specially written to take advantage of the architecture and every worker runs the same code. Fortran and C are the most commonly used languages. The usual problems tackled require that workers share data so interconnection speed is very important. Interconnections are assigned as part of the job setup
Loosely coupled systems
The best known example of a loosely coupled system technology is BOINC (Berkeley Open Infrastructure for Network Computing) which runs the Seti@home and many other CPU-intensive distributed projects using world-wide volunteer networks. A controlling server for a particular project hands out blocks of work on request, records the results returned and tracks the effort of each participant. Volunteers can be organized into teams competing for placements in the list of contributors.
Although the volunteer grids are most publicly conspicuous, the BOINC project encourages what they call "desktop grid computing" within private organizations. Another volunteer oriented project, World Community Grid, has been sponsored by IBM. Naturally the technology developed can be applied to private projects, making use of idle desktop computers for projects which would otherwise require heavy investment in high performance hardware.
In contrast with tightly coupled grids, typical loose grids are not tied to particular hardware or operating systems. The problem types are such that data is organized in blocks with can be operated on by each worker independently. If a worker drops out of the grid with a partially completed block, the controlling server will eventually reassign that block to another worker. Projects may run for months or even years before completion.
In the "space" or "tuple space" style of distributed computing, participating processors share data containing objects through an intermediary system that acts like a huge "associative memory." An associative memory lets you address objects by their content rather than their physical location. The difference between location addressing and content addressing is the difference between trying to find a particular car owner in a room full of people by walking around to talk to each one and announcing over the PA system "the owner of a green Toyota, license XYZ-123 - you left your lights on."
The term "tuple" in "tuple space" (also written as tuplespace) implies that the data containing objects may contain more than one value that can be used in locating the object. Processes that need some computing resources write into the space an object with values defining what is needed and defining the variables that must be returned. Worker processes register with the space as being able to perform work on objects with specified values. The space matches jobs with workers and sends the object to a worker, which returns it after performing the needed computation. Complex computation tasks can be accomplished by passing an object from one specialized worker program to the next. For example one worker might retrieve raw data of daily stock prices and the next might perform statistical analysis. As with the BOINC style distributed projects, increasing the capacity of a space-based system can be as simple as adding more workers, as long as the computer managing the space storage can keep up.
The most widely distributed space implementation appears to be JavaSpaces originated by Sun Microsystem, which uses the Jini protocol for distributed networking. Jini is now an open-source project and there are several commercial and open source implementations of JavaSpaces.
What can Web Services and grids cooperate on?
The grid based computing systems I have described are obviously not applicable to all Web service problems, but I think there are significant areas of cooperation. For the big computing jobs tackled by Beowulf and BOINC, Web services could provide a convenient interface for progress reports. If the grid computing ideal of commodity computing power really takes off, Web services could be used to submit data sets for massive number crunching. The informal networks of space-style systems can be used to expand the computing power available to a Web service by simply adding more workers. In my next article I will examine some of these possibilities more extensively.