Whether it is China or a foreign country, there is a fable of "stone soup". In the fable, a soldier who lost the team came to a very poor village. He said that he can make a pot of delicious soup with only a large pot and a stone. When I started cooking, the villagers were very doubtful, but there were villagers who provided some cabbage, several carrots, a little beef ... Finally, this big pot is full, this big pot is full of feet soup. . This fable note, unity, in some time, can do something unexpected. Even if the contribution to each of us may be negligible, it can make some big things to make these contributions. The two scientists of Hargrove and Hoffman were using the truth elaborated to build their own supercomputer. Typically, powerful computers can reach hundreds of billions of processing capabilities per second. Most supercomputers are in parallel. They are usually equipped with a lot of powerful processors to solve complex problems such as weather forecast, atomic bombing simulation. These supercomples are generally produced by industry giants such as IBM, SGI, and their prices are also amazing, mostly need tens million. This is obviously too expensive for funding scientific research groups. So some laboratory and universities began using inexpensive PCs to build supercomputers, and write software to handle some very complex issues. The problem was proposed in 1996, ORNL (Oak Ridge National Laborative, Tennessee Oak Ridge National Lab, HTTP: //www.ornl.gov/) Hargrove and Hoffman has encountered a problem. They need to draw a US ecological map that contains information: the same area, terrain, and soil characteristics of the climate. In order to create a high-resolution map of the US continent, the state needs to divide the country into 7.8 million small squares, and each small square represents a square kilometer. And each variable is required to consider in each square. Obviously, PC or normal workstations cannot complete such tasks. That is, they need a supercomputer with parallel processing power to complete this work, but this computer has far exceeds their ability. Finally, the solution they take is to build a computer cluster with Ors' long-term computers and donated computers using ORNL. This cluster is named "Stone Souper Computer, http://stonesoup.esd.ornl.gov)". The reason why this name is to express it mainly on the donation of all walks of life. As a result, this supercomputer function consisting of waste PC is very powerful, which draws high-resolution ecological map for Hargrove and Hoffman. At that time, other research teams were also a computer cluster system that was comparable to the best supercomputer with the world. Due to the excellent cost performance, this method is also greatly received by the manufacturer and company users. In fact, the concept of this cluster provides very powerful processing capabilities for all research organizations, schools and corporate agencies, causing a conceptual change. The origin of the cluster system provides a powerful computer capabilities that connect the computer. As early in the 1950s, the US Air Force established a network vacuum tube computer named SAGE to prevent nuclear attacks of the former Soviet Union. In the mid-1980s, DEC integrates the VAX miniature into a larger system, and eventually proposed the word "cluster". At that time, the network workstation (typically smaller than the small machine is slower than the PC) has been very widely used in research institutions. By the 1990s, scientists began considering building a PC cluster due to the sharp decline in microprocessor chips and Ethernet equipment. The development of software is also fully prepared for building a PC cluster. In the 1980s, UNIX became the most widely used operating system in scientific research. Unfortunately, the operating system lacks flexibility in the PC.
In 1991, Linux was born, and LINUS Torvalds was free to download it for free on the Internet. In a short period of time, thousands of programmers began to contribute to the development of Linux. Now Linux has become a world-famous operating system, and it is also an ideal operating system for building a PC cluster. The first PC cluster was born in 1994 in Nasa Goddard Space Center. Previously, NASA has been looking for a solution that solves some problems related to calculations in the earth and space sciences. Scientists need a computer that can reach GIGAFLOPS (1 billion floating point calculations per second, a simple operation equivalent to playing or multiplying once). At that time, the price of the supercomputer capable of achieving this running ability is approximately $ 1 million. This is really expensive for research teams engaged in a field study. At that time, a scientist named Sterling decided to purchase the PC and build a cluster. Sterling and his colleagues in the Gorda Research Center have 16 PCs. Each PC contains an Intel 486 processor, the operating system used is Linux, and the connection device is also a normal network card. At that time, this PC cluster reached 70 Megaflops, that is, 70 million floating point operations per second. Although it is seen from today's standards, this is nothing, but at that time, this has reached the computing capacity of some commercial small supercomputers. At that time, this cluster took only 40,000 US dollars, only the price of one tenth computing capacity computer in 1994. NASA researchers refer to the cluster they built as Beowulf (http://www.beowulf.org). Since then, this name has become synonymous with low-cost cluster system using PC. In 1996, two successors of Beowulf appeared: Hyglac (built by researchers in the California Technical College and Jet Propulsion Laboratory) and LOKI (it was built by the US Los Alamos National Laboratory). These two clusters use 16 Intel Pentium Pro 200 CPUs, and the network is 100Mbits / s Fast Ethernet. The operating system uses Linux, and the data transmission is MPI (Message Passing Library). Their performance has reached 1gigaflops, and the price is less than $ 50,000. The construction of "Stone Soup Computer" seems to use Beowulf to meet the requirements of Hargrove and Hoffman to draw the US ecological map. Because each single workstation can only handle several states, they cannot assign different regions to different workstations (because each part of the country must be compared and handled simultaneously) . In other words, they need a system that can perform parallel processing. Thus, in 1996, Hargrove and Hoffman submitted an application to their superior departments, ready to purchase a PC containing Pentium 2 processors to build a cluster system similar to Beowulf. However, this plan is not approved. They decide to seek alternative programs for helplessness. They are known that the US Energy Oak Ling Office often uses some new PCs to replace the old PC, and these old PCs will be auctioned in the internal web. Now there are hundreds of discarded machines waiting for processing. As a result, they began to build "stone soup computer" by collecting abandoned PCs in an idle room in ORNL. The design philosophy of parallel computing system is "all breaks", which is to break a complex problem into some small tasks. These tasks are then assigned to each node of the system (such as a PC in the Beowulf cluster, and these nodes can handle problems at the same time.
The efficiency of parallel processing depends to a large extent on the problem to be processed. A very important factor that needs to be considered is to share intermediate results between each node. Some problems must be divided into countless tasks, and these too refined allocations require frequent communication between each node, and is not suitable for parallel calculations. In contrast, some problems can be divided into relatively large child problems, and there is no need for frequent communication between these sub-issues, so it is much more faster through the parallel system. Such problems are compared to use such systems. Before building any Beowulf cluster, you must have some designs to build. Connecting the PC can choose a normal Ethernet, or you can choose a more professional and fast network. Due to the lack of budget, Hargrove and Hoffman are connected by ordinary Ethernet because it is free. They selected one of the PCs as a front end node and installed two NICs. One of the NIC is used to communicate with the outside, and the other will communicate with the remaining nodes, and these nodes are connected by their own proprietary networks (as shown in Figure 1). The cooperation between each PC is done by transmitting information between each other. There are now two most popular information transmission methods, one of which is MPI, and its full name is Message Passing Interface. The other is PVM, its full name is Parallel Virtual Machine. MPI and PVM have its own advantages, and PVM is apparent, but lack of standardization, it is more suitable for use in a heterogeneous environment. In a comparison, the PVM has a lower performance, and its function is not as rich as MPI. MPI not only provides a large number of function functions, but also has a wide compliance with a wide range of support. The development of MPI is also very fast. Both can be available online. In the construction of the "stone soup computer", both methods have been used. Many of the hardware composition of the Beowulf cluster system is consistent, that is, all PCs used by all PCs are the same. This uniform simplifies the management and use of the cluster, but it is not necessary. "Stone Soups Computer" contains various processors because it is designed at the beginning of the construction of all available devices. Start, "Stone Soup Computer" contains some machines that use Intel 486, then, the machine used is at least above, and the memory is at least 32MB, and the hard disk is 200MB. In fact, in the process of building this cluster system, Hargrove and Hoffman finds that few machines can meet this need, so they combine the comparison components of different PCs. When you add a node for a cluster, they install the Linux operating system. Later, they simplified this process so that the installation process of each node is easier. "Stone Soup Computer" The first run is in 1997. By 2001, the nodes included have reached 133. These include 75 486 PCs, 53 Pentium Machines and 5 CoMPAQ Corporation ALPHA workstations (as shown in Figures 2 and 3). The upgrade process for "stone soup computer" is very simple, just replace it with the slowest node. As a routine task of a cluster, every other hour, each node will perform a simple speed test. The value obtained will help adjust the cluster. Unlike commercial machines, the performance of "stone soup computer" has continued to rise, because it will continue to upgrade it from all parties. Parallel processing issues with hardware for setting up and assembling the Beowulf system, the development of parallel programs requires more technically and creativity, and thus also challenging. The most common program mode in the Beowulf cluster is Master-slave. In this mode, a node is as a master, which is responsible for commanding one or more slave associated with it. Another challenge is to achieve load balancing between PCs in the cluster.