Grid calculation: All World Computer Union!
Author: World Software issued a document time: 2004.02.05
Editor Press: With the continuous development of supercomputers and high performance calculations, with the Internet to make computer computing capacity dispersed in different geographic locations into a very huge "virtual supercomputer", it has become the field of commercial computing and complex science calculations. An important consideration for people. However, in many dimensions such as availability, cost, management, and security, this new computing model called "grid calculation" requires a development process. In 2003, it was "Grid Computing" and the year of rapid development, "mesh calculation" became the keyword of the population of the software industry.
By scanning several large grid projects in the world, by introducing the main applications of grid computing, and the role of leading companies in grid computing; and introduction through grid calculation, this journal here The mystery of huge grid is fully unveiled -
In November 2003, the national 863 program is also a major results - 4 trillion times per second "Deep Site 6800" successfully developed in Lenovo. "Deep-Tong 6800" super computer peak operation is 5.32 trillion per second, and the LINPACK actual calculation speed is 4.183 trillion per second. In the latest rankings of the global supercomputer 500, "Deep Titan 6800" actual computing speed ranks 14th, Linpack efficiency ranks second in high-end supercomputer. This is also the best achievement of supercomputers made so far. As the main symbol of the national grid project, the birth of the supercomputer "Deep Detead 6800" once again triggered the attention of "grid calculation".
What is grid calculation?
Multi-dimensional grid calculation
Grid Calculation (GRID), as the name suggests, means multiple computer groups into a grid network. "Grid" is "checkered" in English, is a "simulation of high performance computers". For example, if there is a CPU for a service rate of 1G to be 6 minutes. If there are 6 computers that have the same CPU installed in the network, this service can be divided into 6 equals, respectively, handed over to each computer. Then, in theory, the processing time of this business will be shortened to 1 minute. This is the basic idea of grid calculation.
So, what is the grid calculation? Grid computing is a multi-dimensional concept.
Grid computing is the third generation of Internet computing; grid computing is a high performance calculation; grid computing is a collaborative calculation; grid computing is also the basis and prototypes of universal computing and public calculations. First, the grid calculation is "third-generation Internet computing". With the development of the Internet, human life has encountered a huge impact and impact from thinking methods, working models and life philosophy. In the network computing era, the first generation of INTERNETs, which are mainly used by e-mail, connects the computer all over the world with TCP / IP protocols. In the Web Computing Age, the second generation Internet has implemented the global pages through web information browsing and e-commerce applications.
The task facing the third generation of Internet is: How to achieve comprehensive communication of all resources on the Internet? These resources include: computing resources, storage resources, communication resources, software resources, information resources, knowledge resources, and more. This has become the basic driving force for promoting grid calculation (gird computing). In this sense, the primary characteristics of grid computing are high availability. Second, grid calculation is "high performance calculation". Computers apply to solve computational complex problems, but computer in technology and construction will never rise to actual problems to computer capabilities. The computing environment cannot meet the requirements that often cause the computer to unable to solve complex practical problems.
So, some people envision: If there is a technology that can make the computer all over the world, when all computing power is aggregated, there is a separate supercomputer or high-performance computer to compare its computing power! This technology is grid computing technology. This makes mesh calculations have essentially "high performance". This high performance has become a native motivation for the current academia and corporate circles pursue grid computing.
Third, grid calculation is a "collaborative calculation". Since grid computing is a set of emerging technologies that constructing on the Internet. Its infrastructure must be built on a broadband digital communication network based on an IP protocol. In recent years, the development of broadband technology has become possible. The emergence of grid calculations will change the traditional Client / Server and Client / Cluster structure to form a new Pervasive / Grid architecture. Under such an architecture, it is possible to treat the entire network as a huge computer, and enjoy integration, dynamic, flexible, intelligent, collaborative information services. This makes the grid calculation in the four modes of the network calculation (other three are the enterprise calculations, peer computing and popularization calculations), and gradually enter scientific research, manufacturing information, e-government, enterprise synergy, education informationization Even entertainment space. This synergistic requirement also promotes the development needs of manageability and security in grid computing. Finally, based on high availability, high performance and synergy, the concept of grid calculation has become an extremely attractive "concept" in 2003. The core of the grid concept is "resource and service" and the use of "resources". The "resource" here includes extremely broad content such as computers, databases, instrumentation, and information services. The essence of grid computing is to provide users with an unprecedented "advanced service" on the basis of breaking the traditional restrictions on "resources".
The "Advanced Services" of the grid computing include the meaning of the following aspects: the gridization of the resource; the coordination of grid resources; and the fusion of grid resources. "Advanced Services" by mesh can be liberated from a specific geographic location, so that the resource can be delivered to any corner via grid. Through the "Advanced Services" of the grid, any grid resources can be achieved under certain rules constraints and management, breaking the obstacles for extensive sharing and collaboration between different resources. Since the resources provided by the grid system are enhanced resources, "Advanced Services" by mesh can also make the original restrictions in resource capabilities and resource types. In these senses, the ideal grid calculation is actually a contributing calculation and public calculation.
In-depth understanding of grid calculation
Further understanding mesh calculations, you must understand the hierarchical and metrics of the grid calculation.
Key technologies for grid computing
Grid calculations actually belong to distributed computing. The grid computing mode first divides the data to be calculated into several "small pieces", and the software that calculates these "small pieces" is usually a pre-prepared program, then in different nodes can download one or more according to their own processing capabilities. Data pieces and programs. As long as the user of the node does not use a computer, the program will work. Grid computing systems are generally composed of grid hardware, grid operating system, grid interface, grid application 4 layer basic structure. The most prominent feature of grid computing systems is resource sharing, collaborative work, and open standards. Therefore, grid calculation is currently the main barrier of research and development comes from the establishment of standard protocols and the determination of architecture.
Grid calculation is based on metadata, component framework, intelligent body, grid public information protocol and grid computing protocol as the main breakthrough point for grid calculation. As a national grid engineering master, the birth of the supercomputer "deep 6800" caused attention to "grid calculation".
The grid system can be divided into three basic levels: resource layer, middleware layer and application layer. Since the current Internet structure is not designed for mesh, in order to make mesh computing and existing structural compatibility, there is generally a scalable middleware layer. This middleware layer refers to a series of tools and protocol software. It features the distribution of computing resources in the mask grid resource layer, which provides transparency, consistent use interfaces to the grid application layer. The middle part of the grid is also called a grid operating system. This middleware layer needs to provide user programming interfaces and corresponding environments to support grid applications.
Grid protocol "Globus": As a free software, the key technologies for mesh calculations such as resource management, security, information services, and data management, Globus can develop grid computing tool software that is running on various platforms (Toolkit) To help plan and form a large grid test platform to develop large applications that are suitable for large grid systems. Globus believes that interoperability in a network environment means developing a universal protocol, using it to describe the rules of information and information exchange. Globus's grid computing protocol is based on the Internet protocol, based on communication, route, name resolution, etc. in the Internet protocol. The protocol is divided into 5 layers: structure layer, connection layer, resource layer, assembly layer and application layer. The upper protocol can call the service of the underlying protocol. Both global applications in the grid are called through the services provided by the protocol. Grid Node: Grid Computing Resources, which includes high-end servers, cluster systems, MPP systems large storage devices, databases, etc. These resources are distributed in a geographic location, and the system has heterogeneous properties.
Broadband Network System: In the grid computing environment, in order to make computing power, "i-even", the user gets a delayed, reliable communication service, requires high quality broadband network system support.
Resource Management and Task Scheduling Tools: Computing Resource Management Tools To resolve key issues such as the description, organization and management of resources. The role of the task scheduling tool is based on the load of the current system, dynamically scheduling the task within the system, and improves the operating efficiency of the system. They belong to the middleware of the grid computing.
Monitoring Tool: To make full use of resources in grid calculations, performance analysis and monitoring tools.
Application layer visualization tool: convert the calculation result into a tool for intuitive graphics information.
Grid calculation keyword
Computing grid is a software and hardware combination of base underlying structures, which can be reliable, consistent, universal, and low price use high-level computing power. Computer resources can be redistributed as needed by one virtual platform based on grid computing.
"Basic underlying structure": Computing grid is used for large-scale resource mergers. This combination requires a valid hardware base underlying structure to complete the necessary interconnect and effective software base underlying structures to monitor and control results.
"Reliable": Reliable service is the most basic requirement. To make sure the message received from different parts of the user is pronounced, continuous, and high performance.
"Consistency": Grid computing requires standardized service, standard access interface and standard parameter operation. The challenge of development standards is how to package differences without affecting efficient implementation.
"Universality": Regardless of how to change the operating environment of the grid, the universal access method can use reliable service. Universality does not require resource everywhere or everywhere. Computing grid uses restriction accesses and access control methods. In the environment supported by the grid, you can rely on global access (Universal Access).
"Lower cost": grid calculation can be widely received and used, and the comparison cost of grid calculation should be lower. "Resource Concentration": Grid computing makes the company's users can see the company's entire IT infrastructure as a computer, which can find resources that have not been utilized according to their needs.
"Data Sharing": Grid Computing Enables companies to access remote data. This is especially useful for certain life sciences projects, because in these projects, each company needs to share human gene data with other companies.
"Grid Collaboration": Grid computing enables organizations that are widely dispersed to cooperate, integrate business processes, sharing all information such as engineering blueprints to software applications.
Where is the grid calculation?
The application requirements for high performance calculations make computational power cannot be obtained on a single computer, so super computing power must be obtained by building a "network virtual supercomputer" or "yuan computer". In the early 1990s, according to the situation in which the host on the Internet, the utilization did not improve, the US National Science Foundation (NFS) constructed its four supercogramping centers as a meta computer, gradually developed to the study Most challenging parallel issues.
Grid computing through various types of computer (including chassis), databases, various devices, and storage devices, etc., forms a virtual high-performance computing environment relatively transparent to users, including distributed computing, high throughput Many features such as calculation, collaborative engineering and data query. Main application of grid computing
The grid calculation was in the field of scientific research in the 1990s. To date, now, grid computing is mainly used by universities and research laboratories for high performance calculated projects. These projects require huge computing power, or need to access a lot of data. The application of grid computing is very wide range: the rapid analysis of satellite images, the design, biological information scientific research, super video conferencing, manufacturing design and production, digital library and general business applications. The earliest mesh calculation in IT is the e-commerce application that supports all industries. For example, the production requirements of complex products such as airplanes and cars are calculated on product design, product assembly and product lifecycle management. Other application examples also include simulations of complex financial environments, as well as many projects in life sciences. At present, grid computing has gradually been used by corporate and government for IT optimization, analysis & acceleration, information access, engineering design, and design collaboration.
Enterprises that require high-speed processing performance in design, research and analysis work, have also begun to pay attention to this new resource. The purpose of grid calculation is as follows: genetic research; drug research; cutting-edge design projects such as automotive and spaceplanes; and special effects of entertainment industry.
The world's main grid computing project
Since the grid computing environment can connect different standards of different standards in the wide area, form a huge global calculation system, which is an advanced form of Internet development. Therefore, grid computing has attached great importance to countries and organizations in the world, and many forums, experimental environment and research projects have been launched. Most of them are shared online, which greatly facilitates people's research and utilization of resources, such as grid computing forums, etc., aimed to promote promotion related technologies. More representative mesh computing items include: experimental beds, globus projects, legion projects, Globe projects, NetSolve projects, Javalin projects, etc., simply divided into representative GLOBUS-based projects and Java-based grid computing Two categories. The world's influential grid projects mainly include: seti @ Home Program, Globus, and D2OL projects.
At present, the largest grid computing project on the Internet is an analytical shooting telescope data, looking for the SETI @ Home plan of the universal wisdom life, and the computer participating in the program is close to 4.5 million units. As a successful model of grid computing, the SETI @ Home project will construct a computer array of 2 million individual computers around the world in early 1999, which is used to search for the signs of alien civilization in the radio astronomical telescope signal. The project team said that in less than two years, this calculation method has completed the amount of 345,000 in a single computer. This distributed calculation ability is a powerful visible.
Globus is a R & D project in the US ARGONNE National Lab, 12 universities and research institutions participated in the project. Globus technology has been applied in multiple projects such as NASA grid, European data grid, US national technology grid. In addition, there is also a D2OL project hosting D2OL project in the first half of 2002, and the initial operation target includes Ebola virus, anthracte bacillus and celyfirus. In May 2003, the SARS virus joined the SARS virus in the whole world, especially China SARS, and also as the focus of the work at the time. It is alleged that the power of 55,000 individual computers participating in the D2OL project has exceeded 10 of the best supercomputers at the time.
Grid calculation, who is saying?
Grid computing is considered to be engine technology that leads the next generation of Internet age. Therefore, international IT giants have strengthened grid calculations in 2030. Looking back in 2003, we can see international companies that play main roles in technology include: IBM, Sun, Oracle, HP and Fujitsu in Japan.
"Blue Grid": IBM will become a necessary product to the 20th, 2003, IBM has introduced ten different industries for five main industries - financial services, life sciences, government departments and automotive and aerospace industries. Grid solution. These products combine the software made of IBM's computer and smaller grid computing companies, making the advantages of mesh calculations from academic and research fields to business and market. Each solution for IBM grid calculations will include hardware, software, and services specialized in accordance with industry and customer needs. Hardware includes IBM EServer P Series, IBM EServer X Series, and IBM Storage Products; software includes WebSphere, DB2, Discovery Link, and middleware from grid partners, and also include an open source Globus kit. In addition to the products and techniques of IBM software, hardware, and services, IBM R & D Center also launched the latest technologies specifically for grid computing, such as Service Domain. In addition, IBM also cooperates with five grid middleware companies to deploy grid solutions in IBM selected industries. Platform Computing and DataSynapse two companies have signed a gold medal agreement with IBM, and joined IBM's Business Partnership Program. Among them, users can help the financial industry discovers and utilize the IT optimization mesh that can be used but unused, and store resources; analyzes acceleration mesh for computational intensive applications, such as biopharmaceutical industries; In the existing data resource, the inquiry processing is applied to the information access grid of government agencies; suitable for industries such as automotive manufacturing, can effectively reduce design costs, improve design cycle engineering design mesh; Shared data and collaborative work in design work, suitable for design collaboration mesh for complex system design such as aircraft design; and information access to information access, non-formatted data mining, non-formatted data mining. In addition, IBM also creates a grid innovation laboratory to help users debug the grid solution suitable for the business.
In terms of grid calculation, IBM has extensive experience and success cases. Including "Blue Grid", British National Grid, North Carolina Biological Information Science Grid, Asia Pacific Grid Customer Center, European Grid Facilities, etc. Successful grid construction. IBM even built a Butterfly grid with butterfly.net. This is the first commercial grid on the network video game market. IBM "Grid Calculation" will be divided into four steps: The first step is the computer's grid, processor's grid; second step is the grid of data, based on data exchange between different computers and individuals; The third step is to calculate the grid of the service to ensure the uninterruptible operation of the computing service; the fourth step is to provide e-commerce services based on the needs. In these four steps, IBM believes that it has come to the second step.
Sun: "The grid is everywhere" since the acquisition of Gridware in July 2000 and launched the resource management software Sun Grid Engine, Sun became a leading designer for grid computing technology. Currently, Sun's full set of grid computing technology; Sun About mesh technology evolution concept - "From clustering, to enterprise, to global network", and Sun For open, standard-based commitments, SUN push the world The Important Foundation of Grid Computing Technology Application and Development Strategy. Grid computing is considered to be engine technology that leads the next generation of Internet age.
Sun will apply to the SOLARIS operating environment and the Sun Grid Engine software for the Linux platform for free download, and play an important role in promoting grid computing in global applications. In July 2001, Sun became the first manufacturer of the Grid Engine project on the website to the first manufacturer of the main grid computing technology into an open source. Sun also actively participates in the Global Grid Forum to help establish industry standards, including distributed resource management application programming interface (DRMAA) standards.
Sun vigorously develops mesh calculations, actively playing leadership roles, and has technical advantages. Java language and related technologies have successfully solved several key issues, such as isomer and security. At the same time, in terms of grid calculation, Java has a minimum implementation environment. This shows that from the theory, any of the world's machines with a web browser can be calculated. Therefore, the development and improvement of the Java platform promoted the comprehensive development of grid computing models. Grid technology is considered to be a key part of a strategic implementation such as N1 architecture, virtualization data center, and maximizing customer investment returns. As a grid calculation, the owner of Java, SUN launched a grid computing on-demand service strategy in early 2002, and launched Sun Grid Engine Enterprise Edition software in June 2002. With this, Sun expands the concept of open enterprise grid architecture, and continues to strengthen leadership in the field of grid computing. In order to provide a powerful tool for enterprises, Sun has released integration of data, computing, and on-demand services for grid computing and fully integrated software platforms for grid computing and on-demand services.
Where is grid calculation?
The "Grid Computing" concept is derived from "PowerGrid", which is desirable to get high performance computing power like a power supply. Since the grid computing environment can connect different standards of different standards in the wide area, form a huge global computing system, which is the advanced form of Internet development, so it is highly valued by countries and organizations from all over the world. At present, many forums have been established through the Internet, and the experimental environment is built, and research projects can be carried out. Most of them can share online, which greatly facilitates people's research and utilization resources.
In the IBM's Grid Computing Development Plan, we will experience four stages, solve capacity, data, high usability and hydropower utilities. Currently, grid computational capacity issues have been resolved, and data and high usability will also be implemented as soon as possible. For high usability and hydropower utilities, IBM believes that the public utility of IT resources is needed to introduce corresponding service providers.
There have been more and more industry people believe that computing power will eventually become public resources like electricity and water. At present, it is possible to get a solution to "common calculations" in the enterprise and similar to the "Shanghai Grid" in the digital city. A comprehensive grid calculation is still far away. Also, the reliability and safety requirements of infrastructure and services are also more demanding. IBM, Sun and Oracle Company have done a lot of work in promoting grid computing applications.
In order to make mesh computing technology, IBM is working hard to improve system collaboration, integration, and automatic operationality by means of Web Services. Sun launches a new strategy of "grid everywhere", focusing on resource utilization, data grid and visualization. There are also important areas in grid technology, including end-to-end collaboration design and supply chain management. Comparison, Sun is more optimistic about enterprise grid computing and demand for grid computing in the firewall, and more pragmatic. This is different from the IBM advocated "Global Grid" and the Internet-based Grid Plan.
Oracle companies pay attention to grid calculations in product strategies. Industry believes that Oracle will "bet" in the database software, I hope to obtain the advantages of its competitors IBM and Microsoft, making its products a large-scale enterprise and service using utility computing mode. Natural choice for business. We are trying to "all the world's computer joint work".
Sun believes that Sun Grid Engine software combines with Sun One through web portals, Java, and XML technology, is a natural evolution of network computing technology. Sun proposes a new grid computing / web service solution, focusing on enterprise-wide grid computing, making computing resources and data for on-demand services. So far, more than 160,000 CPUs are under the management of Sun Grid Engine software and Sun Grid Engine Enterprise software in more than 5,000 clusters and enterprise grid environments. In mid-November 2003, Sun launched "Grid Everywhere" new grid calculation strategy. Sun has become more than 4 years of leaders in the grid computing market. This is the second phase of its grid computing strategy, namely the "Building Module" method, adapting to customers' specific needs. The "formation module" is used by the industry as "granted wood", which can provide users with grid computing services, making its network computing capabilities transcend the original computing power, integrating collaboration, data network, and visualization.
The new "Setup Module" method provides comprehensive resources such as products, technology, professional experience, alliances and services to design the most applicable grid architecture to help customers get higher utilization in existing resources and provide customers with past. There is impossible resources in the network. Grid "Setup Module" divides Sun new services to four categories: access modules, data modules, computing modules, and virtualization modules. Among them, Sun's "Access Module" can be effectively used for resources of any location. The access module is provided through a new grid portal solution in the "Globus Toolset" in the Sun Grid Engine Enterprise Edition software and the industry standard.
Sun's "Data Module" is a data grid solution, composed of Sun Storedge Open SAN architecture, Sun StorEdge 3510 FC array, Sun StorEdge SAM-FS, and QFS software with high speed, efficiency and flexibility. Users and data can centrally, manage, and protected data in any location. Sun's "Calculation Module" is also known as Sun Fire computing grid solutions, including Sun Fire systems and inline. For small system clusters, the solution provides the best cost performance. For super clusters, the best productivity and price ratio are provided. Sun's "virtualization module" can make a variety of applications to perform graphical operations through a local or remote graphical system. The Visual Grid platform consists of a Sun Fire V880Z system, a XVR-4000 high-speed graphic subsystem, and a professional software based on OpenGL industry standards. In addition, Sun also provides support for Grid Reference Architecture and Customer Ready System (CRS) projects.
Grid calculation in China
"Grid" is a concept raised from scientific research. From the early 1990s, it began to rise rapidly in 2000. Grid technology will form an annual output value of $ 20 trillion in an annual output. After the "Grid" concept is rapidly accepted by the enterprise, the industrial giant IBM is the most enthusiastic participant and the bulletin of the grid, and the research and commercial application development of hugegar support mesh calculations. The same is true in China, IBM has a large direction in China's grid market: including electronic government applications, college grid applications, automobiles, aircraft, electronic products design and optimization model applications, and hospitals, earthquake predictions and oil exploration Analysis and other applications.
The role of the grid is obvious. Like other governments in the world, in order to raise my country's comprehensive national strength and international competitiveness, our government is also very concerned about grid construction, and put forward the construction of China National High Performance Computing Environment (China National Grid "in 863. The goal of national high-performance computing environment project is to establish a computational network distribution, supporting a computational grid demonstration system for heterogeneity characteristics, connecting my country's high-performance computing centers through the Internet, unified resource management, information Manage and user management, and develop a number of grid applications that require high performance computing power to achieve a series of research results. China National Grid will provide high performance computing, resource sharing, synergy ability; in scientific research, environmental resources, manufacturing, service industry construction, a number of large industries application mesh; develop high-performance computers for grid computing, equipment Grid node, promoting the research and industrialization of China's high-performance computers; research on grid core technology represented by grid software, in grid architecture and grid software, grid application technology, grid service mode, network Break through a number of key technologies and grid management and operational mechanisms; my country's research on grid computing has begun in 1998, in terms of key technical research, and abroad.
At present, my country's grid computing research is mainly concentrated in research units in the Calculation Center of the Chinese Academy of Sciences, National Defense Science, Jiangnan Computing Center, Tsinghua University. These units have good technical accumulation and strong research capabilities in high performance computing research. Among them, the main achievements in the high-performance calculation field in the Chinese Academy of Sciences are the main results of the Dawn 3000 super servers. The main results of other units include the Galaxy Giant Machine, Tongfang Exploration Unit System. From a business point of view, there are many companies that have already put mesh as the main attack direction in the next few years, such as Haier, TCL, Lenovo, etc. In November 2003, the successful development of "Deep Site 6800" per second was one of the main governors of national grid projects. In addition, the main manufacturer of domestic high-performance computers has made a large number of critical work in Shanghai, China's main node of retaining points.
On December 15, 2003, Shanghai Super Computing Center and Shuguang Company, jointly announced the speed of the competition of the whole world - 10 trillion Dawn 4000A settled in Shanghai superior center, responsible for the massive information of grid calculation Services and data interactions include the largest main node machine in China's national grid. Moreover, in the next few years, the grid-oriented 4000A, 4000H, and 4000T series high-performance computers are fully supported the core computing environment of China's grid.