(Favorites) Build your own Java-based supercomputer

zhaozj2021-02-16  70

If you have thought about building your own supercomputer, but it is daunting to use C language, then the pseudo-remote thread can help you solve this problem. This award-winning Java programming model greatly simplifies parallel programming, and enables supercarcies to go out of the laboratory so that each Java programmer can use it. In the past three years, the parallel cluster has changed the face of supercaps. Once millions of US dollars account for dominant, the parallel cluster will soon become the choice of supercomputer. It can be imagined that the high enthusiasm in the open source circle has caused hundreds - if not thousands - parallel group projects. The first is also the most famous open source cluster system is Beowulf. With NASA sponsors, Thomas Sterling and Donald Becker Beown released in 1994 began as a 16-node demo cluster. Today, BeoWulf has hundreds of implementation, from the Stone SouperComputer of Oak Ridge National Lab to the Customized Commercial Cluster of the ASPEN System Corporation (see Resources). For Java programmers is unfavorable, most cluster systems are implemented around C-language-based software messaging API - such as messaging interface (MPI) or parallel virtual machine (PVM) -. In parallel programming with C language is not easy, so I have designed an alternative. This article explains how to integrate the Java Thread and Java Remote Method Call (RMI) to create its own Java-based supercomputer. Please note that this article assumes that you have Java threads and RMI applications. What is there in the supercomputer? The definition of the supercomputer is: consisting of eight or more nodes as a single high-performance machine. The Java-based supercomputer contains a job scheduler and any number of running servers (also known as hosts). The job scheduler generates multiple threads, each thread contains code that performs different sub tasks. Each thread migrates its code to a different running server. Each run server performs migration to its code and returns the result to the job scheduler. Finally, the job scheduler combines the results of each thread. This parallel cluster system is called a pseudo-remote thread because the thread is scheduled on the job scheduler, but the code within the thread is performed on the remote computer. What components do this system? Component term refers to a logic module that computes the "pseudo-remote thread" parallel cluster system. The system includes the following components: Job Dispatcher is a machine that performs control. It generates different threads, each thread contains a sub-task of the primary tasks to be processed by this cluster. The code in each thread is sent to a remote computer to execute. The thread is scheduled on the job scheduler, so theoretically, the machine should not be used to perform any sub-tasks. Subtask is a user-defined class that defines a data or functional part of the primary task. You can define different classes for different parts of the main task. Class name Subtask is an example. You can take any name for a Subtask class, but this name should describe the subtask assigned to it. When defining the Subtask class, you must implement the JobCodeInt interface and the jobcode () method, which will be described below. JobcodeInt is a Java interface. You must implement this interface and jobcode () method in the class defined in the subtraction. The JobCode () method describes the code that will execute it remotely. If you intend to use a local resource remote, you must initialize this resource outside the jobcode () method. For example, you'll want to send a set of images to remote processing, you must initialize the image object outside the JobCode () method.

You can call classes in the standard Java library in this method because of these libraries on the remote computer. RunServer is a Java object that allows the remote process to call its method. It is a method to implement the object of the JobCodeInt interface as a parameter. RunServer executes code within the object on a computer (running server) running the object, and returns the calculation result as an instance of the Object class. Object is the highest level of class in the Java class hierarchy. PseudoremThr is a Java class that encapsulates a thread and accepts an instance of a given Subtask class. It chooses a remote host and sends the Subtask instance to this host. If you want to use a particular resource available on a host, such as a database or printer, you can specify the host. HostSelector is a module. If you do not specify a remote host, the PseudoremThr class calls the HostSelector module to select a specific host. If there is no idle host, HostSelector returns the minimal remote computer. If a remote computer is a multiprocessor system, HostSelector may return the host name more than once. Currently, HostSelector cannot select the host based on the complexity of a given task. Pseudo-remote threads work ways to use pseudo-remote threads, you must implement job scheduler and running servers. This section explains how to implement each part. Implement the job scheduler First, drag down the primary task to data or functionally independent subtours. A class that implements a JobCodeInt interface is defined for each sub-task (thus implementing the jobcode () method). In the JobCode () method, define the code to be executed each gives the stator task. Note that you cannot call user-defined local resources on job schedulers. Initialize all such resources outside the method. For example, you can initialize such resources in the constructor of the Subtask class. Create a number of instances of class Pseudoremthr and pass the instance of Subtask to each instance of Pseudoremthr. If you want to specify a remote host, you can complete it by calling another constructor of the PseudoremThr object. Wait for these threads to complete. Call the getResult () method to get the execution result of each instance of Pseudoremthr. If the calculation is not completed, the result returns a Boolean object that value false; otherwise, an instance of the Object class will be returned, including the calculation result. You must convert this instance to the class type you want. Combine all sub-tasks as the final result. Implementing Run Server Implementation Running Server is a simple job: Start the RMI registration program. Start RunServer. Run the server to turn on the job scheduler while starting, and notify the job scheduler It is ready to accept the task to be executed. A computing example is now the model. The following calculation example uses two computers in parallel. One is a 333 MHz Pentium II computer running Windows 98 and the other is a 500 MHz Pentium III computer running Windows 2000. In order to calculate the sum of square roots from 1 to 10 ^ 9, I created a SQRT class, which calculates the sum of square roots of all integers between dblstart and dblend. SQRT implements the JobCodeInt interface, so the JobCode () method is also implemented. In the JobCode () method, I defined the code that completes this calculation. The constructor is used to pass the data to the SQRT class and initialize all local resources on the job scheduler. The starting point to which the integer to calculate the sum of its square root is sent to the constructor.

Listing 1 is a definition list of SQRT classes 1. Define the SQRT class // SQRT class to calculate the sum of square roots between dblstart and dblend. // Calculate the JobCode () method to complete the JobCodeInt interface in the jobcode () method, and the implementation code is located in the jobcode () method // transmits the data to the class in the constructor, and initializes the local resources on the job scheduler. // In this example, to calculate the starting point of the integer sequence of its square root and the starting point of the integer sequence to be sent to the SQRT class public class sqrt IMPLEMENTS JOBCODEINT {Double DBLSTART, DBLEND, DBLPARTIALSUM; Public SQRT (Double Start, Double End) {dblstart = start; dblend = end;} public object portcode () {dblpartialsum = 0; for (double i = dblstart; i <= dblend; i ) // adjustable standard Java function and object. DBLPARTIALSUM = Math.SQRT (i); // Returns the result, a standard Java class object. Return (New Double (DBLPartialSum);}} JobDispatcher class creates two instances of the SQRT class. Then decompose the master task, assign a sub task to a SQRT object (SQRT1), and assign the remaining subtask to another SQRT object (SQRT2). Next, JobDispatcher creates two objects of the PSeudoremthr class and passes the SQRT object as a parameter to them, respectively. Next, wait for the thread execution. Once thread is executed, some results can be obtained from each Pseudoremthr instance. The final result can be obtained by combining the results of each part, as shown in Listing 2. Listing 2. JobDispatcher // This class can be named for any name you selected //> Use JobDispatcher only for the convenience of public class jobdispatcher {public static void main (String args []) {double fin = 10000000; // representative 10 ^ 9double finByten = FIN / 10; // represents 10 ^ 8long nlstarttime = system.currenttimemillis (); // Range from 1 to 3 * 10 ^ 8sqrt SQRT1 = New SQRT (1, FinByten * 3); // ((3 * 10 ^ 8) 1) to 10 ^ 9SQRT SQRT2 = New SQRT ((FinByten * 3) 1, FIN); // The following to create two instances of the PSeudoremThr class. / / The parameters of this constructor are as follows. // First parameter: instance // second parameter of a class representing a subtual: Remote Host // Third Parameters of this sub-task: The descriptive name of the PseudoreMTHR instance.

PseudoRemThr psr1 = newPseudoRemThr (sqrt1, "// 192.168.1.1:3333/","Win98");PseudoRemThr psr2 = newPseudoRemThr (sqrt2," // 192.168.1.2:3333 / "," Win2K "); psr1.waitForResult ( ); // Wait for the execution end // Get the result of each thread Double Res1 = (Double) psr1.getResult (); double res2 = () psr2.getResult (); double finalres = res1.doubleValue () res2. DoubleValue (); long NlendTime = system.currenttime = system.currenttimemillis (); system.out.println ("Total Time Taken:"); System.out.Println ("Sum:" FinalRes);}} Evaluate the total execution time of this calculation between 120,000 milliseconds to 128,000 milliseconds. If the same task is running locally without decomposing the task, the execution time will be between 183, 241 to 237, 641 milliseconds. Initially, the main task included the sum of the square roots from all integers from 1 to 10 ^ 7. For test performance, I will expand the scope to 10 ^ 8, eventually expand to 10 ^ 9. As the amount of task increases, the difference between remote parallel execution and the time required for local execution is increasingly obvious. That is to say, when performing large tasks, remote parallel execution consumes less time. Remote parallel implementation is not suitable for small tasks because the system overhead between machines communicated is not ignored. As the amount of task increases, the overhead of the machine between machines has gradually become insignificant compared to the overhead of performing all tasks on a single machine. Therefore, I conclude that the pseudo-remote thread system can do a lot of tasks that need to be calculated well. What is the advantages of using a false remote thread? Because the pseudo-Fair thread is a Java-based system, it can be used to implement clusters, or heterogeneous clusters that contain multiple operating systems. With a pseudo-remote thread, you avoid the inclusion of the original C / C code, and can also take advantage of the Java standard library and its various expansion libraries. In addition, pseudo-remote threads allow you to care about memory management. Of course, its disadvantage is that system performance is directly related to JRE performance. The business application now is now created with the Java platform, and take into account the practical difficulties in which the original C / C code is required to use parallelism, it may be based on Java-based supercoding into the commercial field. . The parallelism and load balancing consideration are made in consideration when creating Java-based applications. The Internet is a good example of the heterogeneous cluster, so the pseudo-remote thread can be deployed in the Internet, converting the web into a single, Java-based supercomputer (see the reference resource). However, from the actual application, you should notice that the best results will be obtained in a single-tasher cluster that specializes in performing a single task. Finally, from the daily application, the pseudo-remote thread makes the local area network (LAN) - such as campus network and home network - converted to miniature supercomputers. This is the usage of the Beowulf system. With a false remote thread, Java programmers can also create their own supercomputers.

Reference Resources "WeatherWorks, May 2000) For you to guide you, let you know the current Linux available open source cluster solution and confidential source cluster solution. For more information about a distributed operating system, review the Modern Operating Systems of Andrew S. Tanenbaum (Prentice Hall Publishing Company, February 1992). To learn more about parallel programming, see Gregory V. Wilson's Practical Parallel Programming (MIT Press, December 1995). To further understand the cluster, see the Cluster cookbook of Scalable Computing Laboratory. For deep discussions on supercapsulation using Java technology and Web, see Laurence Vanhelsuw's "Create Your Own SuperComputer with Java?" (JavaWorld, Jan 2007). Linux Documentation Project hosts the Beowulf Howto documentation. Visit the Beowulf website to learn more about the Beowulf project. See the details of the famous Stone SouperComputer of Oak Ridge National Labs. Aspen system companies are one of a few vendors that currently provide custom cluster solutions.