Note: This is the kind of work published on Vccode during the SARS last year. Recently, I have seen it a good popularity, and I have a good discussion.
http://www.vccode.com/file_show.php?id=1852
http://www.vccode.com/file_download.php?id=1852 Source Code
Original: This article and procedures can be used free of charge, please do not use commercial use
Original algorithm idea: 1) Assume that all numbers are proven to prove it is not. 2) Find the rigidity by screening all the multiple of the small prime numbers. Improvement 1: Use "Great Bad Number of Mosts is odd", which will reduce half of the sieved space, only store odd number. Improvement 2: Since there is no longer there is no longer there is odd number and odd product, multiplier is increased from 3, and the number of cycles will be reduced by half improvement three: (may only do this in C language ), Using the bits (BIT) to store the mark, reduce the storage space to 1/8 although the bit perpendicent of each cycle increases, the time to apply for memory is greatly reduced, and the large number of prime numbers can be obtained Define 2 macros #define getisp (i) (isprime 8) >> (i)% 8)) & 0x1 is used to get the i-th number mark #define setisp (i, j) j ? (isprime [(i) / 8] | = (0x1 << ((i)% 8))) :( Isprime [(i) / 8] & = ~ (0x1 << ((i)% 8)) ) Used to set the reason for whether the number of rigid numbers is not intended to be expressed for less stack operations. Four: Determine whether it is set to compliance before setting the i-th number, if it has been set to Number, no reset due to read operations is not written, this improved speed improvement relationship has improved five: (1) change / 8 operation in the operation macro to >> 3, change% 8 operation to & 0x7 , (2) Since only one is set to 1, it is not necessary, the setisp macro decomposes, omitting the setting value of 1 or 0, (3) In order to reduce the number of divisions, loop The value of the variable is an algebraic transformation, but the readability of the program has a lot of these improvements to improve the speed improvement, and when the number of rigid numbers is screened, because the multiplication is less than the number of lots. It has been screened as a product that is less than the number of prime numbers, so the multiple should start from this prime number, such as the multiple of the screen 5, 15 has been sieved as 3, should start from 25. This is an improvement in the algorithm. (2) Considering that the rating operation acquires whether or not the rigidity mark is composed of 4 steps, and the multiple of the intermediate variables judges only 3 steps. (One time, one judgment, judging the establishment of a reputation once), separately proposed the multiple of the judgment 3. As for the multiple of the majority of greater than 3, since the statement alone is more than the macro judgment, it is no longer additional. In the above algorithm, the memory space of the actually 3 is also saved, but the processing statement is complicated, and the speed is lost, so it is no longer processed.
The replacement core statement is as follows: due to the readability of the program, the same statement is attached as a comment: char _3Times = 0; I = (max-1) / 2; for (int J = 4; j <= i; J = 3) // SET All 3's Times {Isprime [(j) >> 3] & = ~ (0x1 << ((j) & 0x7));} char I1 = 0; int m = (n-1) / 2; int K = (MAX-1) / 2; for (i = 2; i <= m; i ) {IF ( I1 == 3) // Skip WHEN I IS 3'S TIMES {I1 = 0; Continue;}} {_3Times = ((i << 2) 1)% 3; // (2 * (2 * i 1)% 3-1) for (int J = i * (2 * i 2); j <= k; j = j i i 1) {if ( _ 3times == 3) // Skip WHEN J IS 3'S TIMES {// printf ("Skip i = % D, J =% D%,% D ", 2 * i 1, 2 * J 1, _3Times); _3Times = 0; Continue;}} (Getisp (j)) // Good idea isprime [(j ) >> 3] & = ~ (0x1 << ((j) & 0x7)); // else // CT ;}}} / * for (i = 5; i <= n; i = 2) {IF i1 == 3) // Skip When I IS 3's Times {I1 = 0; Continue;} int m = (i-1) / 2; if (Getisp (M)) // if I is a prime, {_3Times = (i i)% 3-1; // The positioning J is 3 times (NOD * NOD)% 3 all == 1 for (int J = i * i; j <= max; j = j i) // loop through multiples, { IF ( _ 3times == 3) // Skip WHEN J IS 3'S TIMES {// Printf ("
SKIP i =% D, J =% D%,% D ", I, J, _3Times); _3Times = 0; Continue;} int K = (j-1) / 2; // they area not prime. IF Getisp (k)) // good IDEA {isprime [(k) >> 3] & = ~ (0x1 << ((k) & 0x7)); // they area}} else {// printf ("cat I =% D, J =% D%,% D ", I, J, _3Times); CT ;}}}} * / improved seven: Read Java syntax yesterday, found that it also supports the operation, but does not support macro Definition, clearer type check, if an integer cannot be converted to Boolean by default, on the basis of improving the six c procedures, the ISPRIME type unsigned char * is changed to Byte [], as as a class data member variable. Static Byte [ ] isprime; change GetISP to the following function: Public Static Int GetISP (INT i) {Return (ISPRIME [(I) >> 3] >> ((i) & 0x7)) & 0x1;} When calling when (Getisp j) == 1) instead of IF (GetISP (j)). Although the Java's Byte type is symbol, there is no impact of the program's correct operation. The result is exciting and running speed reaches the level of the C program for 3. Considering the overhead of the function call, if it is changed to the expanded statement, since Java does not support IF ((isprime [(k) >> 3] >> ((k) & 0x7)) & 0x1) == 1) This expression It must first assign the result of the bit operation to an INT variable F, and then determine whether the expression f == 1 is established, and the resulting speed change is not obvious.
Supplement: JEXEGEN provided by MS Java SDK 4.0 converts the .class file to an EXE file (but requires .class must use MS JVC to compile with Sun JDK Javac, and can only use JDK 1.1 features) improve Java The program speed, but this is not a pure Java program C: Sieve> Jexegen /out:sieve4.exe / main: sieve4 sieve4.class test speed method: command line mode, using the maximum integer of the test TEST program name can be seen The time before and after execution, for example: Test Java Sieve 50000000 Note: This time includes Class to transfer memory, interpretation time, is not a fair fair test result for Java: Platform: PIII650 128M Windows 2003 JDK 1.4.1_02 LCCWIN32 VER3.3C program test Command line C:> Test Sievex 12345678JAVA program test command line C:> Test Java Sievex 12345678 Computing result is: The Largest Prime Less Than Or Equal To 12345678 IS 12345653, the average is as follows: / * ---------- ---- Original Java -------- 7.08 s ------ * // * ------------ Improve 1, two java ------ 3.02 S -------- * // * ------------ Improved one, two c ------ 2.34 s -------- * // * ------------ Improve three -------- 1.51 s ------ * // * ------------ Improved four, five ------ 1.12 s -------- * // * ------------ improve the six java ------ 2.30 s ------- - * / / * ------------ Improve six Java EXE ------ 1.64 s -------- * // * --------- --- Improve seven Java ------ 1.53 s -------- * // * ------------ Improve Seven Java EXE ------ 1.14 S -------- * / Conclusion: 1. The improvement of algorithm is much more important than the improvement of statements or instructions, and the removal instructions are not as slow; 2. Java and C procedures are running It's very close; I can do it in C language, and the next step is compiled, but I will not write the contract: (Mathematics Is there a better case of quality? I don't know, welcome everyone to correct. Because this program is mainly used to illustrate the problem of quality, some details are not considered, such as the maximum number of 0, 1. Some places, such as whether x * 3 is changed (x << 1) X, J = J i i 1 is changed to J = J 1 (i << 1), I don't understand compilation The optimization of the device is not self-satisfied. Explanation: Due to the value of the value range and Malloc space limit, the number of programs 1-2 can not exceed 10 ^ 8 procedures 3-5 can not exceed 2 * 10 ^ 8, otherwise it may cause a system collapse.