Data structure learning (C ++) continued - sort [3] exchange sort

zhaozj2021-02-16  61

[3] Exchange Sort

Basic thinking is: Two-two comparison of key codes to be sorted, if in reverse order, exchange it until all objects are ranked.

Bubbling sort

Bubbling Sort is a comparison of two records, and the reverse order is exchanged. Such a method leads to a small key code layer, so the name is obtained. 9CBS Forum once discussed "bubble" and "bubble" are not a thing. It seems that this is a disaster that translators. The English name is Bubble Sort, and it can be in the case of being written. . (Strict version is from the back, Yin version is from the back of the back, it is good to translate the two books as "bubble sort", otherwise it is as the conclusion of some people - one is from The front row, one is from going to the back row)

Template

Void Bubblesort (T A [], INT N, INT & KCN, INT & RMN)

{

KCN = 0; RMN = 0; bool exchange = true;

For (int i = 1; i

For (int J = n - 1; j> = i; j -)

{

EXCHANGE = FALSE;

IF ( KCN && A [J - 1]> A [J]) {SWAP (A [J - 1], A [J]); Exchange = True; RMN = 3;}

}

}

It should be noted that don't write it below, although the result is correct:

Template

Void Bubblesort2 (T A [], INT N)

{

For (int i = 0; i

For (int J = 1; j

IF (A [J - 1]> A [J]) SWAP (A [J - 1], A [J]);

}

Test Results:

Sort ascending n = 10000 timeespared: 0ms

KCN = 9999 kcn / n = 0.9999 kcn / n ^ 2 = 9.999e-005 kcn / nlogn = 0.07525

RMN = 0 RMN / N = 0 RMN / N ^ 2 = 0 RMN / NLOGN = 0

Sort randomness n = 10000 timeespared: 1161ms

KCN = 45409094 kcn / n = 4540.91 kcn / n ^ 2 = 0.454091 KCN / NLOGN = 341.737

RMN = 71526984 RMN / N = 7152.7 RMN / N ^ 2 = 0.71527 RMN / NLOGN = 538.294

Sort descending n = 10000 timeespared: 1022ms

KCN = 49995000 kcn / n = 4999.5 kcn / n ^ 2 = 0.49995 kcn / nlogn = 376.25

RMN = 149985000 RMN / N = 14998.5 RMN / N ^ 2 = 1.49985 RMN / NLOGN = 1128.75

It can be seen that the efficiency is very poor, it is better to sort, I really don't know why people are here, is it to understand rapid sorting? There is another interesting phenomenon. Although the reverse sequence KCN and RMN are more than the downtime, the time in the reverse sequence is less than the chaos, from here you can see the role of the CPU pipeline, here you can give us a signal, one Really good algorithms need to make full use of hardware characteristics. When the number of records (n = 1000000), it can be seen that in the case of complete or order, the foaming is better than the insertion, as mobile records are required at this time. Quick sort

It's really sorrowful for this algorithm, even a name that can indicate the substance of the algorithm (such as straight insertion, table plug) is not, not like Hill sorting is named by inventors, is it because it is too fast? Perhaps "fast" is the highest honor of sorting algorithms.

Basic ideas are: one record of serving sequence as a reference, according to this key code size, divide the entire sequence into two sequences - all records of all records on the left are smaller than the reference small (or etc.), right side Both are larger than the benchmark, the baseline is placed between two subsequences, apparent that the reference is placed in the last position. The above procedure is repeated separately from the left and right subsequences, and all records are placed in the corresponding position.

The following routines are not easy to understand, because this is a few improvements:

Template

INT Partition (T a [], int LEFT, INT RIGHT, INT & KCN, INT & RMN)

{

INT pivotpos = left; t pivot = a [left]; // pivot

For (int i = left 1; i <= right; i )

IF ( KCN && A [i]

{SWAP (a [i], a [pivotpos]); RMN = 3;

SWAP (a [left], a [pivotpos]); RMN = 3;

Return Pivotpos;

}

The calculated pivot position is used as a function to avoid the use of useless temporary variables when recursive. When you decide to use recursive, pay attention to this - put everything can be placed outside. Note how this function reaches our "The left side is smaller than it, the right side is bigger than it".

Template

Void QSrecurve (T a [], int LEFT, INT RIGHT, INT & KCN, INT & RMN)

{

IF (Left

{

INT Pivotpos = Partition (A, LEFT, RIGHT, KCN, RMN);

QSrecurve (A, LEFT, PIVOTPOS - 1, KCN, RMN);

QSrecurve (A, Pivotpos 1, Right, KCN, RMN);

}

}

Template

Void Quicksort (T A [], INT N, INT & KCN, INT & RMN)

{

KCN = 0; RMN = 0;

QSrecurve (A, 0, N - 1, KCN, RMN);

}

These two can only be considered a housing, especially the last one.

Test Results:

Sort ascending n = 10000 Timespared: 1051ms

KCN = 49995000 kcn / n = 4999.5 kcn / n ^ 2 = 0.49995 kcn / nlogn = 376.25

RMN = 29997 rmn / n = 2.9997 RMN / N ^ 2 = 0.0002997 RMN / NLOGN = 0.22575

Sort randomness n = 10000 timeespared: 0ms

KCN = 155655 kcn / n = 15.5655 kcn / n ^ 2 = 0.00155655 KCN / NLOGN = 1.17142

RMN = 211851 RMN / N = 21.1851 RMN / N ^ 2 = 0.00211851 RMN / NLOGN = 1.59434

Sort descending n = 10000 timeespared: 1082ms

KCN = 49995000 kcn / n = 4999.5 kcn / n ^ 2 = 0.49995 kcn / nlogn = 376.25

RMN = 29997 rmn / n = 2.9997 RMN / N ^ 2 = 0.0002997 RMN / NLOGN = 0.22575

It can be seen that the average performance is very good, but the performance at both ends is not as straight. The case of testing n = 100000 is as follows (Million Remember to comment out the test and reverse test, otherwise, "crash" not to find me)

Sort randomness n = 100000 Timespared: 110ms

KCN = 2123221 KCN / N = 21.2322 kcn / n ^ 2 = 0.000212322kcn / nlogn = 1.27831

RMN = 3010848 RMN / N = 30.1085 RMN / N ^ 2 = 0.000301085RMN / NLOGN = 1.81271

It is really very "fast", but its worst situation is really unfameful, in case ..., and because of the use of stack recursive, the worst case is not allowed to collapse. In order to reduce this adverse tendency, the improved approach is "Triple Taking", that is, the first, last, one of the key code hosted in the middle of the sequence to be sorted as the reference. Just change the partition function.

Template

INT Partition (T a [], int LEFT, INT RIGHT, INT & KCN, INT & RMN)

{

INT MID = (LEFT RIGHT) / 2;

IF ( KCN && A [Left]> a [MID])

{

IF ( KCN && A [Left]> a [Right])

{

IF ( KCN && A [MID]> A [Right]) {swap (a [MID], A [Left]); RMN = 3;}

Else {swap (a [right], a [left]); RMN = 3;}

}

}

Else

{

IF ( KCN && A [LEFT]

{

IF ( KCN && A [MID]

}

}

INT pivotpos = left; t pivot = a [left]; // pivot

For (int i = left 1; i <= right; i )

IF ( KCN && A [i]

SWAP (a [left], a [pivotpos]); RMN = 3;

Return Pivotpos;

}

Just add a bold portion to the original partition function. Below is the test results:

Sort ascending n = 10000 timeespared: 0ms

KCN = 131343 kcn / n = 13.1343 kcn / n ^ 2 = 0.00131343 KCN / NLOGN = 0.988455

RMN = 35424 RMN / N = 3.5424 RMN / N ^ 2 = 0.00035424 RMN / NLOGN = 0.266592

Sort randomness n = 10000 timeespared: 0ms

KCN = 154680 kcn / n = 15.468 KCN / N ^ 2 = 0.0015468 KCN / NLOGN = 1.16408

RMN = 204093 RMN / N = 20.4093 RMN / N ^ 2 = 0.00204093 RMN / NLOGN = 1.53595

Sort descending n = 10000 Timespared: 280ms

KCN = 12517506 kcn / n = 1251.75 kcn / n ^ 2 = 0.125175 kcn / nlogn = 94.2036

RMN = 45006 RMN / N = 4.5006 RMN / N ^ 2 = 0.00045006 RMN / NLOGN = 0.338704

The following is the test results of n = 100,000, still very embarrassing during the reverse order, but it is still the past.

Sort ascending n = 100000 Timespared: 60ms

KCN = 1665551 kcn / n = 16.6555 kcn / n ^ 2 = 0.000166555kcn / nlogn = 1.00276

RMN = 393210 RMN / N = 3.9321 RMN / N ^ 2 = 3.9321E-005RMN / NLOGN = 0.236736

Sort randomness n = 100000 Timespared: 110ms

KCN = 1888590 kcn / n = 18.8859 kcn / n ^ 2 = 0.00018859KCN / NLOGN = 1.13704

RMN = 2659857 RMN / N = 26.5986 RMN / N ^ 2 = 0.000265986RMN / NLOGN = 1.60139

Sort descending n = 100000 Timespared: 42120ms

KCN = 1250175006 KCN / N = 12501.8 kcn / n ^ 2 = 0.125018 kcn / nlogn = 752.68RMN = 450006 RMN / N = 4.50006 RMN / N ^ 2 = 4.50006E-005RMN / NLOGN = 0.270931

In fact, we spend so many statements to engage a "three taken" is not as high as "casually choosing one", such as replacing the following statements to the original bold statement:

SWAP (A [Left], A [Right-LEFT) Left]); RMN = 3;

Test Results:

Sort ascending n = 100000 Timespared: 90ms

KCN = 1917756 kcn / n = 19.1776 kcn / n ^ 2 = 0.000191776kcn / nlogn = 1.1546

RMN = 378810 RMN / N = 3.7881 RMN / N ^ 2 = 3.7881e-005RMN / NLOGN = 0.228066

Sort randomness n = 100000 TimeSpared: 120ms

KCN = 1979189 kcn / n = 19.7919 kcn / n ^ 2 = 0.000197919kcn / nlogn = 1.19159

RMN = 3175977 RMN / N = 31.7598 RMN / N ^ 2 = 0.000317598RMN / NLOGN = 1.91213

Sort descending n = 100000 Timespared: 110ms

KCN = 2069369 kcn / n = 20.6937 kcn / n ^ 2 = 0.000206937kcn / nlogn = 1.24588

RMN = 2574174 RMN / N = 25.7417 RMN / N ^ 2 = 0.000257417RMN / NLOGN = 1.54981

It can be seen that the efficiency of the reverse order has a qualitative leap, and the random function has to write itself, because the RAND () of the library function can only output 0x7FFF, because the RAND function uses 32bit integers, in order to prevent (most serious) It is a negative number), which can only be output. A less stringent random function is as follows, the maximum output value is the maximum positive integer of 32bit:

int RND (INT N)

{

Static _int64 x;

X = (2053 * x 13849)% 0x7fffffff;

Return (int) x% N;

}

[Note] Note that the maximum output value problem of the random function, the source self is sorted by tens of thousands of integers, and found that the performance of the fast row becomes very poor. Thanks to Bluesky2008 to give an understanding, this is the relevant post http://expert.9cbs.net/expert/topic/2237/2237721.xml?temp=.9800684.


New Post(0)