Introduction to Combination Algorithm (2)

xiaoxiao2021-03-06  68

The stacking is also one of the sequencing, which is characterized by using the result of the keyword comparison that has been obtained in the first selection in the "selection" in the future.

The definition of the heap:

Piles are several columns {R1, R2, ..., RN}:

or

If you see this number as a full binary tree, you will have a fully binary tree that is or empty or empty or empty. The value of the point is smaller than the value of the left / right subtree root node.

Thus, if the above number is a heap, the R1 must be the minimum or maximum value in the number, referred to as a small pile or a large top heap, respectively.

The stack is a sorting method for sorting the recording sequence using the characteristics of the heap. The specific method is: first build a "big top", first select a keyword as the largest record, then exchange the last record in the sequence, continue to "filter" in the sequence in the sequence in the sequence, refurbish Adjust it to a "big top heap", then exchange the pile record and the N-1 record, so repeated until the sort is ended.

The so-called "screening" refers to the full binary tree of a heap of a left / right sub-tree, and the "adjustment" root node makes the entire binary tree.

The algorithm of the stack is as follows:

Template

Void Heapsort (elem r [], int N) {

// Sort by the recording sequence r [1..n].

For (i = n / 2; i> 0; --I)

// Build a large top heap of R [1..n]

Heapadjust (R, I, N);

For (i = n; i> 1; --I) {

R [1] ← → r;

// Record the pile record and the current unspeakded subsequence

// r [1..i] in the last record mutual exchange

Heapadjust (R, 1, I-1);

// Re-adjust R [1..i-1] to large top stacks

}

} // Heapsort

The algorithm of the screening is as follows. In order to adjust R [S "to" large top stack "," screening "in the algorithm should be taken downwards along keywords.

Template

Void Heapadjust (Elem R [], Int S, INT M) {

// Know the keyword recorded in R [S..M] except R [S] .Key

// Get an average definition of the heap, this function adjusts R [S]

// keyword, make R [S..M] becomes a big top heap (on it)

// Record keywords)

Rc = r [s];

For (j = 2 * s; j <= m; j * = 2) {// Endown of children with larger KEY

IF (j

IF (rc.key> = r [j] .key) Break; // RC should be inserted on position s

R [s] = r [j]; s = j;

}

R [S] = rc; // Insert

} // HEAPADJUST

Analysis of time complexity analysis of heap sorting:

1. The number of keywords required to be k on the depth of K is at most 2 (k-1);

2. For n keywords, the number of depths is built into h (= log2n 1), and the number of keywords required to be 4N;

3. Adjust the "pile" N-1 times, the total number of keywords in total is not exceeded

2 (log2 (N-1)  log2 (N-2)  ... log22) <2N (log2n )

Therefore, the time complexity of the heap sort is O (nlogn)

four. Multiplexing: It is a sequence of document sequences by "returning" or more than two or more, gradually increases the length of the recording order sequence; the basic idea of ​​merged sorting is: two or more order sequences " Return and "is an ordered sequence.

In the internal sort, it is usually used to be 2-road mad and sort. That is, the two positions adjacent to the order of sequences are collected into a ordered sequence. The "return" algorithm is described as follows:

Template

Void Merge (Elem SR [], ELEM TR [], INT I, INT M, INT N) {

// Multiplexes the orderless SR [i..m] and SR [M 1..n]

// Ordered TR [i..n]

For (j = m 1, k = i; i <= m && j <= n; k)

{// Record in SR by small to the ground into TR

IF (sr.key <= sr [j] .key) TR [K] = SR [i ];

Else TR [K] = SR [J ];

}

IF (i <= m) TR [k..n] = SR [i..m];

// copy the remaining SR [i..m] to TR

IF (j <= n) Tr [k..n] = SR [j..n];

// copy the remaining SR [J..n] to TR

} // merge

The algorithm of sorting can have two forms: recursive and recursive, it is derived from two different programming ideas. Here, only the recursive form of algorithm is discussed.

This is a kind of analysis method:

If the two parts of the sequential sequence R [S..T] are recorded, the two parts of R [S. "(S T) / 2] and R [ (S T) / 2 ... The word is in order, and the above belongs to the algorithm are easily merged into the entire recording sequence, thereby, the two parts should be subjected to 2-road madness and sorting.

Template

Void Msort (Elem SR [], ELEM TR1 [], INT S, INT T) {

// Put the SR [S.] 2-road sorting is TR1 [S.].

IF (s == T) TR1 [S] = SR [S];

Else {

M = (S T) / 2;

// divide SR [S.] into SR [S..M] and SR [M 1..T]

Msort (SR, TR2, S, M);

// Recursively win the SR [S..M] into an orderly TR2 [S..M]

Msort (SR, TR2, M 1, T);

// Recursively SR [M 1..T] is collapsed into an orderly TR2 [M 1....

MERGE (TR2, TR1, S, M, T);

// Multiple TR2 [S "and TR2 [M 1..T] to TR1 [S.]

}

} // msort

Template

Void Mergesort (ELEM R []) {

// Sort by the recording sequence R [1..n] for 2-road sorting.

Msort (R, R, 1, N);

} // mergesort

It is easy to see that the time complexity of the N records is sorted is ο (NLOGN). That is: the time complexity of each mating is O (N), and the logn is required.

Fives. The base sort: "Single Keyword Sort" algorithm is implemented with the "multi-keyword sort".

[I] multi-keyword sort

There is a sequence of n records ...

{R1, R2, ..., RN}

Each recording RI contains D keyword (Ki0, Ki1, ..., KID-1), the above-mentioned recording sequence is called keywords (ki0, ki1, ..., kid-1) Or order: arbitrary in the sequence Two records RI and RJ (1≤i

(Ki0, Ki1, ..., KID-1) <(kj0, kj1, ..., kJD-1)

Where K0 is called "the most primary" bit keyword, KD-1 is called "Most" bit keyword.

There are usually two practices to achieve multi-keyword sorting:

Most Position Priority MSD Method: First Sort K0, and divide the record sequence into a plurality of subsequences according to different values ​​of K0, and separate the K1, ..., it is pushing until finally sorted the most bit keyword. Minimum positional priority LSD method: first sort KD-1, then sort KD-2, push it according to next time until the top keyword K0 is set. During the ordering process, it is not necessary to split the recording sequence into several ("previous" keywords) subsequences ("the previous" keyword) subsequences.

For example: Student records contain three keywords: Series numbers in the line, the class, and the class, where is the right keyword. The sorting process of the LSD is as follows:

[Ii] chain base sort

In the record sequence of multi-key, the range of values ​​of each keyword is the same, and when sorting is performed by the LSD method, "allocation-collection" method can be used, and the benefit is no need to perform a comparison between keywords.

For digital or character types of single-door words, you can be seen as a multi-key word composed of multiple digits or more characters. At this time, this "allocation-collection" method can be used to sort, called the base sorting method. .

For example: for the following group of keywords

{209, 386, 768, 185, 247, 606, 230, 834, 539}

First, according to its "bit number", it is 0, 1, ..., 9 "allocation" into 10 groups, then "collects" together in order from 0 to 9; then according to its "ten digits" The value is 0, 1, ..., 9 "allocation" into 10 groups, then press them "to collect" from 0 to 9; finally according to its "100-bit number" repeat the above operation, Get the orderly sequence of this group of keywords.

When the base sort is achieved on the computer, in order to reduce the required auxiliary storage space, the chain table should be used as a storage structure, that is, the chain base is sorted, and the specific procedure is:

1. To be sorted to record with a pointer phase chain, constitute a linked list;

2. When "allocates", press the current "key" to allocate the record to a different "chain queue", the "keyword" recorded in each queue;

3. When "collecting", the value of the current keyword is hidden from the small to the general team.

4. Two steps for each of the key lines 2) and 3).

E.g:

P → 369 → 367 → 167 → 239 → 237 → 138 → 230 → 139

First assignment

F [0] → 230 ← r [0]

F [7] → 367 → 167 → 237 ← R [7]

F [8] → 138 ← R [8]

F [9] → 369 → 239 → 139 ← R [9]

First collection

P → 230 → 367 → 167 → 237 → 138 → 368 → 239 → 139

Second assignment

F [3] → 230 → 237 → 138 → 239 → 139 ← R [3]

F [6] → 367 → 167 → 368 ← R [6]

Second collection

P → 230 → 237 → 138 → 239 → 139 → 367 → 167 → 368

Third assignment

F [1] → 138 → 139 → 167 ← R [1]

F [2] → 230 → 237 → 239 ← R [2]

F [3] → 367 → 368 ← R [3]

After the third collection, you will get a recorded sequence of records.

P → 138 → 139 → 167 → 230 → 237 → 239 → 367 → 368

We need to pay attention in technology sorting:

1. The actual operation of "allocation" and "collection" is only to modify the pointer in the list and the header, tail pointer;

2. For finding, the list is still needed to apply an algorithm to adjust it to an ordered table.

The time complexity of the base sort is O (D (N RD)).

Among them, allocation is O (n); collected as O (RD) (RD "), D is" assignment-collection ".

Below we compare the various internal sorting methods mentioned above, first, from time performance:

1. That three types of sorting methods are available on average time performance:

The method of time complexity is O (nlogn) is: rapid sorting, stacking and sorting, where rapid sorting is best;

Time complexity is O (N2): Direct insertion sort, bubble sorting, and simple selection, where directly insert is preferably, especially for those record sequences that are approximately orderly to keywords;

The order of time complexity is O (n) is only, the base is sorted.

2. When the row-row record sequence is in order in the keyword order, the direct insertion and bubble sort can reach the time complexity of O (N); and for rapid sorting, this is the least good situation, at this time The time performance is degraded to O (N2), so it should be avoided as much as possible.

3. Simply select Sort, Stacking and Multiple Sorting Time Performance does not change with the distribution of keywords in the recording sequence.

Second, from spatial performance:

Refers to the auxiliary space size required during the sorting process.

1. All simple sorting methods (including: direct insertion, bubbles, and simplicity) and the spatial complexity of the stack sorting is O (1);

2. Quick Sort is O (LOGN) to provide the auxiliary space required for the stack;

3. Multifiers to sort the maximum auxiliary space, its spatial complexity is O (N);

4. The chain base sort must be attached to the first tail pointer, and the spatial complexity is O (RD).

Again, from the stability of the sorting method:

The stable sorting method refers to the relative positions of the two keywords, which have no change in the relative position in the sequence, before and after being sorted. When the LSD method is sorted by the LSD method, a stable sorting method must be employed. For unstable sorting methods, as long as one example will be mentioned. We need to point out that rapid sorting and stack sorting are unstable sorting methods.

Let's talk about "lower limit of time complexity of the sorting method"

The various sorting methods discussed here, other methods are sorted by "compare keyword", which can prove that the fastest time complexity of such sorting methods may be O (Nlogn) . (The base sort is not based on the "compare key" sorting method, so it is not subject to this limit). Such a decision tree can be used to describe the sorting method based on "compare key".

For example, the judgment tree for sorting three keywords is as follows:

K1

K1

K2

K3

There are two characteristics of the description of the sorted judgment tree:

1. Every "comparison" on the tree is necessary;

2. The leaf node on the tree contains all possible conditions.

The "Decision Tree Comparison" can be introduced by "Decision Tree Depth" as shown in the above figure, "up to three comparisons" can be completed to complete the sort of three keywords. Conversely, it is determined that the tree can be seen, considering the worst case, "at least three comparisons" can be sorted to three keywords. The decision tree depth of the three keywords is unique. That is, no matter what the order is done, the depth of the tree is determined is 3. When the number of keywords exceeds 3, different sorting methods determine that the depth of the tree is different. For example, when the four keywords are sorted, the depth of the decision tree directly inserted is 6, while the depth of the decision tree of the folded half-insert is 5.

It can be proven that 4 keywords are sorted and at least 5 comparisons are required. Because the results of the four keyword sort have 4! = 24 possible, that is, 24 leaf nodes must be between the sorted decision tree, the minimum of which is 6.

In general, sort N keywords, the result may be obtained with n! Species, due to N! The depth of the binary tree of the leaf node is not less than log2 (n!)  1, then n key The number of comparisons for sorting is at least log2 (n!) . Through the Stirin approximate formula log2 (n!)  NLOG2N, based on the "compare key", the fastest time complexity may be O (nlogn).

Let's talk about it.

External sort:

Common external sorts:

Disk sorting and tape sorting, this is this because the external sort is not only related to the algorithm of the sort, but also related to the characteristics of the exterior device. In combination with exemplary equipment, it can be generally divided into two major classes of sequential access (such as tape) and direct access (such as a disk). The tape sorting time depends primarily on the read and write of the belt. (Here is just a matter of managing, it is basically known that the way to eliminate tape storage is now pointed out that the working process of the two ways of external sorting is consistent, and the basic process of external sorting is relatively independent. Two steps consists of:

1. According to the size of the available memory, use the internal sorting method, construct a number (recorded) order sequence, usually referred to as "normal segments" in the absence of the absence;

2. Through "return", gradually expand the length of the order sequence (recorded), until the entire record sequence is ordered in the keyword.

For example, it is assumed that there is a disk file containing 10,000 records, and the currently used computer can only sort a 1,000 records at a time, then the internal sorting method is used to obtain 10 initial merged segments, and then returned to the same.

Assume that 2  is returned and (ie two or two returns), then

The first trip is obtained from 10 signs of consolidation.

The second trip is obtained from three somedum segments;

The third quarter is obtained from three signs and segments;

Finally, it is a sequence of ordered sequences throughout the record.

Let's analyze the number of times of accessing the exemption (read / write to the exemption) in the above-mentioned outer draft: Suppose "data block" is 200, that is, each access deposit can read / write 200 records. Then 10,000 records, handle more than 100 times to access the exemption (read and write 50 times).

Thus, in the above example,

1) Queue 10 initial merged paragraphs to access the exemption 100 times;

2) Each time you go to a return, you need to access the exemption 100 times;

3) Total Access 3 4  100 = 500 times.

The total time of the outer row should also include the time required for internal sorting and the time to return to merge, clearly, except for the internal sorting factors, the external sorting time depends on the "number" required by the merger .

For example, if the above example is 5  归 并, only 2 并, the number of times the total access, the number of times will be compressed to 100 2  100 = 300 times

In general, it is assumed that the recording sequence contains M initial merged segments. When the outer row is rumored, the number is returned to the logkm, obviously, the increase of K's increase will be reduced, so For external rows, multiple ways are usually returned. K size is optional, but it is necessary to consider various factors.

The above talk is to assume the work completed on a single processor, and will talk in parallel below:

Parallel sort: refers to the use of multiple processing machines (parallels), which is mainly to improve speed. Parallel Sort Algorithms Although there are many similarities on the serial sorting algorithms on the single-handed single-handed single processor, they cannot be considered just that it is a simple promotion and expansion of the serial algorithm. One of its biggest features is closely related to the architecture of parallel computers, and different architecture leads to different acceleration and parallel sorting algorithms of different design styles. Graph and network optimization algorithm:

The most abundant part of the combination algorithm is the map and network optimization algorithm. The calculation problem in the chart is includes the search, path problem, connectivity problem, planarity test, coloring problem, network optimization. The famous algorithm in the chart is: the Kruskal algorithm for the minimum spanning tree, the Dijkstra algorithm and the floyd algorithm for the shortest path, seeking the two-way chart max match (assignment problem) Hungarian algorithm, seeking the maximum matching edmonds "flower" Algorithm, seeking network maximum flow and minimum cutting algorithm, etc.

Kruskal algorithm for minimal spanning tree

Because this is what everyone is familiar, I just said: In order to make the sum of the values ​​on the tree on the tree, it is obvious that the weight of each side should be as small as possible. The Kruskal algorithm is: first construct a sub-map SG containing N top points, then start with the smallest weight, if its addition does not cause a loop in the SG, add this edge on the SG, so Repeat until the N-1 edge is added. The algorithm is as follows:

Constructs non-connected map ST = (v, {});

K = i = 0;

While (K

i;

Select the sides (U, V) minimal (u, v) from the number I of the right value from the side set E;

If (U, V) does not cause a loop in ST after adding ST,

Output edges (U, V);

K ;

}

转载请注明原文地址:https://www.9cbs.com/read-93819.html

New Post(0)