Hash

zhaozj2021-02-08  285

Recently I am doing a project, in which a data structure --hash table (hash table), only theoretical knowledge before, now it is very uncommon, so write down and share it.

We know that the hash table is a fixed size array, each element of the array is a linked list (one-way or two-way) head pointer. If Key is the same, then together, if Key is different, it is not together. The query of the hash table is fast. Because it does not need to search from head search, it uses Key's "hash algorithm" to directly locate, find very fast, and the data structure in various databases is basically it. But the problem is that the size, hash algorithm of the hash table.

The array of hash tables is a length, and if it is too big, waste, if it is too small, reflects the efficiency. The appropriate array size is the key to the performance of the hash table. The size of the hash table is preferably a rigmer, the minimum number size is 17.

Of course, according to different amounts of data, there will be different hash tables. For applications where the amount of data is very much, the best design is the use of dynamic variable size hash table, then if you find that the hash table size is too small, for example, the element is 2 times the size of the hash table. We need to expand the hash table size, which is generally doubled. The following repository is a list of size when the hash table changes.

Static int prime_Array [] = {17, / * 0 * / 37, / * 1 * / 79, / * 2 * / 163, / * 3 * / 331, / * 4 * / 673, / * 5 * / 1361 , / * 6 * / 2729, / * 7 * / 5471, / * 8 * / 10949, / * 9 * / 21911, / * 10 * / 43853, / * 11 * / 87719, / * 12 * / 175447, / * 13 * / 350899, / * 14 * / 701819, / * 15 * / 1407303, / * 17 * / 5614657, / * 18 * / 11229331, / * 19 * / 22458671, / * 20 * / 44917381, / * 21 * / 89834777, / * 22 * ​​/ 17969557, / * 23 * / 359339171, / * 24 * 25 * / 1437356741, / * 26 * / 2147483647 / * 27 Largest Signed INT Prime * /};

#define prime_Array_size (28)

To use a hash table, you must use a hash algorithm to determine the key value, which seems to be a hard thing, below is a hash algorithm:

Typedef struct _htab {hlinks * link; / * A linked list * / int Num; / * Member number * / int size; / * Table size * /} HTAB;

Static unsigned intGetHashIndex (htab * tabptr, const char * key) {unsigned int ha = 0; while (* key) HA = (HA * 128 * Key )% tabptr-> size; return ha;

}

(Where Key is a string, HTAB is a hash table structure, tabptr-> size is the size of the hash table array)

This algorithm is implemented, it is more good, it can achieve the effect of dispersion data, if you have a better algorithm, welcome to communicate with me. (Litmouse@km169.net)

---- (All rights reserved, if you need to reprint, please indicate the source and author)

转载请注明原文地址:https://www.9cbs.com/read-1085.html

New Post(0)