Title: QQWRY.DAT file structure analysis
Author: lionall@nkbbs.org
Date: 2004/12/18
CONTENT:
Recently written a program, some of which is about IP address query, and I have been looking for a long time IP database. The result is not ideal. The only thing to find a complete database is the pure text file exported from qqwry.dat, which is relatively large, so I think it is better to use qqwry.dat directly, and this file has a QQ that can display IP. The following content refers to the two articles, thanks to the author of these two articles. One is a person who claims to be a format designed to design this format itself. However, the above two articles are not specific enough, because the former is written by the author, so the specific details are not clear; the latter is not a matter of research, so some things are not thorough enough. I borrowed these two articles. Here I will introduce the file format of qqwry.day. If you have incorrect, please refer to you.
Overall, QQWRY.DAT has 3 parts:
[Document Head] [End IP Region 1 Region 2] [M] [Start IP End IP Offset] [N]
Let me mean what is the specific to each part of the part. The description is described below in QQWRY.DAT:
202.113.16.0 - 202.113.16.255 Nankai University Network Center
Among them, 202.113.16.0 is [Start IP], 202.113.16.255 is [End IP],
[Start IP] - [End IP] forms an IP segment, all IP segments are arranged in small to large.
Nankai University is [Region 1], the network center is [Region 2]
Generally, [Region 1] is relatively large, such as the United States; [Region 2] is more accurate, such as New York.
[Document Head] and each [Start IP End IP Offset] size is fixed, but each one
The size of [End IP Region 1 Region 2] is uncertain, and all offsets are absolutely offset.
The following specific analysis:
1. [file head]
There are 8 bytes of file headers, as follows:
Typedef struct
{
/ / Point [Start IP End IP Offset] [0]
Unsigned long first_start_ip_offset;
/ / Point [Start IP End IP Offset] [N - 1]
Unsigned long last_start_ip_offset;
} Header;
So use the two pointers of [file head] to traverse all [start IP end IP offset]
2. [Start IP End IP Offset]
Each [Start IP End IP Offset] structure has 7 bytes,
Typedef struct
{
// Specific [start IP]
Unsigned long start_ip;
/ / Point to the corresponding [End IP Region 1 Region 2]
Unsigned char end_ip_offset [3];
} start_ip;
For example, the [start IP] recorded is 202.113.16.0, then
START_IP.Start_ip = 0xca711000
Note, the data saved in the file is poured, ie 0x001071ca.
Can be positioned to the corresponding [End IP Region 1 Region 2]
3. [End IP Region 1 Region 2]
This part is relatively complex, his length is uncertain, deleting the repeating area string appeared in front, but only a pointer, which can play an appropriate data compression. [End IP] is also the same as the start_ip.start_ip Save mode.
The following specifically introduces the area's storage method, and there are 4 types:
1) [Region 1] 0x00 [Region 2] 0x00
Where [Region 1] [Region 2] is all strings, such as "US" 00 "New York" 00
2) 0x01 [Regional offset]
Indicates that the [Region 1 Region 2] can be found in the 3-byte [regional offset], it is worth noting that this [regional offset] found in [Region 1 Region 2] is not necessarily 1 ), Maybe it is 3) 4), but must not be 2), that is, 2) will not recurrent, at least I travers all 15,000 records, and there is no such situation.
3) 0x02 [Regional Offset] [Region 2] 0x00
Indicates that [Region 1] can be found in front of the 3-byte [area offset], and found [Region 1] [0]! = 0x01 && [Region 1] [0]! = 0x02, Say no longer appear again, and 3) does not reclaim, so the data found must be [Region 1] 0x00.
4) 0x02 [Regional Offset] 0x02 [Area Offset]
Indicates that [Region 1] [Region 2] appears in front, and you can find a string such as [region] 0x00 using both offsets. The data found will not be 2) 3), it will not be 4) itself.
There are several QQ IP data formats on the Internet, such as: phoenix, innovative version, Cangzhou version, the above analysis of the data is the IP data of the pure network September 5, 2004.
Bibli-oography:
http://blog.9cbs.net/taft/ "QQWRY format"
http://blog.9cbs.net/cnss/ "About QQWRY Format"