Title: IP fragment recombination analysis and common fragment attack V0.2
Author: yawl@nsfocus.com
Home: www.nsfocus.com
Time: V0.1: 2000/07; V0.2: 2000/09
I. Introduction
This article analyzes Linux IP assembly algorithm, because IP debris is often used for attacks such as DOS, and I combined behind the article.
Some of the attack methods have been described later. The main reference version of the kernel is 2.2.16, another brief introduction
Some changes in 2.4.0-Test3.
This article is based on a note in July this year, but has added a lot of new content. In fact, the monthly monthly submission preparation
Write some Netfilter's things, but work is too tight, I have to write my own familiar theme :-). Comparison
If there is any problem, welcome to correct.
Two catalog
1 Overview
2-Key data structure
3- Important function description
4-2.4 Changes of Series
5- Common Debris Attack
1 Overview
In Linux source code, all programs of IP fragment reorganization are almost all in /net/ipv4/ip_fragment.c
Document. It provides a function interface ip_defrag (). Its function prototype is as follows:
Struct SK_Buff * ip_defrag (Struct Sk_buff * SKB)
As we all know, the network datagram is transmitted in the network stack of Linux, IP_DEFRAG ()
The function is to receive the data packet (SK_BUFF), and try to combine. When the full package group is in good condition, new
SK_BUFF returns, otherwise returns an empty pointer.
The calls in other files are as follows:
The IP layer receives the main function is IP_RCV () (/ Net / IPv4 / IP_INPUT.C), and any IP package needs to be processed through this function.
If this package is sent to this machine, call the ip_local_deliver () function (/net/ipv4/ip_input.c)
Treatment, the general system debris is only reorganized when the final purpose is reached (although during transmission
Can be further divided into smaller slices). In IP_LOCAL_DELIVER () we can find the following code:
IF (sysctl_ip_always_defrag == 0 && / * is not set in advance assembly * /
(iPh-> frag_off & htons (ip_mf | ip_offset))) {/ * Judging whether it is a fragmentation package * /
SKB = IP_DEFRAG (SKB); / * Conditions meet, assembly * /
If (! SKB) / * If the assembly is ready, the next step is processed, an error
RETURN 0; or still not assembled back * /
IPH = SKB-> NH.IPH; / * Pointer to reposition IP headers * /
}
iPh-> Frag_off only in setting MF (more Fragment) or Offset! = 0 means it is a fragmentation package, so
The inspection here is granted, but why judges sysctl_ip_always_defrag == 0?
When watching IP_RCV (), we should have noticed that after the version number, length, checksum, etc. is justified, as follows.
One code:
IF (sysctl_ip_always_defrag! = 0 &&
IPH-> FRAG_OFF & HTONS (IP_MF | IP_OFFSET) {SKB = IP_DEFRAG (SKB);
IF (! SKB)
Return 0;
IPH = SKB-> NH.IPH;
IP_send_check (iph);
}
That is, if sysctl_ip_always_defrag == 1, the call location of IP_DEFRAG () will change, for any
If you come in, you have to reorganize, you can imagine, if this machine is router, you will
After assembly, it will be forwarded. This will generally be unnecessary. This value can be dynamically changed by sysctl
Set, you can see in a general system with sysctl -a, this value is set to 0:
#nsysctl -a
......
Net.IPv4.ip_always_defrag = 0
......
2. Key data structure (2.2 series)
Each slice is represented by IPFRAG structure:
/ * Describe an ip Fragment. * /
Struct ipfrag {
INT offset; / * Offset of Fragment in ip datagram * /
Int end; / * last byte of data in database * /
Int Len; / * Length of this fragment * /
Struct SK_Buff * SKB; / * Complete Received Fragment * /
Unsigned char * ptr; / * Pointer INTO REAL FRAGMENT DATA * /
Struct ipfrag * next; / * Linked List Pointers * /
Struct ipfrag * prev;
}
These fractions form a two-way linked list (in the Linux kernel, if you need to use a list, unless there is special needs, it is recommended
In the two-way linked list, see Document / CodingStyle, indicating an unbearable slice queue (one IP package).
The head pointer of this linked list is placed in the IPQ structure:
/ * Describe an entry in the "incomplete database" queue. * /
Struct ipq {
Struct iphdr * iph; / * Pointer to ip header * /
Struct ipq * next; / * Linked List Pointers * /
Struct ipfrag * fragments; / * Linked List of review fragments * /
Int Len; / * Total Length of Original DataGram * /
Short Ihlen; / * Length of the ip header * /
Struct Timer_List Timer; / * When Will this Queue Expire? * /
Struct ipq ** pprev;
Struct Device * dev; / * device - for ICMP replies * /
}
Note that each IPQ retains a timer (ie Struct Timer_List Timer;).
IPQ also forms a linked list that is all IP packets that the kernel is currently not assembled. In order to make it easy, keep one
Hash table:
#define ipq_hashsz 64
Struct ipq * ipq_hash [ipq_hashsz]; # define ipqhashfn (id, saddr, daddr, prot) /
(((ID) >> 1) ^ (SADDR) ^ (DADDR) ^ (Prot) & (ipq_hashsz - 1)))
--------_____________
| 1 | |
------------------------------------------
Hash table | 2 | | ipq1 | ----> | ipfrag1 | -----> | ipfrag2 | ------> .......
-------- ------------------------------------
... |
-------- //
| 63 | -----------------------
-------- | ipq2 | ----> | ipfrag1 | -----> | ipfrag2 | ------> .......
------------ ------------- -----------
|
//
------------ ------------- -----------
| ipq3 | ----> | ipfrag1 | -----> | ipfrag2 | ------> .......
------------ ------------- -----------
|
//
........
Each IP package is represented by the following four-tuples: (ID, Saddr, Daddr, Protocol), four values are all in one IPQ,
You can set a complete IP package.
This structure has changed in the 2.4 core, which will be declared in the following.
3. Important function description (2.2 series)
3.1 ip_defrag ()
IP_DEFRAG () is the entry of the entire process. Here we first make a certain description of IP_DEFRAG ().
(1) In order to prevent the memory consumption due to retaining fractions, Linux sets the boundaries to prevent this, if more than
The upper limit of memory is used, the oldest queue (IPQ) in the memory is empty. The size of the memory used is saved in the variable ip_frag_mem,
Of course, it should be "atom" operation (atomic_sub, atomic_add, atomic_sub, atomic_add, atomic_sub, atomic_add, atomic_sub, atomic_add, atomic_rEAD, ETC).
It defines the front of the file ip_fragment.c:
Atomic_t ip_frag_mem = atomic_init (0); / * Memory Used for Fragments * /
IF (Atomic_Read (& IP_FRAG_MEM)> sysctl_ipfrag_high_thresh)
IP_EVICTOR ();
The specific operations of IP_EVICATOR will be described below.
(2) Retrieving the corresponding IPQ with ID, Saddr, Daddr, Protocol, and if it finds, return
IPQ pointer and reset the timer.
QP = ip_find (iPh, SKB-> DST);
(3) There is an IF / ELSE pair at this time, its effect is:
If IPQ already exists, other fractions that have already been the same package have arrived. Checking this film (because the sequentials can reach the order), if it is, the IP header information and head length are retained in the IPQ structure ();
IF (offset == 0) {
/ * Fragment frame replaced by unfragment Copy? * /
IF ((Flags & IP_MF) == 0)
Goto Out_Freequeue;
QP-> ihlen = IHL;
Memcpy (qp-> iPh, iPh, (ihl 8));
}
If there is no existence, of course it is necessary to build:
QP = IP_create (SKB, IPH);
IF (! qp)
Goto Out_Freeskb;
IP_CREATE is assigned a memory, initializing this IPQ, and registers in the Hash table.
IPQ is already existing so that IPQ is existing, whether it already exists, or we have just generated.
(4) Test the length of the package, if the maximum range of the IP package is exceeded, then the alarm and discard this package. Jolt2
It is to use this to punch the Window system. Since Linux has made this check, it is basically reliabatic.
(5) Adjust the END value (the end position of the data), if it is the last package, the final length of the entire IP package can be known
Tao, for easy assembly, record it into IPQ.
/ * Determine the position of this fragment. * /
End = Offset NTOHS (iPh-> Tot_Len) - IHL;
/ * Is this the final fragment? * /
IF ((Flags & IP_MF) == 0)
QP-> LEN = End;
(6) Next, a long period of code (line481-line586) is positioned in the entire packet of this fragment.
If there is a coincidence between fractions (malicious attacks and other abnormalities), they can return and returned. We will be behind this problem
(Common Debris Attack).
(7) At this time we already know the specific location of this fragment. We have to generate a new IPFRAG structure and put it in
We just found the correct position.
TFP = IP_FRAG_CREATE (Offset, End, SKB, PTR);
IF (! tfp)
Goto Out_Freeskb;
/ * Insert this fragment in the chain of fragments. * /
TFP-> prev = prev;
TFP-> next = next;
IF (Prev! = NULL)
Prev-> Next = TFP;
Else
QP-> Fragments = TFP;
IF (next! = null)
Next-> prev = TFP;
(8) IP_DONE function Checks if all fractions have arrived, if they are, they are assembled into a new SK_BUFF
(Call ip_glue) and eventually return to the place to call IP_DEFRAG.
IF (ip_done (qp)) {/ * all arrived? * /
/ * Glue together the fragments. * / SKB = IP_GLUE (QP);
/ * Free the queue entry. * /
OUT_FREEQUEUE:
IP_FREE (QP); / * The original IPQ structure has no need, release. * /
OUT_SKB:
Return SKB; / * Assembled, you can return * /
}
Returns NULL if not arrived.
The whole assembly process ends.
3.2 ip_evictor ()
IP_EVICATOR is called when the memory used by the shard exceeds a certain upper limit (SysctL_IPFRAG_High_Thresh) to release memory.
IP_EVICATOR will find a clear IPQ and empty it until it reaches the lower limit of available (SysctL_IPFrag_low_thresh)
.
This value is defined below ip_fragment.c:
INT SYSCTL_IPFRAG_High_thresh = 256 * 1024;
INT SYSCTL_IPFRAG_LOW_THRESH = 192 * 1024;
Similarly, these two parameters can be seen with sysctl -a while dynamically modify it.
#Nsysctl -a
......
Net.ipv4.ipfrag_low_thresh = 196608
Net.ipv4.ipfrag_high_thresh = 262144
......
Theoretical ip_evicator should use the LRU algorithm to clear the oldest IPQ. But current Linux (including 2.4.0) does not implement this function
, Just emptying the Hash table in order, such a benefit is easy.
3.3 ip_glue ()
The ip_glue () function will be responsible for combining a IP packets that all shards already arrive. When this step is carried out, all fractions have been
Route it in order and solve all overlap problems. Therefore, its processes are very simple.
First, it is a new SKBUFF (sufficient to accommodate the sum of all fractional packages):
SKB = dev_alloc_skb (len);
IF (! SKB)
Goto Out_nomem;
Once you have adjusted some of the necessary pointers, you can copy the contents of the original fragment in a new SKBUFF in a While cycle.
After another pointer adjustment, the process ends, returns the new SKBUFF.
3.4 ip_expire ()
As mentioned earlier, each IPQ retains a timer, and the assembly is not completed after a certain period of time, and this queue will be emptied.
The value of the timer is kept in sysctl_ipfrag_time:
INT sysctl_ipfrag_time = ip_frag_time;
(There is a #define ip_frag_time (30 * hz) in /include/net/ip.h)
This value can also be set with SYSCTL.
There is no analysis of the specific implementation mechanism of the timer.
4. 2.4 Changes of Series
In fact, if you take a closer look, 2.4 fragment assembly code is basically the same as the 2.2 series, and the division of division of the function is changed.
Since the structure retained by the original IPFRAG structure can be obtained in SKBUFF, this structure is canceled in 2.4, and some of the IPQ structure is
modify. Other major changes are:
1) IP_DEFRAG is divided into two parts: IP_DEFRAG and IP_FRAG_QUEUE.
2) IP_GLUE is renamed into IP_FRAG_REASM, the process is basically not movable.
3) Now IPQ reserves the accumulated value of the existing fragment length in the IPQ (already resolved overlap). If this value reaches the total length, all fractions arrive, so the IP_DONE function is canceled, do not have to pass once every time Link list, therefore has a big improvement in efficiency
The ability to resist small fragment attack is enhanced.
5. Common debris attack
IP debris is often used as a DOS attack. Typical examples are Teardrop and Jolt2, which uses fragmentation that sends exceptions.
If the kernel's kernel does not take into account all the abnormal conditions when the slice reorganization is processed, it will be able to lead an abnormality process, resulting
Deny Service (DOS).
Let's take a closer consideration of Linux when processing fragment overlap.
The code is mainly in IP_DEFRAG, first of all, you must traverse the link, locate the position of this fragment, specifically for PREV and NEXT two-pointer
Correct value. Then process the coincidence with the previous, the code is as follows:
/ * We Found WHERE to Put this One. Check for overlap with
* preceding fragment, and, if needed, Align Things So That
* Any Overlaps are eliminated.
* /
IF ((prev! = null) && (offset
I = prev-> end - offset;
Offset = i; / * Ptr Into DataGram * /
PTR = I; / * PTR INTO FRAGMENT DATA * /
}
Note that OFFSET has been multiplied by 8, that is, by BYTE.
For example, if there are two shards:
OFFSET1 = 0 end1 = 256
-------------------------
| FRAG1 (first arrival) | <--------- Prev
-------------------------
OFFSET2 = 64 end2 = 640
------------------------------------------
| FRAG2 (later) |
------------------------------------------
After processing, it becomes:
OFFSET1 = 0 end1 = 256
-------------------------
| FRAG1 (first arrival) | <--------- Prev
-------------------------
OFFSET2 = 256 end2 = 640
-----------------------
| FRAG2 (later) |
-----------------------
The following is the processing overlapping with the back, the code is as follows:
/ * Look for overlap with succeeding segments.
* If We can Merge Fragments, Do IT.
* /
For (TMP = next; tmp! = null; tmp = tfp) {
TFP = TMP-> NEXT;
IF (TMP-> Offset> = End)
Break; / * no overlaps at all * / i = end - Next-> Offset; / * Overlap is 'i' bytes * /
TMP-> LEN - = i; / * so reduuce size of * /
TMP-> Offset = I; / * Next Fragment * /
TMP-> PTR = I;
/ * If We get a frag size of <= 0, remove it and the packet
* That it goes with.
* /
IF (TMP-> Len <= 0) {
IF (TMP-> Prev! = null)
TMP-> prev-> next = tmp-> next;
Else
QP-> fragments = tmp-> next;
IF (TMP-> Next! = null)
TMP-> Next-> Prev = TMP-> prev;
/ * WE HAVE KILED The Original Next Frame. * /
NEXT = TFP;
FRAG_KFREE_SKB (TMP-> SKB);
FRAG_KFREE_S (TMP, SIZEOF (Struct IPFrag);
}
}
Where IF (TMP-> LEN <= 0) is determined, it will be described later in order to deal with the TearDROP attack.
We continue to use the figure, if there is such two shards:
OFFSET1 = 128 end1 = 960
-----------------------------------
Next -------> | FRAG1 (first arrival) |
(TMP) -----------------------------------
OFFSET2 = 64 end2 = 320
-------------------------
| FRAG2 (later) |
-------------------------
The processing will become:
OFFSET1 = 320 end1 = 960
----------------------
Next -------> | FRAG1 (first arrival) |
(TMP) ----------------------
OFFSET2 = 64 end2 = 320
-------------------------
| FRAG2 (later) |
-------------------------
More complicated situation is no longer listed, let's take a look at the attack method of using fragments:
(1) TEARDROP (CERT CA-97.29, Bugtraq ID 124) Many old systems have a vulnerability in processing fragments, sending an abnormal slice package to operate anomalus, Teardrop
It is a classic attack program that utilizes this vulnerability. The principle is as follows (Take Linux as an example):
Two slice IP packets are transmitted, where the second IP package is completely coincident with the first position. As shown below:
<- len1 ->
-------------------------
| FRAG1 |
-------------------------
OFFSET1 END1
<- len2 ->
-------------
| FRAG2 |
-------------
OFFSET2 END2
There is the following processing in Linux (2.0 core):
When there is a position coincidence (Offset2 Change the value of LEN2: Len2 = end2-offset2; Note that Len2 turns a value that is less than zero, and if it is not paying attention to the problem when processing, the problem will occur. But if there is a problem, there is no problem, after all, this is already an old man. The new version checks the size of this value. If there is less than zero, the slice is lost. (2) jolt2 (ms00-029) Jolt2 is a new attack program that appears in May 2000, almost all of the current Windows Platform (95, 98, NT, 2000) crash. The principle is to send a lot of the same fragmentation package, and the OFFSET value of these packages (8190 * 8 = 65520 BYtes) and the total length (48 BYtes) and exceed the length limit of a single IP package (65536 bytes). As shown below: 0 65535 ------------......------------- | Max Normal Fragment | ------------......------------- 65520 65568 (> 65535) ---------------- | Jolt2 fragment | ---------------- This kind of almost immediately moved in Linux, there is: in IP_DEFRAG: / * Attempt to construct an out packet. * / IF ((NTOHS (IPH-> TOT_LEN) (INT) Offset)> 65535) Goto out_oversize; Although the alarm information has been processed on the latter (out_oversize), it has been processed, but when it is attacked Print a message every 5 seconds is still very buddy, can change the interval time of net_rateelimit () and simply turn off this warning. I don't know how it is handled in the Windows system, and the CPU will reach 100%. 2000 SP1 name has been resolved This problem, but did not tried it. (3) Bugtraq ID 376 Linux IP Fragment Overlap Vulnerability This attack is valid for 2.0.33 kernel. In fact, this attack is still not a problem with fragmentation algorithm, but is in realization. Little leakage appears, there are ip_glue: IF (len> 65535) { Printk ("Oversized IP Packet FROM% S. / N", IN_NTOA (QP-> iPh-> Saddr); IP_STATISTICS.IPREASMFAILS ; IP_FREE (QP); Return NULL; } The problem appears on Printk, if the other party has been using oversized fragments (Len> 65535), the kernel will invoke PRINTK Call the police. This operation of Printk is a considerable amount of resources, thus causing DOS. Changed in version 2.0.34: NetDebug (Printk ("Oversized IP Packet FROM% S. / N", IN_NTOA (QP-> iPh-> Saddr)); And / IncludE/net/sock.h: #if 1 #define netdebug (x) do {} while (0) #ELSE #define netdebug (x) do {x;} while (0) #ENDIF That is, only when debugging is open this feature, do not make anything when it is normal. The Net_RATELIMIT () function is added to the later version, which is limited to one more than 5 seconds to issue a kernel warning: IF (Net_RATELIMIT ()) PRINTK (Kern_info "Oversized IP packet from% d.% D.% D.% D. / N", Nipquad (QP-> iPh-> Saddr); This problem is not only used in fragment assembly, and all network parts of all network parts should consider a large number of days when printing debug information. Zhi causes the problem of denial of service. The current relatively good and universal solution is through the NET_RATELIMIT () function. (4) Bugtraq ID 543 Linux Ipchains Fragment overlap Vulnerability Ipchains is handling slice packages, only the first (offset == 0 && mf = 1), because only this package has TCP, UDP Head information, other subsequent fractions do not match firewall rules, directly pass. If the IP division of the system after the firewall The sheet assembly is similar to the following practices: if there is overlap, the later package covers the previous package. Such an attacker can first create A legal fragmentation that can be used through firewall rules (such as an accessible port), then makes a slider with it, Change the information in the previous piece (such as an unacceptable port), so the final result is broken through the firewall Detection. However, this method is only theoretically, it is necessary to rely on the specific implementation of the segmented assembly algorithm behind the firewall. Anyway, if If it is also a Linux, it is invalid, because Linux does not allow changing positions in yourself when processing overlap The content of the top piece (see the discussion above). This method is more difficult to implement in the kernel version after 2.2.11, and the length of the fragmentation is checked when IPChains is handled. If it is too small, it returns FW_BLOCK, which is discarded. (5) Other Debris attacks are not only attacking the operating system, because many network tools, such as firewalls, intrusion detection systems (IDS) are also The internal formation is assembled. If it is improper, it will also be attacked, such as the famous Checkpoint firewall. FW-1 Some versions (the latest has been corrected) will also be subjected to fragmentary DOS attacks (can be found in the NSFOCUS 12th monthly magazine "Learn the Check Point FW-1 Status Table"). Debris can also be used to escape IDS detection, many network intrusion detection systems are single IP package detection, no fragmentation Even if the company like ISS is also in the latest version 5.0, the assembly is implemented, and it is not necessary to say Snort. Its IP assembly plug-in often causes Core Dump, so most people turn this feature. Reference: 1.Linux2.0.33, 2.0.34, 2.2.16, 2.4.0-Test3 source code. 2. Schendfocus vulnerability information. 3. Some emails on BugTraq have many times, and they are not necessarily listed.