80x86 Protection Model Series Tutorial (2) Segmentation Management Mechanism

zhaozj2021-02-08  191

Second. Section management mechanism

This article describes the segment definitions in the protection mode and how the two-dimensional virtual address consisting of segment selectors and segments is converted to a one-dimensional linear address.

segment definition and virtual address to the conversion of linear addresses

The segment is the basis for realizing the virtual address to the linear address conversion mechanism. In the protection mode, each segment is defined by the following three parameters: Base Address, Segment Direction (Limit), and Tributes.

Segment base site specifies the start address of the middle section of the linear address space. In the 80386 protection mode, the segment base site is 32. Since the base address is the same as the length of the address address, any one segment can start from any byte of the 32-bit linear address space, not the boundary specified in the untrue mode must be rescued by 16.

Segment boundary limit segment size. In the 80386 protection mode, the segment boundary is denoted by 20 bits, and the segment boundary can be in bytes or in 4k bytes. One of the segment properties defines this to become a granularity bit, with a symbol G mark. G = 0 indicates that the segment boundary is limited to byte bit unit, and the 20-bit boundaries can be represented by 1 byte to 1M byte, the increment is 1 byte; g = 1 indicates that the segment boundary is 4K bytes. The 20-bit boundaries can be represented by 4k bytes to 4G bytes, and the increment is 4K bytes. When the segment boundary is in units of 4K bytes, the actual segment limit Limit can calculate from 20-bit segment limit LIMIT by the following formula:

LIMIT = LIMIT * 4K 0FFFH = (Limit SHL 12) 0FFFH

Therefore, when the particle size is 1, the boundaries of the segment are actually extended to 32 bits. It can be seen that in the 80386 protection mode, the length of the segment can exceed 64K bytes.

The base address and the boundaries define the range of linear addresses mapped by segments. The base site base is a linear address corresponding to a virtual address of the offset 0, and the virtual address that is offset within the segment corresponds to the linear address of Base X. The virtual addresses from the offset 0 to the LIMIT range correspond to the linear address from the BASE to Base LIMIT.

The figure below shows how a segment is positioned from virtual address space to linear address space. In the representative segment of the representative segment address, Limita, etc. represent section, Limita, etc. In addition, the segment C is connected after segment A, that is, Basec = Basea Limita.

For example, the base address of section A is equal to 00012345H, the segment boundary is equal to 5678 h, and the segment definition is limited to bytes (g = 0), then the segment A corresponds to the region of 00012345h-000179 bdh in the linear address space. If the segment boundary is in units of 4k bytes (g = 1), then the paragraph A corresponds to the region of 00012345H-0568B344H (= 00012345H 5678000H 0FFFH) in the linear address space.

By increasing the length limit, the capacity of the segment can be expanded. This is effective for those normal data segments that are to be extended in memory, but the case of the stack segment is not the case. Because the bottom of the stack is at the high address, the stack is expanded in the low address direction as the stack operation is performed. In order to adapt to the expansion of the normal data segment and the stack data segment in the two opposite directions, an extended direction bit is arranged in the segment property of the data segment, which is marked as an ED. ED = 0 indicates the high-end extension, and ED = 1 represents the low-end expansion. Generally only the stack data segments use attributes to low-end expansion (the stack segment can also use the upward-extended segment), because the downward expansion is designed for the following two purposes:

First, the stack segment is defined as a unique segment, ie DS and SS contain different selectors.

Second, a stack segment is represented to a larger paragraph to expand ourselves (instead of adding existing pages to its segment). Do not intend to use this method to implement the stack of designers do not need to define downwards.

It should be noted that only the segment attribute of the data segment has an extended direction attribute bit ED, which means that only the data segment (the stack segment is a special data segment), and the score is expanded down, and other paragraphs are Natural upward expansion. The expansion direction and segment boundary limit of the data segment determines the valid range of the offset within the data segment. When the maximum is 1 M byte, the offset from 0 to LIMIT is an unauthorized offset from the offset from 0 to LIMIT in segment to the high-end extension. In paragraphs to low-end expansion, the situation is just the opposite, from 0 to LIMIT offset is an illegal offset, and the offset from LIMIT 1 to 1M-1 is a legally effective offset, pay attention to the boundary value Limit The validity of the corresponding address. When the segment is up to 4G, the situation is similar. It can be seen that if a segment is expanded down, all offsets must be greater than the limits, as it limits the lower limit, and its base address starts from the high address. Conversely, if a segment is extended upward, all offsets must be less than or equal to or longer because the limit is the upper limit, and the base address starts from the low address. By using segment surround, you can define the downward expansion segment to any linear address and can be defined as any size.

In the process of converting the virtual address into a linear address, the offset is checked. If the offset is not in a valid range, then an exception is caused.

The main characteristics of the segment attribute specified segment. For example, the grade G g g above mentioned is part of the segment attribute. When various accesss are performed on the segment, it will be legally inspected, mainly based on segment properties. For example: If a write operation is performed to a read-only segment, then not only cannot be written, but it will cause an exception. The definitions and functions of each paragraph maturation bit will be described in detail below.

memory segment descriptor

The data structure used to indicate the three parameters of the definition segment is referred to as a descriptor. Each descriptor is 8 bytes long. In the protection mode, each segment has a corresponding descriptor to describe it. Divide the object described by the descriptor, the descriptor can be divided into three categories: memory segment descriptor, system segment descriptor, door descriptor (control descriptor). The memory segment descriptor is described below.

1. Format of the memory segment descriptor

The storage segment is a segment that stores code and data that can be accessed by the program. The memory segment descriptor describes the memory segment, so the memory segment descriptor is also referred to as a code and data segment descriptor. The format of the memory segment descriptor is shown in the table below. The above top of the table is an explanation of the use of 8 bytes of the descriptor, the lowest address byte (M) is at the far right, and the remaining bytes are sequentially left until the highest byte (the address is m 7) . The next row is an explanation of you all the domains.

SME Descriptor M 7M 6M 5M 4M 3M 2M 1M 0Base (31 ... 24) AttributessEgment Base (23 ... 0) Segment Limite (15 ... 0)

SME M 6BIT1 M 5bit7bit6bit5bit4bit3bit2bit1bit0bit7bit6bit5bit4bit3bit2bit1bit0GD0AVLLIMIT (19 ... 16) PDPLDT1TYPE

As can be seen from the above, a length of 32-bit segment (segment start addresses) is arranged in two domains of the descriptor, and its bit 0-bit 23 is arranged in the 2nd byte of the descriptor, Bit 24-bit 31 is arranged in the 7th byte within the descriptor. The length of the length of 20 bits is also arranged in two domains of the descriptor, and its bit 0-bit 15 is arranged in the 0 -th byte within the descriptor, and its bit 16-bit 19 is arranged in the description. The 6th byte of the context is 4 bits.

The reasons why the base site and the segment boundary of the segmentation segmentation site and the segmentation are related to 80286. In the 80286 protection mode, the segment base address is only 24, while the segment boundary is only 16 long. The 80286 stored segment descriptor although it is also 8-byte length, only the low 6 bytes are used, and the high 2 byte must be set to a 0.80386 memory segment descriptor, which allows the format of the memory segment descriptor of 80286. 80386 continues to be effective. The segment attributes in the 80386 descriptor are also arranged in two domains. The definitions and significance are explained below.

(1) P bits are called a present. P = 1 indicates that the descriptor is valid on address translation, or the segments described in this descriptor, ie in memory; P = 0 means that the descriptor is invalid on address translation, ie the paragraph does not exist. An exception is caused when using this descriptor to perform memory access.

(2) DPL represents the descriptor privilege level, a total of 2 bits. It specifies the privilege levels of the described segments, used for privileged checks to decide whether to access to this segment.

(3) DT bit describes the type of descriptor. For the memory segment descriptor, DT = 1 is distinguished from the system segment descriptor and the door descriptor (DT = 0).

(4) TYPE illustrates the specific properties of the memory segments described in the memory segment descriptor.

The bit 0 indicates whether the descriptor is accessed, marked with a symbol A. A = 0 indicates that the A = 1 indicates that the segment has been accessed. When the corresponding selection of the descriptor is loaded into the segment register, the 80386 is 1, indicating that the descriptor has been accessed. The operating system can test the access bit, and the descriptor has been accessed.

The bit 3 indicates that the segment described is the code segment or the data segment, with a symbol E mark. E = 0 indicates the segment as the data segment, and the corresponding descriptor is also a data segment (including a stack segment) descriptor. The data segment is not executable, but it is always readable. E = 1 indicates the segment is an executable segment, ie the code segment, the corresponding descriptor is the code segment descriptor. The code segment is always unwritable. If you need to write an action on the code segment, you must use an alias technology, that is, use a writable data segment descriptor to describe the code segment, and then write this data segment.

In the data segment descriptor (the case of E = 0), bit 1 in TYPE indicates whether the described data segment can be written, W. W = 0 indicates that the corresponding data segment cannot be written. Conversely, W = 1 indicates that the data segment is writable. Note that the data segment is always readable. Bits 2 in TYPE are ED bits, indicating the expansion direction of the described data segments. ED = 0 indicates the data segment to high-end expansion, that is, the offset in the segment must be less than or equal to the segment limit. ED = 1 indicates that the data segment is low extension, and the interior offset must be greater than the segment limit.

In the code segment descriptor (the case of E = 1), bit 1 in TYPE indicates whether the described code segment is readable, marked with a symbol R. R = 0 indicates that the corresponding code segment is not readable, can only be executed. R = 1 represents the corresponding code segment readable and executable. Note that the code segment is always not writable. If you need to write a write action on the code segment, you must use alias technology. In the code segment, bit 2 in Type indicates whether the code segment described is a consistent code segment, with a C. C = 0 indicates that the corresponding code segment is not a consistent code segment (normal code segment), and c = 1 represents the corresponding code segment is a consistent code segment. The description of the consistent code segment will be described in detail later.

The attributes illustrated in the Type field in the stored segment descriptor can be summarized as the following table:

Data segment type type value is 0 read only 1 read-only, have accessed 2 read / write 3 read / write, have been accessed 4 readings, expand 5 readings down, expand down, have access 6 read / write, Extend 7 read / write down, expand down, have been accessed

The code segment type type value is verified that only 9 execution is executed, and the Access A execution / read B execution / read, the access c only execute, the code segment D is only executed, the uniform code segment, the Access E execution / read, Consistency code f Execution / read, uniform code segment, has been accessed (5) G is the segment boundary granulation (Granularity). G = 0 indicates the boundary grania as byte; g = 1 indicates the boundary granularity of 4K bytes. Note that the boundary particle size is only valid for the segment boundary, and the segment site is invalid, and the segment base address is always byte.

(6) The D bits are a very special bit, and the meaning of the three descriptors that describe the executable, the downward data segment or by the SS register address (usually the stack segment) is different.

In the descriptor describing the executable segment, the D bit determines the size of the address and operand used by the instruction. D = 1 Represents the 32-bit address and 32-bit or 8-bit operand by default, such code segment is also referred to as a 32-bit code segment; D = 0 means that the 16-bit address and 16 bits or 8 are used by default The number of operands, such code segments are also referred to as a 16-bit code segment, which is compatible with 80286. You can change the size of the default address or operand using the address size prefix and the operand size prefix.

In the descriptor of the data segment down, the D bit determines the upper boundary of the segment. D = 1 indicates that the upper limit of the segment is 4g; D = 0 indicates that the upper limit of the segment is 64K, which is compatible with 80286.

In the segment descriptor described by the SS register addressing, the D bit determines the implicit stack access instruction (such as the PUSH and POP instructions) to use what stack pointer register. D = 1 means using a 32-bit stack pointer register ESP; D = 0 means using a 16-bit stack pointer register SP, which is compatible with 80286.

(7) The AVL bit is the software available. 80386 The use of this bit is not left, Intel guarantees that the processor developed in the future will not be any definition or regulations for this bit as long as it is compatible with 80386.

Further, the bit 5 in the sixth byte in the descriptor must be set to 0, which can be understood to be preserved for the later processor.

2. Structure type of memory segment descriptor

Depending on the structure of the memory segment descriptor, the following assembly language descriptor structure type is defined:

DESC STRUC

Limitl DW 0; segment limit low 16

Basel dw 0; base site low 16 bits

Basem DB 0; 8 in the base site

Attrib DB 0; segment attribute

Limith DB 0; the length of the paragraph limit is 4 (including 4 digits of segment properties)

Baseh db 0; base site is 8 digits

DESC ENDS

The storage segment descriptor can be conveniently described in the program by using the structural type DESC. For example: The following descriptor DATAS describes a readable and writable effective (existing) data segment, the base address is 100000h, the boundary of bytes is 0FFFH, the descriptor privacy DPL = 3.

Datas Desc <0fffh, 10H, 0F2H,>

A further as: The following descriptor CodeA describes only a valid 32-bit code segment that is executable. The base address is 12345678h, and the segment limit value of the 4k-byte bit unit is 10 h (10FFFH in the byte position unit), Descriptor privy-level DPL = 0.

Codea Desc <10H, 5678H, 34H, 98H, 0C0H, 12H>

<3> Global and Local Descriptive Table

One task will involve multiple sections, each task requires a descriptor to describe, in order to facilitate organizational management, 80386 describes the descriptor into linear tables. A linear table consisting of a descriptor is called a descriptor table. There are three types of descriptor tables in 80386: Global Descriptive Table GDT (Global Descriptor Table), local descriptor table LDT (local descriptor table) and interrupt descriptor table IDT (Interrupt Descriptor Table). In the entire system, the global descriptor table GDT and interrupt descriptor table IDT have only one, and the local descriptor table can have several sheets, and each task can have one. For example, the following descriptors have 6 descriptors:

Desctab Label Byte

Desc1 DESC <1234H, 5678H, 34H, 92H,>

Desc1 DESC <1234H, 5678H, 34H, 93H,>

Desc1 DESC <5678H, 1234H, 56H, 98H,>

Desc1 DESC <5678H, 1234H, 56H, 99H,>

Desc1 DESC <0FFFH, 10H, 16H,>

Desc1 DESC <0FFFH, 10H, 90H,>

Each descriptor table itself forms a special data segment. Such special data segments can contain up to 8K (8192) descriptors.

About Interrupt Description Table IDT Introduced in later articles.

The local descriptor table LDT of each task contains the descriptor of the task's own code segment, the data segment, and the stack segment, also contains some of the door descriptors used by the task, such as task doors and calling door descriptors. As the task is switched, the current local descriptor table LDT is also switched.

The global descriptor table GDT contains a descriptor that can or may access by each task, usually contains a descriptor describing the code segment, data segment, and stack segments used in the operating system, and also contains a variety of special data segment descriptors. For example, each special data segment for describing the task LDT, etc. When the task is switched, the GDT is not switched.

Each segment of each task is separated from other tasks through LDT to achieve protected purposes. The segments that can be used by GDT can be shared. The following figure shows the case where the segments involved in task A and task B are both isolated from both protected and shared. The code segment Code Code and Data Section Data Data Data Section Data Section Private Private by Task A and Data Section DataB and DataB2 are separated by the code segment Code Code and Data Section Data DataB and Data Segment DataB and DataB2 privately owned by Task B, but Task A and Task B share code segments CODEK and Codeos and data segments Datak and DataOS through the global descriptor table GDT.

A task can be used by the entire virtual address space divided into two halves, half space descriptors in the global descriptor table, and the descriptor of the other half space is in the local descriptor table. Since the global and local descriptors can include up to 8192 descriptors, and the maximum value of the segments described in each descriptor can reach 4g bytes, so the maximum virtual address space can be:

4GB * 8192 * 2 = 64MMB = 64TB

segment selection

In the real mode, the address of the storage unit in the logical address space consists of two parts by the segment value and the segment. In the protection mode, the address of the memory cell in the virtual address space (equivalent to the logical address space) consists of two parts from the segment selector and segment. The segment selector replaces the segment value compared to the real model.

The segment selection is 16 bits, and its format is shown in the table below. As can be seen from the table, the high 13 bits of segment selectors are descriptor index (INDEX). The so-called descriptor index refers to the sequence number in the descriptor table. The second bit of the segment selector is a reference descriptor table indicator bit, tagged as Ti (Table Indicator), Ti = 0 indicates reading a descriptor from the global descriptor table GDT; Ti = 1 indicates from the local descriptor table LDT Read the descriptor.

Select Substructure Bit15bit14bit13bit12bit11bit10bit9bit8bit7bit6bit5bit4bit3bit2bit1bit0 Descriptor Index TIRPL Selecton Determine Descriptor, Descriptor Determination Segment Base Address, Segment Biographical Address and Offset The sum of linear addresses. Therefore, the two-dimensional virtual address consisting of two parts of the virtual address space is two parts, which determines a one-dimensional linear address in the linear address space.

The minimum two bits of the selector is the request for the privilege-level RPL (Requested Privilege Level) for privileged checks. The Usage of the RPL field is as follows:

Whenever the program is trying to access a segment, the current privilege level is compared to the privileged level of the accessed segment to determine if the program is allowed to access the segment. Use the RPL field of the selector to change the test rule of the privilege level. In this case, the privilege level compared to the privileged level of the accessed segment is not a CPL, but the privilege level of the CPU and the RPL. CPL stores in the RPL field of the CS register, whenever a code segment selector is loaded into the CS register, the processor automatically stores the CPL to the RPL field of the CS.

Since the descriptor index field in the selection is denoted by 13 bits, it is possible to distinguish 8192 descriptors. This is why the descriptor table contains up to 8192 descriptors. Since each descriptor length is 8 bytes, according to the format of the selected subsemble shown in the above table, the value of the shielding selection is lowered by the lower 3 bit is the offset of the descriptor specified by the selection in the descriptor table, which can be considered It is the reason why the selection of the selection is 13-bit as a descriptor index.

There is a special selection called empty (NULL) selector, its index = 0, Ti = 0, and the RPL field can be arbitrary. Empty selection has a specific use when it is stored with an empty selector. The empty selection is specifically defined, which does not correspond to the 0nd descriptor in the global descriptor table GDT, so the 0nd descriptor in the processor is always accessible, generally set it into all 0. However, when Ti = 1, the selector of index is 0 is not empty, which specifies the 0nd descriptor in the current task partial descriptor table LDT.

<5> Segment Descriptor High Speed ​​Buffer Register

In the real mode, the segment register contains a segment value. When the physical address is formed, the processor references the corresponding segment register and multiplied it by 16 to form a 20-bit segment address. In the protection mode, the segment register contains segment selector, as described above, in order to access the memory to form a linear address, the processor should use the base address in the descriptor specified by the selection. In order to avoid accessing the descriptor table, the corresponding segment descriptor is obtained, and each segment register is obtained from 80286, each segment register is attached, referred to as a segment descriptor cache register or descriptor. Projection registers, it is invisible to programmers. Whenever a selection is loaded into a segment register, the processor automatically takes a corresponding descriptor from the descriptor table, saving the information in the descriptor to the corresponding high speed buffer register. The processor uses the descriptor information corresponding to the high speed buffer register, without having to take a descriptor from the descriptor table.

The contents of the descriptor high-speed buffer registers are shown in the table below. Where the 32-bit group base address is directly taken from the descriptor, the 32-bit segment is limited to the segment limit of 20 bits in the descriptor, and converted to byte according to the particle size bits in the descriptor attribute. The other ten characteristics are determined according to the attributes in the descriptor, "Y" means "Yes", "N" means "NO", "R" means that "W" means must be written, "p" means must Presented, "D" represents the property according to the descriptor.

Segment descriptor cushioning register content segment register segment base address segment boundary segment attribute existence privilege level Extraction granular expansion Direction Readability can be used to perform stack size uniform privilege CS32 base site 32-bit boundary PDDDDNY-DSS32 Base site 32-bit boundary limit PDDDDRWND-DS32 bit address 32-bit boundary PDDDDDN - ES32 base site 32-bit boundary PDDDDDN - FS32 base address 32-bit boundary PDDDDDN - GS32 base site 32-bit boundary limit The PDDDDDN-Segment Descriptor Caspsencing Register Reggler, so it can be quickly accessed. In most cases, access to memory is performed after the corresponding selection is loaded into segment registers, so the segment descriptor cache register can get good execution performance.

The descriptor information saved within the Segment Descriptor Cache Register will be saved to the re-loaded segment to the segment register. Programmers despite the paragraph descriptor cache register, but must notice its presence and its above-described update time. For example, after changing a descriptor of a descriptor in the descriptor table, the content of the corresponding segment descriptor cache register is also updated, even if the segment selector does not change, this can be implemented by reloading the segment register.

Reference information book name Society Society "80386 and its programming" Tsinghua University Press, Zhou Mingde, editor "80x86 assembly language program design tutorial" Tsinghua University Publishing Social Yang Qiwen Editor


New Post(0)