Memory and Process Manager
==========================
But I Fear Tomorrow I'll Be Crying,
YES I Fear Tomorrow I'll be crying.
King Crimson'69-Epitaph
About the high-level information of the Windows NT Memory Manager is already enough, so there will be no more Flat model here, something in the virtual memory. Here we only talk about the specific bottom. I assume that everyone knows the architecture of> i386.
table of Contents
==========
00. Nuclear process thread structure
01. Page Table
02. Hyper Space
03.system PTE's
04.Frame Data Base (MmpfndATABASE)
05.Working Set
06. Change the page to PageFile
07.Page fault processing
08. From the perspective of the memory manager, the process is created
09. Context
0A. Some undisclosed memory managers functions
0b. Conclusion
appendix
0c. Some undisclosed system calls
0D. Note and Code Analysis Draft
00. Nuclear process thread structure
====================================
Each process in Windows NT is an EPROCESS structure. The other structures that are tightly associated with the implementation process are referenced in addition to the properties of the process. For example, each process has one or more threads, threads are Ethread structures in the system. I briefly describe the main information existing in this structure, which is known from the study of kernel functions. First, there is a KProcess structure in the structure, and this structure also has a pointer (allocated address space) of the KTHREAD module, the base priority, the kernel mode or the user mode execution process in the kernel mode or user mode execution process. Time, processor affinity (mask, defining which processor can perform process threads), time slice. Such information also exists in the Ethread structure: process ID, parent process ID, process image name, section pointer. Quota defines the limit value of the paging and non-tap pool that can be used. The VAD (Virtual Address Descriptor) tree defines the condition of the user address space memory area. The information about Working SET defines that those physical pages belong to the given time. At the same time, there is Limit and Statistics. Access token describes the security properties of the current process. The handle table describes the handle of the object opened by the process. This table allows access to access rights not when accessing objects. In the EPROCESS structure, there is a pointer to the PEB.
The Ethread structure also includes creating time and exit time, process ID, and pointers, startup addresses, I / O request linches, and KThread structures. The following information is included in KThread: the creation time of the kernel mode and user mode thread, points to the kernel stack base address, pointer, pointing to the pointer of the service table, the base priority, the current priority, pointing to the APC, pointing to TEB pointer. The KThread contains many other data, and the structure of KTHREAD can be analyzed by observing the data.
01. Page table ==================================================================================================================================
Usually the operating system uses a page table to make a memory operation. In Windows NT, each process has your own private page table (all threads of the process shared this page). Correspondingly, the switching of the page table will occur when the process is switched. In order to speed up access to the page table, there is a Translation Lookaside Buffer (TLB) in the hardware. Two-level conversion mechanisms are implemented in Windows NT. Convert virtual addresses to physical address processes (regardless of segmentation) on the 386 processor (not considered) as follows:
Virtual Address
------------------------- ------- --------------
| 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 | 1 1 |
| 1 0 9 8 7 6 5 4 3 2 | 1 0 9 8 7 6 5 4 3 2 | 1 0 9 8 7 6 5 4 3 2 1 0 |
------------------------- ------- --------------
| Directory Index | Page Table Index | Offset In Page |
----------------------------- - - --------------
| | | | |
| | | | |
| Page Directory (4KB) | Page Table (4KB) | Frame (4KB)
| ------------- | ------------- | -----------
| | 0 | | | 0 | | | |
| ------------- | ------------- | | |
| | 1 | | | 1 | | | | |
| ------------- | ------------- | | |
| | | -> | PTE - | | | |
| ------------- ------------- | | ----------- |
-> | PDE - | | | -> | BYTE |
----------- | ----------- | ----------- |
| | | | | | | | |
------------- | ----------- | | |
| | | | | | | | |
... | ... | | |
| 1023 | | | 1023 | | | | |
CR3 -> ------------- ---> ------------ ---> -- ---------
Windows NT 4.0 uses a flat address. NT's address space is 4G. In this 4G address space, low 2G (address 0-0x7fffffff) belongs to the current user process, while 2G (0x80000000-0xFFFFFFFFFFF) belongs to the kernel. At the context switch, to update the value of the CR3 register, the result is replaced with the user address space, which reaches the process between the processes.
Note: In Windows NT, from the fourth edition, in addition to the 4KB page, 4MB page (Pentium and higher) is used to map the kernel code. However, there is no actual page support in Windows NT.
The format of PTE and PDE is actually the same.
PTE
--------------------------------- - --------------
| 3 3 2 2 2 2 2 2 2 2 2 1 1 1 1 | 1 1 1 1 1 1 | |
| 1 0 9 8 7 6 5 4 | 3 2 1 0 9 8 7 6 | 5 4 3 2 1 0 9 8 | 7 6 5 4 3 2 1 0 |
----------------------------------- --------------
| | T P C U R D A P P U R P |
| Base Address 20 BITS | R P W C W S W |
| | N T D T |
----------------------------------- --------------
Some important bits are defined under i386 as follows:
-------------------------------------------------- -------------------------
P - present position. If this bit is not set, an exception is generated at address transition. Generally speaking, in some cases the NT core will use the PTE that does not set this bit.
For example, if you disable your page to the PageFile, keep these bits to explain their location and PageFile number in the page file.
U / S - Whether you can access the page from the USER mode. It is to provide protection for kernel space (usually 2G) by means of this bit.
RW - Can I write?
NT uses an air-free position assigned to an OS designer
-------------------------------------------------- -------------------------
PPT - Proto PTETRN - Transition PTE
When the P bit is not set, the 5th to 9th place is sent to the field (for page fault processing). They are called Protection Mask, which looks as follows:
-------------------------------------------------- ------------------------------------
* MicreatePagingFileMap
9 8 7 6 5
---------
| | | | | | |
| | | | | - WRITE COPY
| | | | - Execute
| | ----- Write
| ------- No Cache
------- Guard
Guard | Nocache combination is no access
* MMGETPHYSICALALADDRESS
The function is very short, but it can get a lot of information from it. In the virtual address 0xC0000000 - 0xc03fffff, the page table with the process is included. And, the mechanism of the mapping is very delicate. In Directory Table (hereinafter referred to, DT) (corresponding to address 0xc000 ..- 0xc03ff ..) pointing to yourself, that is, for these address DTs as a page table (Page Table)! If we use, for example, address (for easy access, use binary)
1100000000.0000000101.0000001001.00B
---------- ------------------------
0xc0 ... Page table Select Page Meter Offset
Page directory
Through the 1001B of page 101b, we get PTE. But this has not yet finished - DT itself mapped on address 0xc0300000-0XC0300FFC. There is a value of 0xc0300000 in MMSystemPtebase. Why is this - see an example and know:
1100000000.1100000000.0000001001.00B
---------- ------------------------
0xc0 ... 0xc0 ... page directory offset
Page Directory Page Table -
Page directory
select
Finally, the C0300C00 includes a PDE for the catalog itself. The value of the base address of this PDE is saved in MMSystemPageDirectory. At the same time, the system preserves a PTE for the mapping physical page MMSystemPageDirectory, which is MMSystemPagePtes.
This can simplify addressing operations. For example, if there is an address of the PTE, the address of the page described by PTE is equal to PTE << 10. In turn: PTE = (AddR >> 10) 0xC0000000.
In addition, there is a global variable mmkseg2frame = 0x20000 in the kernel. This variable indicates directly mapped to physical memory directly from 0x80000000, that is, at this time, the virtual address 0x80000000 - 0x9FFFFFF is mapped to the physical address 00000000-1f000000.
There are also a few interesting places. The table (0-7FFFFFF) of the description address of 0x1000 * 0x200 = 0x200000 = 2m begins from C0000000. Describe the PDE of these pages located at address c0300000-0XC03007FC. For I486, at address C0200000-C027FFFC should be a 512MB table describing 80000000 to A0000000, but for Pentium in the area 0xc0300800-0XC03009FC is 4MB of PDE, which describes 4M physical pages from 0 to 1FC00000 steps 00400000, That is to say, 4M page is selected. The virtual address corresponding to these PDE is 80000000, 9FFFFFFF. This way we get the distribution of the page table:
Range C0000000 - C01FFFFC for 000000-7FFFFFFFFFFF
Range C0200000 - C027FFF "Eat" Address of 4M Address Page
Range C0280000 - C02FFFFC contains pages for A0000000 - BFFFFFFFF
Range C0300000 - C0300FFC PD itself (description range C0000000 - C03FFFF)
Range C0301000 - C03013FC C0400000 - C04FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF HYPERSPACE (more accurately, is 1/4 of Hyper Space
Range C0301400 - C03FFFF contains pages for c050000 - fffffff
Note: In the 0xc0301000-0XC0301FFC, a page table containing the Hyper Space is included. This is the address space of the kernel, and the contents of different process mappings are different (on the other hand, the kernel space is always in the context of each user process). This is the area where the process is private. For example, Working Set is located in Hyper Space. The top 256 PTE (the first 1/4 of Hyper Space) of the page is reserved for kernel and is used when you need to map virtual addresses to Frame.
I give an example of an address to map zone 0xc0200000-0XC027F000.
1100000000.1000000000.00000000000000 = 0xc0200000
1) Analyze PDE # 1100000000 (4k page) and select PageDirectory
2) Select PTE in Directory # 1000000000 (C0300800)
This is a 4MB PDE - but here is ignored the length of the bit.
Because PDE is used as PTE. Results C0200000 - C0200FFF is mapped to
80000000-80000FFF
The C0201000 is mapped to the following - 80400000-80400FFF.
Wait until C027F000 - 9FC00000
PTE, located in C0200000 to C027FFFC - describes 80000000 - 9FFFFC00 (512M)
02. Hyper Space
==============
HyperSpace is an area (4MB) in the kernel space, different process mapping content. For conversion, 4MB is enough to put down the page of the page. This table is located at address 0xc0301000 - 0xc0301ffc (the 0th item of PDE is located in 0xc0300c04). Internally, in order to map physical pages in the HyperSpace area (when you need fast for a Frame organization virtual address):
DWORD MIMAPPAGEINHYPERSPACE (DWORD BaseAddr, Out Pdword Irql);
It returns a virtual address in HyperSpace, which is mapped to the physical page you want. How does this function work, what did you use? There are such variables in the kernel:
MMFirstReServedMappingPte = 0xc0301000
MmlastReServedMappingPte = 0xc03013fc
These two variables describe 255 PTEs, which describes the area:
0xc0400000-0XC04FFFF (1/4 HyperSpace)
At MMFirstReServedMappingPte, it is a PTE, where the base address plays the role of the counter (from 0 to 255) (of course, PTE is invalid, P bits are invalid). To add PTE to the desired address, depending on the current value of the counter ... and the counter uses the principle of the next opening stack, starting from FF. In general, the PTE in the page table is not the only situation.
03.system PTE's
================
There is a memory in the kernel - system PTE. What is system PTE, and how the kernel uses system PTE?
* See Functions MIRESERVESYSTEMPTES (...)
The system maintains certain structures for idle PTE. First, in order to quickly meet the intensive request (when the kernel needs PTE mapping some physical pages), there is a Sytem PTES POOL. Moreover, PTE blocks in pool (blocks representing the request are met in block, and there are some PTE, 1, 2, 4, 8, and 16 PTEs) in a block.
There are the following tables in the system:
BYTE MMSYSPTABLES [16] = {0, 0, 1, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4};
DWORD MMSYSPTEINDEX [5] = {1, 2, 4, 8, 16};
DWORD MMFREESYSPTELISTBYSIZE [5];
PPTE mmlastsysptelistBysize [5];
DWORD MMSYSPTELISTBYSIZECUNT [5];
DWORD MMSYSPTEMINIMUMFREE [5] = {100, 50, 30, 20, 20}
Pvoid mmsystemptebase; // 0xc0200000
The idle PTE in pool is organized into a list (of course, PTE is in the page table, that is, the chain table structure is located in the page table, this is true). Elements of linked list:
Typedef struct _free_system_ptes_block {
/ * PTE0 * / syspte_ref nextref; // Point to the back of the Block
/ * PTE1 * / DWORD FLUSHUNKN; // Use when Flush
/ * PTE2 * / dword arrayofnulls [any_size_array]; // idle PTE
} Free_system_ptes_block pfree_system_ptes_block;
The address used as a PTE pointing to the back element pointer can be obtained: va = (NextRef >> 10) mmsystemptebase (low 10 bits is always 0, the corresponding P bits are 0). The last element NEXTREF domain is 0xfffff000 (-1). Correspondingly, there are 5 linkers (blocks of blocks of 1, 2, 4, 8, and 16 PTE). * See Functions MIRESERVESYSTEMPTES2 (...) / miinitializesystemptes
In addition to Pool, there is an undocumented idle system PTE linked list.
PPTE MMSystemptesstart [2];
PPTE MMSystemPteend [2];
Syspte_ref mmfirstfreeesystempte [2];
DWORD MMTOTALFREESYSTEMPTES [2];
There are two references in both linked lists. Elements of linked list:
Typedef struct _free_system_ptes {
Syspte_ref next; // #define only_one_pte_flag 2, last = 0xffffff000
DWORD NUMOFFREEPTES;
} Free_system_ptes pfree_system_pte;
Moreover, there is no organization in principle on the 1 chain list. The No. 0 Links (MireLeleaseSystemptes) is used to release the PTE. PTE is likely to enter System PTES Pool. If the number of PTE is greater than 16 when requesting mireservesystemptes (...), the PTE is allocated from the No. 0 list. That is, the No. 0 chain list is associated with pool, and the No. 1 is not.
In order to make the results of the work are not contradictory with the TLB, the system either uses the overload CR3 or uses the command inflpg. Advanced "function
Miflushptelist (PTE_List * Ptelist, Boolean Bflushcounter, DWORD Ptevalue);
Perform the following work:
Initialize the PTE and call the InVLPG (assembly instruction).
Typedef struct PTE_List {
DWORD Counter; // Max EQU 15
Pvoid Ptepointersintable [15];
Pvoid PtemAppingAddresses [15];
}
If the Counter is greater than 15, then KeflushCurrentTb (just overloaded CR3) is called, and if BFLUSHCOUNTER is set, 0x1000 is added to MMFlushCounter.
04.page frame number data base (mmpfndatabase)
=======================================
The kernel saves information about the physical page in the PFN database (MMPFNDATABASE). In essence, this is just a 0x18 byte long structure block. Each structure corresponds to a physical page (order arrangement, so element is often referred to as PFN - Page Frame Number). The number of structures corresponds to the number of 4Kb pages in the system (or the number of pages visible to the kernel, if necessary, using the corresponding option in Boot.ini to make this "bad" page for the NT core. Typically, the form of structures is as follows: TypeDef struct_pfndatabaseEntry
{
Union {
DWORD nextREF; // 0x0 If frame is in the list, this is the number of frame.
// The last one is -1
DWORD MISC; / / At the same time, the other information is dependent on the context
// see the false code (usually TMPPFN-> 0 ...)
// Usually there is * kthread, * kprocess,
// * PageSupport_Block ...
}
PPTE PTEppTe; // 0x4 points to PTE or PPTE
Union {// 0x8
DWORD prevref; // front Frame or (-1, first)
DWORD SharCounter; // Share Counter
}
Word flags; // 0xc See below
Word Refcounter; // 0xE reference count
DWORD TRANS; // 0x10 ?? See below. Used for Pagefile
DWORD contrame; // containingframe; // 14
} PfndatabaseEntry;
/ *
Flags (the name taken from Windbg! PFN)
Mask bit name value
----- ---- ------------
0001 0 m modifyied
0002 1 R Read in Progress
0004 2 W WriteInProgress
0008 3 P Shared
0070 [4: 6] Color Color (in Fact Always Null for x86)
0080 7 x Parity Error
0700 [8:10] State 0- Zeroed
/ List 1- Free
2- Standby
3- Modified
4- ModifiedNowrite
5- BADPAGE
6-Active
7-Trans
0800 11 E InpageError
The value of the TRANS domain is used in the content of the Frame or the content of the Frame is in other image files corresponding to this Page File PTE.
I gave an example of the PTE of the P bit (this PTE is not determined by the platform architecture, and is determined by the OS).
* Addition @mireleasepagefilespace (trans)
Page File PTE
-------------------------- - ----- ---- -------
| 3 3 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 | 1 | 1 | 0 0 0 0 0 | 0 0 0 0 | 0 |
| 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 | 1 | 0 | 9 8 7 6 5 | 4 3 2 1 | 0 |
-------------------------- - ----- ---- ------- - | Offset | T | P | Protect. | Page | 0 |
| | R | P | Mask | File | |
| | N | T | | NUM | |
-------------------------- - ----- ---- -------
Transition PTE
-------------------------- - ----- ---- -------
| 3 3 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 | 1 | 1 | 0 0 0 0 0 | 0 0 0 0 | 0 |
| 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 | 1 | 0 | 9 8 7 6 5 | 4 3 2 1 | 0 |
-------------------------- - ----- ---- -------
| PFN | T | P | Protect. | C w o w | 0 |
| | R | P | Mask | D t |
| | N | T | | | | |
-------------------------- - ----- ---- -------
W - WRITE
O - Owner
WT - WRITE THROUTH
CD - Cache Disable
Maybe all these are still not easy to understand, but you can understand it below. Of course, this structure is not disclosed. Obviously, the structure can be organized into a list. Frame supports the following structural body:
Struct _MMPageLocationList {
PpfnListHeader ZeroedPagelisthead; // & mmzeroedPagelisthead
PpfnlistHeader FreePagelisthead; // & mmfreepagelisthead
PpfnListHeader StandBypageListhead; // & mmstandbypagelisthead
PpfnlistHeader ModifiedPagelisthead; // & mmmodifiedPagelisthead
Ppfnlistheader modified; // & mmmodifiednowritepagelisthead
PpfnListHeader Badpagelisthead; // & mmbadpagelisthead
} MmpagelocationList;
This contains 6 links. The names of each domain have a good description of their use. The status of Frame is closely related to these linked lists. The following is listed in the status of Frame:
----------------------------------- ------------------- ---- | Status | Description | Links |
----------------------------------- ------------------ ------
| Zero | Clear free idle page | 0 |
| Free | Free Page | 1 |
| Standby | Unavailable but easy to recover pages | 2 |
| Modified | DIRTY page to change | 3 |
| ModifiedNowrite | DIRTY Pages | 4 |
| BAD | Unavailable page (with error) | 5 |
| Active | Activity page, at least a virtual address map | - |
----------------------------------- ------------------ ------
Frame may be in one of the six chains, or may not be in these linked lists (statly Active). If the page belongs to a process, this page is recorded in the working set (see later). At the same time, if Frame is used by the memory manager, it is generally regardless of the position of these frames.
The heads of each linked list are below:
Typedef struct _pfnlistheader {
DWord Counter; // Number of Frame in the list
DWORD LOGNUM; / / LED IN. 0 - ZeroED, 1- Free ETC ...
DWORD firstFN; // MMPFNDATABASE in the first FRAME number
DWORD LASTFN; / / - / / --- The last one.
} PfnlistHeader PPFNListHeader;
In addition, you can address idle Frame (Zeroed or Free) with "color". If you look at the pseudo code in the appendix, it is easy to understand. I give two structures:
Struct {
ColorhashItem * Zeroed; // (- 1) нет
ColorhashItem * free;
} Mmfreepagesbycolor;
Typedef struct _colorhashitem {
DWORD FRAMENUM;
PfndatabaseEntry * PFN;
} Colorhashitem;
There is a set of functions to use Color to process Frame. For example, MiremovePageByColor (Framenum, Color); Look at these functions and its parameters return value, it is easy to guess the corresponding content, so it is not described here. It is said that these functions are Not exported. When using Color, consider the Color mask and finally select Color. Windows NT complies with the C2 security level, so you should zero the page when you assign a page for the process. Let's take a look at the thread of the system process that cleaves the frame. Finally, in Phase1Initialization () is called MMZEROPAGETHREAD. It is not difficult to guess - the thread cleaves the idle page and moves it to the linked list of the Zeroed page.
MMZEROPAGETHREAD
{
//
// .... ignorant thing we have skewed;)
//
While (1)
{
KewaitForsingleObject (MmzeroingPageEvent, 8, 0, 0); // Waiting for the event
While (! KetrytoAcquirespinlock (Mmpfnlock, & Oldirql); // Get PFNDATABASE
While (mmfreepagelisthead.count) {
Miremove Anpage (MMfreePagelisthead.firstfn & mmsecondaryColorMask);
/ / Take out of the page from the idle chain list
VA = mimappagetozeroinhyperspace (mmfreepagelisthead.firstfn);
KELOWERIRQL (Oldirql);
MEMSET (VA, 0, 0X1000); // Clear Page
While (! KetrytoAcquirespinLock (Mmpfnlock, & Oldirql);
MiInsertPageInList (& mmzeroedPagelisthead, Framenum);
// Insert the collected page into the zero linked list
}
MmzeroingPageThreadActActive = 0; // Qingke
KELOWERIRQL (Oldirql);
}
// Never quit
}
// Function just maps Frame to the defined address
// so that it can be cleared
DWORD MIMAPPAGETOZEROINHYPERSPACE (FRAMENUM)
{
IF (Framenum TMPPTE = 0xc0301404; TMPVA = 0xc0501000; * TMPPTE = 0; Invlpg (void *) TMPVA); // ASM INSTRUCTION IN FACT * TMPPTE = framenum << 12 | ValidPtepte; Return TMPVA; // always 0xc0501000; } When is MMZEROINGPAGEVENT being activated? This occurs when adding Frame to the idle page list: MiInsertPageINList () { ..... IF (mmfreepagelisthead.count> = mmminimumfreepagestozero && MmzeroingPagthReadActive) { MmzeroingPageThreadActActive = 1; KESEVENT (& MMZEROINGPAGEEVENT, 0, 0); } .... } Note: The kernel does not always rely on this thread, sometimes encountered such a code, it acquires an idle page, and cleared it after it. 05.Working Set ============== Working set - Working set is a physical page set belonging to the current process. Memory Manager uses a certain mechanism to track the work set of the process. Working set has two limits: Maximum Working Set and Minimum Working Set. This is the maximum and minimum work set. The memory manager is based on these two values to maintain the work set of the process (the working set size is not less than the minimum, no greater than the maximum). When defining conditions, the work set is cut, and the Frame of the work set falls into the idle chain table. The kernel work set is the sum of the structure. The offset of the process structure is 0xc8 (NT4.0) has the following structures. Typedef struct _vm { / * C8 * / large_integer updatetime; // 0 / * D0 * / dword pages; // 8 called so, by s-ice authors / * D4 * / DWORD PAGEFAULTCOUNT / / 0C FAULTS; // in Fact Number of MiLocateAndReSerVewsle Calls / * D8 * / dword peakworkingsetsize; // 10 all / * DC * / dWord workingsetsize; // 14 in / * E0 * / DWORD minimumWorkingSet; // 18 Pages, Not in / * E4 * / dword maximumworkingset; // 1C Bytes / * E8 * / pws_list worksetlist; // 20 data table / * EC * / list_entry workingseTexpansion; // 24 Expansion / * F4 * / byte fl0; // Operation ??? // 2C Byte fl1; // always 2 ??? // 2d Byte fl2; // reserved ??? always 0 // 2e Byte fl3; // // 2F } VM * PVM; The extended command for Windbg! Procfields is used to VM. It is important to track the number of Page Fault, MaximumWorkingSet, and MinimumWorkingSet, and the manager supports work sets based on them. Note: In fact, PageFaultCount is not a strict count. This count is expanded in the MiLocateAndReSerVewsle function, as this function is not only called when Page Fault, which will be called in some other instances (true, rarely). The following structure describes a table containing a workpiece. Typedef struct _ws_list { DWORD quota; // 0 ??? I'm not shure .... dword firstfreewsle; // 4 start of indexed list of free items DWord firstdynamic; // 8 Num of working set wsle entries in the start // firstdynamic DWORD LastwsleIndex; // c Above - Only Empty items DWord nextslot; // 10 in fact always == firstdynamic // nextslot Pwsle WSLE; / / 14 Pointer to Table with With WSLE DWORD RESERVED1 / / 18??? DWORD NUMOFWSLETEMS; // 1C Num of items in WSLE TABLE // (Last Initialized) DWORD NUMOFWSLINSERTED; // 20 of WSLE ITEMS INSERTED (WSleInsert / // wsleremove) Pwshash_item hashptr; // 24 pinter to haveh, now we can get index of // WSLE ITEM BY Address. Present Only IF // NumofwsleItems> 0x180 DWORD HASHSIZE; / / 28 HASH SIZE DWORD reserved2; // 2c ??? } WS_LIST * PWS_LIST; TypeDef struct _wsle {// Elements of workgroup table DWORD PageAddress; WSLE * PWSLE; // PageAddress itself is the virtual address of the work page // Low 12 bits used as page properties (virtual address is always 4K multiple) #define wsle_donotinhash 0x400 // Not placed in Cache #define wsle_present 0x1 // Non-empty element #define WSLE_INTERNALUSE 0x2 // Frame used by Memory Manager // No WSLE_PRESENT's idle WSLE itself is the index of the next idle WSLE. In this way, idle WSLE is organized into a list. The last idle WSLE is expressed as -1. #define empty_wsle (Next_EMTY_WSLE_INDEX) (Next_EMTY_WSLE_INDEX << 4) #define last_empty_wsle 0xffffff0 Typedef struct _wshash_item { DWORD PageAddress; // Value DWORD WSLINDEX; // Index in WSLE TABLE } WSHASH_ITEM * PWSHASH_ITEM; // cache function is simple. Pseudo code for internal functions: // milookupwslehashindex (value, WorkingSetList) // { // Val = value & 0xffff000; // Tmptr = WorkingSetList-> Hashptr; // MOD = (Val >> 0xA)% (WorkingsetList-> HashSize-1); // if (* (TMPPTR MOD * 8) == VAL) Return MOD; // While (* (TMPPTR MOD * 8)! = VAL)) { // MOD ; // if (WorkingSetList-> HashSize> Mod) Continue; // MOD = 0; // if (fl) kebugcheckex (0x1a, 0x41884, val, value, workingsetlist); // FL = 1; //} // Return MOD; //} Let's take a look at the typical process working set. WorkingSetList is located in the address mmworkingsetlist (0xc0502000). This is the area of Hyper Space, so when you switch, you want to update these virtual gips, so that each process has its own work set structure. At the address mmwsle (0xc0502690), it is the start address of the WSLE dynamic table. The address of the end of the table is always the multiple of 0x1000, that is, the table can end at address 0xc0503000, 0xc0504000, etc. (this is to simplify the size of the WSLE table). Cache (if any) is on an offset, WSLE does not grow to this offset. Let's take a look at this table in detail: // WSLIST-0XC0502000 --- // .... // ------- 0xc0502030 ---- // PDE 00 Fault Counter // PDE 01 FAUT Counter // PDE 02 Fault Counter // // -wsle == 0xc0502690 --- - PDE / PTE ----- PFN [0] ------ // | 0 c0300000 | 403 Page Directory | C0300C00 PDE | PPROCESS // | 4 C0301000 | 403 Hyper Space | C0300C04 PTE | 1 / / | 8 mmworkingsetlist (c0502000) | 403 | C0301408 PTE | 2 // | C MMworkingSetList 0x1000 | 403 |. | 3 // | 10 mmworkingsetList 0x2000 | 403 |. // | .... // | firstdynamic * 4 Framen // | .... |. //. // | LastwsleIndex * 4 framem // ------ ------ ----- // | Free items // .... // | 0xffffffff0 // ----------------- // cache // .... Here is a interesting place where the starting part of the table has the first Dynamic page for the establishment of WSLE, WORKINGSETLIST, and CACHE. At the same time, there is also a page directory Frame, HyperSpace, and some other pages, which are required for the memory manager, and cannot be removed from the work set (flag WSLE_INTERNALUSE). After that, we can also see two variants that are offset by the PFN Frame domain. For page directory frame, this is a pointer to the process, which is the index in the table for a page that is a work set. There is also a small 0x660 byte of idle space between the starting address of the WorkingSetList and WSLE table. The information about how to assign these spaces is not, but soon starts with the user space (usually low 2GB) for user space (usually low 2GB), that is, if, if you say, the index 0x100 has a value of 3, then Start from 3 (if possible) Page Fault is used to range [0x40000000-0X403FFFFF] page. The quota of the work set can be modified by the exported undisclosed function in kernel mode: Ntoskrnl mmadjustworkingsetsize DWORD minimumworkingSet optional, // if Both == -1 DWORD MAXIMUMWORKINGSET OPTIONAL, / / EMPTY WORKING SET PVM VM Optional); To deal with the WorkingSet, the manager uses many internal functions, understanding these functions, you can understand the principles of its work. 06. Change the page to PageFile ========================================= Frame can be idle - When RefCounter is equal to 0 and in a linked list. Frame can belong to a work set. When the idle frame is missing, the FRAME is changed, when TRESHHOLD is reached. The high-level function in this regard is. The task here is confirmed by codec. There are up to 16 PageFiles in NT. The creation of PageFile occurs occurs in module SMSS.exe. At this time, open the file and its handle copy to the handle table of the PSinitialsystemProcess process. I give the prototype of the unapplicared system function that creates the PageFile (if you do not call the core, you must have the permissions of this file). NTSTATUS NTAPI NTCREATEPAGINGFILE Punicode_String FileName, Plarge_integer minlen, // high double word should be 0 Plarge_integer maxlen, // minlen should be greater than 1m DWORD reserved // ignore ); Each PageFile has a paning_file structure. Typedef struct _paging_file { DWORD minpagesNumber; // 0 DWORD maxpagesnumber; // 4 DWORD maxpagesForflush; // 8 (biggest value for page out) DWORD FREEPAGES; // C (Free Pages in Pagefile) DWORD USEDPAGES; // 10 busy page DWORD MAXUSEDPAGES; / / 14 DWORD CURFLUSHINGPSITION; / / 18 -??? DWORD reserved1; // 1C PPAGEFILE_MDL MDL1; // 20 0x61 - EMPTY??? PPAGEFILE_MDL MDL2; // 24 0x61 - EMPTY??? PRTL_Bitmap PageFileMap; // 28 0 - Idle, 1 - Contains Displacement Page Pfile_Object fileObject; // 2c DWORD NUMBEROFPAGEFILE; / / 30 Unicode_String FileName; // 34 DWORD LOCK; // 3D } Paging_file * ppaging_file; DWORD MMNUMBEROFACTIVEMDLENTRIES; DWORD MMNUMBEROFPAGINGFILES; #define max_num_of_page_files 16 PPAGING_FILE MMPAGINGFILE [MAX_NUM_OF_PAGE_FILES]; When the memory subsystem is started (Mminitsystem (...)) will start the thread mimodifiedPageWriter, the thread performs the following: Initialize Mipaging and MimappedFileHeader, create and initialize MMMappedFilemdl in the non-changed domain, establish a priority low_realtime_priority 1, wait KEVENT Initialize the MMMappedPageWritRevent and MMMappedPageWriterList Links, launch the MIMAPPAGEWRITER thread, start the function MimodifiedPageWriterWorker. In the task MiModifiedPageWriterWorker will wait for the event MmModifiedPageWriterEvent, processing and MmModifiedPageList list MmModifiedNoWritePageList and are ready to implement a change to the image file or pagefile page (or call MiGatherMappedPages MiGatherPagefilePages). Use the IoasynchronouSPageWrite () function in MigatherPageFilePages to exchange Frame. And not a frame, but a cluster (sum "sum of MMModifiedWriteClasize). The PageFile change page is tracked by PageFileMap in the paning_file structure. The pseudo code of the research function is in appendix.txt. There is no significance of the pseudo code here - it is very simple. 07.Page fault processing =============================== We have all the necessary information for turning to PageFault. When converting a linear address, the PDE / PTE of the PDE / PTE used when the linear address (page mechanism is opened) is invalid or violates the protection rules, and an exception 14 is generated in the i386 processor. At this time, there is an error number in the stack, including the following information: User / core error bit (exception happened to RING3 or RING0?), Read and write error bit (try reading or writing?), Page exists. In addition, there is a 32-bit linear address that produces an abnormality in the CR2 register. The internal nuclear treatment 14 is interrupted is _Kitrap0e. When the page to be converted does not have a corresponding physical page, the memory manager performs a certain work to "corrections". These are completed by the abnormal processing function MMAccessFault FAULT (WR, ADDR, P); Before you analyze the analysis of the coded, it is useful if it is useful in what is useful. The most obvious is to access the error. At this time, the code of RING3 attempts to write to the page of the U bit in the PTE / PDE or the read-only page (no W bit is set in the PTE / PDE). Moreover, the page can be swapped out in the page file, and the P bit is not set in the PTE of these pages, but the information indicates which page file is looking for Frame, and the offset of Frame. There is also a similar situation - Frame belongs to the image file. In addition, the converted page may only belong to the allocated memory area (using ntallocateMemory), or it may be converted to a page that is not converted. In this case, VMM assigns a clear zero-zero Frame (this is C2 Requirements). Finally, exception may also be triggered by writing a copy on write page and converted shared memory. Only the main situation is listed above. The result of processing is usually added to the current process's Working SET to add the corresponding Frame. Each of the abnormalities has an internal structure associated with it, and the VMM processes these structures. These structures are more complicated, and if they are fully described, there is a need to disassemble a large number of functions. There is currently no complete information of most structures, but it does not require this for understanding the abnormal handler. I will roughly describe the concept of VAD and PPTE, and study the pseudo code for the exception handler. VAD Operating virtual addresses requires VAD (Virtual Address Descriptor). We are well known (there is an almost the same name of Win32 function calling this function) Unapproved function NTALLOCATEVIRTUALMEMORY (RING0 is ZwallocateVirtualMemory) Operate these structures. Each VAD describes the area in the virtual address space, in fact, in addition to the stand address of the area (see the ZwallocatevirualMemory function parameters). At the same time, there are other special information (now there is no full information of VAD except for the first part). The VAD structure is only meaningful for the user address (low 2GB), using these structural VMMs to capture an abnormal area. The structure of the VAD is a balanced binary tree (with internal functions responsible for trusting this tree), which is optimized for findings. There are two pointers that point to back elements - left and right brackets in Vad. The root of the tree is located at the Vadroot domain of the EPROCESS structure (NT 4.0 is offset 0x170). Of course, every process has its own VAD tree. The first form of VAD is as follows: Typedef struct vad_header { Void * StartingAddress; Void * endingaddress; Struct Vad * Parentlink; Struct Vad * LeftLink; Struct Vad * RightLink; Ulong flags; VAD_HEADER, * PVAD; PPTE Prototype PTE is a linear address translation of the again and is used to share memory. Suppose there is a file mapped to several (3) processes of address space. The PPTE table contains PPTE, which describes the physical page of the file loaded to the memory. Some PPTEs can have a P bit (the location and meaning of the PTE / PDE), and some are not, the information without the P bit is used to load Frame or from the image file to load files. The files of all three processes are mapped on different addresses, and the P bit corresponding to the PTE of these pages is not set, and the PPTE of the file page is included. Thus, when the transition is mapped to the file, an exception 14, VMM finds PTE, obtains a reference to PPTE, and now you can "correct" corresponding PTE to point to Frame belonging to the file. At this time, you must load Frame from the file. I give the format that is not set to P-bit PTE, which points to prototype PTE in the page table. PTE Points to PPTE ------------------------------ - - - ------------ - | 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 | 1 | 0 0 | 0 0 0 0 0 0 0 | 0 | | 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 | 0 | 9 8 | 7 6 5 4 3 2 1 | 0 | ------------------------------ - - - ------------ - | Address [7:27] | 1 | UN | Address | 0 | | | | | USE | [0: 6] | | | | | | D | | | ------------------------------ - - - ------------ - * Mmaccessfault We started to study the pseudo code of mmaccessfault. Prototype: NTSTATUS MMACCESSFAULT (Bool WR, DWORD ADDR, BOOL P) The meaning of the parameters is obvious: write logo, an exception's address and page exists. This information is sufficient for the reasons for determining an abnormality. Is the addr belong to the kernel address space or the user address space, the handler selects one from two execution branches. The handler in the first case is simpler, tracking Access Viology or the page (MidispatchFault) in the Working SET. If the address of the user space is more complicated. First, if PDE is not in the memory, an exception handler for PDE is performed. Then, a branch appeared. The first branch-page exists. This means that it is either an Access Violation or it is the process of handling the Copy ON Write. Second branch-handling the page request, Access Viological, Page Border (Guard) (Growth) and the necessary to reclaim on the Working SET. Interestingly, the system will increase the size of the Working Set when a large number of Page Fault occurs. In the case of zero PTE, for the determination status, the handler has to use the VAD tree to determine the properties attempting to access the area. These are MiaccessCheck's work, this function returns the status of access. Under normal circumstances, the main lane work of the abnormal handler is executed by the MIDISPATCHFAULT function. It can make more accurate identification and decide the next step. It is primarily based on some lower functions: MIRESOLVEDEMANDZEROFAULT, MIRESOLVEPROTEFAULT, MIRESOLVEPROTEFAULT, MIRESOLVEPAGEFAULT, MIRESOLVEPAGEFAULT. From the names of these functions, this function is used to determine more specific situations: the page "Status" (possibly recovering the Working Set) page should be blank Frame, PTE points to PPTE and Frame Out of the corresponding page file. In the case of the page file and some related to PPTE, then you may need to read Frame from the file, at which point the function return value is 0xc0033333, indicating that the page must be read from the file. This is done by IopageRead in MidispatchFault. Let's take a more careful study of the functions mentioned. We started from MIRESOLVEDEMANDEMANDEMANDZEROFAUT. If you look at the pseudo code of this function, you can easily understand its working logic. Request Zero Frame and get this frame. Perform a function MiremoveZeropage or MiremoveanyPage at this time. The first function takes a page from the linked list of ZERO. If it is not successful, select any page via the second function. In this way, this page is cleared by MIZEROPHYSICALPAGE. Finally, in MiaddValidPageToWorkingSet, the clear page is added to the work set (happens, this fact proves that the process cannot get access to the unprocessed page when allocating the memory. Now let's study more complicated situations - Page in page files. The previous pseudo code requires a structure. When you are ready to read the page from a file, you will populate the Page_support_Block structure. Thereafter, the following will be made to all PFNs that are about to participate in the operation: set the READ IN Progress flag and write the address of page_support_block in the MISC field (function miinitializeReadInProgressPFN). Finally, the function returns to Magic Number 0xc0033333, indicating that this structure is then used in IopageRead calls (happens, IOpageRead is exported, but not disclosed. It can easily obtain prototypes from their pseudo code). Typedef struct_page_support_block {// size: 0x98 Dispatcher_Header Dispheader; // 0 Fastmutex IO_STATUS_BLOCK IOSTATUSBLOCK; // 0x10 Large_integer addrinpagefile; // 0x18 (file offset) DWORD REFCOUNTER; // 0x20 (0 | 1) ??? KThread thread; // 0x24 Pfile_Object fileObject; // 0x28 DWORD addrpte; // 0x2c PPFN PPFN; // 0x30 MDL MDL; // 0x34 DWORD MDLFRAMEBUFFER [0x10]; // 0x50 List_entry pageSupportlist; // 0x90 Links related to MMINPAGESUPPORTLIST } Page_support_block * Page_support_block; Struct_mminpageSupportlist { List_entry pageSupportlist; DWORD count; } MminpageSupportlist; The function MIRESOLVEPAGEFAULT itself is very simple, in addition to filling the corresponding structure and returns anything out of 0xC00333333. The rest is to perform MidispatchFault. This is very reasonable, if you still remember the principle of multiplexing code. There is also a less complex function MIRESOLVETRANSITIONFAULT. There is also a few words to say more about the Frame for the State Transition. From this state, Frame can quickly return to the process of Working SET. Thus, the last situation is left --Proto PTE. The processing function of this situation is not very complicated, and the basis of supporting it has been told. In fact, there is still a function related to this situation, which is MicompleteProtoptefault, which is called from the MIDISPATCHFAULT. To understand the work of these functions, look at the pseudo code. 07. Section object ================ The section object in NT is a memory, which is shared by a process unique or several processes. The section is file mapping object in the Win32 subsystem. Let's take a look at what the section object is. Section is a very common object under NT, and the execution system uses section to load the executable image into the memory and use it to manage Cache. Section also maps files in the process address space. At this time, the access file is like accessing memory. The Section object is like other objects, which is created by the Object Manager. High-level information tells us that the body's body contains the following types of information: the maximum value, protection properties, other properties of section. What is the maximum accessible value of section, this doesn't say it. Protecting properties are the properties for the section page. Other section attributes are indicated whether file sections or null values (mapped into page files), and whether the section is BASE. Base's section maps into all process spaces in all processes in the same virtual address. In order to get the true information of this object structure, I have disassemble some memory manager functions for section. The following information is not seen in other places. Let's look at the structure. Each file in the system is an object (described in NTDDK.H) File_Object. There is sectionObjectPointer in this structure. There is also its structure in ntddk.h. // : Psection_Object_Pointers SectionObjectPointer; : // Typedef struct _section_object_pointers { PVOID DATASECTIONOBJECT; Pvoid SharedCacheMap; PVOID imageSECTIONObject; } Section_object_pointers; There are two pointers in the structure - DataseCtionObject and ImageSecionObject. Ntddk.h is written to PVOID because they are referenced by unusless structures. DatasectionObject When you open the file as a data. ImageSecionObject - at this time as a image. The types of these pointers are all the same, and it can be called PControl_area. All of these structures below are Windows 2K, compared with some of NT 4.0. Typedef struct _control_area {// for nt 5.0, size = 0x38 PSEGMENT PSEGMENT; // 00 PControl_area flink; // 04 PControl_area blink; // 08 DWord SectionRef; // 0C DWORD PFNREF; // 10 DWORD mappedViews; // 14 Word subsections; // 18 Word flushcount; // 1a DWORD Userref; // 1C DWORD FLAGS; // 20 Pfile_Object fileObject; // 24 DWORD UNKNOWN; / / 28 Word modwritecount; // 2c Word systemviews; // 2e DWORD PAGEDPOOLUSAGE; / / 30 DWORD NONPAGEDPOOLUSAGE; / / 34 } Control_area, * PControl_area; We can see that Control_area forms a linked list, which contains statistical values and flags. In order to understand the information represented by the flag, I give their value (for NT5.0 / ****************** nt5.0 ****** ************ / #define beingdeleted 0x1 #define beingcreated 0x2 #define beingpurged 0x4 #define nomodifiedwriting 0x8 #define failio 0x10 #define image 0x20 #define based 0x40 #define file 0x80 #define networked 0x100 #define nocache 0x200 #define physicalmemory 0x400 #define CopyonWrite 0x800 #define reserve 0x1000 #define commit 0x2000 #define floppymedia 0x4000 #define Waspurged 0x8000 #define userReference 0x10000 #define globalmemory 0x20000 #define deleteonclose 0x40000 #define filepointernull 0x80000 #define debugsymbolsloaded 0x100000 #define setmappedfileiocomplete 0x200000 #define collidedflush 0x400000 #define nochange 0x800000 #define HaduSerReference 0x1000000 #define imagemappedinsystemspace 0x2000000 The number of subsections is followed by the number of subsections. Each SubSecion describes information about the specific file mapping section. For example, Read-Only, Read-Write, Copy-Write, etc. NT5.0 Subsection Structure: Typedef struct _subsection {// size = 0x20 nt5.0 // 0x10 if GlobalonlyPersession PControl_Area Controlarea; // 38, 00 DWORD FLAGS; // 3C, 04 DWORD Startingsector; // 40, 08 DWORD NUMBEROFSECTORS; // 44, 0C Pvoid Basepte; // 48, 10 Pointer to Start Pte DWORD UNUSEDPTES; // 4C, 14Dword PTESINSUBSECT; / / 50, 18 PSUBSECTION PNEXT; // 54, 1C Subsection, * psubsection; In the subsection, there is a pointer, flag, pointing to the Base Proto PTE, the number of PROTO PTE. Startingsector is the number of 4K block numbers, and the section in the file starts here. There is additional information in the logo: #define ss_protecion_mask 0x1f0 #define ss_sector_offset_mask 0xfff00000 // (Low 12 BITS) #define ss_starting_sector_high_mask 0x000FFC00 / / (NT5 Only) (IN PAGES) // Other 5 Bit (s) #define readonly 1 #define readwrite 2 #define CopyonWrite 4 #define globalmemory 8 #define largepages 0x200 Let's see the remaining last structural segment, which describes all mappings and Proto PTEs for mapping the section. The SEGMENT memory is allocated from the Paged Pool. I give segment structures (NT 5.0) Typedef struct _segment { PControl_Area Controlarea; // 00 DWord Baseaddr; // 04 DWORD TOTALPTES; // 08 DWORD NONEXTENDPTES; // 0C Large_integer sizeofsegemnt; // 10 DWORD ImageCommit; // 18 DWORD imageInfo; // 1c DWORD imageBASE; // 20 DWORD commit; // 24 PTE PTTEMPLATE; / / 28 OR 64 BITS IF PAE ENABED DWORD BASEDADDR; / / 2C DWORD BaseadDrpae; // 30 if PAE Enabled DWORD protoptes; // 34 DWORD protoptespae; // 38 if Pae enabled } Segment, * psegment; As I expected, the structure contains references to Control_area, pointing to the POOL of the Proto PTE, and all the information of all sections. There is something to explain. The structure of the structure depends on whether to support PAE. PAE is Physical Address Extionion. From the 5th edition, Windows NT contains kernel NTKRNLPA.EXE that supports PAE. In general, supporting PAE means that the virtual address can be used in NT is not 4GB but 64GB. The address translation when using PAE is more than one level - all the virtual subnet spaces are divided into 4 parts. The size of PTE and PDE is not 4B while opening PAE, but we can see from the Segment structure. Now you don't need to talk about PAE in detail, after all, so we use this. It has been introduced in all structures describing the section, and the section object structure is not mentioned yet. From intuitive, it should be referenced to Segment or Control_Area, because all information saved after the two structures can be obtained. The body of the section object obtained by disassembly is the following form: Typedef struct _section_object {// size 0x28 VAD_HEADER VADHEADER; // 0 PSEGMENT PSEGMENT; // 0x14 Segment Large_integer seatsize; // 0x18 DWORD controlflags; // 0x20 DWORD PGPROTECTION; // 0x24 } Section_object, * section_object; #define Pagefile 0x10000 #define mappingfile 0x8000000 #define based 0x40 #define unknown 0x800000 // Not Sure, IN Fact it's allocattrib & 0x400000 We see that the resulting structure is fully conforming to the description of the existing high-level information. The only thing that may have questions is VAD_HEADER. It describes the location of Base Section in the address space. VAD_HEADER is located in the VAD tree in the vertex as _mmsectionBasedroot. We once again experience that it is necessary to understand the working principle of the operating system, we must understand the structure of its internal. In order to have an overall grasp, one of the structures associated with the structures describing the sections are given below. Section_Object-> segment <-> control_area-> file_object-> section_object_pointers ^ | | ------------------------------------------ 08. From the perspective of the memory manager, the process is created ============================================================================================================================================================================================================= == As we introduce the creation of the process from the Win32 perspective, also talked about the working principle of the Memory Manager and Object Manager, and the section object structure. The most interesting thing now is to take into account the memory manager in the process creation. The process is created with NTCReateProcess () with an unprecedented system. Here is given in a pseudo code: / ************************************************** **************** / / * - Here it is, Just Wrapper - * / NtcreateProcess Out Handle, In Access_mask Access, In POBJECT_ATTRIBUTES OBJECTATTRIB, In Handle Parent, In boolean inherithandles, In Handle SectionHandle, In Handle Debugport, In Handle ExceptionPort { IF (PARENT) { Ret = pspcreateprocess (Handle, Access, Objectattrib, PARENT, InheritHandles, SectionHandle, DEBUGPORT, ExceptionPort); } Else Ret = status_INVALID_PARAMETER; Return Ret; } We see that NTCReateProcess is a package for another internal function PSPCREATEPROCESS. The only job for NTCReateProcess is to check the Parent (parent process handle). But then we have seen that this is not a significiance for NT, because in general, the inheritance of the process itself does not have a particular meaning. Now let's see pspcreateprocess (). PSpCreateProcess Out Phandle Handle, In Access_mask Access, In POBJECT_ATTRIBUTES OBJECTATTRIB, In Handle Parent, In boolean inherithandles, In Handle SectionHandle, In Handle Debugport, In Handle ExceptionPort ); I quickly noticed that the PARENT parameter in the function can accept 0, which indicates that this parameter is verified in NTCREATEPROCESS to limit user mode. In the parameters of the function, there is a reference to the section, Debug Port, and Exception Port, the parent process. By calling ObreferenceObjectByHandle, you can get pointers to these objects. In fact, the parent process handle is usually transmitted - 1, which is a current process. If the PARENT is equal to 0, the Affinity of the process does not get from the parent process, but acquires from the system variable. IF (PARENT) {// Get Pointer to Father's Body ObreferenceObjectbyHandle (Parent, 0x80, PSProcesstype, Prevmode, & Pfather, 0); AffinityMask = pfather-> affinity; // on witch processors will be executed PRIOR = 8; } Else { Pfather = 0; AffinityMask = KeactiveProcessors; PRIOR = 8; } The priority is always 8. Subsequently, create a process object. NT4.0 has a size of 504 bytes. // size of process body - 504 bytes // CREATING process Object ... (Type Object PSProcesSstype) ObcreateObject (PrevMode, PSProcesstype, ObjectAttrib, Prevmode, 0,504, & pprocess); // Clear Body MEMSET (PPRocess, 0,504); Initialize some domains and quota blocks (related introduction of the object manager). PPROCESS-> CREATEPROCESSREPORTED = 0; PPROCESS-> Debugport = PDEBugPort; PPROCESS-> EXCEPTPORT = PEXCEPTPORT; // inherit quota block, if pfather == null, PSPDefaultquoTablock PSPinheritquota (PPRocess, Pfather); if (pfather) { PPROCESS-> DefaultharderrorMode = Pfather-> DefaulthardErrorMode; PPROCESS-> inheritedFromuniqueProcessId = Pfather-> UniqueProcessId; } Else { PPROCESS-> inheritedFromuniqueProcessId = 0; PPROCESS-> Defaultharderrormode = 1; } After that, MMCreateProcessAddressSpace is called, and the address context is created. The parameter is a pointer, the size of the work set, and a pointer to the result structure. This structure is as follows: Struct process_address_space_result { DWORD DT; // Dict. Table Phys. AddR. DWORD HYPSPACE; // HYP Space Page Phys. Addr. DWORD WORKINGSET; // Working Set Page Phys. Addr. } Casresult; MmcreateProcessAddressSpace (PSMinimumworkingSet, PPRocess, & CasResult); We see that the function returned to us is the page address of the page table (the content of the CR3 of the new address space), the page address of the Hyper Space and the page address of the work set. This is followed by some domains of the initialization process object: PPROCESS-> minimumWorkingSet = minworkingset; PPROCESS-> MaximumworkingSet = MaximumworkingSet; KeinitializeProcess (PPRocess, Prior, Affinitymask, & CasResult, PPROCESS-> DefaultharderrorProcessing & 0x4); PPROCESS-> ForegroundQuantum = pspfeoregroundquantum; If there is a parent process and set the flag parameter, the handle table of the parent process will be inherited: IF (pfather) // if there is faather and inherithandle, so, inherit handle db { Pfather2 = 0; IF (binherithandle) pfather2 = pfather; ObinitProcess (pfather2, pprocess); // see Info About ObjectManager } The following things are more interesting, prove the flexibility of the NT execution system, can't see from the surface. If you have a specified section in the parameter, use this section to initialize the address space of the process, otherwise it will work like * UNIX (). IF (psection) { MminitializeProcessAddressSpace (PPRocess, 0, Psection); ObdereferenceObject (Psection); Res = obinitprocess2 (pprocess); // Work with unknown byte 0x22 in process IF (RES> = 0) PSPMAPSystemDLL (PPRocess, 0); Flag = 1; // Created AddR Space } Else {// if there is futher, but no section, so, do operation like fork () {PfatherProcess) { IF (psinitialsystemprocess == pfather) { Mmres = mminitializeProcessAddressSpace (pprocess, 0,0); } Else { PPROCESS-> SectionBaseAddress = Pfather-> SectionBaseAddress; Mmres = mminitializeProcessAddressSpace (PPRocess, Pfather, 0); Flag = 1; // Created AddR Space } } } Next, use PSACTIVEPROCESSHEAD to insert the process into the Active Process Link list, create a PEB and other auxiliary work. We will not repeat it again. Finally, when all work is finished, the work of safety subsystems is performed. We have studied the safety subsystem in the past, so it is only simply given a pseudo code here. Just I noticed that if the parent process is SystemProcessHandle, it is not checked without the systemProcessHandle. // finally, security operations IF (Pfather && PspinitialsystemProcessHandle! = Father) { ObgetObjectSecurity (PPRocess, & SecurityDescriptor, & memoryallocated); Ptoken = psreferenceprimarytoken (pprocess); AccessRes = SeaccessCheck (SecurityDescriptor, & SecurityContext, 0,0x2000000, 0,0, & psprocessToken-> genericmapping, Prevmode, PPRocess-> grantedAccess, & AccessStatus); ObdereferenceObject (ptoken); ObreleaseObjectSecuryty (SecurityDescriptor, MemoryAllocated); IF (! access "pprocess-> grantedAccess = 0; PPROCESS-> GrantedAccess | = 0x6fb; } Else { PPROCESS-> grantedAccess = 0x1f0FFF; } IF (sedetailedauditing) SeauditProcessCreation (PPRocess, Pfather); The most interesting is the function keinitializeProcess and MmcreateProcessAddressSpace. The previous function In addition to the other members of the initialization process object, the offset of the IO bitmap in the TSS is also initialized. PPROCESS-> IOPMOFFSET = 0x20ad; // omap base !!! // you can patch kernel here and // got I / o port control;) The selection of the offset is such that it points to the I / O bitmap so that the process can directly use the I / O port directly. The function MMCreateProcessAddressSpace is created in process address space. I don't give all pseudo code, just write the main operations. It selects pages for Hyper Space, Working Set, and Page Directory. Reflexible code confirms that they are selected from the Zero Frame Link or by the MIZEROPHYSICALPAGE function. Then initialize the newly created Page Directory. PPRocess-> workingSetPage = frame3; // workingSetPage (MMPFNDATABASE 0x18 * frame) -> PTE = 0xc0300000; Validpde_u = validpdepde & 0xeff ^ frame2; // hyperspace / ************** Important !!!!!!!!!!!!! **************************** **** / / * Important! Here, PD * / / ************************************************** ************ / VA = MIMAPPAGEINHYPERSPACE (Frame, & LastiRQL); // no We got va of our New Page Directory // Fill Some Fields * (VA 0XC04) = validpde_u; // hyperspace Validpde_u = validpde_u & 0xfff ^ Physaddr; // DT * (VA 0xC00) = validpde_u; // Self-PDE // Copy from Current Process, kernel address mapping Memcpy (Mmvirtualbias 0x80000000) >> 0x14 va, // it's like this we found, // What mmvirtualbias is it;) (Mmvirtualbias 0x80000000) >> 0x14 0xc0300000, 0x80 // 32 PDES -> 4MB * 32 = 128MB ); Memcpy (// Copy PDES, CORRESPONDING to NONPAGEDAREA MmnonPageDsystemStart >> 0x14 VA, MmnonPageDsystemStart >> 0x14 0xc0300000, (0xC0300FFC-mmnonPageDsystemStart >> 0x14 0xc0300000) & 0xffffffffc 4); Memcpy (Va 0xc0c, // Cache, Forgot About it now, it's another story;) 0xc0300c0c, (MMSystemCachend >> 0x14) -0xc0c 4 ); That is to copy PDE to the kernel address space (which is the same for all processes, except Hyper Space), and is a copy to an irreremented area. At the same time, this space belongs to the system cache. 09. Context ========================== Knowing the working principle of Ethread, Eprocess Structures, and Memory Manager, it is not difficult to guess what will happen when the context switch. Windows NT's designer uses threads, don't care about who is shared, it is possible: there are two possibilities: thread belongs to the current process - must switch to another thread (update the stack and replace the GDT descriptor), and Threads belong to another process, must switch to that process (reload CR3). In this regard, in order to confirm my speculation, I have contained the KeattachProcess function. This function is not disclosed, but all known functions are used to switch to the address space of another process. You can return to the current process through KedetachProcess. KeattachProcess uses the following internal functions: kiattachprocess - KeattachProcess is just a package of this function Kiswapprocess - Replace address space. (Essentially is reloading CR3) SwapContext - Replace the context. Generally, regardless of the address space switch, only the thread is adjusted. Kiswapthred - Switch to the next thread in the list (SWAPCONTEXT) The pseudo code for these internal functions is given below. -------------------------------------------------- --------------------------- / ************** KEATTACHPROCESS ********************************** *** / // Just Wrapper // KeattachProcess (EPRocess * Process) { KiattachProcess (Process, Keraiseirqltosynchlease); } / ************** KiattachProcess ******************************** *** / KiattachProcess (Eprocess * Process, IRQL) { // Curthread = fs: 124h //Curprocess =curthread->ApcState.Process; IF (Curprocess! = process) { IF (Curprocess-> ApcStateIndex || KPCR-> DPCROUTINACTIVE) Kebugcheckex ... } // if We Already in Process's Context IF (Curprocess == Process) {kiunlockdispatcherdatabase (IRQL); Return;} Process-> StackCount ; KimoveApcState (& Curthread-> ApcState, & Curthread-> SavedApCState); // init lists Curthread-> apCState.apclisthead [0] .blink = & curthread-> apcstate.apclisthead [0]; Curthread-> ApcState.Apclisthead [0] .flink = & curthread-> appstate.apclisthead [0]; Curthread-> apCState.apclisthead [1] .blink = & curnet-> appstate.apclisthead [1]; Curthread-> apCState.apclisthead [1] .flink = & curthread-> apcstate.apclisthead [1] ;; // Fill Curtheads's Fields Curthread-> APCSTATE.PROCESS = process; Curthread-> APCState.kernelapcinProgress = 0; Curthread-> apCState.kernelapcpending = 0; Curthread-> apCState.USERAPCPENDING = 0; Curthread-> apcstate.apcstatepoint.savedapcstate = & curnetread-> savedapcstate; Curthread-> APCSTATE.APCStatePointer.ApcState = & Curthread-> ApcState; Curthread-> APCSTATEINDEX = 1; // if process ready, Just Swap it ... IF (! process-> state) // state == 0, Ready { Kiswapprocess (Process, Curthread-> SavedApcState.Process); KiunlockdispatcherDatabase (IRQL); Return; } Curthread-> State = 1; // r? Curthread-> ProcessReadyQueue = 1; // Put Process in Thread's Waitlist Curthread-> Waitlistentry.flink = & process-> readylisthead.flink; Curthread-> Waitlistentry.blink = process-> readylisthead.blink; Process-> readylisthead.flink-> flink = & curnet-> waitlistentry.flink; Process-> readylisthead.blink = & curthread-> waitlistentry.flink; // Else, Move Process to swap list and wait IF (Process-> State == 1) {// idle? Process-> State = 2; // Trans Process-> swaplistentry.flink = & kiprocessinswaplisthead.flink; Process-> swaplistentry.blink = kiprocessinswaplisthead.blink; KiprocessinswaplistHead.blink = & process-> swaplistentry.flink; Kiswapevent.header.signalState = 1; IF (kiswapevent.header.waitlisthead.flink! = & kiswapEvent.header.waitlisthead. FLINK) Kiwaittest (& kiswapevent, 0xa); // fastcall } Curthread-> WaitiRQL = IRQL; Kiswapthread (); Return; } From this function, you can get the following conclusions. The process can be in the following state - 0 (Preparation), 1 (IDLE), 2 (TRANS - Switch). This confirms the high level of information. KiattachProcess uses two other functions Kiswapprocess and KiswaPthread. / *******************************************************************************************************************************************************************TION ***** / Kiswapprocess (Eprocess * NewProcess, Eprocess * OldProcess) { // Just Reload CR3 and Small Work with TSS // tss = kpcr-> tss; // xor Eax, EAX // MOV GS, AX TSS-> CR3 = newprocess-> DirectoryTableBase; // 0x1c // Mov CR3, NewProcess-> DirectoryTableBase TSS-> IOPMOFFSET = newProcess-> Iopmoffset; // 0x66 IF (Word (NewProcess-> ldtdescriptor == 0) {LLDT 0x00; Return; //} // gdt = kpcr-> gdt; (Qword) GDT-> 0x48 = (qword) NewProcess-> ldtdescriptor; (Qword) GDT-> 0x108 = (qword) NewProcess-> int21descriptor; LLDT 0x48; Return; } Switch process context. As I expected, this function just reloads the CR3 register and plus a point-related operation. For example, the offset of the I / O bitmap in the TSS is established with the value of the IOPMOFFSET domain. It is also necessary to load the value of the selector to the LDT (only for VDM session). / ****************************************************************************************************************************** ******* / SwapContext (Nextthread, Curthread, WaitiRQL) { NextthRead.state = threadstaterunning; // 2 Kpcr.debugactive = nextthread.debugactive; CLI (); // Save Stack Curthread.kernelstack = ESP; // set stack Kpcr.StackLimit = nextthread.stacklimit; Kpcr.StackBase = nextthread.initialstack; TMP = nextthread.initialstack-0x70; NEWCR0 = CR0 & 0xFFFFFFFF1 | NextTHREAD.NPXSTATE | * (TMP 0x6C); IF (NewCR0! = CR0) ReloadCr0 (); IF (! * (tmp-0x1c) & 0x20000) TMP- = 0x10; TSS = kpcb.tss; TSS-> ESP0 = TMP; // set PTeb Kpcb.self = nextthread.pteb; ESP = nextthread.kernelstack; STI (); // Correct GDT GDT = kpcb.gdt; Word (GDT-> 0x3a) = NextTHREAD.PTEB; BYTE (GDT-> 0x3c) = nextthread.pteb >> 16; Byte (GDT-> 0x3F) = NextTHREAD.PTEB >> 24; // if we must swap processes, Do IT (Like Kiswapprocess) IF (Curthread.ApcState.Process! = nextthread.ApcState.process) { // ******** Like Kiswapprocess } Nextthread-> contextswitches ; KPCB-> keconTextSwitches ; IF (! nextthread-> appstate.kernelapcpending) return 0; // POPF; // jnz halrequestsoftwareInterrupt // Return 0 Return 1; } Switch the stack to correct the GDT to point the FS register to the TEB. If the thread belongs to the current process, the context switch is not performed. Otherwise, the operation of the operation and the rough difference in KiswApprocess. For the son, I gave the prototype of KedetachProcess. KedetachProcess (EPRocess * Process, IRQL); We see - the pseudo code of these functions actually describes the context switching of the operating system. In general, the code analysis shows that the main way to understand the OS is to know its internal structure. 0A. Some undisclosed memory managers functions ============================================================================================================================================================================================================= ========= The memory manager of SP3 NTOSKRNL.EXE exports the following symbols: 467 1D0 00051080 mmadjustworkingsetsize 468 1D1 0001EDFA MmallocateContiguousMemory 469 1D2 00051A14 MmallocatenonCachedMemory 470 1D3 0001EAE8 MMBUILDMDLFORNONPAGEDPOOL 471 1D4 000206BC MMCANFILEBETRUNCATED 472 1D5 0001EF5A MMCREATEMDL 473 1D6 0002095C MMCreateSecion 474 1D7 00021224 MMDBGTRANSLATEPHYSICALALADDRESS 475 1D8 000224AC MMDisableModifiedWriteOfsection 476 1D9 000230C8 MMFLUSHIMAGESECTION 477 1DA 0001FA9C mmforcesectionClosed 478 1DB 0001EEA0 MMFreeContiguousMemory 479 1DC 00051AFE MMFreenoncachedMemory 480 1DD 0001EEAC MMGETPHYSICALALADDRESS 481 1DE 00024028 mmgrowkernelstack 482 1DF 0004E144 mmhighestuseraddress483 1e0 0002645a mmisaddressvalid 484 1E1 00026CD8 MMISNONPAGEDSYSTEMADDRESSVALID 485 1E2 0001F5D8 mmisrecursIofault 486 1E3 00026D56 MMISTHISANNTASSYSTEM 487 1E4 000766C8 MMLOCKPAGABLEDASECTION 488 1E5 000766C8 mmlockpagableImagesection 489 1E6 0001F160 MMLOCKPAGABLESECTIONBYHANDLE 490 1E7 0001ED18 mmmapiospace 491 1E8 0001EB74 MMMaplockedPages 492 1E9 0001F5F6 mmmapmemorydumpmdl 493 1EA 00076A14 mmmapvideoDisplay 494 1eb 0005206C mmmmapviewinsystemspace 495 1EC 00079B0E mmmapViewofsection 496 1ed 0007A7EE MMPAGEENTIREDRIVER 497 1EE 0001E758 MMPROBEANDLOCKPAGES 498 1EF 00026D50 MMQuerySystemsize 499 1F0 00052A8A MMRESETDRIVERPAGING 500 1F1 0004E0A4 mmsectionObjectType 501 1F2 00079D28 MMSecurevirtualMemory 502 1F3 0001Efce MMSetAddressRangemodified 503 1F4 0007684E MMSetBankedsection 504 1F5 0001EF2C MMSIZEOFMDL 505 1F6 0004E0A0 MMSystemRangeStart 506 1F7 0001F516 MMunlockPagableImageSecion 507 1F8 0001EA16 MMUNLOCKPAGES 508 1F9 0007669A MMUNMAPIOSPACE 509 1FA 0001ECA8 MMUNMAPLOCKEDPAGES 510 1FB 00076A2E MMunmapVideoDisplay 511 1FC 00052284 mmunmapViewinsystemspace 512 1FD 0007AFE4 MMUNMAPVIEWOFSECTION 513 1FE 0007A00A MMunsecurevirtualMemory 514 1ff 0004DDCC MMUserProbeAddress The symbol ' ' indicates that the function is recorded in the DDK. I give some prototypes of certain unappromant functions. // Adjust the size of the Working Set. Ntoskrnl NTSTATUS MmadjustworkingsetSize DWORD minimumworkingSet optional, // if Both == -1 DWORD MAXIMUMWORKINGSET OPTIONAL, / / EMPTY WORKING SET PVM VM Optional); // can file be truncated ??? Ntoskrnl Boolean mmcanfilebetruncated ( Psection_Object_Pointers SectionPointer, // See File_Object Plarge_integer newfilesize; // Create Section. NTCreateSecion Call this function ... Ntoskrnl NTSTATUS MMCREATESECTION Out pvoid * SectionObject, IN Access_mask desidaccess, In POBJECT_ATTRIBUTES OBJECTATTRIBUTES OPTIONAL, In Plarge_integer maximumsize, IN ULONG SectionPageProtection, // Page_xxxx IN Ulong AllocationAttributes, // sec_xxx In Handle FileHandle Optional, In Pfile_Object File Optional ); TypedEf enum _mmflush_type { Mmflushfordelete, MMFLUSHFORWRITE MMFLUSH_TYPE; Ntoskrnl Boolean MMFlushimageSecion In psection_object_pointers SectionObjectPointer, In mmflush_type flushtype ); Ntoskrnl DWORD MMHIGHESTUSERADDRESS; / / Usually 0x7ffeffff Ntoskrnl Boolean mmisrecursIOfault (); // #define _mmisrecursiofault () (/ (Psgetcurrentthread () -> disablepagefaultclustering | / (Psgetcurrentthread () -> forwardClusteronly) / ) Ntoskrnl POBJECT_TYPE MMSECTIONObjectType; // Standard Section Object Ntoskrnl DWORD MMSystemRangestart; // is usually 0x80000000 Ntoskrnl dword mmuserProbeAddress; // is usually 0x7fff0000 Ntoskrnl pvoid mmmapvideodisplay (// для i386 враппер в mmmapiospace In Physical_Address PhysicalAddress, In Ulong Numberofbytes, In Boolean Cacheenable ); Ntoskrnl void mmunmapvideodisplay (// для i386 враппер в mmunmapiospace In Pvoid Baseaddress, In Ulong Numberofbytes ); // Tag Frame is marked as change and perform the corresponding operation Ntoskrnl void mmsetaddressRangemodified ( Pvoid Startaddress, DWord Length ); // Call in NTMapViewOfSecion Typedef enum _section_inherit { Viewshare = 1; Viewunmap = 2; } Section_inherit; Ntoskrnl NTSTATUS MMMAPVIEWOFSECTION PVOID PSECTION, PeProcess PPRocess, Out Pvoid * BaseEaddress, DWORD ZEROBITS, DWORD COMMITSIZE, Out Plarge_integer SectionOffset Optional, Out pdword viewsize, Section_inherit inheritdisposition, DWORD AllocationType, DWord ProtectionType ); Ntoskrnl NTSTATUS MMUNMAPVIEWOFSECTION Peprocess Process, PVOID Address ); PVOID MMLOCKPAGABLEIMAGESECTION PVOID AddressWithinImageSecion // Same Entry As MMLockPagableDataSecion ); / / Reduce StackLimit (Stack Growth) NTSTATUS MMGROWKERNELSTACK PVOID CURESP / / The address of the top ); I Talk to the Wind My Words Are All Carried Away I Talk to the Wind The Wind Does Not Hear The Wind Cannot Hear. King crimson'69 -i talk to the wind 0b. Conclusion ============= Just here. If you integrate all these descriptions, how much concepts can be obtained for memory managers. Unfortunately, these things are still unable to call it complete. Memory Manager, probably the most complex and most important kernel components, and I have to take more than ten eight functions for it for a complete description. But the main basic things I have written here. For further disassembly kernels, these should be very helpful, who knows ...;) Best regards, Peter Kosy Aka Gloomy. Melancholy Coding '2001. Mailto: GL00MY@mail.ru P.S. I know that my "masterpiece" inevitably has errors. I will be very happy to listen to criticism and suggestions. appendix 0c. Some undisclosed system calls ============================================================================================================================================================================================================= Here I describe some useful ZW / NT functions that can be called in User mode or in the driver (Zw class). Almost all of these functions come from Коберниченко 's "недокументированные возмождности Windows NT" book. Adding the value of the Working SET structure, you can describe the memory_working_set_information for NTQueryVirtualMemory. NTSYSAPI NTSTATUS NTAPI NTALLOCATEVIRTUALMEMORY Handle Process, Out Pvoid * BaseEaddr, DWORD ZEROBITS, Out pdword regions, DWORD AllocationType, // MEM_RESERVE | MEM_COMMIT | MEM_TOP_DOWN DWORD protect); // page_xxxx ... NTSYSAPI NTSTATUS NTAPI NTFREEVIRTUALMORY Handle Process, Out Pvoid * BaseEaddr, OUT Pulong Regionsize, DWORD FREETYPE // MEM_DECOMMIT | MEM_RELEASE ); NTSYSAPI NTSTATUS NTAPI NTCREISECTION Out phandle section, Access_mask desirdAccess, // section_map_xxx ... POBJECT_ATTRIBUTES OBJECTATTRIBUTES OPTIONAL, Plarge_Ibteger maximumsize optin Optional, DWord sectionPageProtection, // page _... DWORD AllocationAttributes, // sec_xxx Handle FileHandle Optional // Null - Pagefile ); Typedef enum _section_inherit { Viewshare = 1; Viewunmap = 2; } Section_inherit; NTSYSAPI NTSTATUS NTAPI NTMAPVIEWOFSECTION Handle Section, Handle Process, Out Pvoid * BaseEaddress, DWORD ZEROBITS, DWORD COMMITSIZE, Out Plarge_integer SectionOffset Optional, Out pdword viewsize, Section_inherit inheritdisposition, DWORD AllocationType, // MEM_TOP_DOWN, MEM_LARGE_BAGE, MEM_AUTO_ALIGN = 0x40000000 DWORD protectiontytepe // page _... ); #define unlock_type_non_privileged 0x00000001L #define unlock_type_privileged 0x00000002L NTSYSAPI NTSTATUS NTAPI NTLOCKVIRTUALMORY In Handle ProcessHandle, In Out Pvoid * RegionAddress, In Out Polong Regionize, In ulong unlocktyperequired ); NTSYSAPI NTSTATUS NTAPI NTUNLOCKVIRTUALMORY In Handle ProcessHandle, In Out Pvoid * RegionAddress, In Out Polong Regionize, In ulong unlockTypeRequiested ); NTSYSAPI NTSTATUS NTAPI NTREADVIRTUALMORY In Handle ProcessHandle, In Pvoid Startaddress, Out pvoid buffer, In Ulong Bytestoread, Out pulong bytesreaded optional ); NTSYSAPI NTSTATUS NTAPI NTWRITEVIRTUALMEMORY In Handle ProcessHandle, In Pvoid Startaddress, In pvoid buffer, IN ULONG BYTESTOWRITE, Out pulong byteswritten optional ); NTSYSAPI NTSTATUS NTAPI NTPROTECTVIRTUALMORY (in Handle ProcessHandle, In Out Pvoid * RegionAddress, In Out Polong Regionize, In Ulong DesiredProtection, OUT Pulong OldProtection ); NTSYSAPI NTSTATUS NTAPI NTFLUSHVIRTUALMORY In Handle ProcessHandle, In pvoid * Startdress, In Polong Bytestoflush, OUT PIO_STATUS_BLOCK STATUSBLOCK ); Typedef enum _MemoryInfoclass { MemoryBasicInformation, MemoryworkingsetInformation, // There is a Class 2 - this is the information in VAD, I have not fully understood the VAD structure, and the corresponding INFO structure cannot be written. MemoryInfoclass; Typedef struct _Memory_basic_information { Pvoid Baseaddress; PVOID AllocationBase; Ulong allocationprotect; Ulong Regionsize; Ulong State; Ulong protect; Ulong Type; Memory_basic_information, * pmemory_basic_information; #define wsframeinfo_shared_frame 0x100 #define wsframeinfo_internal_use 0x4 #define wsframeinfo_unknown 0x3 Typedef struct _Memory_working_set_information { Ulong sizeofworkingset; DWORD WSENTRIES [Anysize_Array]; // is Page Va | WSFrameInfo ... } Memory_ENTRY_INFORMATION, * PMEMORY_ENTRY_INFORMATION NTSYSAPI NTSTATUS NTAPI NTQUERYVIRTUALMORY In Handle ProcessHandle, In Pvoid RegionAddress, In MemoryInfoclass MemoryInformationClass, In Pvoid VirtualMemoryInfo, In Ulong Length, OUT Pulong ActualLength Optional ); 0D. Note and Code Analysis Draft ================================================================================================================================================================================= **** к mmcreateprocessAddressSpace ... **** ============================================================================================== _ts DWORD NUMOFPAGES); // EDX: ECX STATISTIC DD MMTOTALCOMMITLIMIT DD mmtotalcommitedpages If Numofpages MmTotalCommitedPages do not exceed Limit - everything OK, and just simply correcting statistic. Otherwise, the collaboration between the thread is started. Select the TIME OUT value (if the request> = 10 pages, it is 20 seconds), otherwise it is -1 seconds. Then fill a structure, probably this look: TYPEDEF STRUCT _REQUEST_FOR_COMMITED_MEMORY { List_entry listentry; DWORD PAGESTOCOMIT; DWORD RESULT; KSemaphore semaphore; } _Request_for_commited_memory; This structure (or elements of the linked list) is inserted into the global linked list of ListofRequest in the global structure: [Pre list item] <-> [ur list item] <-> [listofrequest] Typedef struct _commit_memory_request_list { Ksemaphore CommTMemorySemaphore; List_entry ListofRequest; } Commit_memory_request_list; Then use the keyleasesemaphore to use KeeleseSemaphore and wait for the request_for_commited_memory with the amount of time OUT in Request_for_Commited_Memory. If the Time OUT is not exceeded and therefore does not be 0, then check once and output OK (if Limit has problems - all restarted). If the result is 0, MICOUSEOVERCOMMITPOPUP. If time OUT has occurred, the analysis is as follows: If Listofreques.flink == & Listofreques.flink, that is, all requests are in the tail of the queue, once again wait for the amount of time out, because not our problem; If Listofreques.flink == & RequestForCommiteDMemory.Listentry, the next one in the queue is our request (???). Request from the queue, because It is here from us. Now let's see a few pages we want to see. If> = 10 mICOUSEOVERCOMMITPOPUP, otherwise MichargeCommitmentcantexpand, then output. All operations require CLI STI, using FastMutex (10CH offset of the process), calling this function when the process is created, does not do this. Now, MiCouseOverCommitPopup (PagesNum, CommitTotalLimitDelta); and what do about it - if we want to be greater than the number of pages 128-- the ExRaiseStatus (STATUS_COMMITMENT_LIMIT); if less than the IoRaiseInformationalHardError (STATUS_COMMITMENT_LIMIT, 0,0); (these functions are public of). If the last function is called successfully -, you will accumulate Statistic: MiOverCommitCallCount ; MmtotalCommitLimit = CommitTotallimitdelta; MMEXTENDCOMMIT = COMMITTOALLIMITDELTA; MmtotalCommittedPages = PageSnum; Not correcting MMPEAKCOMMINTMENT; If you are unsuccessful but miovercommitcallcount == 0, all all equals Statistic, otherwise ExraisesTatus (Status_Commitment_Limit); Auxiliary function: DWORD NTOSKRNL RTLRANDOM (PDWORD SEED); Not surprising, this function is not open. This function uses a table of 128 DWORD tables. This table and SEED are corrected after operation. It can be seen that this gives the maximum cycle. If there are two Event MMAVAILABLEPAGESEVENTHIGH and Mmavailablepageseventhigh. MisectionInitialization: MMDEReferencesegmentheader: это структура описанная выша с добавленным Spinlock сверху. Create thread midereferencethreadthread PschargePoolquota (PVOID Process, DWORD TYPE (NP / P), DWORD Charge); [To do] - >> mminfocounters !!!! Use the corresponding ntqueryinfo ... you can get a lot of useful information, посмотреть !!! -------------------------------------------------- ------------------------- (c) GLOOMY AKA Peter Kosyh, Melancholy Coding'2001 http://gLoomy.cjb.net Mailto: GL00MY@mail.ru