Understand and use the NT driver's execution context (2)
The context of the driver's dispatch routine is executed should cause special attention. In many cases, the reputation routine of the kernel mode driver runs in the context of the caller user thread. Figure 1 shows why this is. When a user thread issues an I / O function call, for example, by calling the Win32's READFILE (...) function, a system service request will be generated. On the INTEL architecture, such requests are implemented by a soft interrupt through a interrupt gate. The interrupt door changes the current permissions level of the processor to the kernel mode, switch the kernel stack, and then call the system service distributor. The system service dispatcher then calls a function of processing the requested system service within the operating system. . Corresponding to ReadFile is a NTREADFILE (...) function in the I / O subsystem. The ntreadfile (...) function constructs an IRP and then calls the read scisup routine that corresponds to the driver of the file object referenced by the readfile (...). All of this happens above the IRQL level passive_level.
During the entire process described above, the user request is not scheduled or queued. So the context of the user thread or process has not changed. In this example, the discipline of the driver is running in the context of the user thread that issues a readFile (...) request. This means that when the driver's read partition function is run, it is the code of the user thread in executing the kernel mode driver.
Is the driver assignment function always run in the context of a requestible user thread? Well, not this. Kernel Mode Driver Design Guide 4.0 version of 16.4.1.1 Tell us, "Only the highest level of NT drivers, such as file system drivers, you can ensure that their dispatch functions are called in the context of user mode threads." From us The example can be seen that this statement is not completely accurate. The File System Driver (FSDS) is of course called in the context of the requestible user thread. In fact, any driver called directly for user I / O requests, as long as it is not to pass another driver, it ensures that the context of the requestible user thread is called. This includes the case of the file system driver. This also means that most users have written standard kernel mode drivers for user applications, such as those of those process control devices, their dispatch functions will be called in the requestible user thread context.
In fact, the driver assignment function is not called in the context of the caller thread that the user request is first oriented to a higher-level driver, such as a file system driver. If the high-level driver passes the request to a system working thread, this will result in a change in context. When IRP is finally passed to a low-level driver, the context of the context in which the high-level driver running of the IRP is not guaranteed is a context of the requesting user thread. The low-level driver will run in any thread context.
The general rule is that the distribution thread of the device's driver of the device is always run in a request-based user thread when a device is directly accessed without accessing other drivers. At this time, there are some very interesting consequences, making us do something that is also interesting.
influences
What is the consequences of distributing functions running in the context of the caller user thread? Well, some are useful, some is annoying. For example, let us assume that a driver creates a file in ZwcreateFile (...) in the dispatcher. When the same driver tries to read the file with zwreadfile (..), it will fail, unless read and creation is in the context of the same user thread. This is because the handle and file object are stored by thread. Continue the above example, if the zwreadfile request is successfully issued, the driver can choose to wait for the reading operation to complete on an event related to the read operation. What will happen when this is waiting for it? The current user thread is placed in a waiting state, references an event instruction object. So far, the operation of asynchronous I / O request is only so only! The operating system distributor finds the next thread with the highest priority. When an event object is set to an excited state due to readfile (...) request, only when the user thread is one of the N own priority ready-to-peer threads, when the user thread is once again, the driver is running. There are also some very useful benefits to running in a request for user thread context. For example, with a handle value -2 (means "current thread") calling the ZWsetInformationThread (...) function will allow the driver to change all the properties of the current thread. Similarly, use NTCURRENTPROCESS (...) handle value (defined in NTDDK.H) to call ZWsetInformationProcess (...) will allow all features of the driver current process. Note that because the two calls are issued in the kernel mode, they will not adhere to the security. That is to say, this approach may change the thread or process attribute that the thread itself cannot be accessed.
However, the most useful place to run in the user thread context in the request is perhaps the ability to access the user's virtual address. For example, consider a simple, a driver that is directly used by the user program. We assume that a write operation on this device is composed of a shared memory area directly copying 1k data from the user buffer to the device, and the shared memory area of the device is always accessible.
The traditional design of the driver of this device may use a buffered I / O because the amount of data to be moved is far less than the length of a page. That is, I / O Manager will assign each write request to each write request for each write request to copy the same buffer from the user buffer to the buffer in this non-paging pool. I / O Manager calls the driver's write subordinate routine, providing a pointer to buffers in the non-paged pool (IRP) in the IRP. Then, the driver copys the data to the buffer in the non-paged pool to the shared memory area of the device. How high is this design efficiency? Well, to copy two data for completing one thing, don't mention the fact that I / O Manager also assigns a shared pool for buffers in the non-paged pool. I don't want to call it minimum overhead design.
Suppose we have to increase the performance of this design and still use traditional methods. We can let drivers use direct I / O. In this case, I / O Manager finds and locks the page containing user data in memory. The I / O Manager then describes the user data buffer with a memory descriptor list (MDL), pointing to the MDL pointer to the driver (IRP-> MDLADDRESS). Now, when the driver's write partition function gets IRP, it needs to create a system address that can be used as a copy operating data source with MDL. This is done by calling iGetsystemAddressFormDL (...), which then calls mmmaplockedPages (...) to map the page table in MDL to the kernel virtual address space. Using the kernel virtual address returned by IOGETSystemAddressFormdl (...), the driver user buffer copy data to the shared memory area of the device. How high is this design efficiency? Well, it is better than the first design. But the mapping is not a low-cost operation. So what is the alternative for these two traditional design? Well, assume that the user program is done directly with this driver, and we know that the driver's dispatch routine is always called in the context of the requestible user thread. Therefore, we can use "non-I / O" to bypass the design of the buffered I / O and direct I / O. The driver is not specified in the logo word of the device object, which does not specify the do_buffered_io bit to indicate "non-I / O". When the write partition function of the driver is called, the user mode virtual address of the user data buffer can be found in IRP-> Userbuffer. Because the kernel mode virtual addresses that point to the user spatial location, the user mode virtual address pointing to the same location is the same, the driver can directly use IRP-> UserBuffer, copy data from the user data buffer to the device's shared memory area. Of course, an error is incorrect when accessing the user buffer, the driver can be included in a try ... Except statement block. Without mapping, no copy, no shared pool allocation. It is a direct copy. There are not those that I am saying low overhead.
But there is an adversely used "non-I / O". What happens if the user passes a buffer pointer illegally illegally imposed on the driver? The Try ... Excpet statement cannot capture this problem. For example, a pointer that is mapped to read-only by the user process to read only, but can read / write in the kernel mode. In this case, the mobile operation of the driver will simply put the data in the user program. It seems to be read-only! Is this a problem? Well, depending on the driver and application. Only you can decide whether this design is worthy of potential risk.
limit
Finally, use an example to demonstrate many of the possibilities of the drivers running in the context of the user thread that issued a request. This example will demonstrate the context of the caller user process running in the kernel mode when the driver is running. We wrote a pseudo device called Switchstack. Because it is a pseudo device, it is not related to any hardware. This driver supports creation, off, and an IOCTL operation using Method_neither. When the user program issues this IOCTL, a VOID type pointer is provided as an input buffer of the IOCTL, and a function pointer (parameter is a pointer to a VOID type and returns VOID) as an output buffer of the IOCTL. When the IOCTL is processed, the driver calls the specified user function and passes the PVOID as the context variable. The result function in the user address space will be executed in kernel mode.
According to NT design, there are very few things that the callback function cannot be done. It can issue Win32 function calls, pop up dialogs, and execute file I / O. The only difference is that this user program will run in kernel mode and use the kernel stack. When an application runs in kernel mode, it is not subject to privileges and quota limits, not protected. Because all functions executed in kernel mode have IOPL, this user program can even issue IN and OUT commands (of course, on the system of Intel architecture). Your imagination (plus a little common sense) is only limited by the type of thing that can be done by the driver. //
// switchstackdispatchioctl
//
// this is the dispatch routine Which Processes
// Device I / O Control Functions Sent to this Device
//
// INPUTS:
// DeviceObject Pointer to a Device Object
// IRP Pointer to an I / O Request Packet
//
// Returns:
// NSTATUS Completion Status of IRP
//
//
NTSTATUS
SwitchstackDispatchioc (in PDevice_Object, DeviceObject, In PIRP IRP)
{
PIO_STACK_LOCATION IOS;
NTSTATUS STATUS;
//
// Get a Pointer to Current I / O Stack Location
//
IOS = IOGETCURRENTIRPSTACKLOCATION (IRP);
//
// Make Sure this is a Valid IOCTL for US ...
//
IF (iOS-> parameters.deviceioControl.iocontrolcode! = ioctl_switch_stacks)
{
Status = status_INVALID_PARAMETER;
}
Else
{
//
// Get the Pointer to the Function to Call
//
Void (* UserfunctTocall) (Pulong) = IRP-> UserBuffer;
//
// and the argument to pass
//
Pvoid Userarg;
Userarg = iOS-> parameters.deviceioControl.Type3InputBuffer;
//
// Call User's Function with the parameter
//
(* UserfunctTocall) ((Userarg);
Status = status_success;
}
IRP-> iostatus.status = status;
IRP-> iostatus.information = 0;
IOCOMPLETEREQUEST (IRP, IO_NO_INCREMENT);
Return (status);
}
The above is the DISPATCHIOCTL function of the driver. This driver is called in the standard Win32 system service call, as shown below:
DeviceioControl (HDRIVER, (DWORD) ioctl_switch_stacks,
& Userdata,
Sizeof (PVOID),
& Originalwinmain,
Sizeof (PVOID),
& cbreturned,
Design This example is of course not encouraging you to write programs running in kernel mode. However, this example has explained that when your driver is running, it is indeed running in a context of a normal Win32 program, with all variables, queues, Windows handles, and so on. The only difference is running in kernel mode, using the kernel stack.
to sum up
It's getting here. Understanding the context will be useful tools, which help you avoid some annoying problems. Of course it allows you to write some very cool drivers. Let us look forward to this help you. I wish you a happy driving!