Anti-virus engine design

zhaozj2021-02-16  77

Excerpt SEOUL Author: NJUE

Introduction to anti-virus engine design

1. Introduction This article is the main content of the study as indicated by its topic is designed and prepared a advanced anti-virus engine. First of all, I need to explain this "advanced" two words. What is "advanced"? It is well known that traditional anti-virus software uses a featured static scanning technology, that is, find a specific hex string in the file. If it is found, it can be determined that the file is infected with a certain virus. However, this method has not played a good role in the situation of today's viruse techniques. Cause I will describe in the following chapters. Therefore, this article will not analyze the characteristic scanning and virus code clearance modules in the anti-virus engine. We want to discuss two major anti-virus techniques for adequate viral technology - virtual machines and real-time monitoring techniques. What is a virtual machine, what is real-time monitoring, I will introduce a detailed introduction in the corresponding chapter. What I want here is that although these two technologies have been reflected in their predecessors (used by some advanced anti-virus manufacturers at home and abroad), these technologies are not fully disclosed for commercial purposes. So you can't find inside those on these technologies in any case from books or online information. And I will analyze a large number of program source code in the relevant chapter (mainly a complete virtual machine source code in Section 2.4) or the reverse engineering code (3.3.3 and 3.4.3, three of my reverse engineering The real-time monitoring of the famous anti-virus software and the anti-appointment code of the client program are announced at the same time publish some un disclosed mechanisms and data structures within the operating system of personal excavation. Here, you will start to enter the topic. Directory 1.1 Background 1.2 Development Status of Today's Virus Technology 1.2.1 System Core Virus 1.2.2 Resident Virus 1.2.3 Intercept System Operation 1.2.4 Encrypted Dorm Virus 1.2.5 Reverse Tracking / Anti-Virtual Execution Virus 1.2.6 Direct API Calling 1.2.7 Virus Hide 1.2.8 Viral Special Infection 1 1.1 The two main technologies involved in this article are the two main technologies used in today's anti-viral world. What is it used? First, it is said that virtual machine technology is mainly designed to kill encrypted viruses. Simply, the so-called virtual machine is not a virtual machine. It is more appropriate to be a virtual CPU (CPU implemented by software), but the virus world is called. Its role is mainly an operation of analog Intel x86 CPU to explain the execution code, which can be used as the true CPU, decoding and executing the corresponding machine instruction specified. Of course, what is an encrypted deformation virus, why do they need to be able to get answers in the appropriate chapters if they need to be virtualized and how virtual implementation. That istermined another highlight - real-time monitoring technology, it is more wide, not only limited to killing viruses. Many objects monitored in real time, such as INTMON, PPMON, Disk Access (DiskMon), and more. Monitoring for anti-virus is mainly for file access. When you have access to a file, real-time monitoring will check if the file is a poison file. If it is, the user choices is to clear the virus or cancel the operation request. This gives the user a relatively safe execution environment. But at the same time, real-time monitoring will make system performance decline, and many anti-virus software users complain that their real-time monitoring allows the system to become unparalleled and unstable. This gives us a higher demand, which is how to make real-time monitoring of system resources for real-time monitoring while ensuring accurate intercept file operations. I will discuss this problem in the Virus Real-Time Monitoring section. These two technologies have been used in the products of advanced anti-virus manufacturers at home and abroad, although their source code is not open, but we can also peek into their design ideas through reverse engineering methods. In fact, you use a hex editor to open their executable, perhaps see some of the debugged symbols, variable names, or output information that are not peeled off, and these spider silk horses are of great help.

At the same time, the suffix is ​​.vxd or .sys in the installation directory of the anti-virus software is the driver to perform real-time monitoring, can be reversed (see I analyze the discussion of the driving source code). I believe this, we have a general understanding of these two technologies. Behind we will go deep into the details of the technology. 1.2 The development of today's virus technology is to discuss how anti-viruses must begin with the discussion of the virus technology itself. It is the so-called "know each other and know each other." In fact, I think that there is a great disadvantage that the study of viral technology is illegal. It is hard to imagine a person who has no virus writing experience will become an anti-virus expert. As far as I know, there are currently no shortage writing masters in some of the famous anti-virus software companies in China. Only they use the same technique to be on the front, with 'poison' attack '. So I hope this paper can play the role of throwing bricks. I look forward to more people will introduce virus technology to the public. Today's viruses are different from DOS and Win3.1. I think the biggest shift is: the guiding area virus is reduced, and the script type virus begins to flood. The reason is that there will be certain difficulties in the guidance zone directly rewritten under the current operating system (DOS is not protected, allowing the INT13 direct writer), and the change of the guiding area is easily discovered, so few people write again. And the script virus is favored by the viral authors with high communication efficiency and easy to write. Of course, because these two viruses can be killing using the characteristic-based static scanning techniques I have told, it is not in our discussion. The technology I want to discuss is mainly from binary housing viruses (viruses of infection documents), and these technologies are related to the operating system underlayer or 386% of protection mode, so it is worth research. Everyone knows that the housing type virus under DOS mainly infects 16-bit COM or EXE files. Since DOS is not protected, they can easily reside, reduce the available memory (by modifying the MCB chain), modify the system code, intercept system service or Interrupt. Then the Win9x and Winnt / 2000 era, I want to write a 32nd Windows virus that runs it is not easy. Due to page protection, you can't modify the system's code page. Due to the provisions in the I / O license bitmap, you cannot access direct port access. In Windows you can't intercept all file operations by intercepting INT21H as in DOS. In short, you run with a user-state program, your behavior will be strictly controlled by the operating system, and it is impossible to do so like DOS. Also, it is worth mentioning that the executable file format used under Windows is very different from the EXE under the DOS (ordinary program uses PE format, the driver uses le), so the difficulty of the infection file of the virus is increased (PE and Le comparison Complex, the middle is divided into several festivals, if the infection is wrong, will cause the file to continue to be executed). Because there are too many new technologies for today's viruses, I can't discuss them one by one, so I choose some important and representation to discuss in each section of this chapter. 1.2.1 System The core state virus is introducing what is the system's core state virus, it is necessary to discuss the concept of core state and user state. In fact, as long as you open a textbook on 386 protection model assembler design, you can find a story about these two concepts. The CPU of 386 and above achieved four privilege modes (Windows only used two), where privileged 0 (Ring0) is left to the operating system code, the device driver code is used, and they work in the system core state; The privilege 3 (RING3) uses ordinary user programs, they work in user state.

The code running on the processor core state is not restricted, free access to any valid addresses, direct port access. The code running on the user's state is subject to the plurality of processes, which can only access the virtual addresses that can access the page in the user state in the page table item mapped to the address space, and only the task status segment ( Direct access specified in the I / O license bitmap (I / O Permission Bitmap) (at this time, IOPL in the processor status and control flag register EFLAGS is usually 0, indicating that the current direct I / O The minimum privilege level is RING0). The above discussion is limited to the protection mode operating system, and there is no such concept of this real mode operating system, all of which can be considered as running in the core state. Since there is so much advantage of running in the core state, then the virus has no reason to don't want RING0. Processor mode When the switching of Ring3 to RING0 occurs, there are two cases: the following cases: Access the long transfer command Call of the call door, access the interrupt gate or trap door. Details of the specific transfer Due to the complex protection inspection and stack switching, please refer to relevant information. Modern operating system typically uses interrupt gates to provide system services, complete mode switching, in Intel X86 is int, such as INT30 in Win9X, in Linux, is int80 In WinNT / 2000 is INT2E. User mode servers (such as system DLL) requesting system services by performing an INTXX, then processor mode will switch to the core state, working on the core state corresponding system code will serve the request and transmit the result to the user program. The following example will explain the method of the virus into the system's core state. In addition to the top 4m page table, other places can be read or written by the user program in the part of the top 4m page table (3G-4G) in Win9x's proximity. If you view the page properties of these addresses with the Softice's page command, you will be surprised to discover the U RW bit, which means that these addresses can be read or written directly from the user state. This means that any user program can maliciously or unintentionally destroy the operating system code page during its run. This virus can be casually constructed at the GDT (Global Descriptor Table), the LDT (Local Descriptive Table), and the core state is allowed to enter the core state. Of course, it is not necessary to use the door to describe, and there are many ways to get RING0. According to the method I know, there is more than 10 species, such as calling the door (Callgate), Interrupt Door (INTGATE), TrapGate, Unusual Door (FAUT), Interrupt Request (IRQS), Port (Ports) , Virtual Machine Manager (VMM), Tonance (THUNKS), Device IO Control, API Function (SETTHREADCONTEXT), Interrupt 2E Service (NTKERN.VXD). Due to the limitations of the space I can't describe all the methods one by one, I only select a piece of code that is the most representative CIH virus version 1.5 version. It is often said that CIH viruses use VXD (virtual equipment driver) technology, in fact it is not VXD.

Only it uses Win9X's vulnerability, constructed a DPL (segment prior level) in the IDT (Interrupt Description) (meant to perform an int instruction of the interrupt door from RING3), and make The descriptor points to a function address that needs to work in RING0 in a private address space. In this way, the CIH can perform an INTXX instruction (CIH choosing INT3, is to make the system debugger SOFTICE that the same hanging int3 does not work properly to enter the system's core state, thus calling the system VMM and VXD services. The following is the source code of a CIH1.5 I have commented:; ******************************************* *

; * Modify IDT to find core state privilege level *; ********************************************* PUSH EAX SIDT [ESP-02H]; get the IDT Subterite address POP EBX ADD EBX, HOOKEXCEPTIONNUMBER * 08H 04H; ZF = 0 CLI; Let the interrupt MOV EBP, [EBX] MOV BP, [EBX "when reading modified system data -04h]; achieve the original interrupt entry address Lea ESI, myExceptionHook- @ 1 [ECX]; get the offset address of the function to work in RING0 Push ESI MOV [EBX-04H], Si SHR ESI, 16 MOV [EBX 02H], Si; set to new interrupt entry address POP ESI; ******************************************* **; * Generate an exception to enter Ring0 *; ********************************* INT HOOKEXCEPTIONNUMBER; generate an exception

Of course, there is also a code that is restored to the original interrupt address and an exception processing frame. The technology just discussed is limited to Win9x, and it is not so easy to enter RING0 under Winnt / 2000. The main reason is that Winnt / 2000 does not have the above loopholes, and their system code page (2G - 4G) has good page protection. Virtual addresses greater than 0x80000000 are invisible for user programs. If you use Softice's page command to view the page properties of these addresses, you will find the S bit, which means that these addresses can only be accessed from core states. So I want to construct a descriptor in IDT, GDT, and modify the kernel at the time of runtime. What can be done is only by loading a driver, using it to do something you can't do in Ring3. The virus can modify the kernel code in their load, or create a transfer door for the virus itself (using NT from Ntoskrnl.exe "Kei386allocategdtselectors, Kei386SetGDTSELECTORS, KEI386RELESEGDTSELECTORS). For example, the Funlove virus uses the drive to modify the system file (NTOSKRNL.EXE, NTLDR) to bypass the security check. But there are two problems in this, one is where the driver comes from, the modern virus generally uses a technology called "DROP", ie, in the virion itself contains driver binary code (can compress or dynamically constructing the file). When the virus needs to be used, dynamically generate the driver and throw them on the disk, then immediately run the driver to run by registering and final calling startService in the SCM (Service Control Manager); its second is to load a driver Administrator's identity, normal accounts return fails when calling the above-described load function (security subsystem wants to check the user's access token (TOKEN), but most users choose administrator status when logging in Otherwise, the virus is also unable to load real-time monitoring and driving, so there is still a lot of opportunities for viruses. 1.2.2 Resident Virus Stillivative refers to the existence of those who look for a suitable page in memory and copy the virus itself to it and can always maintain the virus code during the system operation. Resident viruses are more concealed than those direct infection (Direct-action "viruses, which usually intercepting some system operations to achieve the purpose of infection. Viruses entering the core state can utilize system services to achieve this, such as CIH virus, by calling a service VMMCall_pageAllocate exported by VMM over 0xC0000000, is assigned a page space over 0xc0000000. The user who is in a user-state seems to be impossible in the memory after the program exits, because the user program is allocated as part of the process, once the process ends, Resources will be released immediately. So what we have to do is to allocate a process to exit memory. A technique for the members of the Virus Writing Group 29A is very creative: He created a zone object via CreateFilemappingA and MapViewoffile and mapped it into a viewport to go to its address space, and moved the virus to it, due to documents The virtual address where the mapping is located is in a shared area (which can be seen by all the processes, that is, all processes are used to map the page table items of the virtual address in the shared area points to the same physical page), so the next step is injecting to Explorer.exe A code (using WriteProcessMemory to write data to other processes), and this code will apply again from the address space of Explorer.exe to open this file mapping.

As a result, even if the virus exits, since Explorer.exe also retains the mapping page, a viral code has been kept in the memory page that can affect all processes until Explorer.exe exits. It can also be done by modifying the system dynamic connection module (DLL). Win9X under system DLL (such as kernel32.dll is mapped to BFF70000) is in the system sharing area (2G-3G), and if you write a small virus code in its code segment void, you can affect all other processes. But the code segment of kernel32.dll can only be read in the user state. Therefore, you must first modify its page protection attributes through special means; and the page of the system DLL in WinNT / 2000 is mapped to the process's private space (such as kernel32.dll mapped to 77ed0000), and has a write attribute, that is, there is no process When you try to write to this page, all processes share this page; and when a process tries to write to the page, the system's page error handling code will receive the abnormality of the processor and check the exception is not accessible, and assign it to A new page that causes exceptions and copies the original page content on it and updates the page table of the process to point to the newly allocated page. This optimization of this shared memory has brought a certain amount of trouble to the writing of the virus. The virus cannot be modified only the Kernel32.dll code in Win9X. It needs to use WriteProcessMemory to map the virus code to each process, so that each process will get a copy of the viral body, which is called a multi-process reside or every process in the viral boundaries (MUTI) -Process Residence or per-process residence. 1.2.3 Intercepting System Operation Intercept System Operation is a trick for virus. The DOS era, the Windows era is no exception. Under DOS, the virus intercepts the DOS system service by modifying the inlet address of INT21H in the interrupt vector table (DOS with INT21H to provide system calls, including a large number of file operations). Most of the guiding area viruses will connect INT13H (providing a BIOS interrupt of disk operation services) to obtain control of disk access. The virus under Windows also found the method of hooking system services. More typical CIH viruses uses a system-level file hook provided by IFSMgr.vxd (installed file system) to intercept all files in the system, I will discuss this problem in detail in the relevant chapters, because of the real-time monitoring under Win9X This service is also mainly used. In addition, there are other methods. However, the effect does not have this system-level file hook, mainly the bottom layer, will lose some file operations. One of the methods is to use the APIHOOK, hook the API function. In fact, there is no ready-made service in the system, there is a SETWINDOWSHOKEX to hook the mouse message, but there is no power to intercept the API function. What we can do is to construct this hook. The method is actually very simple: For example, if you want to intercept the function createfile exported by kernel32.dll, you only need to add a jump instruction to your hook function at the beginning (bff7xxxx) of its function code, and then jump in your function come back. As shown below: ;; Target Function (to intercept the target function)

... TargetFunction: (Target Function To Intercept) JMP DetourFunction (jumping to the hook function, 5 word sorts of jump instructions) TargetFunction 5: Push Edi ... ;; Trampoline (your hook function) ... trampolinefunction :( After your hook function is executed, you want to return to the original function.) Push EBP MOV EBP, ESP PUSH EBX PUSH ESI (above a few lines are several instructions at the original function, a total of 5 bytes) JMP TargetFunction 5 (Jumping back to the original function) ... But this method is only a small part of the file opens. There is also a way to intercepted file operations under Win9X, which should be considered a large back door of Win9X. It is an API function called VxDCall0 in kernel32.dll. The code that disassembles this function is as follows:

MOV EAX, DWORD PTR [ESP 00000004h]; get the service code

Pop DWORD PTR [ESP]; Stack Correction Call Fword PTR CS: [BFFC9004]; Call 3B Section 3B Section through a call gate

If we continue to track, you will see:

003b: xxxxxxx int 30h;

This is a protected mode callback for caught VWIN32.VXD Related to VxDCall details, see Matt Pietrek's "Windows 95 System Programming Secrets". When the service code is 0x002A0010, the protection mode callback will fall into a service called VWIN32_INT21DISPATCH in vwin32_int21dispatch. This is indicating that Win9X is still dependent on MSDOS, although Microsoft claims that Win9x does not rely on MSDOS. The call specification is as follows:

MY_INT21H:

PUSH ECX PUSH EAX; Similar Function No. 002A0010H Call Dword PTR [EBP A_VXDCALL] Ret

We can use the IXIT.DLL data segment of the entrance to the VxDCall0 function to the kernel32.dll data segment in the kernel32.dll data segment, the user can use the six bytes to point to our own hook function, and In the hook, check the transfer service number and function number to determine if it is a file service requesting VWIN32_INT21DISPATCH. The famous HPS virus utilizes this technology directly intercepting file operations in the system at the user state, but this method is only a small number of file operations. 1.2.4 Encrypted Virus Encrypted Dorm Virus is the key content of the virtual machine chapter, which will be placed in the relevant chapters. 1.2.5 Reverse tracking / anti-virtual execution viral anti-tracking / anti-virtual execution virus and virtual machine contact is close, so it will also be introduced in the corresponding chapter. 1.2.6 Direct API Call Direct API Call is a common means of today's Win32 virus, which refers to a technique that the virus directly locates the API function directly in the memory and then calls. When the ordinary program performs an API call, the compiler compiles an API call statement into several parameter stack instructions followed by an indirect call statement (this refers to the Microsoft compiler, the Borland compiler uses JMP

DWORD PTR [xxxxxxxxh]) The form is as follows:

Push Arg1 Push Arg2 ... Call Dword PTR [xxxxxxxh]

Address XXXXXXXH In the Import Section section of the program image, when the program is loaded, the loader is responsible for adding the address of the API function to the inside, which is the so-called dynamic link mechanism. The virus constructs the link information of the API used in the viral code in the import segment of the file when infected with an executable file, which selects the code to directly locate the API function address directly at runtime. In fact, these function addresses are relatively fixed for some versions of the operating system, but viruses cannot depend on here. The more popular practice is to first locate the load base address of the dynamic connection library of the API function, and then find the required API address in its export section. There is almost no difficulty in the back, as long as you are familiar with the export structure. The key is that the first step - determine the DLL load address. In fact, the system DLL load base address is also fixed for some version of the operating system, but the virus is still not dependent on this stability. At present, most of the viruses use a technique called structured abnormality to capture the abnormality triggered by viral body. In this way, the virus can search for the specified DLL (DLL using the PE format in a certain memory, and the head has a fixed flag), but not worrying that the system will be killed by the operating system due to the incorrect of the page. Structured abnormal processing is simply interpreted as follows: A total of two types of exception handling: Final exception processing and per-thread abnormality processing. One: Final exception handling When you have an exception, the operating system will call the exception handler established in the main thread to call the setunhandledExceptionFilter. You also don't need to remove the processing code you install when you quit, and the system will automatically clear it. Push Offset Final_Handler

Call setunhandExceptionFilter ... Call EXITPROCESS; ********************************* FINAL_HANDLER: ...; (EAX) = -1 reload context and continche) MOV EAX, 1 RET; Program Entry Point ...; code covered by final handler ...; code to provide a polite exit ...; eax = 1 stops display of closure box; eax = 0 enables DISPLAY OF THE BOX

Second: The value of each thread exception handling FS is a sixteen selection, which points to the data structure TIB, thread information block that contains important information. Its first double-byte pointing we called the structure of ERR:

1st DWord 0 Pointer to Next Err Structure

(Next ERR structure pointer) 2nd dword 4 Pointer to OWN Exception Handler (current level exception handler address)

So the exception handling is pumped, if your own handler captures and processes this exception, then when your program has an exception, the operating system does not call its default handler, it will not An annoying red fork that performs illegal operations appears. Here is the exception segment of CIH:

MyvirusStart:

PUSH EBP LEA EAX, [ESP-04H * 2] xor EBX, EBX XCHG EAX, FS: [EBX]; Exchange the current ERR structure and the previous structure of the previous structure; EAX = the first structure of the previous structure; fs: [0] = Now ERR structural pointer (on the stack) Call @ 0 @ 0: Pop EBX Lea ECX, StoptorUnviruscode- @ 0 [EBX]; your exception handler offset PUSH ECX; your exception handler offset Stack Push Eax; the first ERR structure address stack; construct the ERR structure, the ESP (ERR structural pointer) this time is ESP0 ... StoptorUnviruscode: @ 1 = stoptorunviruscode xor EBX, EBX; the system is practicing when an exception I added a ERR structure before; so I have to find the original structure address MOV EAX, FS: [EBX]; take the address ERR structure of ERR structure ESP ESP, [EAX]; take the next structure address, EPS0 to ESP RESTORESE:; If there is no abnormality, go here, you then ESP is this ESP0 POP DWORD PTR FS: [EBX]; pops up the original front structure address to fs: 0 Pop Eax; pop up your exception handling Address, pin

1.2.7 Virus hidden implementation processes or modules should be a feature that must be a successful virus. Under Win9X, Kernel32.dll has an export function RegisterServiceProcess that can disappear from the process manager process list, but it does not allow viruses to escape from some process browsing tools. But when you know how these tools come to enumerate the process, you will also find ways to deal with these tools. Process Browse Tools Under Win9x, most use of the process32first and process32next two functions in the dynamic connection library called Toolhelp32.dll to implement process enumerations; in WinNT / 2000, there is also a psapi.dll export EnumpRocess can be used to achieve the same Features. Therefore, viruses can consider modifying part of these public functions, so that the information of a particular process cannot be returned to achieve viruses. But things are far from imagining it, as the saying goes, "Tao is one foot, the magic is a high feet", this is a bad. Due to the current efforts of many counter-engineers, many secrets hidden in Microsoft have gradually been excavated by people. Of course, the management process and the internal data structure and code of the module are included in the Windows kernel. For example, WINNT / 2000 describes the process of all activities in the system by the process EPROCESS block bidirectional linked list pointed to by NToskrnl.exe PsinitialsystemProcess. If the process browsing tool reads these data from the system kernel with the help of the driver, any virus cannot escape from it. For specific structures and features of EPROCESS, see David A.Solomon and Mark E.Russinovich's "Inside Windows2000" third edition. 1.2.8 Viral special infection methods have a slightly more common sense of virus, and ordinary viruses are by attaching themselves to the end of the host (so, the host's size will increase), and modify the program entry point to make the virus to hit the virus live. But now a lot of viruses can keep the host size and the entrance point on the host file head unchanged by using special infection techniques. Attached to the virus code, the size of the infected file unchanged, it is incredible, in fact it uses the characteristics of PE file format: there is a void between the PE files, and if the virion is Sustficies can divide themselves into several copies and inserted into the last gap in each section, so they do not have to increase one festival, so the file size remains unchanged. The famous CIH virus is using the typical example of this technology (there is only 1K size). If the virus wants to get control right without modifying the file header entry point: The entry point constant means that the program is executed from the original program entry code, the virus must put the original program code Modify the jump instruction to guide the virus entrance. The principle is this, but there are still many discussable places, such as where to insert this jump instruction in the original code code. Some check tool scans the entry point domain of the executable file header, if it is found to be not normal, that is, not in the resource section or repositioning section, there is reason to suspect a certain virus. So just discussing the technique of the viral circle called EPO (inlet point blur) technology can deal with such scans, and it is an important means of anti-virtual implementation. Also worth mentioning is that there are now many viruses already support infection of compressed files. For example, Win32.crypto virus can infect many of the types of compressed files such as ZIP, ARJ, RAR, ACE, CAB. These viruses contain code segments that decompress and compress the specific compressed file type, first to extract the contents in the compressed file, then infection, the appropriate file is infected, and finally the file compressed back and simultaneously modify the compressed file. The header checksum. Many anti-virus software currently support compressed files in multiple formats, but cannot kill them for some pending compressed files.

Cause I think it may be that it is afraid that due to some kind, if it is unzipped or compressed incorrect, check and calculates is not equal, the compressed file format is destroyed after the clearance. The virus does not have to be responsible for the user's file damage, so there is no such concern. [Understanding] View Next: "Virtual Machine Design of Anti-Virus Engine Design" Main Reference David A. Solomon, Mark Russinovich "Inside Microsoft Windows 2000" September 2000 David A. Solomon "Inside Windows NT" May 1998 Prasad Dabak, Sandeep Phadke, Milind Borate "Undocumented Windows NT" October 1999 Matt Pietrek "Windows 95 System Programming Secrets" March 1996 Walter Oney "System Programming for Windows 95" March 1996 Walter Oney "Programming the Windows Driver Model" 1999 Lu Lin "Windows9x file read and write the INTERNAL" 2001 anti-virus engine design virtual machine check article editorials Press: Introduction "anti-virus engine design" introduced the development of viral technology and the characteristics and infection mechanisms of some viruses. Below we focus on virtual machine torment. Contents 2.1 Virtual Machine 2.2 Encrypted Dorm Virus 2.3 Virtual Machine Implementation Technical Detailed 2.4 Virtual Machine Codes 2.4.1 Analysis of Non-Relying Sign Register Instructions 2.4.2 Dependency Sign Register Instruction Simulation Function Analysis 2.5 Anti-Virtual Machine Technology 2. Virtual Machine Catch 2.1 Virtual Machine In recent years, virtual machines are also known as universal decryptors in anti-virus software, although anti-virus is far from it. There is no level of perfection, but the virtual machine has brought bright prospects with market sales of anti-virus products such as "viral tag simulator" and "stryker". The following discussion will bring us into a wonderful virtual technology world. The first thing to talk about is the concept of virtual machines and it is with a virtual machine such as VMware (US VMware, which supports other operating systems such as Linux in WinNT / 2000) and VDM under Win9x (DOS virtual It is used to run 16 real mode code in the 32-bit protection mode environment. In fact, these virtual machines design ideas are somatic findings. As early as the 1960s IBM, a set of operating systems named VM / 370 were developed. The VM / 370 provides a pre-standard multitasking between different programs, and the process is a multi-virtual machine in a single actual hardware. Typical VM / 370 session, the user sits before the remote terminal of the cable, the initializer load operation of the simulation of the real machine via a control program, so a complete operating system is loaded into the virtual machine, and Started to create a session for the user. This simulation system is so complete, and the system programmer can even run its virtual copy to unlocate the new version. VMware is very similar to this, which can create a virtual machine as an application under the original operating system, and create a virtual machine for the target operating system, the target operating system is like running on a single real machine, whit Not under the control of VMware. When the power button is pressed in VMware, the machine's self-test screen appears, followed by the load of the operating system, everything is true.

Win9x uses VMS to make multiple programs shared CPU and other hardware resources (all Win32 applications are running on a system virtual machine; each 16-bit DOS program has a DOS virtual machine). The VM is a thing that is completely fictitious by software, and responds to the needs of the application in the same way as the real computer. From a point of view, you can treat a standard PC structure as an API. The elements of this API include hardware I / O systems, and Bios and MS-DOS based on interrupts. Win9x often proxy these traditional API elements with its own software so that it is possible to reproduce the precious hardware. The application running on the VM thinks yourself exclusively, they believe they get input from the real keyboard and the mouse and output from the real screen. Slightly add a little restriction, they can even think that you have a CPU and all memory. The key to realizing virtual technology is that software virtualization and hardware virtualization, the following briefly introduces the implementation of DOS virtual machines under Win9x. When Windows moves to the protection mode, the protection mode program cannot directly call the MS-DOS processing routine of the real mode, and cannot directly call the BIOS of the real mode. Software virtualization is used to describe how the Windows part can interact with each other with real-mode MS-DOS and BIOS. Software virtualization requires the operating system to intercept the call to the protection mode and the real mode boundary, and adjust the appropriate parameter register to change the CPU mode. Win9X uses virtual device driver (VXD) to intercept the interrupt from the protection mode, and convert it into a real mode interrupt call by the real mode interrupt vector table (IVT). As part of the conversion, VXD must use the parameters in the protected mode extension memory, generate an appropriate parameter and place it in the active mode (V86) operating system can access. After the service is over, VXD is assigned to the protected mode call end in the extended memory. A large number of 21h and 13h interrupt calls in the 16-bit DOS program are resolved here, but there are still many direct port I / O operations, which requires the introduction of hardware virtualization to solve. The appearance of the virtual hardware is to generate an interrupt request on the hardware interrupt request line, in order to respond to the IN and OUT instructions, change the special memory mapping position. Hardware virtualization depends on several features of Intel 80386 . One is the I / O license mask, allowing the operating system possible to trapping all IN / OUT instructions to any port (trap). Another feature is that the hardware assisted paging mechanism allows the operating system to provide virtual memory and intercept the access to the memory address, which will be a good example of this very well. The last necessary feature is the virtual 8086 (V86) mode of the CPU, so that the DOS program is executed in the real mode. We discussed the virtual machine for torching is not like some people who want to create a virtual execution environment like VMware, providing all elements that it may be used, including hard drives, ports, etc. Let it play freely, and finally determine whether it is a virus according to its behavior. Of course, this is a good idea, but considering that its design is too difficult (too much to simulate too much elements and behavioral analysis to be artificial intelligence theory), it can only be used as the direction of future development. The virtual machine I have designed is strictly called virtual machine, and it is called virtual CPU, universal decryptor, etc. Continue this name. Virtual machines are a software simulation CPU that can be referred to, decoding, executing like a true CPU, which can simulate a result of a piece of code to run on the true CPU.

A set of machine code sequences is given, and the virtual machine will automatically remove the first instruction opcode section, determine the operating code type, and addressing mode to determine the length of the instruction, and then execute the instruction in the corresponding function, and according to the execution The result is determined where the following instructions are determined, and the loop repeatedly until a particular situation occurs at ending, this is the basic working principle and simple process of the virtual machine. The purpose of designing virtual machine is to deal with encrypted deformation viruses. The virtual machine first determines from the file and read the virus entry, then explain the decryption segment (Decryptor) of the viral header in the above working step, and finally executing The results of the virus were found in the result of the results (clear virus in the decryption). The so-called "virtual" here is not a virtual environment created, but it is not actually implemented by the dye file, but the virtual machine simulates the effect of its true execution. This is the basic principle of virtual machine, please refer to the relevant chapters later. Of course, the virtual implementation technology is far more than the automatic shell (virtual machine speciation is actually the decryption of the decryption of the automatic tracking of the virus into the encrypted viral body), which can also be applied in cross-platform advanced language interpretation A malicious code analysis, debugger. For example, Liu Tao's domestic debugger TRDOS is fully utilized to interpret each instruction to perform the debugged program. This debugger compares traditional breakpoint debuggers (Debug, Softice, etc.) have many advantages, if it is not easy to be Not periphered by the debugger, there is no limit to the number of breakpoints. 2.2 Encrypted Dorm Virus The purpose of designing virtual machine is designed to deal with encrypted deformation viruses. This chapter focuses on encryption deformation technology. Early viruses did not use any complex anti-testing techniques. If the disassembly tool opens the viral code, it will be true machine code. It is thus possible to uniquely determine a virus in a certain machine code in vivo body and a virus entrance (note is not a file header) offset value. Simply determine the viral entrance and scan a specific code string at the specified offset. This static scanning technology deals with ordinary viruses is unloraity. With the development of viral technology, a class of encrypted viruses appeared. This type of virus is: the inlet has a decryptor, and the virus main code is added. When the runtime is first obtained, the decryption code of the control will circulate the virus body. After the completion, the control is given to the viral body operation, and the viral body is infected, the decent is dispensed, and the viral main body with the random key is encrypted, and saved The infected document is written together in the virus or the key in the decomposition of the deceleration. Since the virus main body of different infectious examples of the same virus is encrypted with a different key, it is impossible to find a unique range of code strings and offsets to represent the characteristics of this virus, and seemingly static scanning techniques to fail. But think about it, the decryption of different infections still keeps the unchanged machine code clear (from the theory, there is no encrypted machine code, otherwise the program cannot be executed), so the feature code is selected from this. It will take a certain false positive risk (the code in the decrypted subword lacks the viral characteristics, the same signature can also appear in the normal program), but it is still a valid method. Since the encrypted virus has not been able to completely escape the static signature scan, the virus writing is improved on the basis of an encrypted virus, so that the codes of the decociation have diverted different infections, which has an encrypted deformed virus. It is very similar to the encrypted virus. The only improvement is that the virus main body constructs a different file that will constructs a function of the same function but code different, that is, the decryption of different infectious instances has the same decryption function but the code is very different. For example, the original instruction can be completely removed to complete, and the middle may be inserted into useless spam. In this way, the static scanning technique is completely invalid because the constant signature cannot be found.

Below, two examples will be given to the encrypted viral decryption sub-construction, and then discuss how to detect encrypted deformation viruses with virtual implementation techniques. Deformation Decryption of Famous Polymorphic Marburg: 00401020: MovsX EDI, Si; Virus Inlet

00401023: MovsX EDX, BP 00401026: JMP 00408A99 ... 00408A99 ...... 00408A94: Decryption pointer initial value ... 00408A99: MOV DL, F7 00408A9B: Movsx EDX, BX 00408A9E: MOV ECX, CF4B9B4F 00408AA3: CALL 00408AA3: Call 00408ac4 ... 00408ac4: POP EBX 00408AC5: JMP 00408ade ... 00408ade: MOV CX, DI 00408AE1: Add EBX, 9FDBD22D 00408AE7: JMP 00408B08 ... ... 00408B08: ADD ECX, 80C1FBC1 00408B0E: MOV EBP, 7FCDeff3; Circulatory Remeasure Numeral Popular Value 00408B13: SUB CL, 39 00408B16: MovsX ESI, SI 00408B19: Add DWORD PTR [EBX 60242DBF], 9EF42073; Decryptive statement 9EF42073 is key 00408B23: MOV EDX, 6FD1D4CF 00408B28: MOV DI, DX 00408B2B: Inc EBP 00408B2C: XOR DL, A3 00408B2F: MOV CX, Si 00408B32: SUB EBX, 00000004; Move Decryption Offset Pointer, Reverse Decryption 00408B38: Mov ECX, 86425DF9 00408B3D: CMP EBP, 7FCDF599; Judgment Decryption End or No 00408B43: JNZ 00408B16 00408B49: JMP 00408B62 ... 00408B62: MOV DI, BP 00408B65: JMP 00407400; Give control to decrypted viruses Decomposition of the famous Polymorphic HPS: 005365B8:; Decryption pointer initial value and viral entrance encryption of viral body ... 005379CD: Call 005379E2 ... 005379E2: POP EBX 005379E3: SUB EBX, 0000141A; Setup decryption pointer patented 005379E9: Ret ... 005379f0: Dec EDX; Reduce Cycle Retrieval Numeric 005379f1: Ret ... 00537A00: XOR DWORD PTR [EBX], 10E7ED59; Decryption statement, 10E7ED59 is key 00537A06: Ret ... 00537a1a: Sub EBX, fffffff 00537a20 : Sub EBX, Fffffffd; Mobile Decryption Pointer inlet 00537a44: call 00537a30 00537a49: call 00537a00 00537a4e: call 00537a1a 00537a53: call 005379f0 00537a58: mov esi, edx 00537a5a: cmp esi, 74d9c696; determining whether or not the end of decryption 00537a60: jnz 00537a49 00537a66: jmp 005365b8; control to decrypt the After the viral entrance

The above code looks definitely not to be compiled with the compiler, or the programmer is hand-written, because of which a large number of chaos and garbage are filled. There is no comment portion in the code to be considered garbage code, and the function of useful partially completed is only a two-word plus or different fixed value that is looped to the encrypted virion. This is only one of the deformed viral infections, the decryption sub-and viral body of other instances will not be used, and it is extremely deformable to make people unable to recognize. As for the realization of deformation viruses, it is not discussed within our discussion due to complex algorithms and controls. This detection of this encrypted deformation virus is obviously no longer possible. For this reason, the method we take is dynamic signature scanning technology. The so-called "dynamic signature scan" refers to the first decryption of the virus first under the mating of the virtual machine, followed by the display of the virus. We know that the virus in the decryption is stable, as long as the decrypted virion can be scanned using the signature. To obtain a virus, it must first explain the virus's decrypted subsection of the virus. When tracking and determining its loop decryption, the entire virus is clearly text or part has been saved to an internal buffer. The virtual machine is also referred to as a general-purpose decryption device that it does not have to know the encryption algorithm of the virion in advance, but by tracking the decryption process of the virus itself. As for how the virtual machine explains the instruction execution, how to determine that the executable code has a loop-free decision section will be described in the next section. 2.3 Virtual machine implementation technology is detailed with the previous introduction of encrypted deformation virus. Now we know that the key to dynamic signature scanning technology is to obtain a clear text after the virus decryption, and the timing to which the plain text is generated is the virus itself decrypt code decryption. complete. There are currently two ways to track each step of controlling the virus, and can read the viral states from the memory after the virus cycle decryption is completed. One is a single step and breakpoint tracking method, similar to the current program debugger; another method is of course a virtual execution method. The technical details of single-step and breakpoint tracking method and virtual execution method are analyzed below. Single-step tracking and breakpoints are the most fundamental techniques for realizing traditional debuggers. Single step work principle is simple: When the CPU will check the logo register before executing a command, if the trap flag is found, it will cause a single step trap INT1H after the instruction is executed. As for the setup of the breakpoint, the software breakpoint is that the adjuster is used to replace the first byte of destruction to trigger the instruction in a single-byte break point command (CC, ie INT3H), which is executed to the breakpoint At the command, the default debug exception handling code will be called, and the segment / offset address saved in the stack is the address of one byte after the breakpoint instruction; the setting of the hardware breakpoint utilizes the processor itself. Support, the line-shaped address of the trigger command is set in the debug register (DR0 - DR4) and sets the relevant control bits in the debug control register (DR7), and the CPU automatically triggers debugging at the preset instruction. And Windows itself provides a set of debug APIs, making debugging tracking a program is very simple: the debugger itself does not have to hang the default debug exception processing code, but only call the WaitFordeBugevent Waiting for the debugging event; debugger Using the GetThreadContext suspended to get the context, set the trap marker in the flag register in the context, and finally make the settings take effect through the setthreadContext to make single-step debugging; the debugger can also call two powerful debug APIs -ReadProcessMemory and WriteProcessMemory are injecting breakpoint instructions to the address space of the debugged thread. According to the results of my reverse analysis, the VC debugger is directly written by this debug API. Using the above debugging technology You can write a fully equipped debugger like VC , then there is no reason to use it to be used to automatic decryption of virus code.

The simplest method: Create a debug sub-process that is to be checked as the debugger, then use the above method to make a single step, whenever you receive an event with an Exception_Single_STEP exception code, you can analyze the strip The instruction executed in the mode is finally determined to determine the READPROCESSMEMORY to read the viral body clear text after the entire decryption process of the virus. The only thing to use single-step and breakpoint tracking method is that it does not have to handle execution of each instruction - which means it does not need to write a large number of specific instruction processing functions, because all decryption code is performed, debugging The device is just the control of the clearance of the code being interrupted by a single step. However, the disadvantage of this method is also quite obvious: one is prone to virus, and the virus only needs to make simple stack checks, or directly calls IsDebugerPresent to determine that they are being debugged; their second due to no corresponding machine code Analyze modules, decodes of instructions, execute completely on the CPU, so it will result in an accurate access to perform detail and effectively control it. The three single step and breakpoint tracking method requires the real execution of the executable to be executed, that is, it will be run as a real process in the system running in its own address space, which is of course not allowed by the virus scan. Obviously, the single-step and breakpoint tracking method can be applied to the debugger, automatic shell, etc., but is not suitable for torch. The only shortcomings of using virtual execution methods are that it must handle execution of all instructions internally - this means that it needs to write a large number of specific instruction processing functions to simulate the execution effect of each instruction, where there is no time here The problem of control, because control will always be in the virtual machine. The software method is used to simulate the CPU is not easy, and it is necessary to have enough understanding of its mechanism, otherwise the simulation effect will be far from real implementation. Two examples: One is a multiplication ASCII adjustment command AAM that is commonly used by the virus, which is often used by the virus to test the advantages and disadvantages of the virus because there is unprecedented behavior. Typically, AAM is a double-byte instruction, and the opcode is D4 0A (in fact 0a implicitly represents the operand 10); however, it can also be explicitly specified as a single-byte instruction to any 8-bit immediate number, At this point the opcode is only D4. The virtual machine must take into account the case of the latter specified divisor to ensure the correctness of the simulation results; there is an example of a processor response interrupt method, that is, the CPU will not immediately respond immediately after the interrupt is just opened, but must be separated. An instruction cycle. If the virtual machine does not take into account the mechanism, it is likely that the virtual execution process will not do with the real situation. However, the advantages of virtual implementation are also very obvious, and it is not possible to fill the uniform and breakpoint tracking method: First, it is impossible to be aware of virus because the virtual machine will be in its internal buffer. Virtual execution code sets a dedicated stack, so the stack check results are not actually executed (will not press the return address to the stack and the return address when the breakpoint is interrupted; secondly due to the calculation of the decoding and address of the virtual machine itself Therefore, it is possible to obtain details of each instruction and control; Finally, the most critical feature is that virtual execution has indeed "virtual" execution, the system does not generate the process of executor because the executor's register group And the executive elements such as stacks are implemented inside the virtual machine, so it can be considered to be executed in the virtual machine address space. In view of the advantages of virtual implementation, it is better to apply it to universal virus. Typically, the virtual machine design can take one of the following three: self-contained virtual machine (SCCE), buffer code virtual machine (BCE), Finite Code Virtual Machine (LCE). The self-contained virtual machine works like a real CPU. A instruction is taken from memory, decoded by scce, and is transmitted to the corresponding analog to the instruction, the next instruction continues this loop.

The virtual opportunity contains a routine to decode the memory / register addressing operands, and then includes a routine set for simulating instructions that may be performed on the CPU. As you think, scce's code will become extremely huge and speed. However, SCCE is useful for an advanced anti-virus software. All instructions are processed inside, and virtual machines can make a very detailed report on each of the instructions, which will refer to each other to form an effective anti-drug system with each other. At the same time, the anti-virus program can most accurately control memory and port access because it handles the address decoding and calculation. The buffer code The virtual machine is a thumbnail of the science because it has a smaller size and faster execution speed relative to the SCCE. In BCE, an instruction is obtained from memory and is compared to a special instruction table. If it is not a special directive, it is made to decode the length of the instruction, and then all such instructions are imported into a small process that can simulate all non-special instructions. Special instructions only account for a small part of the entire instruction set, simulates in a particular small processing program. BCE reduces the number of instructions that must be specially processed by using all non-special instructions with a small universal handler, so that it cuts its own size and improves the execution speed. But this means that it will not truly limit access to a memory area, port, or other similar things, and it is impossible to generate the same comprehensive report provided by SCCE. Finite code The virtual machine is a bit like a level of virtual system for universal decryption. LCE is actually not a virtual machine because it does not really simulates instructions, it simply tracks the register content of a piece of code, and may provide a small modified memory address table, or the called interrupt. thing. Choosing the reason for using LCE rather than a bigger and more complex system, even if support for very small number of instructions can be taken far on the way of decrypting the original encrypted virus, because the virus only uses a small portion of the Intel instruction set to encrypt Its main body. Using LCE, there is no significant cost of the entire Intel instruction set, bringing a huge growth of speed. Of course, this is at the expense of the unable to handle complex decryption blocks. When the LCE is useful when you need to perform a quick file scan, because a small but like Lce can be used to quickly check the suspicious behavior of the execution file, the SCCE algorithm will be used to use the SCCE algorithm for each file will result in slowness. Of course, if a file looks suspicious, the LCE can also start a full inspection of a SCCE code. The following start introduce 32-bit self-contained code virtual machine W32encode (W32Encode.cpp, tw32asm.h, tw32asm.cpp as part of the check engine and other search clearance modules to rSENGINE.DLL). Since this is a fully designed and complex large commercial virtual machine, which inevitably contains specific processing of certain viruses, in order to make the virtual machine model clamps, I will make appropriate simplification when analyzing. W32Encode works very simple: it first sets an analog register group (with a DWORD global variable to simulate a register within the real CPU, such as Eneax) initial value, initialize the execution stack pointer (a value of a quantity in the virtual machine STATIC INT Stack [ 0x20] To simulate the stack). Then enter a cycle, explain the head 256 instructions in the instruction buffer progbuffer, and if the virus is not found when the cycle exits, it can thus determine the non-encrypted variant virus. If you find that the encrypted cycle is found, the EncodeInst function is called repeatedly to perform a loop. The decryption process is decrypted in Dataseg1 or Dataseg2. The relevant partial code is as follows: W32encode0 overall flow control section code:

For (i = 0; i <0x100; i ) // First, virtual execution 256 instructions attempt to find out the virus cycle solution {IF (Instloc> = 0x280) Return (0); if (Instloc Progseekoff> = Progendoff) Return ( 0); // The other two judgment statements check the legitimacy saveinstloc (); // store the offset of the current instruction in the instruction buffer in the instruction buffer HasadDNewSt = 0; if (! (J = parse ())) //// The virtual execution instruction buffer in the command buffer (0); / / exits the cycle IF (j == 2) // when the unform of the instruction is encountered, 2 shows the discovery to understand the recirculation Break;} if (i = = 0x100) // Exit the loop after executing 256 instructions, quit return (0); pre corresponds = 0; processinst (); if (! EncodeInst ()) // Call the decryption function Repeat the loop decryption process Return (0 ); // JMP Decision loop appears: if ((LOC> = 0) && (LOC

The process of virtual execution per instruction in PARS is more complex: Typically Parse will get the first two bytes of the current instruction from acquiring the command buffer progbuffer and call the corresponding instruction processing function according to their value. . For example, when the first byte is equal to 0F and the second byte bit is equal to BE, it can be determined that this instruction is MOVSZX and simultaneously calls MOVSZX for processing. When executing a process of entering a specific instruction, the command length is first determined by judging the addressing mode (calling modRegrM or ModRegrm1) and the control is given to the SaveInst function. SaveInST calls the true instruction execution function W32ExecuteInst after saving the information of the instruction. This function and PARSE are very similar, it acquires two bytes of the current command from SaveInstbuf1 and calls the corresponding instruction analog function according to their values ​​to complete the execution of a directive. The relevant part of the code is as follows:

Command partial code in W32ExecuteInst:

IF ((C & 0xF0) == 0x50) {IF (EXECUTEPUSHPOP1 (C)) // Simulate PUSH and POP RETURN (GotOnext ()); Return (0);} IF (c == 0x9c) {IF (ExecutePUSHF ()) // Simulate Pushf Return (GotOnext ()); Return (0);} IF (c == (char) 0x9d) {IF (EXECUTEPOPF ()) // Simulation POPF Return (gotonext ()); Return (0); } IF ((c == 0xF) && ((C2 & 0xBe) == 0xbe)) {if (i = executemovszx (0)) // simulates Movszx Return (gotonext ()); return (0);}

2.4 Virtual Machine Code Analysis of the relevant code of the overall process control and distribution part, has been analyzed in the previous chapter. The specific specific instruction simulation function is analyzed below, which is the essence of the virtual machine. I will deliver instructions into two major classes that do not rely on flag registers and dependent flag registers.................................... ..

Static int executepushpop1 (int C) {if (c <= 0x57) {if (StackP <0) // Check the legitimacy of the stack buffer before the stack Return (0);} else if (stackp> = 0x40) // Check the legitimacy Return (0) of the stack buffer pointer before the stack; if (c <= 0x57) {stackp-; energy (c <= 0x57) {stackp-; // If it is a stack command to reduce the stack pointer before entering the stack} Switch ( c) {case 0x50: stack [stackp] = energy; // simulation push eax break; ... case 0x5f: Enedi = stack [stackp]; // simulation push edi breaf;} if (c> = 0x58 ) {Stackp ; energy, if it is out of the stack instruction, add the stack pointer after the stack is out of the stack} return (1);

2.4.2 Dependency Sign Register Instruction Analog Functions The Simulation of the CMP Directive in the CW32ASM class:

Void CW32ASM :: CMPW (INT C1, INT C2)

{Char flgreg; __ASM {MOV EAX, C1 // Get the first operand MOV ECX, C2 // acquired the second operand CMP Eax, ECX // Compare lahf // Remove the comparison mark results into AH MOV FLGREG, AH // Save results in local variables FLGREG} flagreg = flgreg; // Save results in global variables FlagReg} Simulation of JNZ instructions in CW32ASM class: int CW32ASM :: JNE () {INT i; char flgreg = flagreg ; // Initialize the local variable flgreg __ASM {MOV AH, FLGREG / / Setting AH with Saved FlagReg PUSHF // Save the virtual machine itself PUSHF // save the simulation flag register value to the real flag MOV EAX, 1 JNE L // in the register executes the JNZ POPF // Restore Virtual Machine Sign Register XOR EAX, EAX L: POPF // Restore Virtual Machine Sign Register MOV I, EAX} Return (i); // Return Value To 1 means need to jump}

2.5 Anti-virtual machine technology Any matter is not perfect, impeccable, and virtual machines are no exception. Due to the emergence of anti-virtual implementation technology, virtual machine torch is subject to a certain challenge. Here, there are several typical anti-virtual execution techniques: first insert special instruction technology, that is, some people inserting special instructions such as floating point, 3DNOW, MMX in the virus, to achieve anti-virtual execution. Although the virtual machine uses software technology to simulate the work process of the real CPU, it is not true CPU, due to limited energy, the virtual machine's coder may implement support for the entire Intel instruction set, so when the virtual opportunity does not know Work will be stopped immediately. However, through the analysis and statistics of such viral code, we have found that these special instructions have not had any effects on the decryption of the virus, and their insertion is just to interfere with the work of virtual machines. In other words, the virus will not take this. The result of the calculation of the random garbage instruction. In this way, we can only construct a command length table for all special instructions corresponding to different addressing methods, without having to write a dedicated analog function for each special instruction. With this table, when the virtual machine can index the form of the command to obtain the length of the command to obtain the length of the instruction, then the current simulation instruction pointer (EIP) plus the command length to skip this Garbage instructions. Of course, there is another insurance method that is: After getting the length of the instruction, you can put this command we don't know in a buffer full of empty operation command (NOP), then we will jump into the buffer. Go, this is equal to letting the true CPU help us to perform this instruction, and the last step is of course placing the result in the real register back to our analog register. The advantage of this virtual implementation and real implementation brilliance is that even if the special instruction is meaningful for the virus, that is, the virus relies on its return results, the virtual machine can ensure the correct virtual execution result. Secondly, the structured abnormal processing technology, that is, the decryption code of the virus first sets its own exception handler, and then deliberately trigger an exception to turn the program process to a pre-established exception handler. This process transfer is the result of the CPU and the operating system mutual cooperation, and to a large extent, the operating system has played a lot. Since the current virtual machine only simulates the working process of the CPU without the protection check, it is not processed for the system mechanism. Therefore, there are two results in the face of an exception: one is that some designs defective virtual machines cannot determine the legality of the analog instruction, so that the simulation will cause the virtual machine to perform illegal operations; The two virtual machine determines that the analog directive belongs to illegal instructions, and the virtual execution immediately stops virtual execution immediately. Usually the purpose of the virus uses this technology is to place the true cyclic decryption code to the exception handler, so the virtual machine will stop working before entering the exception handler, so that the solution is the opportunity to escape the virtual execution. Thus, a good virtual machine should have the ability to discover and record the operation of the virus installation exception filtering function and automatically control the power of the anomalous processing function when it causes an exception. Once again, the entry point fuzzy (EPO), that is, the virus is inserted by inserting a jump instruction in the host code body without modifying the host code. Through the previous analysis, we know that virtual machine scanning viruses for efficient consideration of all code impossible for virtual execution of the test file, the usual practice is: Scan to check the file code entry, if there is no decryption cycle in the specified step number, This determines that the file does not carry an encrypted deformation virus. The reason why this technique can actually perform this assumption that the virtual machine is used: Since the virus is controlled from the host to half, the virtual machine first explains the execution of the host entrance. Procedure, of course, it is impossible to discoverse loops in the specified steps, resulting in a slide.

If the virtual machine can increase the size of the specified step size, it is likely that the virus is inserted with the virus to track the decryption of the virus, but it is really difficult to determine the size of the prescribed steps: too large, will have a normal program. Detection time; too small, it is easy to generate a drain. But we don't have to worry about it, this virus is not much more difficult because of its technical difficulty. Without the help of disassembly and virtual implementing engines, the virus is difficult to locate a full command within the host to insert jump, and it is difficult to ensure that the depth of the inserted jump instruction is greater than the specified step number of the virtual machine, and No grasp the inserted jump instruction will be executed. There are also multi-threaded techniques, that is, the virus initiates additional working threads in the decryption part of the entry main thread, and places the true cyclic decryption code in the working thread. Since the multi-line switching schedule is managed by the operating system, our virtual machine can only be permitted by the presence of the thread exclusible, that is, the guarantee will never be robbed. As a result, the virtual machine will be difficult to simulate the enable multi-threaded work. Multi-threaded and structured exception handling two technologies have utilized a specific operating system mechanism to achieve the purpose of anti-virtual execution, so adding support for a specific operating system mechanism in a virtual CPU will be our future improvement goals. Finally, metapolymorphy, that is, the virus is not a versatile decrypted sub-encrypted viral body structure, and the overall variant is used. This virus is all changing, and there is no so-called "viral" clear text. Of course, it is very difficult to write. If the first few anti-virtual machine technology uses the defects in the virtual machine design, you can make up for the code to make up, then this meticular technology makes the dynamic characterization scan of the virtual machine completely invalid, we A more advanced approach such as behavioral analysis must be sought to be solved. [Understanding] View Next: "Real-time monitoring of anti-virus engine design" main reference David A. Solomon, Mark Russinovich "Inside Microsoft Windows 2000" September 2000 David A. Solomon "Inside Windows NT" May 1998 Prasad Dabak, Sandeep Phadke, Milind Borate "Undocumented Windows NT" October 1999 Matt Pietrek "Windows 95 System Programming Secrets" March 1996 Walter Oney "System Programming for Windows 95" March 1996 Walter Oney "Programming the Windows Driver Model" 1999 Miss Lin "WINDOWS9X Document reading and writing Internal "2001 Real-time monitoring of the anti-virus engine design) Press: Introduction" Virtual Machine Design of Anti-Virus Engine Design "We focus on virtual machine torment. Let's take a look at how to monitor viruses.

Contents 3.1 Introduction to Real-Time Monitoring 3.2 Introduction to Virus Real-Time Monitoring 3.3WIN9X Virus Real Time Monitoring 3.3.1 Implementation Techniques 3.3.2 Program Structure and Process 3.3.3HOOKSYS.VXD Reverse Engineering Code 3.3.3.1 Hook Function Inlet Code 3.3. 3.2 Get the current process name code 3.3.3.3 Commix part code 3.4 Winnt / 2000 Virus real-time monitoring 3.4.1 Implementation technology 3.4.2 Program structure and process 3.4.3hookys.sys reverse engineering code analysis 3.4.3.1 Get the current process name Code 3.4.3.2 Start the hook function work code 3.4.3.3 Mapping system memory to user space code 3. Virus real-time monitoring 3.1 Introduction to real-time monitoring technology is not a new technology, it is in the DOS programming era. However, people did not give this technical name to this professional name. The hard disk write protection software that is commonly used in the early communities machine room is using real-time monitoring technology. The hard disk write protection software generally writes the part of the hard disk zero magnetic head (the 64 sectors of the 0 head 0 cylindrical 1 fan is reserved, the DOS access is not available) and modify the original main boot. Record allows the hard disk write protection program to get control at startup. The hard disk write protection program that acquires the control will modify the INT13H interrupt vector points to the hook code that has been reside in memory so that all the operations of the disk are intercepted at any time. The role of the hook code is of course apparent, it is responsible for determining whether the entry parameters, including function number, disk target address, etc., which can be permitted, such that the write operation protection of a particular area can be implemented. Later, it was born on this product that improved the magnetic disk recovery card, which utilizes techniques to redirect write operations to temporary partitions outside the target area and save the disk's previous state, and other technologies. Restore the function. Anyway, the core technology of such products is still real-time monitoring of disk operations. For those who are interested in this can refer to Gao Yunqing's "hard disk protection technical manual". There are many programs that have been resident and intercepted with some useful interrupts to implement certain specific purposes. We are often called TSR (terminating and waiting for Terminate-and-stay-responsient, this program is not easy to make , Require a lot of knowledge about hardware and DOS interruption, but also solve the problem of DOS re-entry, TSR program re-entry, and it will be a machine). Under Windows, real-time monitoring is not easy, and ordinary user programs are impossible to monitor system activities, which is also for system security considerations. HPS viruses can directly monitor the file operation in the user state is actually due to the Win9X in the design of a vulnerability. The two virus real-time monitoring (for Win9x & Winnt / 2000) we have to discuss (for Win9X & Winnt / 2000) use driver programming technology to allow the driver working on the system to intercept all file access. Of course, due to the difference in the working system, these two drivers are not the same from the structure or working principle, of course, the procedure is more than a matter of course, so we decided to divide each of them into independent section. discuss. The real-time monitoring of the virus mentioned above is actually the monitoring of the document, saying that document monitoring should be more reasonable. In addition to file monitoring, there are also a variety of real-time monitoring tools, which also have their own characteristics and functions. Here you recommend a site about the Windows system kernel programming: www.sysinternals.com. On it, you can find a lot of real-time monitoring gadgets, such as REGMON that can monitor registry accesses (by modifying the system call table related service entry), you can observe TDIMON in TCP and UDP activities in real time (drive TCPIP via Hook System Protocol The Dispatch function in .sys intercepts the request to send it to it), which is very beneficial to understand the internal operation details of the system.

After introducing the relevant background, let's take a look at the specific implementation techniques of the virus real-time monitoring. 3.2 Introduction to Virus Real-Time Monitoring Technology As mentioned above, the virus real-time monitoring is actually a file monitor, it will check if the file is opened, closed, cleared, written, etc. Check if the file is a virus carrier, if so, according to the user Decide to select different processing schemes, such as clearing the virus, prohibiting accessing the file, deleting the file or ignoring. This can effectively avoid the spread of infections on the local machine because the executable file loader first requires the file to open the file, and this request will be monitored in real time in the first time It is ensured that every execution is a clean non-poisonable file, which does not give the virus in any opportunity to perform and episode. The above is only the virus to monitor a rough working process in real time, and the detailed description will leave it into the corresponding chapter. The design of the real-time monitoring of the virus mainly has the following difficulties: First, the process of the driver is different from the writing of ordinary user programs, and it is very difficult. When writing a user program, you need to call some familiar API functions to complete specific purposes, such as opening the file You just need to call CreateFile; but you will not be able to use the familiar CreateFile. Under NT / 2000 you can use ZwcreateFile or NTCReateFile (Native API), but these functions are often required on an IRQL (interrupt request grade), if you call, delay / asynchronous process calls, non-paging / Sub-memory memory and other concepts are not particularly clear, then you write the driver will easily lead to the blue screen crash (BSOD), and the exception under RING0 will often cause the system to crash, because it is always trusted for the system, so there is no corresponding processing code Capture this exception. The call to KebugCheckex in NT will result in a blue screen, and then the system will dump and then restart. In addition, the debugging of the driver is not as convenient as the user program, and the debugger like VC is not linen, you must use the system-level debugger, such as Softice, Kd, ​​TRW, and so on. Its two is the communication problem of the driver and the client program under RING3. The proposal of this problem is natural. Imagine that when the driver intercepted to a file, it must be notified to check the opened file under RING3, and then the consolidation module also needs to pass the result of the invisible A monitoring program is transmitted to RING0, and the final driver determines whether the request is allowed based on the result returned. This obviously there is a two-way communication process. People who write drivers know a API that can be used to send device I / O control information to the driver, which can be found in MSDN, but it is unidirectional, ie Ring3 client program can pass Call DeviceioControl to pass some of the information to the monitoring program under RING0 but not in turn. Since it is unable to find a ready-made function to realize the communication of the monitoring program under RING0 to the communication of the client program, we must use a roundabout way to indirectly. To this end we must introduce the concept of asynchronous process calls (APCs) and event objects, which is the key to achieving the wake up between privilege levels. Now briefly introduce these two concepts, please refer to the technical implementation details in each subsequent subsequent chapter. Asynchronous process calls are a mechanism for executing a process in the context of a particular thread when the condition is appropriate. When you queue an APC to a thread APC queue, the system will issue a software interrupt. When the next thread is scheduled, the APC function will run. APC is divided into two: The APC created by the system is called the kernel mode APC, and the APC created by the application is called the user mode APC.

An APC can also be run only when the thread is in a Alertable state. For example, when calling an asynchronous mode, you can specify a user-defined callback function FileiocompletionRoutine, which is called when the asynchronous I / O operation is complete or canceled and the thread is called, which is a typical usage of the APC. The QueueUseraPC function exported in kernel32.dll can add an APC object to the queue of the specified thread, because we write the driver, this is not the function we want. Fortunately, in vwin32.vxd, a synergistic QueueUseraPc, after the monitor intercepts a file to open the request, it immediately calls this service to queue a RING3 APC that needs to be awakened in the client program, this function will be Soon the customer program is called when it is scheduled. This APC wake-up method is suitable for Win9x. We will use global shared events and semaphore objects to solve mutual awakening problems in WinNT / 2000. I will explain in Section 3.4.2 in Section 3.4.2. In the NT / 2000 monitor, we will use the keReleaseSemaphore to wake up a thread waiting for a client program under Ring3. At present, many anti-virus software have moved the driver's consolidation module to RING0, that is, the "active and operating system seamless connection" as it is promoted, so that the province has the consumption of communication, but writes the inspection module into a driver form. At the same time, there will be some troubles, such as can't call a lot of well-known API, can not interact with the user, so we still choose to analyze traditional anti-virus software monitoring procedures. The third is the resource problem occupied by the driver. If the system performance is too much due to frequent intercept file operations, the system performance is not worthless. This thesis will thoroughly analyze a successful anti-virus software monitoring program, which contains some of the techniques to improve their performance, such as setting history, built-in file type filtration, and set waiting timeout. 3.3WIN9X Virus Real-Time Monitoring 3.3.1 Implementation Techniques Improve the implementation of real-time monitoring under Win9X mainly depends on virtual equipment driver (VXD) programming, install file system hook (IFSHOOK), VXD and RING3 client program communication ( APC / EVENT) Three techniques. We have mentioned that only drivers working on the system have the ability to effectively complete the intercepting system range file operation, and VXD is a virtual device driver for Win9X, so it is aweight. Of course, the VXD function is far more than the intercepting files provided by IFSMgr.vxd, and the system's VXDS provides almost all underlying interfaces - you can see VXD as a DLL under RING0. The virtual machine manager itself is a VXD, which is generally referred to as a VMM service, and other VXD call interfaces are called VXD services. The two Ring0 call methods are all the same, that is, after INT20 (CD 20), it is tightly followed by a service identification code. The VMM uses the first half of the service identification code to find the corresponding VXD, and then use the service identification code. The second half of the VXD is located in the service table (Service Table) pointer and calls: CD 20 INT 20H

01 00 0d 00 DD VKD_DEFINE_HOTKEY This command is performed for the first time, and the VMM will replace with a same 6-byte indirect call command (not all modified as the CALL instruction, sometimes using JMP instructions), so that the query service is saved. Table work:

FF 15 XX XX XX XX Call [$ VKD_DEFINE_HOTKEY]

It must be noted that the above call method only applies to RING0, ie a RING0 interface that is called from VXD / VMM services. VXD also provides V86 (virtual 8086 mode), Win16 protection mode, Win32 protection mode call interface. Where the call interface of V86 and Win16 protection mode is quite weird: xor di di

MOV ES, DI MOV AX, 1684; INT 2FH, AX = 1684H -> Equipment Entry MOV BX, 002A; 002AH = VWIN32.VXD device identifies INT 2F MOV AX, ES; now ES: DI should include inlet OR AX, AX JE Failure Mov AH, 00; VWIN32 Services 0 = VWIN32_GET_VERSION PUSH DS MOV DS, WORD PTR CS: [0002] MOV WORD PTR [LPFNVMIN32], DI MOV WORD PTR [LPFNVMIN32 2], ES; Save ES and Di Call Far [lpfnvmin32]; Call Gate (Call Door) ES: DI points to a protection mode callback from the 3B segment: 003b: 000003d0 INT 30; # 0028: C025DB52 VWIN32 (04) 0742

INT30 forced CPUs from RING3 to Ring0, then the WIN95 INT30 processing function first checks whether the call is sent from the 3B segment, if it utilizes the CS index of the callback: IP indexes a protective mode callback table to obtain a RING0 address. In this example, 0028: C025DB52 is 0028: the entry address of the required service VWIN32_GET_VERSION. VXD Win32 Protection Mode Call Interface We have already mentioned in front. One is DeviceIocontrol, our Ring3 client uses it to make one-way communication with the monitoring driver; the other is vxdcall, it is an unapproved call of the kernel32.dll, which is frequently used by the system, and there is not much place for us. You can see WIN95DDK help, where there is a detailed description of the call interface provided by each system VXD, which can be selected according to the needs of the appropriate service. The installation of file system hook (IFShook) is derived from a service IFSMGR_InstallFileSystemapiHook provided by IFSMGR.vxd, using this service driver to register a hook function to the system. All file operations in the system will pass through this hook, and the document read and write under Win9x is as follows: When the read and write operation is performed, the number of MUSTCOMPLETECOUNT variables will be added to the operating system. This operation must be completed. . This function sets the internal variables in the Kernel32 module to display that there is a key operation now. If you have a sentence, there is also a function in the VMM, and the function name is also EntermustComplete. That function also tells VMM, there is a key operation being in progress. Prevent threads from being killed or hang. Next, Win9X has been processed _MapHandlewithContext operations. The specific significance of this operation itself is unclear, but its operation is a pointer to the object referred to in Handle and adds a reference count. Subsequently, it is a fundamental operation: KERNEL32 issued a VxDCall called Vwin32_int21dispatch. After being caught in VWIN32, it checks if the call is read or written. If so, the file handle is switched into a handle that can be identified, and IFSMGR_RING0_FILEIO is called. Next task is transferred to IFS Manager. IFS Manager generates an ioreq and jumps to the Ring0Readwrite internal routine. Ring0ReadWrite Checks the handle validity, and gets the FSD returned to the CONTEXT returned when the file handle is created, and it is incremented together to the Calliofunc internal routine. Calliofunc Checks the existence of ifshook. If there is no existence, IFS Manager generates a default IFS Hook and invokes the corresponding vfatreadfile / vfatwritefile routine (because the MS itself only provides VFAT driver); if IFSHOOK exists, the IFShook function is controlled The right, and the IFS Manager itself is separated from the document reading and writing. Then, the call is returned by the layer layer. KERNEL32 calls an unaffected function LeavemustComplete, reducing the MustCompleteCount count, and eventually returns to the caller. This shows that it is unlunably through the IFSHOOK to intercept the local file operation, and there is more missing files through APIHOOK or VXDCALL. The famous CIH virus is using this technology to realize its resident infection, where the code snippet is as follows: Lea Eax, FileSystemapiHook- @ 6 [EDI]; get the address of the hook function to be installed

Push Eax Int 20h; call IFSMGR_INSTALLESTALLESTEMAPIHOK IFSMGR_INSTALLESTEMAPIHOK IFR_INSTALLESYSTEMAPIHOK = $ DD 00400067H MOV DR0, EAX; Save the address of the previous hook Pop Eax as we see, all hook functions installed in the system are arranged in a chain. Finally installed hooks, first being called by the system. We must store the address of the previous hook of the previous hook to be called while installing the hook, to pass the request downward request:

MOV Eax, DR0; get the address of the previous hook

JMP [EAX]; jump to where to continue

For viral real-time monitoring, we also need to save the address of the previous hook when we install the hook. If the object of the file is carried with a virus, we can easily cancel the file request by do not call the previous hook; contrary, we need to pass the request in time, if the time in the hook is too long - use The processing feedback waiting for the RING3 tubular module will make the user significantly sensation system slow. As for the hook function entry parameter structure and how to get an operation type (such as IFSFN_Open) and file name (in Unicode form), please refer to the corresponding code profiling section. Another technology we need - APC / Event is also a service derived from a VXD export, which is a famous Vwin32.vxd. This strange VXD exports many services corresponding to Win32 API: such as _vwin32_queueuSerapc, _vwin32_waitsingleObject, _vwin32_resetwin32event, _vwin32_get_thread_context, _vwin32_set_thread_context, etc. This VXD is called virtual WIN32, and the probably name is thereby. Although the name of the service is the same as the Win32 API, the calling rules are large, and they are not available. _Vwin32_queueuserapc is used to register a user-state APC, and the APC function herein refers to the toxic threads we are in a alarm state in Ring3. Ring3 Client First Pass the address of the thread to the driver via IOCTL, then call this service queue an APC when the hook function is intercepted to the predetermined file, when the Ring3 client is scheduled, the APC routine is executed . _Vwin32_waitsingleObject is used to wait on an object so that the current RING0 thread is suspended. Our Ring3 Client first calls Win32 API - CreateEvent Create a set of event objects, and then converts the event handle to VXD handle (which should be a pointer to the object) and use IOCTL to use IOCTL through an unprecedented API - OpenVxDHandle Sending the Ring0 VXD, the hook function is called _vwin32_waitsingleObject on the VXD handle of the event in the VXD handle of the hook, and finally the Win32 API - SetEvent is called after the Ring3 client is completed. Waiting for the hook function. Of course, there is a terrible problem with this: If you do what I said, you will find it working properly within one end, but the time is long, the system is hanging. Even the Drive Programming Master Walter One is also known in some of its APC routines in some cases in its book "System ProGramming for Windows 95". Microsoft's engineers claim that the document operation request cannot be interrupted, you can't block file operations in the drive and rely on Ring3 feedback to respond. There are also some discussions on this issue. Non-advice: Some people think that when the system DLL - KERNEL32 has a mutex (Mutex) when it calls RING0 processing file request, and in some cases to handle the APC to have the same Mutual exclusion, so deadlocks happen; others think that although the 32-bit thread under Win9X is a multi-task, the Win16 subsystem is running in collaborative multitasking. In order to smoothly run the old 16-bit program, it introduces a global mutual exclusion --win16mutex. Any 16-bit thread has Win16Mutex in its entire lifecycle, and 32-bit threads are converted into 16-bit code, because the Win9x core is 16-bit, such as knrl386.exe, gdi.exe.

If a file request from a thread with Win16Mutex is blocked, the system will fall into a deadlock state. The correct answer to this question seems to be proven before you get the Win9X source code, but this is the key to our real-time monitoring, so you must solve it. By tracking the process of Win95 file operation, I repeatedly experimentally verified, I finally found a better solution: I get the RING0TCB of the current thread through GET_CUR_THREAD_HANDLE before intercepted the file request, and find TDBX, then find TDBX. The RING3TCB obtained in TDBX, according to its structure, we get the FLAGS domain value from the offset 44h, I found that if it is equal to 10h and 20h, it is easy to cause the dead lock, which is just an experimental result, the reason I also say unclear, probably this File requests come from threads with Win16Mutex, so it cannot be blocked; another fundamental solution is to specify timeout when calling _vwin32_waitsingleObject, if there is no Ring3 wake-up signal from the specified time, the waiting is automatically released to prevent deadlock happened. The main techniques of real-time monitoring under Win9X have been described in detail. Of course, there is also a part of the structure of VXD, writing, and compiling methods because the relationship between the space is not possible herein. For more details, please refer to Walter ONEY's book "System Programming for Windows 95", this book still has Taiwan's successful translation version "Windows 95 system program design". 3.3.2 Program Structure and Process Structure Structure Structure Structure and Process Analysis from a Famous Anti-Virus Software Win9x Real-Time Monitoring Virtual Device Hooksys.vxd: 1. When VXD receives ON_SYS_DYNAMIC_DEVICE_INIT messages from VMM - Need to pay attention to this is a dynamic VXD, it does not receive the sys_critical_init, device_init, and init_complete control messages sent when the system virtual machine is initialized - it starts to initialize some global variables and data structures, including allocating memory on the heap, creating standby, history , Open files, wait actions, close files 5 two-way loop linkers and 5 semapies (call CREATE_SEMAPHORE) for linked list operations, while setting global variables _Gnumoffilters, file name filter items are set to 0. 2. When VXD receives an ON_W32_DEVICEICONTROL message from the VMM, it acquires the user program from the entrance parameters to use DeviceIoControl to transfer the IO Control Code (IOCTLCODE) to deliver the user program. The RING3 customer program for hooksys.vxd works with Hooksys.vxd will send IO control requests to hooksys.vxd to complete a series of work, the specific order and code meanings are as follows: 83003C2B: Pass the operating system version obtained by GUIDLL to the driver ( Save in the iOSVersion variable), depending on this variable value, different offsets will be used when extracting certain domains from the Ring0TCB structure, because the operating system version will affect the kernel data structure. 83003C1B: Initialize the rear preparation chain table, saved a set of event pointers converted with OpenVxDHandle in each linked list element.

83003C2F: Passing the drive type value of Guidll to the drive (saved in the Drivertype variable), in accordance with this variable, call Vwin32_WaitsingleObject Set different wait timeout, because the read and write time of the non-fixed drive may be slightly longer. 83003C0F: Save the user-specified intercept file specified by the user-transmitted user, in fact, this type of filter already exists in the check module, and then set it clearly to improve processing efficiency: it ensures that non-specified type files will not be sent to RING3 Check module saves the overhead of communication. The parsed file type filter block pointer will be saved in the _gafilenamefilterarra array, and the value of the filter item number _Gnumoffilters variable is updated. 83003C23: Save the APC function address and the current thread KThread pointer to the current thread KThread pointer to the GUIDLL to kill open files. 83003C13: Install the system file hook, start the workmonhookProc of the block function of the intercept file operation. 83003C27: Save the APC function address and the current thread KThread pointer to close the file to close the file. 83003C17: Uninstall the system file hook, stop the work of the hook function FilemonhookProc, which intercepts file operation. The issuance of the IO control code listed above is fixed, and when the hook function is started, some random control code is also issued: 83003C07: Driver will open the head element of the file list, the first-one request open file is removed and inserted into Wait for the tail of the list, while transmitting the user's space address to the RING3 level waiting to kill the APC function of the Open file. 83003c0b: Driver The head element of the file will be closed, the first request is closed, and inserted into the end of the standby lin list, and transfer the file name string in the element to the APC function waiting for the kill to close the file. 83003C1F: When you check the file is a virus, you update the historical record list. The following describes the cook function and the GUIDLL to kill the APC function of the Open file, write files, and closing the process of processing, the class: When the file request enters the hook function FilemonHOKPROC, it first acquiring the function executed from the entrance parameters. The code and determines whether it is open operation (ifsfn_open 24h), if it is not immediately transmitted down, the inlet parameter is simply transmitted, that is, the inlet parameter is constructed, and the previous hook function is stored in the PrevifshookProc; if it is the process of proceduction branch to open the file request . At the entrance to the entrance, you must first determine if the current process is our own. If you must put it, because the file operations are frequent in the check module, intercepting the document request from yours will lead to a serious system deadlock. The next is to obtain a complete file path name from the stack parameters and filter the array by saving whether it is in the interception type, such as by further checking if the file is one of the following files: system .Dat, user.dat, / pipe /. Then look for a history chain list to determine if the file has been checked and recorded. If you find the record about the file in the history list and the record has not been fed, that is, its timestamp, the current system time is not greater than 1f4h, then Read the test results directly from the record. At this time, enter the real check Open file function _ravcheckopenfile, this function entry is first removed from the standby, waiting, or closing the chain header (_GetfreeEntry) and fill it (file path name, etc.). The file request is queued by the file request before the value (RING3TCB-> Flags) in the undisclosed data structure is then judged.

If you can add an idle element to the end of the file linked list and queue a RING3 check Open the file function APC. Then call _vwin32_waitsingleObject to wait for Ring3 to accomplish the completion of RING3 in an event object saved in idle elements. When the hook function hangs, Ring3's APC function is executed: it will send a request for a request for an IO control code to be 83003C07 to obtain the open file linked header element to save the first submission, the unreasonable file request, the driver can The virtual address of the elements in the kernel space is directly transmitted to it without having to consider remapping it. In fact, because there is no page protection in Win9x kernel space, the RING3 program can be read directly. Then it then calls the fnscanonefile function in RSENGINE.DLL for torch and sets the tubular result bit in the element, and then the event object saved in the element will call SetEvent Wake the hook function on this event. The waken hook function checks the result of the RING3 check code to determine whether the file request is delivered to the EAX or the cancel is placed directly in EAX, and the history is added. The above is just a brief introduction to the hook function and the APC function process, which omitted, such as judging the fixed drive, timeout, etc. For details, please refer to the anti-assembly code comments for Guidll.dll and Hooksys.vxd. 3. When VXD receives an ON_SYS_DYNAMIC_DEVICE_EXIT message from VMM, it releases the heapfree assigned when the initialization is initialized, and clears 5 semaphors (Destroy_Semaphore) for mutual exclusive. 3.3.3.3HOOKSYS.VXD Reverse Engineering Code Anatience It is necessary to introduce the concept of reverse engineering before analyzing the code. Reverse Engineering refers to an enforcement of executable to understand the meaning of the machine code itself without source code. There are many uses of reverse engineering, such as the software protection, peek its design and writing technology, explore the mystery of the internal mystery of the operating system. The many unappromant data structures and services we used herein are obtained by the reverse approach. The difficulty of reverse engineering can I know that there is a 1000 line after an EXE file in a 1K size, and the three files we have to reverse add more than 80 K, the total code amount is more than 80,000 lines. So you must master a certain reverse skill, otherwise it will be very difficult. First of all, you must complete the reverse work and you must choose an excellent disassembly and debug tracking tool. The Ida (The Interactive Disassembler) is a powerful disassembly tool: it is known for its interaction ability, allowing users to increase labels, annotations, and definition variables, function names; there are many anti-assembly tools for special treatment Document, such as import festival damage, etc., the IDA is still competent. Dynamic tracking needs to be used when the file is plurled or inserted into the interference instruction. Numega's Softice is a leader in debug tools: it supports all types of executable files, including VXD and SYS drivers, can call out with a hotkey to perform, memory, memory and port access, in summary Very strong, even the President of Microsoft is amazed. Second, there must be a certain understanding of the compiler commonly used compiler, which helps us understand the meaning of the code. The following code is a form of compiling advanced language functions that are commonly used by MS compilers: 0001224A PUSH EBP; Save the base register

0001224B MOV EBP, ESP 0001224D SUB ESP, 5CH; Local Variable Space in Stack 00012250 Push EBX 00012251 PUSH ESI 00012252 Push EDI ... 0001225B LEA EDI, [EBP-34H]; reference local variables ... .. 0001238D MOV ESI, [EBP 08H]; Reference parameters ... 00012424 POP EDI 00012425 POP ESI 00012426 POP EBX 00012427 Leave 00012428 RETN 8; Function Returns the following code is a compilation advanced language commonly used by MS compilers String length: 0001170d Lea EDI, [EAX 1CH]; serial address pointer 00011710 or ECX, 0FFFFFFFH; set ECX to -1 00011713 xor EAX, EAX; scanning string end symbol (null) 00011715 PUSH OFFSET 00012C04H; Compiler Optimization 0001171A Repne ScaSB; Scanning Strings End Symbol Location 0001171C Not ECX; Refuel Get String Length 0001171E SUB EDI, ECX; Restore Series Address Pointer The last point must have a perseverance and clear mind. The reverse engineering itself is a painful work: the variables and function names used in the advanced language source code are just an address here, which requires repeated debugging to determine their meaning; additional compiler optimization is more understanding of the code to increase a lot of obstacles As in the above example, the stack command is placed in advance when the back function calls when the rear function call is set. Therefore, the perseverance and the mind are not possible. The following enters the hooksys.vxd code analysis, because the code is too large, I only choose a representative and wonderful part of the introduction. The variables and functions in the code and the label name are what I added after my analysis, and may have some access to the original author. 3.3.3.1 Hook function entry code

C00012E0 PUSH EBP

C00012E1 mov ebp, esp C00012E3 sub esp, 11Ch C00012E9 push ebx C00012EA push esi C00012EB push edi C00012EC mov eax, [ebp arg_4]; Code C00012EF mov function to be executed [ebp var_11C], eax C00012F5 cmp [ebp var_11C ], 1; IFSFN_WRITE C00012FC jz writefile C0001302 cmp [ebp var_11C], 0Bh; IFSFN_CLOSE C0001309 jz closefile C000130F cmp [ebp var_11C], 24h; IFSFN_OPEN C0001316 jz short openfile C0001318 jmp irqpassdown hook function entry, the stack parameters are distributed as follows: EBP 00H -> Save EBP value. EBP 04H -> Return Address. EBP 08H -> Provide this API address EBP 0CH -> Provide the code EBP 10H for the executed function > Provides the type of resource that operates in 1-based driver code (if UNC is -1) EBP 14H-14H-14H-14H-14H->> provides operation thereon. EBP 18H -> Provides the code page EBP 1CH -> providing the IoreQ structure on which the user string is delivered.

The hook function determines the type of the request using the code saved in [EBP 0CH]. It also uses the pointer of the IOREQ structure saved in [EBP 0CH] from which the PATH_T IR_PPATH domain acquires a complete file path name. 3.3.3.2 get the current process name code c0000870 Push EBX

C0000871 push esi C0000872 push edi C0000873 call VWIN32_GetCurrentProcessHandle; returned in eax ring0 PDB (process database) C0000878 mov eax, [eax 38h]; HTASK W16TDB; Win16 at offset 38h is a database selecting sub-task C000087B push 0; DWORD Flags C000087D or al, C000087F push eax; DWORD Selector C0000880 call Get_Sys_VM_Handle @ 0 C0000885 push eax; acquiring system VM handle VMHANDLE hVM C0000886 call _SelectorMapFlat; selected sub-group address mapping for the linear address of the flat pattern C000088B add esp, 0Ch C000088E cmp eax, 0FFFFFFFH; mapping error C0000891 JNZ Short LOC_C0000899 ... C0000899 Lea EDI, [EAX 0F2H]; acquire module name from offset 0f2h; char TDB_MODNAME [8]

3.3.3.3 Communication part of the code

Hooksys.vxd China code:

C00011BC PUSH ECX; client program RING0 thread handle C00011BD PUSH EBX; Parameter C00011BE PUSH EDX; RING3 APC function, C00011BF call _vwin32_QueueUseRAPC; Queuing APC C00011C4 MOV EAX, [EBP 0CH]; event object ring0 handle C00011C7 push eax C00011C8 call _VWIN32_ResetWin32Event; no signal is provided to the event object state ...... C00011E7 mov eax, [ebp 0Ch] C00011EA push 3E8h; timeout setting C00011EF push eax; ring0 event handler object C00011F0 call _VWIN32_WaitSingleObject; Waiting RING3 to complete the completion of Guidll.dll in Guidll.dll: 10001AD1 MOV EAX, HDEvice; acquisition device handle 10001AD6 LEA ECX, [ESP 4] 10001ADA PUSH 0 10001ADC PUSH ECX; Return by 10001Add Lea EDX, [ ESP 8] 10001AE1 PUSH 4; Output Buffer Size 10001AE3 PUSH EDX; Output Buffer Pointer 10001AE4 PUSH 0; Input Buffer Size 10001AE6 PUSH 0; Input Buffer Pointer 10001AE8 PUSH 83003C07H; IO Control Code 10001AED PUSH EAX; Equipment Handle 10001AEE call ds: DeviceIoControl 10001AF4 test eax, eax 10001AF6 jz short loc_10001B05 10001AF8 mov ecx, [esp 0]; resulting open file list head element 10001AFC push ecx 10001AFD call ScanOpenFile; virus scanning function ScanOpenFile function call: 1000185D call ds: fnScanOneFile; Call the true invisible library export function 10001863 MOV EDX, HMutex 10001869 Add ESP, 8 1000186C Mov ESI, ESI; PROUSEMUTEX 10001875 TEST ESI, ESI; Check Result 10001877 JNZ Short OpenFileisvirus; If the virus is found to jump to OpenFileisviru to further process 10001879 MOV EAX, [EBP 10h]; Event object Ring3 handle 1000187c MOV BYTE PTR [EBP 16H], 0; set the result of the element in the EEX 10001880 PUSH EAX 10001881 Call DS: setEvent; Set the event object to have a signal-to-hook

3.4 Virus Real-Time Monitoring under Winnt / 2000 3.4.1 Implementation Techniques Improve the Implementation of Virus Live Monitoring under Winnt / 2000 Mainly Depending on NT Kernel Mode Drive Programming, Intercepting IRP, Driving and Ring 3 Client Underline Customer Programs (Name Events and Signals Volume object) three techniques. The design ideas and general processes of the program are very similar to the real-time monitoring of the Virus under Win9x, but only the technology will show great difference due to the different operational environment. VXD no longer supports in Winnt / 2000, I will analyze the hooksys.sys.sys, which is actually a driver called NT core mode device. This driver is very different from VXD from its structure or working mode. In contrast, the NT core mode device driver is more difficult than VXD is more difficult: because it requires programmers to familiarize with the overall architecture and operational mechanism of Winnt / 2000, NT / 2000 is a pure 32-bit micro-kernel operating system, which is very large with Win9X. Difference; flexible use of kernel data structures, such as driver objects, device objects, file objects, IO request packages, execution body process / thread block, system service schedule, etc. In addition, the programmer also needs to pay attention to many important matters when programming, such as the IO request grade, paging / non-paging memory, etc. of the current system. Here first introduces several important kernel data structures, they are often used in the programming of the NT core mode device, including file objects, driver objects, device objects, IO request packs (IRPs), IO stack units (IO_STACK_LOCATION) : The file is clearly compliant with object standards in NT: they are system resources that can be shared by two or more user-state processes; they can have names; they are protected by object-based security; and they support synchronization. For user-protected subsystems, file objects typically represent an open instance of a file, device directory, or volume; for device and intermediate driver, file objects usually represent one device. The domain in the file object structure is transparently driven to access the domain including: PDEvice_Object DeviceObject: Pointer to the device object to which the file is opened. Unicode_String FileName: The name of the file opened on the device, if the device represented by DeviceObject is opened, this string length is 0. The driver object represents the image driven by the loaded kernel mode. When the driver is loaded into the system, I / O Manager is responsible for being created. The pointer to the driver object will be transmitted as an input parameter to the driven initialization routine (Reinitialize Routines), and unload routine. Most of the domains in the driver object structure are transparent, the domain that can be accessed includes: PDEvice_Object DeviceObject: Pointer to the device object that drives the driver. This domain will be automatically updated after successfully calling IOCREATEDEVICE in the initialization routine. When the driver is uninstalled, its uninstalling routine will use this domain and device object to call IodeDeleteDevice to clear each device object that the driver created. PDRIVER_INITIALIZE DRIVERINIT: The initialization routine set by I / O Manager (DRIVERENTRY) entry address. This routine is responsible for creating a device object for each device of the driver, and you can create a symbolic link to the user state visible name in the device name and device.

At the same time, it also fills the driver routine entry points into the domain of the driver object. PDRIVER_UNLOAD DRIVERUNLOAD: The uninstall routine entry address of the driver. PDRIVER_DISPATCH MAJORFUNCTION [IRP_MJ_MAXIMUM_FUNCTION 1]: One or more driver scheduling routines inlet address arrays. Each driver must set at least one scheduling entry for the IRP_MJ_XXX request set for the drive processing in this array, so all IRP_MJ_XXX requests are imported by the I / O Manager into the same scheduling routine. Of course, the driver can also set a separate scheduling entry for each IRP_MJ_XXX request. Of course, the routines that may be included in the driver will be much more than listed above. For example, start I / O routines, interrupt service routines (ISRs, interrupt service DPC routines, one or more completion routines, cancel I / O routines, system closing notification routines, error record routines. Only in Hooksys.sys we will analyze is only very few in the routines, so the rest will not be described in detail. The device object represents a logic, virtual, or physical device that processes the I / O request for the loaded driver. Each NT Kernel Mode driver must call IOCREATEDEVICE in its initialization routine to create its supported device objects. For example, TCPIP.sys creates three shared device objects that share this drive in its DriveRentry: TCP, UDP, IP. There is currently a relatively popular driver called WDM (Windows Driver Model). In most cases, its binary image can be compatible with WIN98 and WIN2000 (32-bit versions). The main difference between the WDM and the NT Kernel Mode driver is how to create a device: In the WDM driver, the Plug and Play (PNP) manager knows when to add a device to the system, or remove the device from the system. The WDM driver has a special AddDevice routine, and the PNP manager calls this function for each device instance of sharing the driver; the NT core mode driver needs to do a lot of additional work, and they must detect their hardware and create hardware. Equipment objects (usually in Driverentry), configure and initialize hardware make it work properly. Most of the domain in the device program object is transparent, the domain that can be accessed includes: PDRIVER_OBJECT DriverObject: Points to the driver object representing the driver load image. All I / O is driven by I / O request pack (IRP). The so-called IRP driver means that the I / O manager is responsible for allocating a certain space in the non-page memory of the system. When the command is received or caused by the event, the work instruction is placed in it and passes to it. The service routine of the driver. In other words, the IRP contains the information instructions required for the service routine of the driver. IRP has two parts: a fixed portion (called title) and one or more stack units. The fixed portion information includes: the type and size of the request, is a synchronous request or asynchronous request, a pointer to buffer I / O, a pointer to a buffer, and a status information varying due to progress. PMDL MDLADDRESS: Points to a memory descriptor table (MDL), which describes a user mode buffer associated with the request. If the FLAGS field of the top-level device object is do_direct_io, the I / O Manager creates this MDL for IRP_MJ_READ or IRP_MJ_WRITE. If an IRP_MJ_DEVICE_CONTROL request is required to specify a method_in_direct or method_out_direct operation mode, the I / O Manager creates an MDL for the output buffer used for the request.

The MDL itself is used to describe the user mode virtual buffer, but it also contains the physical address of the buffer lock memory page. PVOID AssociatedirP.systemBuffer: The SystemBuffer pointer points to a data buffer, the buffer is located in non-page memory in the kernel mode in IRP_MJ_READ and IRP_MJ_WRITE operations. If the top-level device specifies the DO_BUFFERED_IO flag I / O manager, this data buffer is created. For IRP_MJ_DEVICE_CONTROL operations, if the I / O control function code indicates that a buffer is required, the I / O Manager creates this data buffer. I / O Manager sends the user mode program to the driver's data to this buffer, which is also part of the creation of the IRP process. These data can be data related to WriteFile calls, or so-called input data in the DeviceIocontrol call. For a read request, the device driver fills the read data into this buffer, then copy the contents of the buffer to the user mode buffer. For the I / O control operation specified for Method_Buffered, the driver puts so-called output data in this buffer, then copy the data to the output buffer of the user mode. IO_STATUS_BLOCK IOSTATUS: IOSTATUS (IO_STATUS_BLOCK) is a structure that contains only two domains that set this structure when the driver is finally completed. The iostatus.status domain will receive a NTSTATUS code. PVOID UserBuffer: For the IRP_MJ_DEVICE_CONTROL request for Method_neither mode, the domain contains the user mode virtual address of the output buffer. The domain is also used to save the user mode virtual address of the read and write request buffer, but specify the driver of the DO_BUFFERED_IO or DO_DIRECT_IO flag, and its read-write routine usually does not need to access this domain. When processed a Method_neither control operation, the driver can create its own MDL with this address. Any kernel mode program creates an IRP while also created an associated IO_STACK_LOCATION structure array: each stack unit in the array corresponds to a driver that will process the IRP, and there is a stack unit for IRP The founder is used. The stack unit contains the type code and parameter information of the IRP and the address of the completion function. Uchar Majorfunction: The main function code for this IRP. This code should be a value similar to IRP_MJ_READ and correspond to a dispatch function pointer to the Majorfunction table in the driver object. Uchar minorfunction: The subfaming code of the IRP. It further pointed out which main function class belonging to the IRP. PDEvice_Object DeviceObject: The address of the device object corresponding to the stack unit. This domain is fill in by the IocallDriver function. Pfile_Object FileObject: The address of the kernel file object, the target of IRP is this file object. The following is a brief introduction to the I / O request processing process in Winnt / 2000. First, I / O requests for the synchronization of single-layer drivers: I / O Requests The corresponding services in I / O Manager via subsystem DLL subsystem DLL. The I / O Manager sends a request to the device driver in the form of IRP. The driver starts I / O operation. When the device completes the operation and interrupts the CPU, the device driver service is interrupted. Final I / O Manager completes I / O request. The above six steps is just a very rough description, and the interrupt processing and the I / O completion phase are more complicated. When the device completes I / O operation, it will issue an interrupt request service. When the device is interrupted, the processor gives control to the kernel trap handler, and the kernel trap will locate the ISR for the device in its interrupt schedule (IDT).

After the ISR routine of the driver obtains control, it usually only stays on the device IRQL for a period of time, and then stop the device interrupt, then it queues a DPC and clears the interrupt exit operation. Before IRQL is reduced to DISPATCH / DPC, all intermediate priority interrupts can be served. When the DPC routine is controlled, it will start the next I / O request in the device queue, then complete the interrupt service. When the driven DPC routine is executed, there are some work to do before I / O requests can be considered. In some cases, the I / O system must copy data stored in the system's memory to the caller's virtual address space, such as recording the operation result in the I / O state block provided by the caller or buffer I / O The service returns the data to the calling thread. Thus when the DPC routine calls the I / O manager Complete the original I / O request, the I / O Manager will call a thread to queue a core state APC for calling the thread. When the thread is scheduled, the pended APC is delivered. It will copy the data and return status to the caller's address space, release the IRP representing the I / O operation, and set the event or I / O completion port provided by the caller's file handle or caller to a signal state. If the caller specifies the user APC with an asynchronous I / O function ReadFileEx and WriteFileEx, then the user APC is also required to queue. Finally, you can consider completing I / O. The thread waited on the file or other object handle will be released. The I / O request processing process based on the file system device is basically the same, and the main difference is to increase one or more additional processing layers. To,, The system service scheduler KisystemService in Ntoskrnl.exe then locates NTWREADFILE in Ntoskrnl.exe in the system service schedule, and the interrupt is released. This service routine is part of the I / O manager. It first checks the parameters passed to them to protect the system security or prevent the user mode program from illegally accessing data, and then create an IRP for IRP_MJ_READ and send it to the entry point of the file system driver. The following work will be done by the file system driver and the disk driver. The file system driver can reuse an IRP or create a set of parallel work-related IRPs for a single I / O request. The disk driver that performs IRP is finally possible to access the hardware. For PIO modes, an IRP_MJ_READ operation will result in a direct read device's port or a memory register implemented by the device. Although the driver running in the kernel mode can be directly connected to their hardware sessions, they usually use hardware abstraction layers (HAL) access hardware: read operations will eventually call the read_port_uchar routine in HAL.DLL to come from an I / O port Read the single byte data. Winnt / 2000 equipment and drivers have a significantly stacked hierarchy: the object object in the stack is called a physical device object, or is referred to as PDO, and the corresponding driver is called a bus driver. There is an object in the middle of the device object stack, an object is called a functional device object, or referred to as FDO, its corresponding driver is called a functional driver. There will be some filter device objects on the top of the FDO and below. The filter device object located on the FDO is called the upper filter, and its corresponding driver is called the upper filter driver; the filter device object located below the FDO (but still on the PDO) is called a lower filter, The corresponding driver is called the lower filter driver. This stack structure can make the I / O request process more. Each operation affecting the device uses IRP. Usually IRP first is first sent to the uppermost driver of the device stack, and then gradually filters the driver below. Each driver can determine how to handle IRP.

Sometimes the driver does not do anything, it is only to pass the IRP next to the lower layer. Sometimes, the driver is processed directly to the IRP and is no longer transferred down. Sometimes, the driver is handled both IRP and passes the IRP. This depends on the contents carried by the device and IRP. Through the above introduction, you can know: If we want to intercept the file operations of the system, you must intercept the I / O Manager IRP to the file system driver. The easiest way to intercept the IRP is to create an upper filter device object and add it to the device stack where the file system device is located. The specific method is as follows: Firstly, you can create your own device objects via IOCREATEDEVICE, then call IoGetDeviceObjectPointer to get a pointer to the file system device (NTFS, FASTFAT, RDR, or MRXSMB, CDFS) object, and finally put your own device in the device stack by Ioattachdevicetodeventack. filter. This is a method of intercepting IRP's most commonly used is also the most insurance. ART Baker's "Windows NT Device Driver Design Guide" has been described in detail, but there is two problems with real-time monitoring of viruses: one of them is to filter Put it to the uppermost layer of the stack. When there is other upper filter, it cannot guarantee that the filter is above the file system device; its second due to the filter system needs to perform, such that all of its characteristics is required Replication in the file system device. In addition, the scheduling routine filter drive must support, which means that we cannot make the scheduling routines in the filter driver for their own RING3 customer program, because the original sent to file system driver scheduling Cheng IRP will now pass through the scheduling routine driver driven. So hooksys.sys do not use the above method. Its method is simpler and more directly: it gets the pointer to the file system drive object through ObreferenceObjectByname. Then the open, closes, clear, set file information in the MajorFunction array in the drive object, and write scheduling routine inlet addresses to the entry address of the corresponding hook function in hooksys.sys to reach the purpose of intercepting IRP. Please refer to the code for details. The following describes the communication technologies for driving and RING3 client programs. Same as Win9X, the client program communication technology is the same, and NT / 2000 still supports unidirectional communication from Ring3 to Ring0 using DeviceIoControl, but from Ring0 to wake up the Ring3 thread by queuing APC, it is not possible. The reason is that I didn't find an open function to achieve (the Walter One is said that there is an unapproved function to achieve from Ring0 queue APC). In fact, we can also achieve bidirectional wake up by named event / semaphore objects, and this may be more reliable than APC. Object Manager has an extremely important location in the Windows NT / 2000 core, and one of its most important functions is to organize management system kernel objects. In Windows NT / 2000, the kernel object manager has introduced a large number of object-oriented ideas, that is, all kernel objects are encapsulated inside the Object Manager, except for the object manager, and some other people who want to reference kernel objects. The system is opaque, that is, you need to access these structures through the Object Manager. Microsoft highlights the kernel driver code follows this principle (the user code cannot directly access this data), which provides a series of routines starting with OB for us to use. The kernel named object exists in the overall naming kernel area of ​​the system, similar to the traditional DOS directory and file organization, and the object manager also manages these objects, so that the kernel object can be quickly retrieved.

Of course, this tree-like structure tissue kernel has named objects, and another advantage, that is, make all naming objects organizations are very organized, such as device objects under / device, and the object type name is in / ObjectTypes, etc. . In this way, it can also reach the user-only process that can only access the objects under the / BaseNameDObjects, and the kernel code does not have any restrictions. As for how to organize these named objects inside, in fact, Windows NT / 2000 is directed by the Directory object to which the kernel variable obprootdirectoryObject is meant, and use hashtable to organize these named kernel objects. Use the named semaphore to wake up the Ring3 thread in hooksys.sys. The specific practices are as follows: First call CreateSemaphore in Guidll.DLL to create a named semaphore hookpen and set to no signal, and call CreateThread to create a thread. The entrance to the thread code is waiting to be awakened by the Ring0 hook function by calling WaitForsingleObject. The driver This is the pointer to the named semaphore object hookopen by unapproved routine OBReferenceObjectByname (/ basenaMedObjects / hookopen) during initialization. When it intercepts the file to open the request, call the keReleaseSemaphore to set the hookopen to wake up 3 Level Waiting to check the thread that opens the file. In fact, GUIDLL.DLL created two named sessions, as well as a hookclose to wake up Ring3 waits to check threads that turn off files. GUIDLL.DLL uses a named event to wake up temporarily suspend the Ring0 hook function waiting for the test. The specific practices are as follows: hooksys.sys Create a group of naming events through the ZwcreateEvent function during its initialization (here you must reasonably set the security descriptor, otherwise the RING3 thread will not use the event handle) and get its handle, and get the handle through ObreferenceObjectbyHandle The pointer of the event object referenced. Then hooksys.sys will save this group of event handles and fingers, and the event names in each element of the standby list: Ring3 uses the handle, RING0 uses the pointer. When the hook function intercepts the file request, it first wakes up 3 to check the drug thread, then call KewaitForsingleObject to wait for the completion of the completion of the recovery of the incident / basenaMedObjects / hookxxxx. And waken Ring3 check threads get their handles by the OpenEventa function by the event name, and send a setEvent call after the end of the hook, set the event as a signal state to awakening the Ring0 hangs. Of course, the above discussion is limited to open file operations, and the hook function does not adjust the completion of the recovery of KewaitForsingleObject when intercepted by other file requests, but wakes up the Ring3 check the thread to return directly; the corresponding RING3 check thread is not Call SetEvent for remote wobble after checking. In addition, you must pay attention to some matter when you write a NT core mode driver. The first is the interrupt request grade (IRQL), which is a problem that is particularly worth noting when the NT driver programming is performed. Each kernel routine requires running on a certain IRQL. If you cannot determine which level of the current IRQL is in the call, you can call KegetCurrentiRQL to get the current IRQL value and determine.

For example, to obtain a pointer pointing to the current process may consider Eprocess Analyzing current IRQL, such as larger than the DISPATCH_LEVEL call IoGetCurrentProcess equal; IRQL is less than when scheduling / deferred procedure call level (DISPATCH_LEVEL / DPC) and can be used PsGetCurrentProcessId PsLookupProcessByProcessId. Secondly, the problem is paging / non-paging memory. Since the system will not be able to handle the page failure because the system is executed, the system will not process page faults in the APC level, and the general principle here is that the code is absolutely unable to cause page faults. This also means that the code to perform at a level above or equal to the DISPATCH_LEVEL level must exist in non-page memory. In addition, all of these code to access must also exist in non-page memory. Finally, synchronous mutual exclusive problems, which is especially important for drivers sharing such as viral real-time monitoring. Although there is no multithreading in hooksys, PSCreateSystemThread, but because it hooks the system file hook, all threads of all threads in the system will pass from hooksys. When a thread's file request is processed, Hooksys will go to access some global shared data, such as filters, history, etc., which may be preempted when accessing is for some reason, and the result is other threads. The file request will be wrong when the request is passed when the Hooksys accessed. To this end, the driver must be synchronized using the kernel synchronization objects such as spin lock, mutex, and resources, and synchronize all threads of shared global data. 3.4.2 Program Structure and Process Structure Structure Structure Structure and Process Analysis from Winnt / 2000 Real-Time Monitoring NT Kernel Mode Device Drivers from a Famous Anti-Virus Software Hooksys.sys: 1. Initialization Rollers (DRIVERENTRY): Call _GetProcessNameOffset get progress Offset in EPRocess. Initialize, open file waiting operation, shut down file, history 5 two-way loop linches and 4 self-spinch and 1 fast mutual exclusion with the linked list operation mutually exclusive. Set the global variable_irqcount (IRP) to 0. Create an uninstall protection event object. Initialize the synchronization resource variable for file name filter array. Retrieve the Hookopen and HookClose two named sessions (_CreateSemaphore) in the system global named kernel area. To spare (_allocatebuff), the list is allocated in the system non-paged pool, and create a set of naming event object hookxxx and saves each element of the standby list (_createoneEvent). Create a device, set the drive routine port, set a symbolic connection for the device. Create a disk drive device object pointer (_QuerySymbolicLink) and a list of file system driver object pointers (_hooksys). 2. Open the routine (IRP_MJ_CREATE): Map the standby list with system non-paged memory (preserved in _sysbufaddr) to the user space (saved in _userbufaddr) so as to directly access this memory directly from the user state (_mapMemory) .

3. Device Control Routines (IRP_MJ_DEVICE_CONTROL): It acquires the user program from the IRP current stack unit to deliver the user program to use DeviceIOControl to deliver the intent of the user program. The RING3 client program for hooksys.sys works with Hooksys.sys will send IO control requests to hooksys.sys to complete a series of work, the specific order and code meanings are as follows: 83003C2F: Passing the drive type value obtained by GuidLL to the drive ( Save in the Drivertype variable), depending on the variable value, set a different wait-only, because the read / write time of the non-fixed drive will be slightly longer. 83003C0F: Save the user-specified intercept file specified by the user-transmitted user, in fact, this type of filter already exists in the check module, and then set it clearly to improve processing efficiency: it ensures that non-specified type files will not be sent to RING3 Check module saves the overhead of communication. The parsed file type filter block pointer will be saved in the _gafilenamefilterarra array, and the value of the filter item number _Gnumoffilters variable is updated. 83003C13: Modify the file system driver object scheduling routine entry, start the work of the hook function of the intercept file operation. 83003C17: Restore the file system driver original scheduling routine port, stop the hook function of the intercept file operation. The issuance of the IO control code listed above is fixed, and when the hook function is started, some random control code is also issued: 83003C07: Driver will open the head element of the file list, the first-one request open file is removed and inserted into Wait for the tail of the list, and transfer the user's space address to the RING3 level waiting to kill the thread of the open file. 83003c0b: Driver to turn off the header element of the file list, the first request, close file delete and inserted into the end of the standby list, and transfer the file name string in the element to the thread of the RING3 waiting to locate the shutdown file processing 83003C1F: When you check the file is a virus, update the history chain list. Here, the hook function _HookcreatedSpatch and GUIDLL are waiting to be smashed to open the file, and the processing of the closed, clear, set file information, and write operations is similar to this: When the file request enters the hook function _Hookcreated Tracy, it First position the current stack unit from the entrance IRP and acquire file objects representing the request. Then determine if the current process is our own, if you must put it, because the file operations are frequent in the check module, the file request from RavMon will cause a serious system deadlock. Next, use the file object in the stack unit to get a complete file path name and make sure the file is not: / PIPE /, / IPC. Then look up the history linked list to determine if the file has been checked and recorded. If you find the record about the file in the history lin list and the record is not expired, the timestamp and the current system time must be greater than 1f4h, then Read the test results directly from the record. If the record does not have this file in the historical chain table, filter the array check if the file has been intercepted file type. At this time, enter the real check Open file function _ravcheckopenfile, this function entry is first removed from the standby, waiting, or closing the chain header (_GetfreeEntry) and fill it, such as file path name. Then add the idle element to the end of the file, and release the hookopen semaphore to wake up RING3 and wait for the thread to open the file.

Then call the KewaitForsingleObject on an event object saved in the idle element to wait for the completion of RING3. When the hook function hangs, the RING3 checks the thread: it will send a request for the driver to the driver to obtain a request to open the file linop header, which will save the first submission, the unreasonable file request, the drive will put the elements The offset address mapped to the user space is passed directly to it. Then it then calls the fnscanonefile function in RSENGINE.DLL for torch and sets the tubular result bit in the element, and then the event object saved in the element will call SetEvent Wake the hook function on this event. The waken hook function checks the result bits set by RING3 Test Code to determine the file request is the original scheduler routine that is saved, or is canceled, the IOFCOMPLETEREQUEST is directly returned, and the history is added. The above is just a brief introduction of the hook function and the RING3 thread process, which omits such as judging the fixed drive, timeout, etc. For details, please refer to Guidll.dll and Hooksys.sys' disassembly code comments. 4. Close the routine (IRP_MJ_CLOSE): Stop the hook function, restore the file system driver original scheduling entry (_stopfilter). Unlock memory mapping to user space. 5. DRIVERUNLOAD: Stop the hook function, recover the original scheduling portal of the file system driver. Delete devices and symbolic connections. To delete a set of named event objects hookxxxx when initialization, including the release of the pointer reference, close the open handle. It is released as MDL (_PMDL), alternate linked list (_sysbufaddr), history of memory, and memory space allocated by filters. Delete the resource variable (_filterResource) set for the file name filtered array access. Release two pointers references to HOOKOPEN and HOOKCLOSE in the overall naming kernel area of ​​the system. 3.4.3HOOKSYS.SYS Reverse Engineering Code Analysis 3.4.3.1 Getting the Current Process Name Code Initialization Routine Make Process Name Transfer 00011889 Call DS: __ IMP__IGetCurrentProcess @ 0; get the current process system EPROCESS pointer

0001188F mov edi, eax; Eprocess base address 00011891 xor esi, esi; initialized offset 0 00011893 lea eax, [esi edi]; scan pointer 00011896 push 6; process Length 00011898 push eax; scan pointer 00011899 push offset $ SG8452 ; "System"; process name string 0001189E Call DS: __ Imp__Strncmp; Comparison Scanner is a process name 000118A4 Add ESP, 0CH; Restore Stack 000118A7 Test Eax, Eax; Test Comparison Results 000118A9 JZ Short Loc_118B9; find a loop 000118Ab INC ESI; increasing offset 000118ac CMP ESI, 3000H; scanning 000118b2 jb short loc_11893 in 12K scope; continues to compare the current process name 00010d1e Call DS: __ imp_iogetCurrentProcess @ 0 within the scope; get the current process System EPROCESS pointer 00010d24 MOV ECX, _ProcessNameOffset; acquired Save process name offset 00010d2a add eax, ECX; get pointers pointing to process names 3.4.3.2 Start hook function work code

000114f4 push 4; driving the file system in advance

000114F6 MOV ESI, Offset FSDRIVEROBJECTPTRLIST Eax, EAX; test is legal 00011500 jz short loc_11548; illegal, continue the next modification driver 00011502 MOV EDX, OFFSET _HOOKCREATEDISPATCH @ 8; get the offset address of your hook function 00011507 Lea ECX, [EAX 38h]; Open the scheduling routine (IRP_MJ_CREATE) Offset 0001150A Call @ InterlocKedexchange @ 8; atomic operation, replacing the entrance to the scheduling routine in the replacement drive object 0001150F MOV [ESI-10H], EAX; save the original open Inlet of the scheduling routine

3.4.3.3 Mapping System Memory to User Space Code

0001068E Push ESI; system memory size

0001068F push _SysBufAddr; system memory base address 00010695 call ds: __ imp__MmSizeOfMdl @ 8; computing description of the system memory required memory descriptor list (MDL) size 0001069B push 206B6444h; debugging tag 000106A0 push eax; MDL size 000106A1 push 0; in a system with non- nonpaged pool allocation 000106A3 call ds: __ imp__ExAllocatePoolWithTag @ 12; of MDL allocate memory 000106A9 push esi; system memory size 000106AA mov _pMdl, eax; save MDL pointer 000106AF push _SysBufAddr; system memory base address 000106B5 push eax; MDL pointer 000106B6 call ds : __ imp__MmCreateMdl @ 12; initialization MDL 000106BC push eax; MDL pointer 000106BD mov _pMdl, eax; save MDL pointer 000106C2 call ds: __ imp__MmBuildMdlForNonPagedPool @ 4; fill MDL physical page array 000106C8 push 1; access mode 000106CA push _pMdl; MDL pointer 000106D0 call DS: __ IMP_MMMMAPLOCKEDPAGES @ 8; Mapping MDL's physical memory page ... 000106dB MOV _USERBUFADDR, EAX; Save the mapping user space address _USERBUFADDR and _SysbufadDR map to the same physical address. Main references David A. Solomon, Mark Russinovich "Inside Microsoft Windows 2000" September 2000 David A. Solomon "Inside Windows NT" May 1998 Prasad Dabak, Sandeep Phadke, Milind Borate "Undocumented Windows NT" October 1999 Matt Pietrek "Windows 95 System Programming Secrets "March 1996 Walter OneY" System Programming for Windows 95 "March 1996 Walter OneY" Programming The Windows Driver Model "1999 Lu Lin" Windows 9x file read and write interNal "2001 [full text]

转载请注明原文地址:https://www.9cbs.com/read-15265.html

New Post(0)