File operation is one of the most basic functions of the application, Win32 API and MFC provide functions and classes that support file processing, commonly used CreateFile (), WriteFile (), readfile (), WRITEFILE (), readfile (), CFILE class provided by the MFC Wait. In general, these functions can meet the requirements of most cases, but for several tens of GB, hundreds of GB, or even TB, the mass storage required for some special application areas, and then process the usual file processing method. Obviously it is not possible. Currently, the operation of this large file is generally processed in a mode of memory mapping files, which will be discussed below for this Windows core programming technology.
Memory map files are similar to virtual memory. You can keep an area of an address space through a memory mapping file, while submitting the physical memory to this area, just the physical memory mapping of the memory file from a file already existing on the disk, not the system The page file, and must first map the file before operating the file, just load the entire file from the disk to memory. It can be seen that when using the memory map file to process files stored on the disk, it will not be necessary to perform I / O operations on the file, which means that it will not be necessary to apply and allocate the cache when processing the file. The file cache operation is directly managed by the system. Since the file data is loaded into memory, the data from memory to files and releases the memory block, the memory map file can be played when processing a large amount of data. Pretty important role. In addition, the system in the actual engineering often needs to share data between multiple processes. If the amount of data is small, the processing method is flexible, and if the shared data capacity is huge, then it needs to be performed by means of a memory mapping file. In fact, memory mapping files are the most effective way to solve data sharing between locals.
Memory map files are not simple file I / O operations, actually use Windows core programming technology - memory management. So, if you want to have a more profound understanding of memory map files, you must have a clear understanding of the memory management mechanism of the Windows operating system. The relevant knowledge of memory management is very complicated, and the discussion category of this article is exceeded. Interested readers can refer to other related books. The general method of using a memory map is given below:
First, you must create or open a file kernel object through the createFile () function, which identifies the file that will be used as a memory map file on the disk. After advertising the file image in the location of the file image in the physical memory, only the path of the image file is specified, and the length of the image is not specified. To specify how much physical storage is required to specify file mapping objects, you need to create a file mapping kernel object to tell the system file size and access the file. After the file mapping object is created, you must retain an address space area for file data, and submit file data as a physical memory mapped to the area. The MapViewOffile () function is responsible for managing all or part of the file mapping object to the process address space through the management of the system. At this time, the use and processing of the memory mapping file is basically the same as the processing method of file data that is usually loaded into the memory. When the use of the memory map file is completed, the clearance is completed through a series of operations and Use the release of resources. This part is relatively simple, and you can complete the image of the file data from the process of address space by unmapViewoffile (), and close the file mapping objects and file objects created in front of CLOSEHANDLE ().
related functions:
When using a memory map file, the API function used is mainly the functions mentioned earlier, and the following is introduced:
(1) HANDLE CreateFile (LPCTSTR lpFileName, DWORD dwDesiredAccess, DWORD dwShareMode, LPSECURITY_ATTRIBUTES lpSecurityAttributes, DWORD dwCreationDisposition, DWORD dwFlagsAndAttributes, HANDLE hTemplateFile); function CreateFile () is often used to create even during the normal operation of the file, the file is opened, in When processing a memory map file, the function is created / opened a file kernel object, and returns its handle. When calling the function, you need to set the parameters dwdesiredAccess and dwsharemode, error parameters according to whether the data read and write and files are required to read and write and files. Settings will result in failure at the corresponding operation.
(2) Handle CreateFilemapping (Handle Hfile, LPSecurity_Attributes LPFileMappingAttributes, DWORD FLPROTECT, DWORD DWMAXIMUMSHIGH, DWORD DWMAXIMUMSELOW, LPCTSTR LPNAME);
The function createfilemapping () function creates a file mapping kernel object, specifying the file handle to the process address space by parameter HFile (this handle is acquired by the CreateFile () function. Since the physical memory of the memory mapping file is actually stored on the disk, not the memory allocated from the system's page file, the system does not actively reserve the address space area, nor will the file storage space Map to this area, in order to make the system to determine what protecting properties to the page, you need to set by parameter flprotect, protect attributes Page_readonly, Page_ReadWrite, and Page_WriteCopy, you can read, read and write file data. . When using PAGE_READONLY, we must ensure that CreateFile () is used in GENERIC_READ parameters; PAGE_READWRITE requires CreateFile () is used in GENERIC_READ | GENERIC_WRITE parameters; As for property PAGE_WRITECOPY only need to ensure that CreateFile () uses one of GENERIC_READ and can GENERIC_WRITE . DWORD type parameters DwMaximumSizeHigh and dwmaximumsizerow are also quite important to specify the maximum number of bytes of the file, because the two parameters are 64 bits, so the maximum file length is 16eb, which can almost satisfy any big data volume file processing. Requirements.
(3) LPVOID MAPVIEWOFFILE (Handle HfilemappingObject, DWORD DWDESIREDACCESS, DWORD DWFILEOFFSETLOW, DWORD DWNUMBEROFBYTOSTOMAP);
The MapViewOffile () function is responsible for mapping the file data to the address space of the process, and the parameter hFileMappingObject is the file image object handle returned for CREATEFILEMAPPING (). The parameters DwdesiredAccess each specify the access method of the file data and also matches the protection attribute set to the createFileMapping () function. Although the protection attributes are repeatedly set up here, it can make the application more effectively control the application of the protection attribute of the data. The MapViewOffile () function allows all or part of the mapping file, when mapping, you need to specify the offset address of the data file and the length of the to map. The offset address of the file is specified by the 64-bit value consisting of DWORD type parameters dwfileoffsetHigh and DwFileOffsetLow, and must be an integration of the allocation grain size of the operating system. For the Windows operating system, the assignment particle size is fixed to 64KB. Of course, it is possible to dynamically obtain a particle size distribution of the current by the operating system following code: SYSTEM_INFO sinf; GetSystemInfo (& sinf); DWORD dwAllocationGranularity = sinf.dwAllocationGranularity; dwNumberOfBytesToMap parameter specifies the length of the data mapping file, should be particularly pointed out that, for Windows 9X operating system, if MapViewOffile () can't find a large area to store the entire file mapping object, will return null values; but under Windows 2000, MapViewOffile () only needs to find a very big enough for the necessary view. The area can be regarded as the size of the entire file mapping object. After completing the file processing that is mapped to the process space area, you need to complete the release of the file data image through the function unmapViewOffile (), which is as follows:
(4) Bool UnmapViewOffile (LPCVOID LPBASEADDRESS);
The only parameter lpBaseAddress specifies the base address of the return area, and it must be set to the return value of MapViewOffile (). After using the function mappviewoffile (), there must be a corresponding unmapViewOffile () call, otherwise the preserved area will not be released before the process is terminated.
In addition, the CreateFile () and CreateFilemapping () functions have also been created with file kernel objects and file mapping kernel objects, and it is necessary before the process terminates:
(5) CloseHandle () is released, otherwise the problem of resource leaks will occur.
In addition to these necessary API functions, other secondary functions should be selected as appropriate when using memory map files. For example, when using a memory mapping file, in order to increase the speed, the system will make the data page of the file in cache, and the disk image of the file is not updated immediately when the file mapping view is processed. Use it to solve this problem:
(6) FlushViewOffile () function, the function enforcement system rewrites the modified data sections or all of the disk images, ensuring that all data updates can be saved to disk in a timely manner.
-------------------------------------------------- -------------------------------------------------- -------------------------------------------------- --------
Using a memory mapped file Processing large file application: By following a specific instance to further describe how the memory mapping file is used. This example receives data from the port and stores it in the disk in real time, and the memory map file is handled in this way due to large data amount (tens of GB).
The following is a part of the main code in the working thread mainproc, which starts from the program running, and when the port has data arrival, the event hevent [0], the waitformultipleObjects () function waits for the event after the event will receive The data is saved to the disk, and if the end reception will issue an event HEVENT [1], the event handler will be responsible for completing the release of the resource release and the file closure. The specific implementation process of this thread process function is given below:
...
// Create a file kernel object, his handle is saved in HFile
Handle Hfile = Createfile ("Recv1.zip", Generic_Write | generic_read, file_share_read, null, create_always, file_flag_sequential_scan, null;
// Create a file mapping kernel object, handle is saved in HFileMapping
Handle Hfilemapping = CreateFilemapping (Hfile, NULL, PAGE_READWRITE, 0, 0X4000000, NULL);
/ / Release the file kernel object
CloseHandle (HFILE);
// Set the size, offset and other parameters
__INT64 QWFILESIZE = 0x4000000;
__INT64 QWFILEOFFSET = 0;
__INT64 T = 600 * sinf.dwallocationGranularity;
DWORD DWBYTESINBLOCK = 1000 * sinf.dwallocationGranulaity;
// Map the file data to the address space of the process
PBYTE PBFILE = (Pbyte) MapViewOffile (HFilemapping, File_Map_all_Access, (DWORD) (QWFILEOFFSET & 0xFFFFFFFFFFFFFFfff), DWBYTESINBLOCK;
While (bloop)
{
// capture event hEvent [0] and event hEvent [1]
DWORD RET = WaitFormultiPleObjects (2, HEVENT, FALSE, Infinite);
RET - = WAIT_Object_0;
Switch (re)
{
/ / Receive data event trigger
Case 0: // receive data from the port and save to memory map
NReadlen = Syio_read (port [1], pbfile qwfileoffset, queuelen);
QWFILEOFFSET = NREADLEN
// When the data is full of 60%, it is necessary to open a new map view for the anti-data overflow.
IF (qwfileoffset> t) {t = qwfileoffset 600 * sinf.dwallocationGranularity; unmapViewoffile (pbfile);
Pbfile = (pbyte) MapViewoffile (HFilemapping, File_Map_all_Access, (qwfileoffset >> 32), (DWORD) (QWFILEOFFSET & 0xFFFFFFFFFF), DWBYTESINBLOCK;
} Break;
// Terminate the event trigger
Case 1: bloop = false;
// Undo file data image from the address space of the process
UNMAPVIEWOFFILE (PBFILE);
// Close the file mapping object
CloseHandle (HFileMapping);
Break;}}
...
During the termination of the event trigger, if only simple execution unmapViewoffile () and closeHandle () functions will not be able to correctly identify the actual size of the file, that is, if the open memory map file is 30GB, the received data is only 14GB, then the above program is executed After the end, the saved file length is still 30GB. That is, it is necessary to restore the file to the actual size after the processing is completed, and the following is the main code for this requirement:
// Create another file kernel object
Hfile2 = createfile ("recv.zip", generic_write | generic_read, file_share_read, null, create_always, file_flag_sequential_scan, null;
/ / Create another file mapping kernel object with the actual data length
Hfilemapping2 = CreateFilemapping (HFile2, Null, Page_Readwrite, 0, (DWORD) (QWFileOffset & 0xfffffffff), NULL;
// Turn off the file kernel object
CloseHandle (HFILE2);
// Map the file data to the address space of the process
PBFILE2 = (Pbyte) MapViewoffile (HFilemapping2, File_Map_all_access, 0, 0, qwfileoffset);
// Copy the data from the original memory map file to this memory map file
Memcpy (pbfile2, pbfile, qwfileoffset);
File: // Undo file data image from the process of address space
UNMAPVIEWOFFILE (PBFILE);
UnmapViewoffile (pbfile2);
// Close the file mapping object
CloseHandle (HFileMapping);
CloseHandle (HFileMapping2);
// Delete the temporary file Deletefile ("Recv1.zip");
-------------------------------------------------- -------------------------------------------------- -------------------------------------------------- --------
in conclusion:
With the actual test, the memory map file has a good performance when processing large data volume files, which has a significant advantage over the file processing method that usually uses the CFILE class and readFile () and WriteFile ().