Author: Chen Zhaohui Li Guilin
Unlike DOS / Windows, it is difficult to recover after the UNIX file is deleted, which is determined by UNIX unique file system structure. The UNIX file directory is not like DOS / Windows, even if the file is still saved after being deleted, the file name, the file length, the start cluster number (ie, the first disk block number) of the file, the opposite, its file All information relies on a data structure called I node to be described, while the I node is cleared after being deleted, so that it is almost impossible to directly recover the deleted file content. This paper combines the actual implementation of several document recovery strategies and its key steps.
First, UNIX file system structure
We know that UNIX is a file volume as its file system storage format, and different UNIX systems, the file volume format is different, even even the different versions of the same UNIX operating system, its file system may not necessarily be exactly the same, for example: The SCO UNIX version 4.1 and the 5.0 version of the file system structure have significant differences, but as long as the UNIX system, the basic structure of the file volume is consistent. analyse as below:
Regardless of the UNIX system, no matter what version, the file volume includes at least several parts such as boot block, super block, i node table, and data area. In addition, different UNIX versions may have different differences. For example: The SCO UNIX system bitmap index block and bitmap AIX logical volume, etc. The particularity of these systems does not affect the recovery strategy below, so this is not discussed, only the standard UNIX file volume structure is introduced.
Guide block
Located in the first sector of the file volume, this 512 byte is the boot code of the file system, which is unique to the root file system, and other file systems are empty.
2. Super block
Located in the second sector of the file system, followed by the boot block, used to describe the structure of the system. Such as the length of the i node, the file system size, etc., its structure is stored in /usr/include/sys/filsys.h, and its structure is as follows:
Struct Filsys
{
Ushort s_isize; / * Number of data blocks occupied by disk index node area * /
DADDR_T S_FSIZE; / * The number of data blocks of the entire file system * /
Short s_nfree; / * The number of idle blocks currently registered in the idle block login table * /
Daddr_t s_free [nicfree]; / * idle block registration form * /
Short s_ninode; / * Idle index node * /
INO_T S_INODE [NICINOD]; / * Free Node Registration Form * /
Char s_flock; / * Lock flag * /
CHAR S_ILOCK; / * Node lock flag * /
CHAR S_FMOD; / * Super block modification flag * /
Char s_ronly; / * File System Read-only Sign * /
Time_t s_time; / * Super block last modified time * /
Short s_dinfo [4]; / * Device information * /
DADDR_T S_TFREE; / * The total number of idle blocks * /
INO_T S_TINODE; / * Total number of free nods * /
Char s_fname [6]; / * File System Name * /
Char s_fpack [6];
Long s_fill [13]; / * Fill vacancy * /
Long s_magic; / * Indicates the number of magic numbers in the file system * /
Long S_Type; / * New File System Type * /
}
3. IA
The i-node table is stored in the hyper block, and its length is determined by the S_Isize field in the hyper block, its function is used to describe the properties, length, genus, a group, data block table, etc. of the file, and its data structure is / USR / include / sys / ino.H, as follows:
Struct Dinode
{
USHORT DI_MODE;
Short di_nlink;
Ushort di_uid;
Ushort di_gid; OFF_T DI_SIZE;
CHAR DI_ADDR [40];
Time_t di_atime;
TIME_T DI_MTIME;
Time_t di_ctime;
}
4. Directory structure
All files all files are stored in the directory, and the directory itself is also a file. The mechanism of the directory stored the file is as follows: First, the directory file itself is also like a normal file, occupying an index node, secondly, by this index node gets the location of the directory content, again, remove one of the file names and it corresponds to it The node number, thus accessing a file. The directory structure is as follows:
Index node (2 bytes). (This directory) (14-byte)
Index node (2 bytes) .. (parent) (14 bytes)
Index Node (2 bytes) file name (14 bytes)
Index Node (2 bytes) file name (14 bytes)
Index Node (2 bytes) file name (14 bytes)
The content and other information of the file are described by the contents of the file, and the content and other information of the file are described by the index node.
Second, the deletion process of the file
The process of deleting a file under UNIX is simple, that is, it is to release the data block that the index node table and the file occupied, and the index node that is emptied, but does not clear the file content. However, deleting files is different from the deleted directory, and the process of deleting files in different commands is different.
1. Delete a file
Unix deletes a file for a file is: Release the disk block data block occupied by the file I node, then clear the corresponding node, and finally release the I node.
2. Delete a directory
Delete a directory: first delete all files in the directory one by one, then delete the directory. The directory itself is also a file, so the deletion method is consistent with the delete file.
3. Several different deletion commands
.rm command
Generally deleted commands, the delete process has been described.
.mv command
Format: MV file 1 file 2
The process is to release the data block of the file 2 and change the name of the file 1 to File 2, and then release the I node of the file 2.
.> Command
Format:> file name
If a new file is generated, the> command only applies an I node without writing any file content; if an already existing file is emptied, the data block occupied by the file is released, and the file length is cleared.
Third, the recovery strategy of deleted files
To restore the deleted file, you can only go to the article according to the deleted. What did the file have been left after being deleted? It can be seen from the above analysis: one, leaving the content of the document; two, left a "site". File recovery policies can only be analyzed from both aspects. Here are several recovery strategies.
1. Recovery according to the disk site
If the file is deleted, the site is not broken (ie, the hard disk has not been written after the file is deleted), and it is assumed that only one file is deleted, then the recovery can be recovered according to the system allocation algorithm. Because the system creates a file, it is necessary to determine the data block position occupied by the file according to a particular allocation algorithm. And when the file is deleted, the data block it occupies is released, and returns to the system allocation table. At this time, if a file is re-established, the system is assigned according to the original allocation algorithm, it must be the original file. The occupied data block is consistent, and we know that the byte of the last data block tail of the UNIX file is all 0, accordingly, as long as the data allocation algorithm of the system is called, the application data block of a block in the system, because UNIX All the bytes of the last data block end of the file are 0, so, as long as the tail of the assigned data block is found to be 0, it can be considered that the file is ended, thereby determining the length and content of the file, which in turn realizes the recovery. Methods as below:
(1) Apply for an index node, that is, the application is created to create a new file name without writing anything. Such as: #> / tmp / xx () Call system assignment data block algorithm getNextFreeBlock () Get a data block number to record in a certain address table variable.
(3) Read this data block, determine if the tail is all continuous, if not, return (2), if yes, then (4).
⑷ First, use the system function FSTAT to get the / tmp / xx i node number, then write the address table obtained from (2) into the address table of the index node (note the information problem), and according to the number of data blocks and the last piece The length of the effect data calculates the file size and writes the DI_SIZE field of the i node.
⑸ ⑸ ⑸ 系统 节 点表 表表.
It should be noted that the algorithm of the first, system allocation data block is different from different UNIX versions; second, some UNIX, such as SCO UNIX version 5.0, the allocation and recycling of its idle data blocks is data using a dynamic chain table. The structure is implemented, and their file recovery is easier, as long as the tail of the idle chain table is looking again, the author will further describe.
2. Restore the content.
If the site has been destroyed, that is, the hard disk has been written, then it is restored according to the content. Moreover, since UNIX is a multi-process, multi-user system, each of which is turned off or hardware, communication failure, etc., .sh_history, etc., the hard disk is destroyed. Therefore, it is discussed with a greater practical value by the method of recovery. Through the actual exploration, the following four recovery strategies are obtained for reference.
(1) Keyword search method
If you know the contents of several bytes in the deleted file content, and the length of the file does not exceed a disk block, you can search this byte string throughout the file system, draw a data block where a file is located, will The block number fills in an I node to restore a file, and the algorithm of the search file system is simple, the description is as follows:
a. #df -k determine the device file name of the file system (such as / dev / root)
b. Search with the following functions, if successful, return the data block number, reverse returns -1. Where fsname is the device name of the file system, such as / dev / root, the comp () parameter is a function that implements the search criteria.
Long searchfs (char * fsname, int comp ())
{
File * fp;
Char BUF [1024];
LONG i = 0;
FP = fopen (fsname, "r");
While (! feof (fp))
{
Fread (BUF, 1024, 1, FP);
IF (comp ()) / * Check if you meet the search criteria * /
Return I; / * If successful return block number * /
i ;
}
Fclose (fp);
Return -1; / * Did not find the eligible block, return -1 * /
}
(2) Exact length search
If you know the exact length (byte number) of the file (number of bytes), the exact length of the data in the last data block of the file can be calculated according to the size of a block, and other bytes in the data block are inevitably 0. Based on this condition, by searching the entire file system, finding the data blocks in which the conditions are met, if multiple blocks meet the requirements, it is also necessary to distinguish according to other conditions. But in any case, it is also a strategy for recovery data according to precise length analysis.
(3) Content Relation
If you know that there is some implementable association in the file content, such as the checksum of the file, or a context relationship of the file content, you can also search the entire file system, by repeatedly trying to find the disk block data block that meets the association conditions. Furthermore to restore a file.
⑷ Environment comparison method
If you know the installation process of the file system where the file is located, then find a completely phase-in machine, install the same version of UNIX and the appropriate software according to the original exact same step, you can imagine that the new machine environment will be The environment is basically the same, comparing the contents of the same file system on both machines, can infer the approximate location of the deleted file, at least greatly reduce the scope of the lookup, once the range of findings is sufficient, the method of observation and trying can be used one by one Other conditions restore data, reduce the difficulty of recovery, and increase recovery. The specific implementation of file system recovery under UNIX systems depends on different operating systems and different versions of specific file system structures and disk block allocation algorithms. This article tries to summarize a general idea and strategy, limited to space, and cannot discuss the specific implementation process of them in detail.
Excerpt from:
ccpi.gov.cn