Linux Backup and Restore Speed
Level: Getting Started
Chris Walden (cmwalden-at-us.ibm.com) e-commerce architect, IBM Developer Relations 2004 January
IBM E-commerce architect Chris Walden will guide you how to use your Windows operation skills in the Linux environment through the nine series of articles published on DeveloperWorks. This section will examine the content on the Linux system, and the conventional backup is planned and implemented in consideration of recovery and security.
Linux is a stable and reliable environment. But any computing system has an incident, such as hardware failure. Reliable backup with key configuration information is an integral part of any responsible management plan. Backups can be performed by a wide variety of methods in Linux. The techniques involved From a very simple script driven method to carefully designed commercial software. Backups can be saved to remote network devices, tape drives, and other movable media. Backup can be file-based or based on drive images. Among the options available, you can mix with these technologies to design ideal backup plans for your environment.
Determine the policy can use many different ways to back up the system. For some information about this, you can read the article "Introduction to Backing Up and Restoring Data" listed in the end of this article.
The contents of the backed up are largely dependent on your reasons you back up. Do you try to recover from a serious failure (such as a hard drive problem)? Do you want to archive so you can recover old files when you need? Do you plan from a cold system and restore, or start with a preloaded spare system?
Determines the content to be backed up In the backup and restore system, Linux is a great advantage based on the nature of the file. In a Windows system, the registry is very relevant to the system. Configuration and software installations are not just putting files on the system. Therefore, the restore system requires software that can handle this characteristic of Windows. In Linux, the situation is different. The configuration file is based on the text, and in addition to direct processing of hardware, they are largely unrelated to the system. The modern approach to the hardware driver is to allow them to be available in the form of dynamically loaded modules, so the kernel has become more unrelated to the system. Unlike the complex details that the backup must be handled to how the operating system is installed on the system and hardware, Linux backup processes the package and unpack of files.
Under normal circumstances, the following directories require backup:
/ ETC contains all core profiles. This includes network configuration, system name, firewall rules, users, groups, and other global system items. / VAR contains information used by the System Guarding (Service), including DNS configuration, DHCP lease, mail buffer file, HTTP server file, DB2 instance configuration, and more. / Home contains the default user home directory for all users. This includes their personal settings, downloaded files and users do not want to lose other information. / root is the main directory of the root user. / OPT is where many non-system files are installed. IBM software is installed here. OpenOffice, JDK, and other software are also installed here by default.
Some directories should be considered without backup.
/ PROC should never back up this directory. It is not a real file system, but a virtualized view of the kernel and environment. It includes files such as / proc / kcore, this file is a virtual view of the entire running memory. Backup These files are only wasting resources. / DEV contains file representation of the hardware device. If you plan to restore to a blank system, you can back up / dev. However, if the plan is reduced to a installed Linux system, the backup / dev is unnecessary.
Other directories contain system files and installed packages. In a server environment, many of this is not customized. Most customizes occur in the / etc and / home directory. However, for integrity considerations, you may want to back up them. In the production environment, I want to ensure that the data will not be lost, so I will back up the entire system other than the / proc directory. If you are worried about users and configuration, I will only back up / etc, / var, / home, and / root directory.
The backup tool is as mentioned earlier, the Linux backup is largely packaged and unpacking files. This allows you to perform backups using existing system utilities and scripts without having to purchase commercial packages. In many cases, such backups will be sufficient and provide great control capabilities for administrators. The backup script can use the cron command from being moved, this command controls the predetermined events in Linux.
Tartar is a classic UNIX command that has been ported to Linux. TAR is an abbreviation for Tape Archive, initially designed to pack files onto a tape. If you have downloaded the Linux source code, you may have encountered TAR files. This is a file-based command, which is essentially continuously, and stacked the files connected.
Using TAR can pack the entire directory tree, which makes it particularly suitable for backup. Archive files can be restored, or unfold separate files and directories from it. Backups can be saved to file-based devices or tape devices. The file can be redirected when restored so that they reappear into a directory (or system) different from the directory (or system) originally saved. TAR is independent of the file system. It can be used on EXT2, EXT3, JFS, REISER, and other file systems.
Use TAR very similar to file utilities such as PKZIP. Simply point to a purpose (can be a file or device), then specify the file you want to package. You can dynamically compress archive files through standard compression types or specify an external compression program you choose. To compress or decompress files via BZIP2, you can use the TAR -Z command.
To use tar to back up the entire file system other than / proc directory to SCSI tape devices:
TAR-CPF / DEV / ST0 / --EXCLUDE = / Proc
In the above example, the -c switch indicates that the archive file is being created. The -p switch indicates that we want to retain file permit, which is critical to good backups. The -f switch indicates the file name of the archive file. In this example, we use the original tape devices / dev / ST0. / Indicates the content we want to back up. Since we want to back up the entire system, specify this switch as root (root). When the Tar points to a directory (at / end), it will automatically recuriate. Finally, we exclude the / proc directory because it does not include anything that needs to be saved. If the single box tape is can't hold this backup, we need to add the -m switch (not shown in this example) to make a multi-volume backup.
In case, don't forget that Linux is case sensitive. For example, the tar command should always be performed in lowercase. The command line switch can be a mixture of uppercase, lowercase or case. For example, -T and -T perform different functions. The file or directory name can be mixed using case, and it is case sensitive like commands and command line switches.
To restore one or more files, you can use the TAR command with extract switch (-X):
TAR-XPF / DEV / ST0 -C /
The -f switch here also points to the archive file, and the -p switch indicates that we want to restore the files. The -x switch indicates that the file is extracted from the archive. -C / indicates that we want to restore from / start. TAR is usually restored to the directory running this command. The -c switch makes our current directory no longer relevant. The other two TAR commands you might use often are -t and -d switches. The -t switch lists the contents of an archive file. -d switch compares the content of the archive file and the current file on the system.
To facilitate operation and editing, you can put a text file that you want archive and a directory you want to archive, and then reference this text file in the command line through the -T switch. These files and directories can be combined with other directories listed on the command line. All ISO files listed in all files and directories, / root directory, and / TMP directories listed in MyFiles.
TAR-CPF / DEV / ST0 -T myfiles / root /tmp/*.iso
The file list is just a text file, where the file or directory is listed. Below is an example:
/ etc / var / home / usr / local / OPT
Note that the TAR -T (or Files-from) command cannot accept wildcards. The file must be explicitly listed. The above example shows a method of separately reference files. You can also perform the script to search the system and create a list. Here is an example of such a script:
#! / bin / shcat myfiles> TemplistFind / usr / share -iname * .png >> TemplistFind / Tmp -Iname * .iso >> Templisttar -cpzmf / dev / st0 -t templist
The above script first copies all existing files in MyFiles to TEMPLIST. It then performs two Find commands to search for files in the file system, and attach them to TEMPLIST. The first time is all the files ending in .png in .png in the search / usr / share directory tree. The second time is all the files ending in .iso in the search / TMP directory tree. After establishing a good list, TAR then creates a new archive file on file device / dev / ST0 (first SCSI tape device), which is compressed using Gzip format and retains all file permissions. The archive will span multiple volumes. The name of the file to be archive will be extracted from the Templist file.
You can also use scripts to perform more fine operations, such as incremental backups. Gerhard Mourani gave an excellent script in his Securing and Optimizing Linux book, and you can find information about this book in the reference information at the end of this article.
You can also write the script to restore files, although restore is usually manually performed. As mentioned above, the -x switch used to extract the file is replaced by the -c switch. You can restore the entire archive or restore the specified individual files or directories. Use wildcards to reference files in the archive file. You can also use a switch to dump and restore.
DUMP and RESTORE DUMP can perform similar functions. However, DUMP tends to consider file systems instead of individual files. Here is the contents from the Dump Manual file: "DUMP checks the file on the EXT2 file system and determines which files need to be backed up. These files are copied to a given disk, tape or other storage media for security protection. ...... Dumps greater than the output media capacity will be divided into multiple volumes. On most media, capacity is determined by returning an end-of-media tag. "
The program with DUMP is RESTORE, which is used to restore files from the dump image.
The restore command performs the reverse function of the dump. You can first restore the full backup of the file system, and subsequent incremental backups can be overwritten over the reduced full backup. A separate file or directory tree can be restored from a full or partial backup.
DUMP and RESTORE can run on the network, so you can back up or restore via remote devices. DUMP and RESTORE use tape drives and provide a wide range of file devices. However, both are limited to EXT2 and EXT3 file systems. If you are using JFS, Reiser, or other file systems, you will need other utilities, such as TAR. Executing a backup using DUMP is quite simple to perform a backup using DUMP. The following command executes a full Linux backup, which backs up all EXT2 and EXT3 file systems to an SCSI tape device.
DUMP 0F / DEV / NST0 / BOOTDUMP 0F / DEV / NST0 /
In this example, there are two file systems in the system. One for / boot, another for /, which is a common configuration. They must be referenced separately when performing backups. / DEV / NST0 references the first SCSI tape drive, but it is referenced in non-reordering mode. This ensures that each volume is arranged in a tape.
One interesting feature of Dump is its built-in incremental backup function. In the above example, 0 represents a grade 0 or basic level backup. This is a full system backup, you have to perform regularly to save the entire system. For subsequent backups, you can use other numbers (1-9) to replace 0 to change the backup level. Level 1 Backup will save all files that have been changed since the 0-level backup. Level 2 backups save all files that have been changed since the implementation Level 1 backup, in this class. The same features can be performed using TAR and scripts, but require script creators to provide a mechanism to determine when the last backup is performed. Dump has its own mechanism, that is, it outputs an update file (/ etc / dumpupdates) when performing backup. This update file will be reset each time you perform a 0-level backup. Subsequent levels of backups retain their tags until another 0-level backup is performed. If you perform a tape-based backup, DUMP will automatically track multiple volumes.
Skip file tags will be done by DUMP skipping files and directories. The command to achieve this is Chattr, which changes the extension attribute on the EXT2 and EXT3 file systems. Chattr D
Use RESTORE to perform the restore to restore information saved using Dump, you can use the restore command. Like tar, DUMP can list (-T) archive the contents and compare with the current file (-C). Whether you must be careful when using DUMP. There are two very different restoration methods, you must use the correct way to get predictable results.
Reconstruction (-R) Remember, consider more of the file system when designing DUMP, not a separate file. Therefore, there are two different file restore style. To rebuild a file system, you can use the -r command line switch. The purpose of design reconstruction is to operate on an empty file system and restore it into a saved state. Before performing reconstruction, you should have created, formatted, and loading (MOUNT) the file system. The file system containing files should not be rebuilt.
Here is an example of performing full reconstruction using the dump described above.
Restore -RF / DEV / NST0
The above command needs to be executed separately for each file system to be restored.
This process can be repeated to add incremental backups when needed.
Extract (-X) If you need to use a separate file instead of using the entire file system, you must use the -X switch to extract them. For example, you have to extract the / etc directory from our tape backup, you can use the following command:
Restore -XF / DEV / NST0 / ETC Interactive Restore (-i) Restore provides another feature is interactive mode. Use the command:
Restore -IF / DEV / NST0
You will put you in the interactive shell, and it also shows items included in this archive. Type "Help" will display a list of commands. Then you can browse and select the item you wish to extract. Be sure to remember that any files you extracted will enter the current directory.
Dump and TARDUMP and TAR have a batch of supporters. Both have their own advantages and disadvantages. If you run any file system other than EXT2 or EXT3, DUMP is not available to you. However, if this is not this, then only the least amount of scripts can run DUMP, and DUMP has an interactive mode that can help restore.
I tend to use tar because I like to write scripts to get additional control levels. There are also multi-platform tools for operation .tar files.
Other Tools In Linux, any program that can copy files can be used to perform some degree of backup. Some people use CPIO and DD to perform backups. The CPIO is another packaging utility that is almost similar to TAR, but it is not very common. DD is a file system replication utility that produces a binary copy of the file system. DD can also be used to generate an image of a hard drive, similar to a product such as a GHOST such as Symantec. However, DD is not file-based, so you can only use it to restore data to a fully identical hard drive partition.
Commercial backup products can be used for LINUX commercial backup products. Commercial products generally provide a convenient interface and reporting system, while you must self-sufficient when using tools such as DUMP and TAR. Commercial products are wide and usually provide a large number of features. The biggest advantage of using a commercial package is that there is a pre-established policy for processing backups, you can put your work immediately. Commercial developers have committed many mistakes you will commit, and their wisdom cost is cheap than lost your valuable data.
Tivoli Storage Managertivoli Storage Manager is perhaps the best commercial backup and storage management utility available for Linux. The Tivoli Storage Manager server can run on a variety of platforms, including Linux, and clients can run on more types of platforms.
In essence, the Storage Manager server is configured by a device suitable for backing up the environment. Any system to participate in the backup must load a client that communicates with the server. Backup can be executed as scheduled, manually executed by the Tivoli Storage Manager client interface, or using a web-based interface remote execution.
TSM's nature based on the nature means that there is no need to adjust the list of files frequently, you can define central rules for backup behavior. In addition, IBM Tivoli Storage Resource Manager is also able to identify, evaluate, control, and predict the utilization of enterprise storage assets, and can detect potential problems and automatically apply self-repair adjustment. For more details, see the Tivoli Web site (see links in the reference).
Figure 1. Tivoli Storage Manager menu
The backup and restore are then processed by a remote device.
Figure 2. Tivoli Storage Manager interface
Preview and Review The first step with a good backup is to have a plan. First, you know the data you need to save and what recovery policy you need, and then use the tool that best suits the policy.
LINUX comes with some ready-made (OUT OF THE BOX) useful backup tools. Two of these most commonly used TAR and DUMP / RESTORE. Both can perform a full system backup. With creative scripts, you can design a custom solution to back up your system locally and remotely.
However, writing your own backup scripts may be a heavy task that is even more such for complex companies. Commercialization software such as Tivoli Storage Manager reduces learning difficulty and allows you to immediately control your own backup, but you may have to adjust your own policy to adapt to these tools. Reference
Read the other parts of Windows to Linux Tour Series (DeveloperWorks, November 2003). Linux Administrator's Security Guide is a security guide, with a very good section discusses Linux backup and restore practices. Introduction to BACKING UP AND RESTORING DATA is an overview that is independent of the operating system or system architecture. The author discusses backup technology and how to formulate plans for backup. Linux Administration Made Easy is an older reference material, but still useful, because Linux's general processes and technologies remain consistent. The Linux System Administrator's Guide introduces the system management of Linux systems for beginners. Securing and Optimizing Linux - A Hands On Guide (Red Hat Edition) Chapter 7 "Backup and Restore" is another nice guide, including a script for performing TAR-based incremental backups. The Tao Of Backup is an interesting display of backup philosophy and presents philosophical form. Although it is related to a commercial product, this information is very good. IBM DeveloperWorks Tutorial "Linux Machine Backup" guides you to complete the process of creating and implementing backup policies. It is easy to store data and other content on the CD: how to do this in the IBM DeveloperWorks article "Burnt CD on Linux". If you are switched from a Windows environment to a Linux environment, you also need to read Linux user technology FAQ. The Tivoli Storage Manager is rated as a LinuxWorld 2003 optimal storage solution. Learn more about Tivoli Storage Manager for Linux from the Linux area from the IBM site. The Tivoli Product page contains more information about Tivoli, including security and privacy features. Chapter 3 of Introduction to Linux in the Linux Document Plan discusses file permissions and security. More reference materials for Linux developers can be found in the developerWorks Linux zone.
About Author Chris Walden is an e-commerce architect of IBM Developer Relations Technical Consulting (also known as Dragonslayers), which provides education, implementation and consulting for IBM business partners. He is committed to Linux related work, and an opportunity to promote the benefits of Linux from people around you. In addition to completing his architect's responsibilities, he is also proficient in various fields of Linux infrastructure servers, including files, prints, and other application services in a hybrid platform user environment. Chris has 10 years of experience in computer industry, supporting Web application development and consultants from site, and he has been involved in various fields. You can contact Chris via email cmwalden-at-us.ibm.com.