Test Linux Reliability - Linux Technology Center Assessing Long Reliability of Linux

xiaoxiao2021-03-06  106

Test Linux reliability

English original

content:

Linux Reliability Measure Test Test Infrastructure Methodology Test Testing Strategy System Monitoring Conclusions Reference Information About The Author's Evaluation

related information:

IBM Linux Technology Center Home PageBM Linux Technology Center Projects PageBehind The Scheres At The IBM Linux Technology CenterDevelOperWorks Toolbox Subscription

In the Linux area:

Tutorial Tools & Product Codes & Component Articles

Linux Technology Center evaluates long-term reliability of Linux

Level: Getting Started

Li GE

(LGE@us.ibm.com), Assistant Software Engineer, Linux Technology Center, IBM

Linda Scott

(Lindajs@us.ibm.com), Senior Software Engineer, Linux Technology Center, IBM

Mark Vanderwiele

(Markv@us.ibm.com), Senior Technical Personnel, Linux Technology Center, IBM 2004 February

This article records the test results and analysis of the Linux kernel and other core OS components, from the library and device drivers to the file system and network, the test range is uneous, all tests are carried out under quite unfavorable conditions, and experience A long time. IBM Linux Technology Center has just ended the full test of up to more than three months and the results of their LTP (Linux Test Project)

ServeloperWorks readers share.

IBM Linux Technology Center (LTC) was established in August 1999, wanting to make Linux success in cooperation with Linux development groups. Its more than 200 employees make it a larger team organization for open source developers. The scope of the code they provide includes, from patch to structured kernel changes, from file systems and internationalization to GPL'd drivers. They are also committed to tracking Linux related development within IBM. The field of LTC is particularly interesting is Linux scalability, applicability, reliability, and system management - all purposes are to make Linux more suitable for enterprises. They have made many contributions to the Linux group, including allowing Linux to work in S / 390 host, porting the JFS log file system to Linux, and so on. Another core task of the LTC is to conduct a professional test of Linux under laboratory conditions in the way of testing business projects. LTC and SGI, OSDL, Bull and Wipro Technologies have contributed to the LTP Linux test item (LTP). Below is the result of the overall test result from the LTP kit in the Linux core. If you guess, Linux excellently withstand the test of sustained pressure.

Test results show the following summary based on test and observation results during operation:

Linux kernel and other core OS components - including libraries, device drivers, file systems, network, IPC, and memory management - operation stable and completed all desirable runtime, without any serious system failure. Each running has a high success rate (more than 95%), only a small number of expected intermittent faults, and these faults are designed to make the results of the resource overload. Linux system performance has not dropped in long run. On the SMP system, the Linux kernel extends correctly to use hardware resources (CPU, memory, hard drives). The Linux system can well withstand the CPU continuous load operation (more than 99%) and extreme memory pressure. The Linux system handled the overload. Test proves that Linux kernels and other core components are 30 days, 60 days, 90 days is reliable and stable, which can provide users with a long-running robust, enterprise-level environment. Linux Reliability Target For IBM Linux Technology Center, Linux reliability work is to use the LTP test suite to perform a long-term test of the Linux operating system using the LTP test suite, focusing on the workload related to Linux user environments (see References In-depth understanding of LTP). It is not committed to defecting defects. Test Environment Summary This article describes the test results and analysis of the use of the LTP test suite for 30 days and 60 days of Linux reliability test. Test as SUSE Linux Enterprise Server V8 (SLES 8) as the test core, as a test hardware with IBM PSeries servers. Using a specially designed LTP pressure test scene, running a wide range of kernel components in parallel using network and memory management, and generating high workload pressure on the test system. The Linux kernel, TCP, NFS, and I / O test components are targeted by heavy working load pressure. Test 30 days PSeries 30 days LTP pressure execution result machine: P650 LPAR CPU: (2) POWER4- 1.2 GHz core: Linux 2.4.19-ULL-PPC64-SMP (SLES 8 SP 1) LTP version: 20030514 99.00% average CPU Utilization Rate (user: 48.65%, system: 50.35%) 80.09% average memory utilization (8GB) observation results:

SLES 8 PPC64 30 days in P650 LPAR was successfully completed. The test tool is LTPSTRESS. The test case is performed in parallel and serial. The core, TCP, NFS, and I / O test components are targeted in heavy working load pressure. Success rate: 97.88%. There is no serious system failure. Figure 1. 30 days LTP pressure execution results 60 days 60 days LTP pressure execution results: pseryies

Machine: B80 CPU: (2) Power3- 375 MHz Kernel: Linux 2.4.19-ULL-PPC64-SMP (SLES 8 SP 1) LTP Version: 20030514 99.96% Average CPU Utilization (User: 75.02%, System: 24.94% 61.69% Average Memory Utilization (8GB) 3.86% Average exchange partition utilization (1GB) observation results:

SLES 8 PPC64 60 days in PSERIES B80 was successfully completed. The test tool is LTPSTRESS. The test case is performed in parallel and serial. The kernel, TCP, NFS, and I / O test components are targeted by heavy working load pressure. Success rate: 95.12%. There is no serious system failure. Figure 2. 60-day LTP pressure execution results Test Infrastructure Hardware and Software Environment Table 1 lists the hardware environment. Table 1. Hardware Environment System Processor Memory Hard Disk Exchange Partition Network PSERIES 650 (LPAR) Model 7038-6M2 2 - Power4 (TM) 1.2GHz 8GB (8196MB) 36GB U320 IBM Ultrastar (Have other hard drive, but not) 1GB Ethernet Controller: AMD PCNET32 PSERIES 630 Model 7026-B80 2 - Power3 (TM) 375 MHz 8GB (7906MB) 16GB 1GB Ethernet Controller: AMD PCNET32 PSERIES 630 Model 7026-B80 and PSeries 650 (LPAR) Model 7038-6m2 The software environment is the same. Table 2 lists the software environment. Table 2. Software environment

Component version of Linux SUSE SLES 8 WITH Service Pack 1 Core 2.4.19-UL1-PPC64-SMP LTP 20030514 The stability and reliability of the methodology system is usually measured in a continuous operation time and a reliable runtime time of the system. Originally run is a set of 30-day baseline operation, then increased to 60 days and 90-day XSeries and Linux test operations on the PSeries server. The initial focus is the kernel, network, and I / O test. Test Tools Linux Test Project (ie LTP; References) is a project for SGI, IBM, OSDL, Bull, and Wipro Technologies. The purpose is to provide test suite for open source communities to test Linux reliable Sex, robustness and stability. Linux Test Project is a collection of tools that test Linux kernels and related components. The purpose is to help improve the Linux kernel by making kernel testing work automation. Currently, there are more than 2,000 test cases in the LTP kit, covering most of the kernels, such as system calls, memory, IPC, I / O, file systems, and networks. The test suite will be updated monthly, which can be run on a variety of architectures. The known LTP test kit has been tested in the architecture, including I386, IA64, PowerPC, PowerPC 64, S / 390, S / 390X (64bit), MIPS, MIPSEL, CRIS, AMD Opteron, and embedded architecture. The LTP version used in our reliability test is 20030524, which is the latest version you can get. The test strategy has two special phases in the baseline operation: a 24-hour "initial test", which is next to the pressure reliability operation stage, or "pressure test". The necessary conditions for starting testing by initial testing. The initial test includes the successful operation of the LTP test kit 24 hours on the hardware and operating system, which will be used to operate reliability. The driver script with the LTP test suite package RunalLtest.sh is used to verify the kernel. This script is serially running a set of packets and reports all results. You can also choose to run several instances simultaneously. By default, this script is executed:

File system pressure test. Hard disk I / O test. Memory management pressure test. IPC pressure test. Debugger test. The verification test of the command function. The verification test of the system call function. Pressure tests can verify the robustness of the product at high use rates in the system. As a supplement to RunalLtest.sh, a test scenario called LTPSTress.sh is designed, and a large-scale core component is run in parallel using the network and memory management, and high pressure load is generated on the test system. Ltpstress.sh is also part of the LTP test suite. This script runs in parallel, runs different test cases in serial, which is to avoid intermittent faults caused by simultaneous access to the same resource or interference. By default, this script is executed: NFS pressure test. Memory management pressure test. File system pressure test. Mathematics (floating point) test. Multi-threaded pressure test. Hard disk I / O test. IPC (Pipeio, Semaphore) test. The verification test of the system call function. Network pressure test. The system monitored the modified TOP tool included with the LTP test suite for system monitoring tools. Use TOP to observe the behavior of the processor in real time. Improved TOP tools have additional features that save the snapshot of tops results into the file and give the average summary of the result file, including information such as CPU, memory, and exchange spatial utilization. In our test, intercepting a snapshot of a system utilization (or TOP output file) every 10 seconds and saves the result file. In addition, the snapshot and LTP test output files for system utilization and LTP test output files are handled daily or weekly to determine whether the performance decreases in long-term operation. This feature is controlled by the cron job and script. The hardware configuration of all selected test systems before testing is as possible. Remove additional hardware to reduce potential hardware failures. Select the lowest security option during the image installation. Reserve at least 2 GB of hard disk space to save TOP data files and LTP log files. Note that this is a test scene; in real life, it is best to suggest that users keep the security settings much higher than the minimum setting. Do not interfere during the test during the test. Occasionally access the system to confirm that the test is still acceptable. The means of confirmation includes using the PS command, check TOP data and check LTP log data. After the test, the system monitoring tool TOP immediately stopped. All TOP data files, including snapshots and LTP log files per day or weekly, are saved and processed to provide data for analysis. The result of the discussion discusses is based on a solution created and tested in a laboratory environment. These results may not be available in all environments, and in this environment may also require additional steps, configuration, and performance analysis. However, because most of the Linux kernel test work is short, this series of tests provide us with long-term operational first-hand data and results. This series of tests also provide data for Linux core components and TCP, NFS, and other test components under high workload pressure. Test proves that Linux systems are reliable and stable for a long time, providing a robust, enterprise-class environment. Reference

Useful information and links can be found in SourceForge's LTP Linux Test Project Home and the LTP project home page. Project documents include LTP HOWTO and LTP MAN PAGES (there are other documents). The LTP page also provides summary and links on other Linux test tools. Visit the IBM Linux Technology Center homepage to read their latest news and statements. The IBM Linux Technology Center project homepage lists the project currently on the workgroup. "IBM's Linux Technology Center" (ITWORLD.COM) and "Inside IBM: Dan Frye and the Linux Technology Center" two articles introduced the background of IBM LTC. IBM DeveloperWorks Articles "Behind The Scnes At the IBM Linux Technology Center" also describes IBM Linux Technology Center. More development materials about Linux can be found in the IBM DeveloperWorks Linux zone. About the author Li GE is an assistant software engineer of IBM Linux Technology Centeris. She graduated from New Mexico University in 2001 to a master's degree in computer science. She has three years of Linux work experience, current work is Linux kernel verification and Linux reliability metric. You can contact her through lrge@us.ibm.com. Linda Scott is a senior software engineer, graduated from Jackson State University, and has been working in IBM development laboratory since he graduated. In the work of IBM, she has participated in many UNIX and Linux projects, and is currently committed to Linux Test Project, which has provided more than 2,000 test cases for open source groups. You can contact her through lindajs@us.ibm.com. Mark Vanderwie is a senior technician, architect of IBM Linux Technology Center, architect. He graduated from Florida State University in 1983. Most of his work is committed to all aspects of operating system development. You can contact him through Markv@us.ibm.com.

转载请注明原文地址:https://www.9cbs.com/read-103667.html

New Post(0)