Performance and adjustment [ZT] author under Solaris: C.Arthur

xiaoxiao2021-03-06  82

Performance and adjustment [ZT] under Solaris http://www.chinaunix.net Author: C.Arthur Posted: 2004-03-19 19:39:20

1. Index performance problem 2. Performance monitoring 2.2 from exposed issues 2.2. Know how your system is in normal conditions 2.3. Look for performance bottlenecks 3. Some common problems and some suggestions 3.1. 64-bit operations and capacity What can be brought about 3.2. Idle memory 3.3. Priority memory page scheduling 3.4. ISM-INTIMATE Shared Memory 3.5. Switch Space Setting 3.6 related to shared memory (IPC) parameters

It is difficult to know what the reason is when a system is slow enough. Is it a memory leak, a disk subsystem bottleneck, or a particular application has a restriction in scalability? There are some ways to find and understand the root causes of performance problems and may be eliminated.

This article gives some suggestions from where to start. The text introduces how to deal with performance and how to locate common performance bottlenecks, also introduce some concepts related to performance, such as the private shared memory (ISM-INTIMATE Shared Memory) and priority memory page scheduling. The article focuses on the Solaris 2.6, 7, and 8 operating environments.

1. Index performance problem performance, perhaps more need to be considered more than other behaviors of computer systems. In order to identify the root source from one or more components, a structured method must be taken.

The actual result is that the most important part of the process of solving performance issues is to define what you are trying to solve. In terms of practical applications, this means defining an operation or test case, so that you can:

A) know how fast the system is currently. B) know that the system needs to be "X" times; or knowing the system has passed "X" times in different environments.

Set the baseline is the first step. Performance analysis is a process from the top of the problem that is simply defined by the problem you need. If you want a system to run fast, you still need to define which properties of this system are what you want to improve, and which cost is you acceptable or unacceptable. Unless you can clearly describe the symptoms / opportunities, you want to identify the root of the problem, you will only touch your luck.

Performance analysis is very like detective, we have established factual basis through evidence and observation, and very careful not to fall into the conclusion of the fact that it does not match the facts - only when there is a very overwhelming evidence.

Suspected to all assumptions. The fact that other people claimed is actually just a hypothesis that may be correct or incorrect. If this assumption is wrong, you may work under incorrect basis to get incorrect conclusions.

Here are some warnings. The Solaris operating environment is very good to optimize the self-performance optimization of workloads in most cases. The more the release version is, the less performance, the less performance. The root of performance issues are often found because of a behavior that is trying to optimize performance. First, you need to pay attention to the application, and finally the operating environment is.

Any changes to system configurations, such as performance settings such as memory size and disk layout, should check their current correctness. Similarly, a system upgrade with parameters may also affect the performance of the new operating environment.

2. Performance Monitor 2.1. From the exposed question What is the action to see the symptoms of performance issues?

For example, it is a specific type of database query, file or network operation is slower than you expect? In the test case, you can make the steps to do more specific, such as a SQL query or 30 line C program?

Maximum use of your knowledge as accurately describe "what is wrong with what is" to define your problem. Good example of a good question is like this:

A SQL query is twice as much as VxFS spends twice as much as in UFS. The SVR4 message queue operation is more than 30 percent of 30 percent of the operating environment version B. The login into the system A is three times more time. A problem description should not include a solution or a possible solution. At most, there is a clear explanation for the problem means that it has completed a half of the problem of solving the problem. It is also important to take into account the factors of user views when you try to solve the problem, which means you have to look at the application's perspective. This is the opposite of people, people always try to demonstrate or have a possible reason, rather than evaluating a cause of a reason.

Inappropriate question description is like this:

MPSTAT's "WT" column shows too much waiting time. The user task takes too long. A system and its application's functionality issues and performance issues are often a gray zone. The issues hang up and the process hang from the process are not within the scope of this article. If you suspect that the system's function is incorrect, not a performance problem, then call your Sun Solution Center to find a way to solve the problem. The premise of high-performance systems is that its function is first correct.

As part of your positive maintenance plan, there is no hardware problem such as disk retry in VAR / ADM / MESSAGES, or there is no extra message generation.

The historical information of the system is also very valuable; if your system has had better performance, draw a time curve detailed record for the first time that performance is deteriorated and from when the start performance has been poor.

2.2. Knowing how your system will save your system in normal conditions is a good idea. You can easily collect and save a monthly performance data, such as:

* STAT Class: VMSTAT, MPSTAT, IOSTAT, VXSTAT SAR PS Output to display which processes are running (PRSTAT in the Solaris 8 operating environment) In addition, there are many commercial and unsupported products to perform performance monitoring . A free unsupported optional product is SE Toolkit (see Sun Performance Se Toolkit Page). Se Toolkit Report Disk Activity, CPU Utilization, TCP, and Network Connections, Memory, and Other Information. In our experience, it is easy to install, no need to restart the system, and generate a graphic display that is easy to understand.

There are many common problems in many such products, which are different thresholds for different hardware configurations. For example, specific thresholds may appear too much for 400-MHz systems, which will slow down like climbing, but for a 900-MHz system may be acceptable.

2.3. Looking for Performance Bottlenecks Once you have defined performance issues that need to be resolved, the next step is to narrow the range to the bottleneck.

This stage is necessary to ask such a problem:

Can I tell me what is bottleneck? Take an Oracle example, an Oracle database administrator should know what BSTAT / ESTATS is and how to run and understand them. Or that sentence, from the point of view of the application, Bstats / ESTATS can display bottlenecks that limit the performance of Oralce, which can be used as a further analysis. Most time spent, is it a kernel or a user process? This issue can be answered by VMSTAT, MPSTAT, SAR, PS, and PRSTAT. Are all resources with similar types? The meaning of this problem is to find an inequality distribution of resources. For example, a disk may be a bottleneck, or a CPU will be more busy than other CPUs. To the CPU, look at MPSTAT. For disks, use iostat. Which or which processes are using the most resources? Use these commands to see the most processes using CPU and memory: PS-EO PID, PCPU, ARGS | Sort 1NCPU percentage

PS-EO PID, VSZ, ARGS | Sort 1N

K byte virtual memory

/ USR / UCB / PS AUX | More

The output is sorted, and the process of the most CPU and the most memory is placed above.

The Solaris 8 operating environment provides PRSTAT, which gives a dynamic annotation of CPU and memory usage. The output of PRSTAT-CVM is very useful.

Let's take a look at how to use some common Solaris commands to start performance analysis.

2.3.1. VMSTAT - Use the vmstat command vmstat command to be simple. Here we can see an example of insufficient CPU capabilities for applications that are being executed.

% vmstat 15

Procs Memory Page Disk Faults CPU

R b W SWAP Free RE MF PI PO fr de SR M0 M1 M2 M3 in SY CS US Sy ID

45 0 0 2887216 182104 3 707 449 6 455 0 80 2 6 1 0 1531 5797 983 61 30 9

58 0 0 2831312 46408 5 983 582 56 3211 0 492 0 0 0 0 1413 4797 1027 69 31 0

55 0 0 2830944 56064 2 649 656 3 806 0 121 0 0 0 0 1441 4627 989 69 31 0

57 0 0 2827704 48760 4 818 723 6 800 0 121 0 0 1 0 1606 4316 1160 66 34 0

56 0 0 2824712 47512 6 857 604 56 1736 0 261 0 0 1 0 1584 4939 1086 68 32 0

58 0 0 2813400 47056 7 856 673 33 2374 0 355 0 0 0 0 1676 5112 1114 70 30 0

60 1 0 2816712 49464 7 861 720 6 731 0 110 7 0 3 0 2329 6131 1067 64 36 0

58 0 0 2817552 48392 4 585 521 0 996 0 146 0 0 0 0 1357 6724 1059 71 29 0

The first row of VMSTAT output is always negligible. One column labeled "R" under "Procs" is the number of processes in the process run queue waiting for the CPU. The "ID" column is the CPU idle time. This machine does not have sufficient CPU resources to meet the needs of the process, which can be seen from its CPU time to see in the user space (see "US" column).

Here are two ways to adopt-first, add more CPUs, or second, the code for the application's code is performed to see some parts of the application can be optimized. Optimization of codes to codes may require very large efforts - and sometimes it is very much. When you are related to time, it is best to consider what you may have a possible "return return". 2.3.2. MPSTAT - Use the mpstat command mpstat command to report the statistics of each processor, each line of each line in the table represents the activity of a processor.

$ MPSTAT 5

CPU Minf MJF Xcal Intr ITHR CSW ICSW Migr SMTX SRW SYSCL USR SYS WT IDL

0 20 0 3592 3350 2338 1355 43 184 285 0 4578 9 6 1 84

1 19 0 304 465 283 2139 135 398 140 0 6170 9 6 1 85

2 25 0 352 507 295 2153 158 433 183 0 7508 12 7 1 81

3 26 0 357 513 302 2082 155 425 181 0 7460 12 7 0 81

CPU Minf MJF Xcal Intr ITHR CSW ICSW Migr SMTX SRW SYSCL USR SYS WT IDL

0 3 0 3879 3773 2754 1832 61 322 339 0 3424 12 7 0 81

1 2 0 555 544 264 3040 197 670 112 0 4828 15 6 0 78

2 11 0 188 595 269 3141 219 738 121 0 5291 18 6 1 75

3 65 0 185 585 279 2660 211 673 110 0 5420 22 9 0 69

CPU Minf MJF Xcal Intr ITHR CSW ICSW Migr SMTX SRW SYSCL USR SYS WT IDL

0 6 0 4028 3633 2620 1695 51 287 343 0 2857 12 8 0 80

1 7 0 150 545 265 3044 196 663 117 0 4374 14 4 0 81

2 14 0 226 602 279 2823 225 707 103 0 4715 22 4 1 73

3 2 0 125 600 282 2810 230 699 118 0 4665 18 4 0 78

MPSTAT can determine what each CPU is speaking for time: for example, allocated to the system, user, waiting, idle time, system call, lock competition, interrupt, error, cross call.

See the MPSTAT (1M) man page with the detailed meaning of each column.

2.3.3. Iostat - Use the iostat command if the iostat command reports the usage of the disk. Each line in the table represents the activity information of a disk. Common options have these:

Option description

N Specify the disk by cxtydz format.

X Report Extension Statistics.

Z This option is new in the Solaris 8 operating environment. It enables those rows that have no disk activity in the sampling interval, which makes the output shorter and highlight those active disks.

P and P Reports I / O statistics of PER-Partitions, is useful when the memory switch is checked.

E is useful for finding a disk that generates an error.

Table 1: Iostat options

Iostat can also report disk activities through NFS, but may result in a relatively long report.

2.3.4. Truss - Your Friends Truss (1M) tool performs a developed command and generates a tracking record, including the system calls it execute, the received signal, resulting in machine errors (Traps / Interruptions - Translator Note) .

Truss can also be used to track a process that is exiting. This is a very useful tool that can position applications to request which slowers or excessive resources. If you don't know Truss, you can look at the man page and try it. -m option is very useful for displaying, for example, a page error. -c option gives this summary information:

The system call error signal has a number of time failures in each type of system invoked 2.3.5. LOCKSTAT - Resource Competitive kernel lock can protect multiple updates to the data structure, and control pairs such as disk cache, network caching, and Nuclear caches access to these resources.

LockStat executes a command, reporting the activity of all kernel locks during the execution, which process or device is requested to lock. Please see the Lockstat (1M) man page. -s 10 Option Reports the kernel thread stack that competes in each locked.

2.3.6. TrapStat - Running Trap Statistics Trapstat is a tool for providing runtime traps (trap) statistics on the UltraSparc® processor running a normal Solaris kernel. For I-TLB and D-TLB missed, TrapStat can optionally display the amount of time in the operating system TLB missed handler. For interrupt vector traps, TrapStat can optionally display interrupt devices.

2.3.7. GPROF - Application Performance Analysis For C, C and Fortran applications, try-xpg options and run this program under typical loads that generate performance issues. Perform GPROF for the generated TMON.out file. This can show where most of the application is spent.

FORTE [TM] Teamware (former Sun Workshop [TM] Teamware) has a lot of useful tools, such as the analysis tool that is spent using a graphical way. To learn more about further information, see the Forte Teamware documentation and Rajat Garg and ilya Shapov's Sun [TM] Blueprints book, application optimization skills: High performance calculation (Techniques for Optimizing Applications: High Performance Computing).

2.3.8. Proc Tool Proc is a utility that uses / proc's characteristics to report such as such process properties:

PSTACK - Call Stack PTREE - Process Relational Tree Pfiles - Opened File Descriptor List PLDD - List of Dynamic Link Libraries Used in Run More Information See Proc (1) Man Page.

3. Some common problems and some recommendations 3.1. What can the 64-bit operations? From the perspective of performance, there are two benefits of running 64-bit applications. The first is a larger problem that can be effectively resolved using a larger process address space. Secondly, an integer operation can use 64-bit registers and instructions.

Overall, because the pointers and data structures in the code are bigger, the program is slightly larger. Converse, this means that the CPU's cache is also very potentially cache line, and those who can run well in the 32-bit environment may have a little slower.

The kernel thread stack is 16kb instead of 8KB, but the resulting effect is often ignored.

3.2. Free memory checks a Solaris system to determine how much free memory has always been confused.

For the version before the Solaris 8 operating the environment, if you want to see if there is insufficient memory, it does not depend on the "Free" column or "SR" column. The value in the "FR" column cannot indicate the lack of memory. The page cache has retained home to prepare them again. The virtual memory subsystem only recovered memory when needed.

This topic has been written in the SUNWorld article and Sun Performance and Adjustment - Java [TM] and Internet (Sun Performance and TUNING - JAVA [TM] and the Internet) This topic has been written. In order to determine if there is insufficient memory, it is true that the 12 column ("SR", which is the scan rate) and the disk I / O traffic (with iostat -p) are exchanged. If a large number of I / O activities are generated by the file system and need to run the page scanner for the I / O release page, the "SR" column will have a relatively large value. The Pageout scanner is only running only if the idle chain is shortened to a threshold (LOTSFREE, in a page). Any non-active and processed or file pages that are locked in memory may be swapped out. The size of FreeELIST looks shortened and kept in that value (LOTSFREE). When the number of freeelist falls below the LotsFree threshold, the page daemon will start, scan the memory that needs to be recovered from the page cache and the exit and idle process. There is no way to get the "idle" value to this limit or more, because there is no way to recover the memory outside this threshold. Let the page remain in the page cache more efficient than put them unnecessarily in the idle chain table.

The Solaris 8 operating environment implements a more efficient algorithm to provide the page you want to provide I / O in the Segmap driver. The "FR" column in VMSTAT does reflect idle and is not used by the page cache. The -p option is added to VMSTAT to give more detail of more accurate page scheduling behavior.

For separate processes, the PMap command reports the memory space layout of the individual process (-x option is relatively useful).

3.3. Priority memory page scheduling priority memory page scheduling is introduced in the Solaris 7 operating environment and is replaced to the Solaris 2.6 operating environment (kernel patch 105181-xx) and Solaris 2.5.1 operating environment (kernel patch 103640-xx) . The most recent version of these two patches can be found in SunSolve Online [SM].

Priority memory page scheduling provides an improved page scheduling algorithm to significantly improve the response speed of the system when the file system is used. Priority Monitor page scheduling introduces a newly added noun, cachefree. Page Scheduling Parameters Now there are these:

Minfree

By default, this new feature is turned off in Solaris 2.5.1, 2.6, and 7 operating environments, so this feature is allowed to allow this feature on a system with significant frequent memory schedule. When priority_paging is not allowed, cachefree is placed as like LotsFree. When it is allowed, the cachefree is set to 2 times the LOTSFREE by default.

Adjusting this parameter tends to change between the window on the workstation system faster, which is a great help for the system that needs to read the large file to the memory in the file system. On systems that perform a large number of I / O operations via the file system, there have been several hundred percent of the speed of the calculation intensive task.

The Solaris 8 operating environment uses a different algorithm that eliminates the pre-version of the page scanner must scan memory to supply the SEGMAP driver to store I / O limitations. All memory pages that segmap no longer need are placed in a linked list that can be reused immediately. Do not set priority_paging in the Solaris 8 operating environment. Also, the Solaris 8 operating environment should not be manually adjusted to adjust the virtual memory parameters, in addition to setting FastScan and MaxPGIO to high values ​​in large systems.

For more information on priority memory page scheduling, please refer to these:

Sun performance, priority memory page scheduling FAQ document 17946: New core adjustable items for priority memory page scheduling in 2.5.1 , ISM-intimate shared memory, makes shared memory locked in memory In the middle, it cannot be replaced (Page Out). Originally, the memory management data structure created only for individual processes is shared by all processes after one time. In the Solaris 2.6 operating environment, there is further optimization, and the kernel trial looks for a continuous 4-MByte physical memory block that can be used as a large memory page to map shared memory. This greatly reduces the overhead of the memory management unit. (See performance and adjustment - Java [TM] and Internet (Performance and Tuning - Java [TM] and the Internet) 333.) By default, similar to Oracle, Informix, Sybase uses a special The logo to indicate that they want to use ISM. ISM is a virtual memory implementation, making it more efficient to use the use of kernel and hardware resources. Also, the ISM provides a method of locking frequently used shared memory pages in memory.

In the default, ISM is allowed, and this feature is not required to edit the / etc / system file. On the kernel with the current patch level, close the ISM will cause system performance to degrade and may hang. And in the database configuration file, such as Oracle's init.ora file, should not have use_ism = false, as this will turn off the ISM.

3.5. Switching space associated with shared memory Sets to understand the switching space configuration related to shared memory, see the "Clearing Up Swap Space Confusion" written by Adrian Cockcroft.

There are two main considerations when setting the switching space size, it is enough to have enough:

Memory to avoid the memory switching space when normal operation, can put down a crash record 3.6. Procedure communication (IPC) parameters The following IPC parameter values ​​require your database system administrator (DBA) to determine . The SUN Solution Center cannot give the actual IPC parameter settings what is recommended. These values ​​depends on the application.

Troubled characters in the IPC parameter settings of / etc / system are very likely. This error will have a serious performance impact on the application. To check the spelling error, traversal / var / adm / messages to find such a message:

Genunix: [ID 492708 kern.notice] sorry, variable 'seminfo_semopn'

Is not defined in 'Semsys'

This shows that there is one spelling error. Use GREP to find "Sorry".

The Solaris 8 operating environment has improved the default value for IPC parameters than previous versions.

For the version before the Solaris 2.6 operating environment, shared memory requires more swap space (that is, "posttle space"). With swap -l, the megabon number can be obtained by dividing the block value 2. There should be at least twice the exchange space that has been assigned shared memory (SHMMAX).

Here is the default value and maximum of SHMMAX:

Default

SHMMAX 1048576 (MEG) 4294967295 (4GB) 2.5.1, 2.6, 32 Solaris 7

2147483647 (2GB) 2.5 or lower

In the Solaris 2.6 operating environment, SHMMAX and SHMMIN are unsigned (32-bit). In the Solaris 7 operating environment, "32-bit" SHMMAX and SHMMIN are unsigned (32-bit). In the Solaris 7 operating environment, "64-bit" SHMMAX and SHMMIN are unsigned long (64-bit). In all cases, SHMMNI and SHMSEG are symbolic (32-bit). Table 2 summarizes these commands and their types. Command Solaris 2.6

32-bit Solaris 7

32-bit Solaris 7

64-bit

SHMMAX unsigned integer unsigned integer unsigned long integer

SHMMIN unsigned integer unsigned integer unsigned long integer

SHMMNI has symbol integer with symbol integer -

SHMSEG has symbolic symbol integer -

Table 2: Command Type SHMMAX Limit Shared Contract Section The maximum size of the SHMGet (2) can request the maximum. The resources it control is not pre-allocated, but allocated as needed.

In the Solaris 7 and 8 environments, 64 bits break through 4-gbyte limitations. This maximum is theoretical. The actual setting needs to be determined based on the memory, database size, and system configuration. The maximum value of the segment itself (shmmax) is an upper limit.

Additional resource A. From Sunsolve Online [SM] About IPC articles about IPC parameter topics, Sun Solutions Center has written a lot of articles. These articles can be obtained in SunSolve Online [SM]. (Contract customers can access additional related publications.) Next is some list of articles.

If you modify the / etc / system file don't seem to work, see the document 12824: SYSDEF -I does not report the IPC parameter set in / etc / system.

General information on IPC parameters:

Document 6328: All information about shared memory parameters in 2.x 2270: Understand signal, Seminfo_ signal light information document 12075: How to configure IPC signal lights and shared memory documents in your system 5288: How to determine IPC parameter values ​​via ADB Document 2273: Nuclear Adjustment Parameters Document 7241: Determine Message Queuing Parameters About Debugging Questions:

Document 12174: How to check the system how much shared memory document 16985: A process using shared memory has been terminated, but the switching space does not seem to be recycled B. Sun Performance The Sun Performance Page provides a variety of resources. SUNWORLD online column 1995-1999 Cockcroft, Adrian and Richard Pettit, Sun Performance and Adjustment - Java [TM] and Internet (Sun Performance and Tuning - Java [TM] and the Internet), Sun Microsystems Press, 1998. This is a copy Useful books, the principles introduced have experienced time test. However, many things need to take a little doubt when applying to the current system. GARG, RAJAT, and ILYA ShaPov, optimize applications: Techniques for Optimizing Applications: High Performance Computing, Sun Blueprints, 2001. This book is based on Sun UltrasParc technology-based platforms for computing intensive programs Practical guidance of performance optimization, which is helpful for understanding how applications use system resources. MAURO, JIM and RICHARD MCDOUGALL, Solaris Interior, Core Kernel Architecture, Sun Microsystems Press, 2001. This book is considered to be an in-depth guide for the internal work of the Solaris operating environment. For performance and other system management, such as capacity planning, from the large number of publications from Sun Blueprints [TM] Programs. C. Related Articles and books ITWORLD.COM derived from Unixinsider's episode: Jim Mauro has written a lot of articles that describe how the Solaris kernel component works. If you want to know "insider", this is a good starting point. ALOMARI, AHMED, ORACLE and UNIX Performal Adjustment (Oracle8i and UNIX Performance Tuning), Prentice Hall PTR, 2001. (Note: The early version of 1999 is for the Solaris 2.6 operating environment.) About the author Karen Edwards has worked in Sun for seven year. She earliest is the support of Peripherals and kernels in the Sun Enterprise Service Solutions Center of California. Then she works for SunSolve Online [SM] Program, helps to provide content and support for the customer to do patch issues. At present, she works with the author as Sun [SM] Alert Program. Take a look at the new Sun alert Notification collection on the Sunsolv website. Clive King works at Sun's Customer Problem Solution Engineering Group. He is good at performance analysis and problematic issues in I / O subsystem equipment. As a Kepner Troble Shooting course lecturer, he has a problem solving, decision-making and risk management aspects and practices Rational Process.

转载请注明原文地址:https://www.9cbs.com/read-59620.html

New Post(0)