[Repost] C ++ code optimization method summary

xiaoxiao2021-03-06 128

Optimization is a very big topic. This article is not going to explore performance analysis theory, the efficiency of algorithms, and there is no ability. I just want to summarize some optimization techniques that can be applied to your C code, this way, when you encounter several different programming strategies, you can conduct a probably estimate of the performance of each policy. . This is also the purpose of this article.

First. Before optimization

Before making optimization, we should first do what we find the bottleneck of our code (bottleneck). However, when you do this, you should inference from a debug-version because the debug-version contains many additional code. A debug-version executable is 40% larger than the release-Version. Those additional code are used to support debugging, such as symbols. Most implementations provide different Operator New and library functions for Debug-Version and Release-Version. Moreover, an acting body of a Release-Version may have been optimized by a variety of ways, including unnecessary temporary object elimination, loop deployment, moving objects into registers, inline, and so on.

In addition, we have to distinguish between debugging and optimization, they are done different tasks. Debug-version is used to hunt BUGS and check if there is a logically problem. Release-version is used to do some performance adjustments and optimization.

Let's take a look at what code optimization technology:

II. What is the location of variables and objects in the declared placement program will have a significant impact on performance. Similarly, the choice of Postfix and Prefix operators can also affect performance. This section we focus on four questions: Initializing V.s assignment, place the declaration of the declaration, construct the initialization list, Prefix V.s Postfix operator in the program. (1) Use initialization rather than assigning only a declaration of a variable in a function body in a C language, however, declares in C may appear anywhere in the program. The purpose of doing this is to delay the declaration of the object to the time you want to use it. Doing so can have two benefits: 1. Ensure that the object is not modified by other parts of the program before it is used. If the object is declared at the beginning, but if it is used after 20 lines, it is not possible to do such a guarantee. 2. Make us have the opportunity to achieve the performance of performance by initializing the replacement, the previous statement can only be started, but we have not yet got the value we want, so the benefits of initialization cannot be application. But now we can initialize directly when we have wanted, so I save a step. Note that there may be no differences between the initialization and assignments, but for the user-defined type, the two will have a significant difference, because the assignment will make a function call - --operator =. Therefore, when we choose between assignment and initialization, initialization should be our preferred. (2) Put the statement in a suitable location in some cases, and the performance improvement by moving the declaration to a suitable location should cause us to pay sufficient attention. For example: BOOL IS_C_NEDED (); Void Use () {C C1; IF (IS_C_NEEDED () == false) {return; // c1 was not needed} // use c1 here return;} Above this code Object C1 Even In the case where it is possible not to use it, we will pay unnecessary costs for it. It is possible that you will say how much time can C1, but if this is the case: C C1 [ 1000]; I think it is a waste of waste. However, we can change this by moving the position of the declaration C1: void use () {if (is_c_needed () == false) {return; // c1 WAS Not needed} c c1; // moved from the block's beginning / use c1 here return;} How is the performance of the program has been greatly improved? Therefore, please analyze your code carefully, put the statement in a suitable location, the advantage that it is unimaginable. (3) Initializing list We all know that the list of initialization is generally used to initialize Const or Reference data members. But because of his own nature, we can achieve performance improvement by using a list of initialization.

Let's first look at a program: Class Person {Private: C_1; C C_2; Public: Person (Const C & C1, Const C2): C_1 (C1), C_2 (C2) {}}; Of course, we can also Write this: Person :: Person (Const C & C1, Const C & C2) {c_1 = c1; c_2 = c2;} What kind of performance difference will be brought about? If you want to figure out this problem, we must first Make a clear understanding of how the two is executed, first look at the list of initialization: Data members are completed before the constructor is executed, and is only completed in the constructor, but the initial list is directly in the data member. When the declaration is initialized, so it only executes a Copy Constructor. Then look at the case in the constructor: First, create a data member through the DEFAULT CONSTRUCTOR before constructor execution, and then assign values via Operator = in the constructor. Therefore, it has been called more than the initialization list. Performance differences come out. But please note that if your data is a basic type, you do not use the initial list for the readability of the program because the compiler is the same as the assembly code generated by the two. (4) Postfix VS Prefix operator Prefix operator and - more efficient than its postfix version, because when the Postfix operator is used, a temporary object is required to save the previous value. For basic types, the compiler will eliminate this additional copy, but for user-defined types, this seems impossible. So please use the Prefix operator as much as possible.

II. Inline function inline functions can remove the efficiency burden of function calls and the benefits of general functions. However, the inline function is not universal medicine, in some cases, it can even reduce the performance of the program. Therefore, it should be cautious when used. 1. Let's first take a look at the benefits of the inline function to us: From a user's point of view, the inline function looks like a normal function, it can have parameters and return values, or have its own scope, however It does not introduce the burden on the general function call. In addition, it can be more easier to debug more secure than macros. Of course, it should be aware that Inline Specifier is just a suggestion for the compiler, and the compiler has the right to ignore this suggestion. So how does the compiler determine the inline or not? Under normal circumstances, key factors include the size of the function, whether there is a local object being declared, the complexity of the function, and the like. 2. Then, if a function is declared as inline but is not what will happen by the inline? In theory, when the compiler refuses to inline a function, the function will be treated like a normal function, but some other problems will occur. For example, the following code: // filename time.h # include #include using namespace std; class time {public: inline void show () {for (int i = 0; i <10; i ) COUT << Time (0) << Endl;}}; because member functions Time: show () includes a local variable and a for loop, the compiler generally refuses inline and treats it as an ordinary member function. However, this header file containing class declarations will be separately included in each independent compilation unit: // filename f1.cpp # include "time.hj" void f1 () {TIME T1; T1.SHOW ();

// filename f2.cpp # include "time.h" void f2 () {time t2; t2.show ();} Results The compiler generates two identical member functions for this program: Void F1 (); void F2 (); int main () {f1 (); f2 (); return 0;} When the program is linked, Linker will face two identical TIME: Show () copies, then the function is defined. The connection error occurred. But old C achieves the way to deal with this situation is to process a un-inlined function as static. So each function copy is only visible in its own compilation unit, so that the link error is resolved, but in the program, it will leave multiple function copies. In this case, the performance of the program does not increase, but increases the compilation and link time and the size of the final executable. But fortunately, the new C standard has changed in the new C standard. A standard C implementation should only generate a copy of the function. However, it may take a long time to support all compilers. In addition, there are two more headache problems with the inline function. The first question is how to maintain it. When a function starts, it may appear in the form of inline, but with the extension of the system, the function body may require additional functions, resulting in the inline function, so I need to remove the Inline Specifier and put the function Go to a separate source file. Another problem is that when the inline function is applied to the code base. When the inline function changes, the user must recompile their code to reflect this change. However, for a non-intramid function, users only need to re-link. What I want to say here is that the inline function is not a reinforced panacea. Only when the function is very short, it can get the effect we want, but if the function is not very short and called in many places, the volume of the executive is increased. The most troubled or when the compiler refuses inline. In the old realization, the results are very unsatisfactory, although there is a big improvement in new implementation, but still not so perfect. Some compilers can pointed out which functions can be inline, but most compilers are not so smart, so this requires our experience to judge. If the inner function cannot enhance the behavior, avoid using it! IV. Optimizing your memory usage is usually optimized: faster running speed, effective system resource usage, smaller memory usage. In general, code optimization is to improve in the above aspects. Reissue declaration technology proves to be the establishment and destruction of excess objects, so that both the size of the program has accelerated running. However, other optimization techniques are based on one aspect ------ faster speed or smaller memory usage. Sometimes these goals are mutually exclusive, compressing the use of memory often slows down the code speed, but the fast code requires more memory support. The following summarizes two optimization methods on memory usage: 1. Bit Fields can access and access data in C / C : Bit. Because Bit is not a C / C basic access unit, this is here to reduce the use of space and auxiliary memory. Note: Some hardware structures may provide a special processor instruction to access bit, so the Bit Fields affects the speed of the program depends on the specific platform. In our real life, many bits of a data are wasted, because some applications will not have such a large data range.

Maybe you will say that Bit is so small, can it reduce the use of storage space? Indeed, there is no such thing as the amount of data is small, but in the case of an amazing amount, the space it saves is still one of our eyes. Maybe you will say that now the memory and hard drive are getting cheaper, why is it for half a day, this is not a few money. But there is another reason will definitely convince, that is, digital information transmission. A distributed database will have multiple copies in different locations. So millions of records will be very expensive. Ok, now we take a look at how to do it, first of all look at the following code: struct BillingRec {long cust_id; long timestamp; enum CallType {toll_free, local, regional, long_distance, international, cellular} type; enum CallTariff {off_peak , Medium_Rate, Peak_time} Tariff;}; The above structure will take up to 16 bytes on the 32-bit machine, you will find that there are many bits of them being wasted, especially the two Enum types, waste is more serious. So please see the improvements made below: Struct BillingRec {Int Cust_ID: 24; // 23 Bits 1 Sign Bit Int TimeStamp: 24; Enum CallTYPE {// ...}; enum caltariff {// ...} Unsigned Call: 3; Unsigned Tariff: 2;}; Now a data is reduced from 16 bytes to 8 bytes, reduced half, how, the effect is still remarkable:) 2. Unions subions reduces memory waste by placing two or more data members in memory of the same address, which requires only one data member to be valid at any time. Union can have member functions, including constructor and destructuring functions, but it cannot have virtual functions. C supports anonymous unions. Anonymous Union is an unnamed object that unnamed type. For example: Union {long n; void * p}; // anonymousn = 1000L; // members area Directly accessp = 0; // n is now Also 0 unlike NONT, it does not have member functions and non-public data member. So when is UNISS? Below this class gets a person's information from the database. The keyword can be both a unique id or a name, but both cannot be effective at the same time: Class PersonalDetails {private: char * name; long id; //...public: PersonalDetails (const char * nm); // Key IS OF TYPE Char * Used PersonalDetails (long ID): ID (ID) {} // Numeric Key Used}; the above code will cause memory waste, because only one keyword is valid at a time.

Anonymous Union can be used here to reduce memory usage, such as Class PersonalDetails {private: union // anonymous {char * name; long ID;}; public: personaldetails; personaldetails (long id): ID): ID (ID) {/ ** /} // Direct Access to a member // ...}; By using Union, the size of the PersonalDetails class is halved. However, it will be described here that saving 4 byte memory is not worth introducing the trouble brought by Union, unless this class is a tens of millions of database records or records in a very slow communication line. It is worth noting that Unions does not introduce any runtime burden, so there is no speed loss here. Anonymous Union is the advantage of its members can be accessed directly. V. Speed Optimization In some applications that are very demanding speed requirements, each CPU cycle is striving for. This section shows some simple methods to perform speed optimization. 1. Use classes to wrap long parameter list A function call will increase as the parameter list increases. When the runtime system has to build a stack to store parameter values; usually, when the parameters are many, such an operation will take a long time. Parked the parameter list into a separate class and passed by reference, which will save a lot of time. Of course, if the function itself is very long, the time to establish a stack can be ignored, so there is no need to do this. However, for those who perform time short and frequently called, wrapped a long parameter list is in the object and will be transmitted through reference. 2. Register Variable Register Specifier is used to tell the compiler that an object will be used very much, you can put it in the register. For example: void f () {INT * P = new int [30000]; register int * p2 = p; // store the address in a register for (Register Int J = 0; J <3000000; J ) {* p2 = 0;} //...use p delete [] p;} The loop count is the best candidate for the application register variable. When they are not stored in a register, most of the cycle times are used in the retrieval of variables from memory and give a new value for variables. If you deposit it in a register, this burden will be greatly reduced. It should be noted that Register Specifier is just a suggestion for the compiler. Like the inner function, the compiler can refuse to store an object into the register. In addition, modern compilers are optimized by placing variables into registers. Register Storage Specifier is not limited to the basic type, it can be applied to any type of object. If the object is too large without loading the register, the compiler will still put it in a high speed memory, such as Cache. The REGISTER Storage Specifier declaration function type will be the suggested compiler to store the actor in the register instead of the stack. E.g:

Void F (Register Int J, Register Date D);

3. Stateing the constant objects to const By declaring the object to const, the compiler can use this statement to place such an object into the register. 4. Virtual Function running time period When you call a Virtual Function, if the compiler can solve the statics of the call, it will not introduce an additional burden. In addition, a very short virtual function can be processed inline. In the following example, a smart compiler can do static call virtual functions: #include using namespace std; class v {public: Virtual void show () const {cout << "i'm v" < show ();} void g () {v v v v v v v v v v v v v, f (v, & v);} int main () {g (); return 0;} If the entire program is now compiled in a separate compilation In the unit, the compiler can inline replacement of G () in main (). And the call to f () in g () can also be processed. Since the dynamic type of the parameter transmitted to F () can be known in the compile period, the compiler can static call to the virtual function. But you can't guarantee that each compiler does this. However, some compilers can indeed utilize the dynamic type of the parameter to obtain the parameters in the compile period, so that the call is determined during the compilation period, avoiding the burden of dynamic binding. 5. Function Objects VS Function Pointers replaces the benefits of Function Objects Function Pointers Not only limit in generalization and simple maintenance. Moreover, the compiler can call the function of Function Object to perform inline processing, thereby further enhancing performance six. The last help to date the optimization techniques showing the presentation and code in the readability of the code. In fact, some of them also increase the stability and maintainability of the software. However, in some software development for time and memory, the above technique may not be insufficient; it is possible to need some technologies that affect the portability and scalability of the software. However, these techniques can only be used in all other optimization techniques but are not in accordance with the requirements. 1. Turn off RTTI and exception handling support When you import pure C code to C compiler, you may find some performance losses. This is not a language or compiler error, but some adjustments made by the compiler. If you want to get the same performance as the C compiler, turn off the compiler's support for RTTI and exception handling. Why is this so? Because in order to support RTTI and exception processing, the C compiler will insert additional code. This increases the size of the executive, thereby causing the efficiency to decrease. When applying pure C code, those additional code are unnecessary, so you can avoid it by turning off. 2. The inline collection compilation can be rewritten by local assembly. The result may be a significant increase in speed. However, this method cannot be implemented to be implemented because it will make the future revision very difficult. Programmers for maintenance code may not understand the assembly. If you want to run the software in other platforms, you need to rewrite the assembly code section. In addition, development and test assembly code is a hard work, which will take longer. 3. The interactive API function directly and the operating system allows you to interact directly with the operating system. Sometimes, executing a system command directly can be much faster. For this purpose, you can use the standard function system ().

转载请注明原文地址:https://www.9cbs.com/read-102126.html

9cbs

New Post(0)