September 2003
Although the C language has been introduced for nearly 30 years, its charm has not fallen. The C language continues to attract many people, and they must learn new skills in order to write new applications or transplanted or maintain existing applications.
Introduction This article is to meet the needs of developers. We summarize a guide, whether as developers or consultants, these guides have always guided us for many years, and we will provide you with recommendations, I hope to help your work. You may not agree with some of these guidelines, but we hope you will like some of them and use them in your programming or transplant project.
Style and guide
Use a source code style that makes the code readability and consistency. If there is no team code style or your own style, you can use the style similar to the Kernighan and Ritchie style that most C programmers. However, raising an extreme example, it may eventually write code similar to the following:
INT i; main () {for (; i ["]
o, world! / n ", '/' / '/'));} Read (J, I, P) {WRITE (J / P P, I --- J, I / I); - 1984 Fuzzy C Code Contest "Poor Award". The author of the code is anonymity. Usually define the main routine as main (). The corresponding ANSI write method is int main (void) (if you do not consider the command line parameter) or int Main INT Argc, Char ** Argv). The previous compiler will omit the void declaration, or list the variable name and followed the declaration. The space takes full use of horizontal and vertical spaces. The indentation and space spacing should reflect the code Block structure. The long string of the condition operator should be divided into a single line. For example:
IF (Foo-> Next == Null && Number && Node_Active (this_input)) {... It is best to change: IF (foo-> next == null && number && Node_Active (this_INPUT)) { ... Similarly, it should be described in a very detailed for cycle to divide different lines: For (curr = * varp, trail = varp; Curr! = NULL; Trail = & (curr-> next), curr = curr-> Next) { ... For other complex expressions (such as expressions using tricolor characters:), it is best to split the count line. z = (x == y) ? n f (x) : f (y) - n; Note Annotation should describe what is happening, how to complete it, what global variables are used, and any restrictions or errors. But avoid unnecessary comments. If the code is clearer, and use a good variable name, it should be able to better illustrate itself. Because the compiler does not check the comment, it does not guarantee that they are correct. An annotation that is inconsistent with the code will play an opposite role. Excessive comments will make code confusing. Here is a unwanted annotation style: i = i 1; / * Add one to i * / It is very clear that the variable I has incremented 1. There is also a worse annotation method: / ********************************************** * * Add one to i * * * ******************************************* / i = i 1; Naming convention has the name of the leading and trailing underscore to be reserved for system use, and should not be used for any user creation. The agreement: #define constant should be all capitalized. ENUM constants should begin with uppercase letters or all over. Function, Typef), and Variable Name, Structure, Union and Enumeration (Enum) tagnaps should be lowercase. To clearly, avoid using the name only on case, such as foo and foo. Again, avoid using Foobar and foo_bar names. Avoid using a similar name that looks like. On many terminals and printers, "L", "1" and "i" look very similar. The variable named "L" is very unwise because it looks very similar to the constant "1". When the variable name selects a variable name, the length is not important, but clear expression is important. Long names can be used for global variables because it is not common, but it is completely enough to name I if the array to be used on each row cycle is completely. If you use "INDEX" or "ElementNumber", you will not only enter more, but also make the calculated details. If you use a long variable name, sometimes it makes the code more difficult. Comparison: For (i = 0 to 100) Array [i] = 0 and For (ElementNumber = 0 to 100) Array [ElementNumber] = 0; The function name function name should reflect what operations do to perform and return anything. The function is used in the expression, usually used for the IF clause, so their intent should be at a glance. E.g: IF (Checksize (x)) doesn't help, because it doesn't tell us that checksize returns TRUE in an error or returns True when it is wrong; ValidSize (x)) Then make the intent of the function clear. Declare that all external data declarations should be added to the extern keyword. "Pointer" qualifier "*" should be close to the variable name instead of the type. For example, should be used CHAR * S, * T, * U; Instead of Char * s, t, u; the latter statement is not wrong, but may not be what we expect, because "T" and "U" are not declared as the pointer. The header file header should be organized together, ie, the declaration of a single subsystem should be in a separate header file. In addition, a statement that changes may occur when the code is ported from one platform to another platform, should be located in a separate header file. Avoid using the same dedicated head file name as the library file name. Statements #include "math.h" If you can't find the desired file in the current directory, you will include a standard library Math header file. If this is the result you expect, you can comment out this line of incrude statement. Finally, it means that the header file uses the absolute path name is not a good idea. The "incrude-path" option of the C compiler (I) is the preferred method for processing a large number of dedicated header files; it allows reorganizing the directory structure without changing the source file. Scanf will never use scanf in important applications. Its error detection is not perfect. Please see the example below: #include INT main (void) { INT I; Float f; Printf ("Enter an INTEGER AND A FLOAT:"); Scanf ("% D% f", & i, & f); Printf ("i ie ^ 1 D and% F / N", I, F); Return 0; } Test Run ENTEGER AND A FLOAT: 182 52.38 I READ 182 and 52.380001 another test run ENTEGER AND A FLOAT: 6713247896 4.4 i ieD -1876686696 and 4.400000 and - increment in the use of variables in the statement or When decrementing an operator, the variable should not appear over the statement, because the order of the request depends on the compiler. Do not assume the order when writing code, or write code that can operate on a machine but not clearly defined: INT i = 0, A [5]; a [i] = i ; / * assign to a [0]? or a [1]? * / Don't be confused by the surface, please see the following example: While (c == '/ t' || c = '|| c ==' / n ') C = getc (f); At first glance, the statement in the While clause seems to be a valid C code. However, using the assignment operator rather than the comparison operator produces a semantic incorrect code. = The priority is the lowest in all operators, so explain the statement to the following manner (for clear, add parentheses): While ((c == '/ t' || c) = ('' || c == '/ n'))) C = getc (f); the clause to the left side of the assignment is: (c == '/ t' || C) It does not generate left values. If C contains a tab, the result is "true" and does not perform further evaluation, and "True" cannot be located on the left side of the assignment expression. The intent is clear. When you write code, you can explain to another intention, use parentheses or other methods to make sure your intentions are clear. If you have to handle the program in the future, this helps you understand your original intent. If anyone else should maintain the code, this can make the maintenance task easier. It is possible to encode the way to predict that the wrong way is sometimes possible. For example, constants can be placed on the left side of the comparison equation. That is, don't write: While (c == '/ t' || c == '|| c ==' / n ') C = getc (f); but writing: While ('/ t' == c || '' == c || '/ n' == c) C = getc (f); use the following method but will get a compiler diagnosis: While ('/ t' = c || '' == c || '/ n' == c) C = getc (f); this style makes the compiler find problems; the above statement is invalid because it tries to assign values to "/ T". Fantastic trouble. Various C implementations are usually different in certain aspects. Adhere to the use of languages may be helpful to all implementations. By doing this, you easily transplant the program to new machines or compilers, and do not meet the problems caused by the particularity of the compiler. For example, consider the string: / * / * / 2 * / ** / 1 here utilize the "Maximal Munch" rule. If you can nested a comment, you can interpret this string as: / * / * / 2 * / * * / 1 two / * symbols match two * / symbols, so the value of the string is 1. If the comment is not nested, then on some systems, / * in the comment is ignored. A warning is issued for / * on other systems. Either case, this expression can be explained: / * / * / 2 * / * * / 1 2 * 1 Seeking 2. Clearing Output Buffer When the application is extremely terminated, the end of its output is often lost. The application may not have the opportunity to completely clear its output buffer. A portion of the output may still be in memory and will never be written. On some systems, this output may have a few pages. Lost output in this way will make people misunderstand because it gives people the programs fail to fail before it fails for a long time. The method of solving this problem is to force the output from the buffer, especially during debugging. The exact approach varies with the system, but there is also a common method, as shown below: SetBUF (stdout, (char *) 0); must perform the statement before writing anything. Ideally, this will be the first statement in the main program. GetChar () - Macro is also a function The following programs copy its input to its output: #include Int main (void) { Register int A; While ((a = getchar ())! = EOF) PUTCHAR (A); } Remove the #include statement from the program will cause the program to be able to compile because EOF will be undefined. We can rewrite the program with the following methods: #define EOF -1 Int main (void) { Register int A; While ((a = getchar ())! = EOF) PUTCHAR (A); } This is feasible on many systems, but there is a lot of running on some systems. Because the function call usually takes a long time, it often implements the getChar as a macro. This macro is defined in stdio.h, so when you remove #include However, the "maximum suitable" rule requires it to decompose it as: A B This is ineffective in grammar: it is equal to: ((A ) ) B However, the results of A are not left values, so the operands of are unacceptable. Thus, the rules that are analyzing the unclearness of the lexic method make it impossible to resolve this example in a syntax. Of course, the cautious approach actually avoids such a configuration without completely determines their meaning. Of course, add spaces help the compiler to understand the intentions of the statement, but (see from the perspective of code maintenance) to divide this structure into multiple lines: B; (A ) B; Careful to process the function function is the most common structure concept in C. They are applied to implementing "Self-descending" problem solution - that is, breaking the problem into a child problem, until each child can be represented by code. This is helpful for the modularity and document records of the program. In addition, programs consisting of many small functions are easier to debug. If there are some function parameters not expected type, force them to convert them to the desired type, even if you are not necessary, you should do this, because (if you don't convert), you may give you the most of your most unexpected cause trouble. In other words, the compiler typically enhances the type of function parameters and converts into a desired data type to comply with the function parameters. However, in the code, the programmer can be clearly explained in manual mode, and the correct result can be ensured when transplanting the code to other platforms. If the header file is not able to declare the return type of the library function, then declare them. Use the # ifdef / # ENDIF statement to enclose your statement to stand up to another platform. The function prototype should be used to make the code more robust, making it faster. Suspension ELSE Unless you know what you are doing, you should avoid "hanging ELSE" questions: IF (a == 1) IF (b == 2) Printf ("*** / n"); Else Printf ("### / n"); The rule is ELSE attached to the nearest IF. When there is doubts, or if there is an unclear possible, the hair brackets are added to explain the block structure of the code. Array boundaries Check the array boundaries of all arrays, including strings, because some people may enter "Floccinaucinihilipilification" in a place you now enter "Fubar". Agricultural software products should not use Gets (). C The subscript is based on zero to start all count problems easier. However, grasp how to handle them need for some efforts. The empty sentence for empty statement for or the While cycle should be alone in a row and add a comment, which indicates that this native sentence is intentionally placed, not omitted code. While (* DEST = * src ) ; / * Void * / True or false does not test non-zero values in a default method, namely: IF (f ()! = fail) Superior IF (f ()) Although the value of FAIL may be 0 (false) in C. (Of course, it should be striking between this style and the constructor demonstrated in the "Function Name" section.) When someone thinks that the return value of the failure should be -1 instead of 0, the explicit test will have you. help. Common problems is to use the strcmp function to test the string, and should never handle its results in a default. More desirable method is to define macros stREQ: #define streq (STR1, STR2) (strcmp ((str2), (str2)) == 0) Use this method, statement IF (Streq (InputString, SomeString) ... It has an implicit behavior that does not make a change in your unknown (people often do not rewrite or redefine standard library functions such as strCMP ()). Do not use 1 to check the equal boolean (TRUE and YES, etc.); use 0 test inequality (False and NO, etc.). Most functions are ensured that 0 returns 0 when the conditions are false (false), but only returning non-zero only when the condition is true (TRUE). Therefore, it is best to IF (func () == true) {... write IF (func ()! = false embedded statement uses embedded assignment statements to see time and place. In some constructs, if there is no better way without using more and less read code, there is no better way: While ((c = getchar ())! = EOF) { Process the character } It is possible to use embedded assignment statements to improve runtime performance. However, you should improve the speed and reduction of maintainability, using embedded assignment statements in human specified locations can result in low maintenanceability. E.g: x = y z; D = x r; should not be replaced with: D = (x = y z) r; even if the latter may save a cycle. In the end, the difference between the two runs at runtime will decrease with the enhancer of the optimizer, and the difference in easy maintenance will increase. The GOTO statement should be conservatively used GOTO. When you jump out of the neckline Switch, FOR and WHILE, use this statement is effective, but if there is such a need, it indicates that internal construction should be decomposed into a separate function. For (...) { While (...) {... IF (WRONG) Goto error; } } ... Error: Print a message When you must use GOTO, the included label should be individually located, and a tab is taken at the beginning of the left side of the subsequent code. Both the GOTO statement and the goal should be added, indicating its role and purpose. "Fall-Through" in Switch When a piece of code has several labels, put these labels in a separate line. This style is consistent with the use of vertical spaces and rearranges the Case option (if it is required) has become a simple task. The "Empty" characteristics of the C Switch statement should be accomplished for future maintenance. If this feature has brought you "trouble", you can understand the importance of this! Switch (expr) { Case ABC: Case Def: STATEMENT; Break; Case UVW: Statement; / * Fallthrogh * / Case XYZ: STATEMENT; Break; } Although the last Break is not required, if you want to add another case after the last case, you can use Break to prevent "empty" errors. If you use the default case statement, it should always be the last one and (if it is the last statement) does not require the last BREAK statement. Constant symbol constants make the code easier to read. It should be avoided to avoid using digital constants; the #define function using the C pre-regulator gives a meaningful name. In a location (preferably in the header file) defined value, the management large program will make it easier because only change the definition to change the constant value. It can be considered using enumeration data types as an improved method for the declaration of only a set of discrete values. Use enumerations that allows the compiler to warn any misuse of your enumeration type. Any direct encoded digital constant must have at least one note from the list of instructions. The definition of constants should be consistent with its use; for example, 540.0 is used for floating point numbers, rather than forcing the conversion of 540 by implicit floating point type. That is, in some cases, constants 0 and 1 can appear directly in its own form, not in the defined form. For example, if a For loop traverses an array, then: For (i = 0; I Very reasonable, code: Gate_t * front_gate = OPENS (Gate [i], 7); IF (Front_gate == 0) Error ("can't open% S / N", Gate [i]); it is unreasonable. In the second example, Front_Gate is a pointer; when the value is a pointer, it should be compared to NULL and less than 0. Even in the simple value of 1 or 0, it is generally preferred to use the definitions like TRUE and FALSE (sometimes Yes and NO read more clearly). Do not use floating point variables where you need discrete value. This is due to the decision of the floating point number (see the second test above SCANF). Use <= or> = test floating point number; accurate comparison (== or! =) May not detect "acceptable" equivalence. Simple character constants should be defined as character text instead of numbers. Non-text characters are not promoted because they are not portable. If you must use non-text characters, especially in a string, you should use the three-digit eight-in-order (not a character) escape character (such as "/ 007") to write them. Even so, such usage should be considered as related to the machine and should be processed in this case. Conditional Compilation Condition Compiles can be used for machine-dependent, debugging, and certain options when compiling. Various controls can be easily combined with unpredictable ways. If #ifdef is used for machine correlation, make sure that it will be wrong when you do not specify a machine, not the default machine. #Error directive can be used conveniently for this purpose. If you use #ifdef to optimize, the default value should be an unopened code rather than an uncharacterized or incorrect program. Ensure that the unopened code is tested. other The utility like Make is used for compilation and link greatly simplifies tasks that move the application from an environment to another. During development, Make is only recompiled with modules that have changed from the last time using Make. Lint is often used. LINT is a C program inspector that checks the C source file to detect and report the mismatch and inconsistencies of the type between the function definition and the call, and the possible existing program errors. In addition, study the compiler documentation to understand the switches that make the compiler becomes "blow". The operation of the compiler is to be accurate, so it can report the error that may exist by using the appropriate command line option. Make the minimum number of global symbols in the application. One of the benefits of this is to decrease the likelihood of conflict with the system defined function. Many programs will fail when they are missing. Empty input tests should be made for all procedures. This may also help you understand the working principle of the program. Don't have too much assumptions for your users or the language you use. Those who "impossible" sometimes sometimes happen. A robust program can prevent such a situation. If you need to find a boundary condition, your user will find it in some way! Never make any assumptions for a given type, especially pointers. When using a CHAR type in an expression, most implementations use them as unsigned types, but some implementations use them as symbolic types. When using them in an arithmetic expression, it is recommended to enforce them for them. Do not rely on initialization of the automatic variable and Malloc returned. Clear your program's purpose and structure. To remember, you may ask you or other people to modify your code or run it on other machines. Carefully write your code so you can transplant it to other machines. Conclusion The maintenance of the application is to spend a lot of time, which is a well-known thing. Part of the reason is due to the use of non-portable and non-standard features, as well as unsatisfactory programming styles. In this article, we introduced some guidelines for many years, they have been giving us a great help. We believe that as long as these guidelines, we will make it easier to maintain application maintenance in a team environment. Reference Obfuscated C and Other MySteries, written by Don Libes, John Wiley and Sons, Inc. ISBN 0-471-57805-3 The C Programming Language, Second Edition, written by Brian W. Kernighan and Dennis M. Ritchie, Prentice-Hall, ISBN 0-13-110370-9 Safer C, written by les hatton, mcgraw-hill, ISBN 0-07-707640-0 C Traps and pitfalls Written by Andrew Koenig, AT & T Bell Laboratories, ISBN 0-201-17928-9 About Author Shiv Dutta is a technical advisor for IBM Systems Group, helping independent software vendors to enable their applications at the PSeries server. SHIV has a rich experience in software developers, system administrators, and lecturers. He is supported in AIX system management, issues, performance tuning, and scale guidance. SHIV is working in this area when AIX is born. He got a physical doctorade from Ohio University, and he contacted him via SDUTTA@us.ibm.com. Gary R. Hook is an IBM's advanced technical consultant to provide application development, transplantation and technical assistance to independent software vendors. Mr. Hook's professional experience is mainly based on UNIX-based applications. When adding IBM in 1990, he works in the Center of AIX Technical Support in Southlake in Texas, providing customers with consulting and technical support services, focusing on the AIX application architecture. Mr. Hook is in Austin now, during 1995 to 2000, he is a member of the AIX Kernel Development team, specializing in AIX link programs, load programs, and general application development tools. You can contact him through ghook@us.ibm.com.