Programming essence - Microsoft preparation high quality unclear C procedure secret

xiaoxiao2021-03-06  117

Dedicated to my wife Beth,

And my parents Joseph and Julia Maguire

──── For their love and support

sequence

In 1986, in the case of consulting and working for several small companies in order to obtain the experience of writing Macintosh applications, I deliberately went to Microsoft, and I participated in the Macintosh Development Team. This team is responsible for the development of graphic spreadsheet applications for Microsoft.

At that time, I can't affirmed what the code is like, I think it should be introduced into a victory and elegant! But the code I have seen is usually very common, and there is nothing different from the other code I have seen. You know, Excel has a quite beautiful user interface - it is more easier to use than other characters based on other characters, more intuitive. But I feel deeper is a multi-function debugging system included in the product.

The system is intended to automatically ask programmers and testers for error alarms. Its work mode is very like a Boeing 747 driver's alarm light to the driver to the driver. The debug system is mainly used to monitor the code, which is not too much to test the code. Although the concept of this debugging system is no longer fresh, then their extensive use of the system and the effective error of the system are still attracted to me, so that I will be inspired. Not long, I found that Most of Microsoft's projects have multiple internal debugging systems, while Microsoft's programmers attach great importance to errors in the code and the cause of it.

After doing two years of Macintosh Excel, I left the development team to help another group of code errors. In the development of Excel two years, I found that Microsoft twice, but many of the concepts familiar with many old projects did not pass to the new project group with the company's growth. The new programmer is not particularly valued by the old program habits that I am joining Microsoft. Only the general attention is paid to it.

After I transferred to the new project group, I once again mentioned a programmer partner: "Some concepts written in the wrong code should be written to the principles to open in the new project group." At this time, another programmer said to me: "You don't always want to write a document, why don't you write this? Why don't you write this book, ask if the Microsoft Press is willing to publish? After all, these The information is not a patent, but its role is just to make programmers pay more attention to errors. "

At that time, I didn't think much about this proposal. The main reason is there is no time, and I have never written books before. In the past 190s, I used to help others hosted Hi-RES magazine program design columns in the 1980s. This is not a matter of time.

As you can see, this book is still written. The reason is very simple: 1990, because there are more and more mistakes, Microsoft canceled a product that has not been announced. Now, more and more is nothing more and more, and several competitors in Microsoft have canceled some projects. But Microsoft cancels the project because of this reason, or once. Recently, the product error has occurred continuously. The manager finally started to "can't stand it", and took a series of measures to try to drop the error rate to the original level. Despite this, no one will record these errors due to records.

Now, Microsoft is more than 9 times more than I just enter the company. It is difficult to imagine that if there is no accurate guide, the company can reduce the error rate to the original level. Especially in the case of Windows and Macintosh applications, this is even more complicated.

The above is the reason I finally decided to write this book.

Microsoft Press has agreed to publish this book.

This is the case.

I hope that you will like this book, I try to make this book less boring and as fun as possible.

Steve MAGUIRE

Seattle, Washington

1992.10.22

Acknowledgment, I would like to thank all the people who helped the book from the Microsoft Press, especially during my writing process, I always teach me two people. First of all, I would like to thank my diand editor Mike Halvorson, he made me have completed this book in accordance with my progress, and patiently answers many of the issues raised by this new author. I would also like to thank my responsibility to edit Ms. Erin O'Connor, she used many additional times to give feedback on the chapters I wrote early. Without their help, there will be no such book. Erin also encourages me to write this book with humorous style. She is a laughter to laugh in the text. Of course, I will not make me unhappy.

I would also like to thank my father Joseph MAGUIRE is his introduction of early microcomputer world in the mid-1970s: Altair, IMSAI and SOL-20, making me end with this line. I would like to thank my partner Evan Rosen at Valpar International in 1981 to 83. He is one of the most influential people in my career. His knowledge and insights are reflected in this book. There is also Paul Davis. In the past 10 years, I have had a pleasant cooperation with many of the projects across the country, and he undoubtedly shaped my way of thinking.

Finally, I thank the time to read the draft of this book and provide all people of technical feedback. They are: Mark Gerber, Melissa Glerum, Chris Mason, Dave Moore, John Rae-Grant and Alex Tilles. Special gratitude to Eric Schlegel and Paul Davis, they are not only the reviewers of the draft book, but also help me in the concept of this book.

Naming agreement

The book adopted by this book is similar to the "Hungarian" naming agreement used by Microsoft. This agreement is developed by Charles Simonyi born in Budapest, Hungary, which enters additional information in data and function names to enhance the program's understanding of the program. E.g:

CHAR CH; / * All character variables are started at CH * /

BYTE B; / * All bytes are crown in b * /

Long L; / * All long words are crown in l * /

For pointers pointing to a data type, you can create a type of name as above, and then add the prefix letter P:

Char * PCH; / * Pointer to the CH start * /

BYTE * PB; / * Similarly * /

Long * PL;

Void * pv; / * Spelp in deliberate use * /

Char ** PPCH; / * Pointer to character pointer * /

BYTE ** PPB; / * Pointer to the byte pointer * /

Hungarian names are usually not so dedicated, but when they read them in the code, they can have many information from them. For example, when you see a variable named PCH in a function, you don't have to look at the declaration. Now it is a pointer to the character.

In order to make the Hungarian names are more descriptitive. Or to distinguish between two variable names, you can add a "label" that starts with a capital letter after the basic name. For example, the STRCPY function has two character pointer parameters: one is the source pointer and the other is the destination pointer. With Hungarian naming conventions, its corresponding prototype is:

Char * STRCPY (Char * PCHTO, Char * PCHFROM); / * Prototype * /

In the above example, the two character pointers have a common feature - all points to the string ending with 0 is ending. So in this book, whenever you use a character pointer to the string, we use a more meaningful name STR to represent. Therefore, the prototype of the above STRCPY is:

Char * STRCPY (Char * STRTO, CHAR * STRFROM) / * Prototype * / This book is used to use the type SIZE_T in the ANSI standard. Some typical usage of this type is given below:

SIZE_T SIZENEW, SIZEOLD; / * Prototype * /

Void * malloc (size_t size); / * prototype * /

Void * Realloc (void * pv, size_t sizenew); / * prototype * /

The naming of functions and arrays follows the same agreement, and the name begins with the corresponding return type name, followed by a description tag. E.g:

CH = chlastKeyPressed; / * get a character * /

CH = chinputbuffer []; / * Get a character * /

CH = ChreadKeyboard; / * get a character * /

If you use Hungarian naming methods, Mall ~ and Reali ~ can be written as follows:

Void * pvNewBlock (size_t size); / * prototype * /

Void * pvresizeblock (void * pv, size_t sizenew); / * prototype * /

Since the Hungarian naming method aims to enhance the procedure's understanding of the program, most Hungarian names have more than ANSI strictly regulate 6 letters limitations. This is not wonderful, unless the system used is a system designed decades ago, otherwise the limit of this 6 letters is only a historical remains.

The above content basically does not involve the details of the Hungarian naming agreement, and the readers are required to understand the necessary contents of the variables and function names used in this book. If the reader is interested in the details of the Hungarian naming agreement, you can refer to the Ph.D. in SIMONYI listed in the end of this book.

Some background

This book uses some readers that may not be familiar with the name of the software and hardware system. The following is a simple description of several of the most common systems.

Macintosh Macintosh is Apple's graphics window computer, which is announced in 1984. It is the first to support the popularity of the "Wanted" supported by the "Wanted" support.

Windows Windows is a graphics window operating system of Microsoft. Microsoft announced Windows 3.0 in 1990, which is significantly better than its earlier versions.

Excel Excel is Microsoft's graphic spreadsheet software, published on Macintosh in 1985, which is then ported to Windows after a lot of rewriting and tired work. Macintosh Excel and Windows Excel have shared a name for many years, but the code used by the program is not the same.

In this book, find many times to mention the experience of Macintosh Excel programmers. However, it should be noted that most of my work is to move the code of Windows Excel to Macintosh Excel or to achieve similar features with Windows Excel. I have no special relationship with this product's amazing success.

The only realistic contribution I made for Macintosh Excel is to convince Microsft to give up Macintosh Excel, and directly use Windows Excel's source code to construct Macintosh's products. The Macintosh 2.2 is the first version based on Windows Excel, which has a Windows Excel 80% source code. This is very significant for the user of Macintosh Excel because they will feel that the features and quality of Macintosh Excel have been used for 2.2. Word Word is Microsoft's word processing application. In fact, Word has three versions: characters based on DOS version running on MS-DOS; Macintosh version; Windows version. So far, although the three versions of the products are made with different source code, they are very similar, and the user does not have any difficulties. Since Excel has successfully used the shared code, Microsoft has determined that the high version of Word will also develop by shared code.

80x86 80x86 is an Intel CPU family that is commonly used by MS-DOS and Windows machines.

680x0 680x0 is the Motorola CPU series used in various Macintosh.

introduction

A few years ago, when I was obsessed with Donald Knuth, I was deeply touched by a paragraph in the preamble:

I am convinced that the last error of Tex has been discovered and excluded from November 27, 1985. But if you don't know if you don't know, Tex still has a mistake, I am very willing to pay for the first discovery $ 20.48. (This amount is twice assed before. I am planning to double it in this year. How credoin!)

I am not interested in whether Knuth has paid for someone $ 20.48 or even $ 40.96. This is not important. Important is the kind of confidence in his procedure. So, if you know, how many programmers will seriously claim that their programs do not have errors? How many dare to print this claim to be on the book, and prepare for the wrong discoverer?

If the programmer is confident that the test group has discovered all the errors, he may dare to make this statement. But this is a problem. Whenever the code is packaged to the program dealer, people have the best wishes in the chest to say: "I hope that all the mistakes have been discovered." I have seen this scene.

Since modern programmers have already abandoned the duties of thorough testing of the code, they can't know if the code is wrong. Managers will not publish test situations, just say: "Don't fuck that heart, testers will test for you." More subtle is that managers want the programmer to test the code. At the same time, they hope that the tester is more thorough, because this is their job work after all.

As you will see in this book, there will be hundreds of technologies, and programmers can use, but the testers cannot be used because these technology and code write are directly related.

Two key issues

All the declarations introduced in this book is when discovering errors, constantly pursue their results on the following two questions:

l How can I automatically detect this error?

l How can I avoid this error?

The first question may make the reader think that this book is a book about the test, is actually not. Is it testing when editing a syntax error? No, not. The editor is just an error in the code automatically. Syntax errors are only the most basic error type that the programmer can use. This book will detail the method to automatically prompt the programmer.

The best way to write a wrong code is to put the wrong error in the first bit. There are also many techniques for this issue. Some techniques are related to commonly used coding practices, but they are not like "everyone violates principles" or "no one violates the principle", but the corresponding details are discussed in detail. To remember, at any time, it is often the worst road that can be selected later. Therefore, it is necessary to determine this to do this, but don't just because the other people are so ourselves. The last chapter of this book discusses the importance of writing a correct attitude. If there is no correct attitude, it is found that the error and preventing errors are like a window to heat the window in winter, although it can achieve a lot of energy.

This book is equipped with practice other than Chapters 4 and 8. But note that these exercises are not to test the reader's understanding of the corresponding content. In fact, more is that the author wants to explain in the body of this chapter but cannot put it in. Other exercises are to let readers think about some questions related to the chapter, open their ideas, and ponder the concepts that have not been considered before.

Either case, I want to supplement some new skills and information by exercises, so it is worth reading. In order to enable the reader to understand the author's intentions, this book provides the answer to all exercises in Appendix C. Most chapters have also given some topics, but these topics have no answers, because this topic is usually task, not a problem.

Rules or suggestions

The book is scheduled to design the classic book "The Elements of Programming Sytle" written in Brian Kernighan and P. J. Plauger. This book is "The Elements Of Style" written by William Strunk Jr. and E. B. White. These two books adopt the same basic concept expression method:

l gives an example;

l Stating some of the problems in this example;

l Improve this example with a general criterion.

Indeed, this is a class, and it is a program that makes readers feel comfortable, so this book also uses this. The author is trying to make this book a kind of enjoyment, although it has the nature of the formula. I hope the reader will feel interesting in this book.

This book also gives some "general guidelines" that do not seem to violate. Our first criterion is:

Each guidelines have exceptions

Since the guidelines are used to illustrate the general situation, this book generally does not specify the exception of the criteria, and leaves it to the reader. I believe that when the reader reads a criterion, it will definitely be suspected: "Oh, when ..., it is not such ...". If someone said to you: "Can not be red", although this is a criterion, you will definitely give a special situation, in this case, the red light is a correct action. Here is the key to remember that the guidelines are only in general, so only the reason is very sufficient to violate the guidelines.

About this book code

All the code of this book is written by ANSI C, and through the five popular compilers on MS-DOS, Microsoft Windows, and Apple Macintosh:

Microsoft C / C 7.0 Microsoft

Turbo C / C 3.0 Borland International

Aztec 5.2 Manx Software Systems

MPW C 3.2 Apple Computer Company

Think C 5.0 Symantec

There is also a question: If the reader wants to pick the code from this book in its own program, be careful. Because many examples have errors because of the arguments in the manual. In addition, the functions used in the book are the same as the standard library functions of ANSI C, but have some small modifications for the corresponding interface. For example, the interface of the ANSI version Memchr function is: void * Memchr (const void * s, int C, size_t n);

Here, the internal of the Memchr will be processed as unsigned char. In many places in this book, readers will see the character type is explicitly declared as unsigned char, not int:

Void * Memchr (const void * pv, unsigned char ch, size_t size);

The ANSI standard declares all characters to int, to ensure that its library function can also be used for non-original programs previously written prior to ANSI standards, and then programs use the extern declaration function. Since only ANSI C is used in this book, you don't have to consider these down-compatible details, you can use more accurate types to declare the clear level of the program and use prototypes (see Chapter 1).

"Is it Macintosh?"

For some reason, if a book does not mention PDP11, Honeywell 6000, of course IBM 360, will not be taken seriously. Therefore, I also mentioned here. Only, readers will never see these words in this book, readers see the most MS-DOS, Microsoft Windows, special Apple Macintosh, because I have been writing code for these systems in recent years. However, it should be noted that any code in this book is not subject to these specific system constraints. They are written in universal C, which should be able to operate under any ANSI C compilation program. Therefore, even if the reader is not mentioned, it is not necessary to worry about the details of these operating systems.

It should be mentioned that in most microcomputer systems, users can read and write through the NULL pointer, destroy the stack framework and leave a lot of useless information in memory or even other applications, and the hardware is nothing. Reaction, listen to the user is what you want. The reason why it refers to this because if the reader is used to writing a hardware failure through the NULL pointer, it is impossible to confuse some of the statements in this book. Unfortunately, the protective operating system on the microcomputer is still not popular, and the hidden dangers that destroy memory must be discovered by hardware protection (usually it does not provide adequate protection).

There is a mistake

Don't have to define what is wrong for the readers of this book, I believe that readers know what is wrong. But the error can be divided into two categories: a class is an error generated when developing a feature, and the other is an error in the code after the programmer thinks that the feature has been developed after the completion has been developed.

For example, in Microsoft, each product is constructed of some original source code that should never contain errors. When the programmer adds a new function to the product, it does not directly change the corresponding original source code, and the copies are changed. Only in these changes have been completed, and the programmer is confident that there is no error in the corresponding code, it will be merged into the original source code. Therefore, from the product quality, there is no relationship between the implementation of how many errors are generated during the designation function, as long as these errors are deleted before the corresponding code is incorporated into the original source code.

All errors are harmful, but the most dangerous mistake of damage to the product is the error in the original source code. Therefore, the mistakes mentioned in this book are the errors that have entered the original source code. The author does not expect that the programmer always writes no error before typing the computer, but is confident that the error is prevented from invading the original source code. Especially after the programmer is using the secrets provided by this book, even. Chapter 1 Imaginary Compiler

Readers can consider what is the error condition of the corresponding program if the compiler can correctly point out all the questions in the code? This does not only mean syntax errors, and any problem in the program, no matter how hidden it. For example, it is assumed that there is a "difference 1" error in the program, and the compiler can use some way to find it and give the following error message.

-> Line 23: While (i <= j)

OFF BY ONE Error: this kind be '<'

Another example, the compiler can find the following error in the algorithm:

-> Line 42: Int Itoa (Int i, char * STR)

Algorithm Error: Itoa Fails When I IS-32768

Again, when the parameter is passed, the compiler can give the following error message:

-> Line 318: Strcopy = Memcpy (Malloc (Length), STR, Length;

Invalid Argument: Memcpy Fails When Malloc Returns NULL

Ok, it seems to be a bit too much to ask the compiler. But if the compiler can really do this, how easy you can imagine whether you write a worse program. That is just a small matter, and the general practical practice of the current programmer is really unable to ratio.

If you use a camera on a spy satellite to align a typical software workshop. I will see that the programmer is in front of the bows and tracks errors on the keyboard; next to the tester, the tester is initiating an internal version of the internal version, and the amount of data that is bombed into the number of people in order to find new mistakes. You will also find that the tester is checking whether the old version of the error slipped into the new version. It can be contemplated that this check-error method is much more effort to check the error with the above imaginary compiler, it is true, and it is a little luck.

luck?

Yes, luck. The tester can find errors, it's not because he notes like a number wrong, a feature does not work or proceed in a desired way? Take a look at the above error given above: Although the program has a "difference 1" error, but if it still works, can the tester can see it? Even if you can see it, then the other two mistakes?

This sounds like it is terrible but tester is to make a large number of input data to the program, and I hope that there is a potential error to be unveiled. "Oh, no! Our testers are not so simple, we also use code coverage tools, automatic test sets, random" monkeys "programs, draw printed or other." Perhaps this, but let's take a look at what these tools do! Overlay analysis tools can specify which parts of the program have not been tested, the tester can use this information to derive a new test case. As for other tools, it is nothing more than "input data, observation results".

Please don't misunderstand, I don't say that the testers are all wrong. I just said that I can use the black box method to fill the data into the program and see what it pops up. This is like determining that a person is not a madman. Ask some questions, get the answer after answering. But still can't determine that this person is a madman. Because we can't know what is thinking in his mind. You will always ask yourself like this: "Is the question I asked? I asked questions ...".

Therefore, do not light on the black box test method. It should also try to imitate the imaginary compiler of the previous, to exclude the influence of luck on the program test, and automatically grasp each opportunity of the wrong. Consider the language used

When is your last time you look at the advertisement of the sales word handler? If the advertisement is written by the group of Madison Street, it is likely to say this: "Write a note for the children or for the next" Great American Novel ", WordSmasher can do it, Wordsmasher is equipped with a surprisingly 233000 word spelling dictionary, which is more than 51,000 words than the same product. It can easily find the typing error in the sample. Hurry to the dealer to buy a copy. Wordsmasher is from The most revolutionary writing tool since the ball is coming! "

The user has continued to promote and influence the market, and they are almost believed in spelling the dictionary, but the fact is not the case. As EM, Abel, and Si, you can find in any of the listing dictionary, but in the case of ME, ABLE and IS, you want to let the spell checker think EM, Abel and Si are also spelling correct Word? If so, then when you see the Suing I wrote, it is very likely that it is unpacked with the wind of the wind. The problem is not that Suing is a true word and is that it is indeed an error here.

Fortunately, some quality spell checked programs allow users to delete the like EM. In this way, the spell check can regard the original legal word as a spelling error. A good compiler should also be able to see such a legal C habits that repeatedly errors to see the errors in the program. For example, such compilers can check the following While loop missed a semicolon:

/ * Memcpy Copy a non-overlapping memory block * /

Void * Memcpy (void * pvto, void * pvfrom, size_t size)

{

BYTE * PBTO = (Byte *) PVTO;

BYTE * PBFROM = (byte *) pvfrom;

WHILE (size -> 0);

* PBTO = * PBFROM ;

Return (PVTO);

}

We can know that the semicolon by the why expressions will definitely be an error since the system's indentation situation, but the compiler thinks this is a full legitimate While statement, its cyclic body is an empty statement. Since sometimes requires empty statements, sometimes no empty statements, so in order to detect unwanted empty statements, compilers often give an optional warning message when encounter empty statements, and automatically warn that you may have the above error. When you are determined when you need a nicker, you will use it. But it is best to use NULL to make it obvious. E.g:

Char * STRCPY (Char * PCHTO, Char * PCHFROM)

{

Char * pchstart = PCHTO;

While (* PCHTO = * PCHFROM )

NULL;

Return (PCHSTART);

}

Since NULL is a legitimate C expression, this program has no interprises. The more advantageous benefits using NULL is that the compiler does not generate any code for the NULL statement, because NULL is just a constant. In this way, the compiler accepts an explicit NULL statement, but automatically uses the implicit empty statement as an error. Only one form of empty statement is allowed in the program, as in order to maintain the consistency of the text, only one of ZERO is used in the text, so I want to remove another plurality of Zeros from the spell dictionary.

Another common problem is unintentional assignment. C is a very flexible language that allows you to use assignment statements anywhere. So if the user is not cautious enough, this excess flexibility will make you mistaken. For example, this common error occurs in the following procedures: IF (CH = '/ t')

ExpandTab ();

Although it is very clear that the program is to compare the CH and horizontal tab, but in fact, it has become a value of CH. For this program, the compiler must not generate an error because the code is legal C.

Some compilers allow users to use simple assignments in the && and | | expressions, and IF, For, and While constructs, it can help users find this error. The basic basis for this approach is that the user is very likely to be equal to the above five cases == accidentally in the assignment number =.

This selection does not hinder the user's assignment, but in order to avoid warning information, the user must take another value, such as zero or empty characters and assignment results to make an explicit comparison. Therefore, for the front STRCPY example, if the loop is written:

While (* PCHTO = * PCHFROM )

NULL;

The compiler will generate a warning message to write;

While (* PCHTO = * PCHFROM )! = '/ 0')

NULL;

There are two benefits. First, modern commercial level compiler does not generate additional code for this redundancy, which can be optimized. Therefore, the compiler that provides this warning selection is trustworthy. Second, it can be risks, although both legal, this is a safer usage.

Another class error can be classified into the "Parameter Error". For example, how many years ago, when I was studying C language, I once called FPUTC:

FPRINTF (stderr, "unable to open file% s. / n", filename);

......

FPUTC (stderr, '/ n');

This program looks like there is no problem, but the parameter of FPUTC is wrong. I don't know why, I always think that the stream pointer (stderr) is always the first parameter of such stream functions. This is not the case, so I often pass many of the information that is useless to these functions. Fortunately, ANSI C provides a function prototype, which can automatically find these errors when compiling.

Since the ANSI C standard requires that each library function must have prototype, the prototype of the FPUTC can be found in the stdio.h header file. The prototype of FPUTC is:

INT FPUTC (INT C, FILE * stream);

If INCLUDE is stdio.h in the program, the compiler will compare each parameter transmitted based on its prototype. If the two types are different, it will generate compilation errors. In the above error example, because the file * type parameters are transmitted at the INT position, the previous FPUTC error can be automatically discovered by the prototype.

ANSI C Although the standard library function must have prototype, the function that the user must be written must also be prototype. Strictly speaking, they can have prototypes or there is no prototype. If the user wants to check out the call error in his own program, you must establish a prototype yourself and keep it consistent with the corresponding function.

I have heard the programmer recently complained that they must maintain the prototype of the function. Especially when you just transfer from the traditional C project to the ANSI C project, this complaint is more. This complaint is of a reason, but if you do not use prototype, you have to rely on traditional test methods to find out call errors in the program. You can ask yourself, which is more important, is it to reduce some maintenance workload, or can I find errors when compile? If you are still not satisfied, consider this fact that you can use prototype to generate a better quality code. This is because: ANSI C standard makes the compiler can be optimized according to the prototype information. In traditional C, the compiler is basically less about its information for functions that are not currently being compiled. Despite this, the compiler still must generate calls to these functions, and the generated call must work. Compiler implementors to solve this problem is to use standard call conventions. Although this method works, it often means that the compiler must generate additional code to meet the requirements of the invocation. However, if you use the "requires all functions must have prototype" this compiler, the compiler is used to understand the parameters of each function in the program, so you can choose the most efficient call for different functions. Agree.

The empty statement, the error assignment, and the prototype check is just a small part of the selection provided by many C compilers, and there are often more other options. The point here is that the user can select whether the compiler warning facility will issue a warning message to the user, and the way it works very similar to the spell checker to process possible spelling errors.

Peter Lynch, is said to be the most good investment company manager in the 1980s. He has said that the difference between investors and gamblers is that investors use every opportunity, no matter how small it is, to fight for interest; Credit is only luck. Users should use this concept as programming activities, select all optional warning facilities of the compiler, and see these measures as a programs that have no risk-free procedures. Don't ask: "Do you have this warning facility? And you should ask:" Why don't you use this warning facility? "To open all warning switches unless there is excellent reason to do this."

All optional warning facilities using compilers

Enhanced prototype

Unfortunately, if the function has the same type of parameters, then the prototype does not detect this call error even if the position of the two parameters is interchanged when calling the function. For example, if the prototype of the function Memchr is:

Void * Memchr (Const Void * PV, INTCH, INT Size);

Then when the function is called, the compiler will not issue a warning message even if it interrupted its character CH and size size parameters. However, if more accurate types are used in the respective interfaces and prototypes, the error check capability provided by the prototype can be enhanced. For example, if you have the following prototype:

Void * Memchr (const void * pv, unsigned char ch, size_t size);

Then, when the function is called, its character CH and size Size parameters are called, and the compiler will give a warning error.

Using more accurate types of defects in prototypes are often the explicit type conversion of parameters to eliminate errors that do not match the type, even if the order of the parameters is correct.

Lint is not so bad

Another check error is more detailed, a more thorough approach is to use LINT, which is hard to expect. Initially, the LINT tool is used to scan the C source file and warn it on the unmiglable code in the source program. But most of the Lint utilities have become more strict, but can not only check the portability issues, but also check those who can be portable and completely syntax but it is probably wrong, those suspicious. Error is this category.

Unfortunately, many programmers still think of Lint as a portable checkup, think it can only give a lot of unrelated warning information. In short, LinT got a reputation that was not worth a trouble. If you think about the programmers like this, then you may rethink your insight. What kind of tool is to be more close to the hypothetical compiler as described above is the compiler you are using, or LINT? In fact, once the source program is in the form of no LINT error, it is easy to keep this state. As long as you run the LINT for the changed part, it will be incorporated into the original source code after no error. With this approach, don't make too much consideration, as long as you pass one or two, you can write a code without the LINT error. When you reach this level, you can get the benefits of LINT.

Use LINT to detect errors missing from the compiler

But I have made modifications are usually

When a technical reviewer with the book, he asked me whether the book intends to include a unit test. I replied: "No". Because although unit tests are related to erroneous code, it actually belongs to another different category, which is how to write a test program for the program.

He said: "No, you misunderstood. I mean whether you intend to point out that programmers should actually perform corresponding unit tests before the new modification is incorporated into the original source code. One of my groups The member is because there is no corresponding unit test after the program has been modified, so that an error enters our original source code. "

This makes me very surprised. Because in Microsoft, most project leaders require programmers to perform corresponding unit tests before merging the source code.

"You didn't ask why he did not do unit test?", I asked.

My friend raised my head from the table to say: "He said that he did not write any new code, but he moved some of the existing code. He said he believes that there is no need to make a unit test."

This kind of thing has happened in my group.

It reminds me of that there is a programmer after making a modification, and even compiling the corresponding code into the original source code. Of course, I found this problem because I generated an error when compiling the original source code. When I asked this programmer how to miss this compilation error, he said: "I have modified usually, I don't have wrong", but he is wrong.

These errors should not enter the original source code, because both can be almost imposed. Why will the programmer make this mistake? It is the ability to estimate our own correct code too high.

Sometimes it seems to skip some steps for designing to avoid the error, but when you take the shortcut, it is the day of trouble. I suspect that there will be many programmers even have to compile the corresponding code, and "complete" a certain feature. I know this is just accidental situation, but the trend of bypassing unit test is stronger, especially as simple changes.

If you find yourself going to bypass a step. And it can be easily used to check the error, then you must stop yourself from bypass. Instead, you should check the error with each of the tools you can. In addition, although unit tests mean erroneous, if you don't make cell testing at all.

If there is a unit test, the unit test is performed.

summary

Which programmer do you know would rather spend time to track tired, not writing new code? I definitely have such a programmer, but I have not seen one yet. For programmers I know, if you promise they no longer use the next error, they will prefer to give up the Chinese dishes in a lifetime.

When you write a program, you must keep in mind the concept of imaginary compiler in your heart, so you can use every opportunity to grasp the error. To consider the error generated by the compiler, the error generated by the LINT and the reason for the unit test failure. Although using these tools should involve a lot of special technologies, how many errors will be in the product if they don't spend so many skills? If you want to quickly and easily find errors, you should use the tool's corresponding characteristics to make an error. The sooner the error, the earlier, the earlier, the earlier, the earlier.

Important:

l The best way to eliminate program errors is as early as possible, as easy as possible to find errors, and seek the smallest automatic error analysis method.

l Strive to reduce the skills required for programmers to check the error. The compiler or Lint warning facility that can be selected does not require the programmer to have any error. In another extreme, the advanced encoding method can be found or decreased, but they also ask the programmer to have more skill, because the programmer must learn these advanced coding methods.

Exercise:

1) If you use the compiler selection item that is forbidden to assign a value in the WHILE, why can I find out the calculation priority error in the following code?

While (ch = getchar ()! = EOF)

......

2) Take a look at how you use a compiler to find out an unintentional empty statement and assignment statement. The recommended approach is to make a corresponding choice, causing the compiler to warn information about the following common problems. How can I eliminate these warning information?

a) IF (Flight == 063). The context of the program is a test of 63 flights, but because there is more 0 to make 063 into an eight-input number. The result became a test of 51 flights.

b) IF (PB! = NULL & PB! = 0xFF). Here you don't care about the && type to be &, even if PB is equal to NULL, * PB! = 0xFF will be executed.

c) quot = numer / * pdenom. There is unintentionally unintentional result, so that / * is interpreted as the beginning of the comment.

d) Word = BHIGH << 8 blow. This statement is interpreted as: word = BHIGH << (8 blow)

3) How can the compiler can give a warning for the "ELSE" without pairing ELSE "? How do users eliminate this warning?

4) Look at the following code:

IF (CH == '/ t')

ExpandTab ();

In addition to the way to use simple assignment in the IF statement, another well-known way to detect this error is to reverse the operations of the assignment number on both sides:

IF ('/ t' == CH)

ExpandTab ();

This should be == when you should type ==, the compiler will report an error because it is not allowed to assign a constant. Is this approach thoroughly? Why is it as high as compiled program switching? Why is the new programmer replaced the number of assignments?

5) The preprocessor program may also cause certain unexpected results. For example, macro uint_max is defined in limit.h, but if you forget the Include's header file in the program, the following directives will be silent, because the pre-delegation will replace the predefined uint_max to 0:

......

#if uint_max> 65535u

......

#ENDIF

How to make the pre-processing program report this error?

Question:

In order to mitigate the workload of the maintenance prototype, some compiler will automatically generate prototypes for the compiled program at compile. If you use the compiler that does not provide this option, write a usage program to complete this work. Why is the standard coding agreement makes the writing of this usage are relatively easy? Question:

If your compiler does not support the warning facilities mentioned in this chapter (including exercises), promote the corresponding manufacturer to support these facilities, and we must urge them to allow users to set or cancel the inspection of certain types of errors. In addition, it is necessary to selectively set or cancel some specific warning facilities. Why do you do this?

Chapter 2 designs and uses assertions

Using the compiler is automatically detected, it is good, but I dare to say that as long as you observe those more obvious errors in the project, it will find that the compiler is only one of them. I dare to say that if all the mistakes in the program have been excluded, the programs are working correctly in most of the time.

Remember the following code in Chapter 1?

Strcopy = MEMCPY (Malloc (Length), Str, Length;

This statement will work well in most cases unless the Malloc's call has failed. When Malloc fails, you will return a NULL pointer to Memcpy. Since Memcpy does not process NULL pointer, an error occurs. If you are lucky, this error causes the process to be paralyzed before delivery, so it is exposed. But if you don't walk, don't discover this mistake in time, the customer will definitely "walk".

The compiler can't find this or other similar errors. Similarly, the compiler also can't detect an error of the algorithm and cannot verify the assumptions made by the programmer. Alternatively, the compiler also can't check if the passing parameters are valid.

Looking for this error is very difficult, only programmers or testers with very high technology can eradicate them and will not cause other problems.

However, if you know how to do it, it is easy to find this error.

Two versions of the story

Let us go directly to Memcpy and see how to find the above mistakes. The initial solution is to check Memcpy check the NULL pointer. If the pointer is NULL, an error message is given and the execution of Memcpy is aborted. Here is the program corresponding to this solution.

/ * Memcpy ─ ─ Copy unloaded memory block * /

Void memcpy (void * pvto, void * pvfrom, size_t size)

{

Void * PBTO = (Byte *) PVTO;

Void * pbfrom = (byte *) PVFROM;

IF (PVTO == Null | | PVFROM == NULL)

{

FPRINTF (stderr, "bad args in memory / n");

Abort ();

}

While (size -> 0)

* PBTO == * PBFROM ;

Return (PVTO);

}

This function will be found as long as the NULL pointer is missed. The only problem existed is that the test code has doubled the size of the entire function and reduces the execution speed of the function. If this is "gains gains, the worse disease is, it is really reasonable because it is not practical. To solve this problem, you need to use C's pre-processes.

How about saving two versions? A neat and fast use for the delivery of the program; another bloated slow (because additional checks) is used to debug. This must maintain two versions of the same program, and include or does not include the corresponding check sections by using C's pretreatment procedures.

Void memcpy (void * pvto, void * pvfrom, size_t size)

{

Void * PBTO = (byte *) PVTO; void * pbfrom = (byte *) pvfrom;

#ifdef debug

IF (PVTO == Null | | PVFROM == NULL)

{

FPRINTF (stderr, "bad args in memory / n");

Abort ();

}

#ENDIF

While (size -> 0)

* PBTO == * PBFROM ;

Return (PVTO);

}

This idea is to maintain two versions of debugging and non-debugging (ie delivery). In the process of writing, compiling its debug version, using the test section it provides, automatically checks the error when adding the program function. After the program is completed, the delivery version is compiled, and then hand it over to the dealer.

Of course, you won't be stupid until the last moment delivered, you want to run the program to be delivered, but in the entire development engineering, you should use the program's debug version. As in this chapter and the next chapter, the main reason for this requirement is that it can significantly reduce the development time of the program. Readers can imagine: If each function in the program performs some minimum error check, and how strong the corresponding application will be tested for some of the conditions that should never appear.

The key to this method is to ensure that the debug code does not appear in the final product.

To maintain the delivery versions of the program, but also maintained the debug version of the program

Use assertion to remedy

It is said that the debug code in the old truth Memcpy is very feet, and it is quite a bit 宾. So although it produces good results, most programmers will not tolerate its existence, this is the smart programmer decided to hide all debug code in assertion of Assert. Assert is a macro, which is defined in the header file assert.h. Although Assert is alternative to the code of the #ifdef in front of the #ifdef, it uses this macro, the original code has become 1 line from 7 lines.

Void memcpy (void * pvto, void * pvfrom, size_t size)

{

Void * PBTO = (Byte *) PVTO;

Void * pbfrom = (byte *) PVFROM;

Assert (PVTO! = NULL && PVFROM! = NULL);

While (size -> 0)

* PBTO == * PBFROM ;

Return (PVTO);

}

AASERT is a macro that only defines Debug. If the calculation result of its parameter is false, the execution of the calling program is aborted. Therefore, any one of the programs in the program will trigger Assert.

Assert is not a macro that is patching. In order not to cause important differences between the delivery versions of the program and the debug version, it needs to be carefully defined. Macro Assert should not mess with memory, should not initialize the uninitialized data, ie it should not be owned by other side effects. It is because the debug version of the program is exactly the same as the delivery version behavior, so it does not use Assert as a function, and it is used as a macro. If the Assert is used as a function, its call will cause an undesired memory or code exchange. To remember that the programmer using Assert is a honesty-free means that uses it as a safe use in any system.

Readers also realized that once the programmer learned to use assertions, the macro Assert will often be redefined. For example, a programmer can define Assert to be executed when an error occurs, but is transferred to the debugger at the wrong location. Some versions of Assert can even allow users to choose to continue to run, as if they have never have errors. If the user wants to define its own assertion, it is best to use other names for the use of standard Assert. This book will use an assertion macro with standards because it is non-standard, so I gave it a name assert to make it look forward to it in the program. The main difference between macro Assert and Assert is that Assert is an expression that can be used casually in the program, and Assert is a relatively restricted statement. For example, using Assert, you can write:

IF (Assert (p! = null), p-> foo! = bar)

......

But if you try with Assert, you will generate a syntax error. This difference is that the author is intended to be caused. You should define Assert as statements unless you intend to use assertions in an expression environment. Only in this way, the compiler can generate syntax errors when it is incorrectly used in expression. Remember, every point of help will help the wrong discovery when struggle with the error. Why do we have the flexibility that you can't use?

Below is a method of defining macro Asserthe yourself:

#ifdef debug

Void_assert (char *, unsigned); / * prototype * /

#define assert (f) /

IF (f) /

NULL; /

ELSE /

_Assert (__ file__, __line__)

#ELSE

#define assert (f) Null

#ENDIF

From we can see that if debug is defined, Assert will be expanded to an IF statement. The NULL statement in the IF statement is very strange, because it is necessary to avoid if the IF is not paired, so it must have an ELSE statement. Perhaps the reader needs a semicolon after the closing brackets called the_assert, but does not need it. Since the user has given a semicolon when using Assert.

When the assert fails, it uses the pre-handler according to the file name and line number parameter parameter provided by the macro __file_____stert. _Assert prints an error message on Standard Error Output Device stderr, then abort:

Void_ASSERT (Char * Strfile, Unsigned Uline)

{

Fflush (stdout);

FPrintf (stderr, "/ nassertion failed:% s, line% u / n", strfile, uline);

Fflush (stderr);

Abort ();

}

Before performing Abort, you need to call FFLUSH to write all buffer output to the standard output device stdout. Similarly, if STDOUT and STDERR point to the same device, FFLUSH Stdout is still in FFLUSH stderr to make sure that Fprintf displays the corresponding error message only after all the output is sent to stdout.

Now if you call Memcpy with a NULL pointer, Assert will grab this error and display the following error message:

Assertion Failed: String.c, Line 153

This gives another difference between ASSERT and ASSERT. In addition to giving the above information, standard macro is also displayed. For example, I usually use the assert of the compiler that will be displayed as follows: assertion failed: PVTO! = Null && pbfrom! = Null

File String.c, Line 153

The only troubles of including test expressions in the error message are whenever you use Assert, it must generate a text-form print message corresponding to the condition. But the question is, where should the compiler store this string? Compiler on Macintosh, DOS, and Windows typically store strings in the global data area, but on Macintosh, the maximum global data area is usually limited to 32K, which is limited to 64K on DOS and Windows. Therefore, for large programs like Microsoft Word and Excel, the assertion string will immediately occupy this memory.

There are some solutions to this problem, but the easiest way is to save the test expression string in the error message. After all, as long as you look at the 153th line of String.c, you will know what the problem and what the corresponding test conditions are.

If the reader wants to know the definition method of standard macro Assert, you can view the Assert.h file of the compilation system used. The ANSI C standard also talked ASSERT in its basic principles and gave a possible implementation. P. J. Plauger also gives a slight distinct standard Assert implementation in its "The Standard C Library" book.

Regardless of whether the assertion is ultimately defined, use it to confirm the parameters passed to the corresponding function. If each called point of the function is checked, the error will soon be discovered. The best role of the assertion macro is that the user can automatically check them when the user occurs.

To confirm the function parameters using the assertion

"None definition" means "avoiding"

If the reader stops reading the definition of the Memcpy function in ANSI C, it will see its last line saying: "If you have a copy between objects overlapping each other, the result is not defined." In other books, the description of this is a bit different. The corresponding description in "Standard C" in P. J. Plauger and Jim Brodie is: "You can access and store elements of these two arrays in any order.

In short, these books say that if they rely on Memcpy that works in a particular manner, when you use mutual overlapping memory blocks, you are actually different (including different versions of the same compiler) The result may also different absurd hypothesis.

Some of the programmers deliberately use unfunctional features, but I think most of the programmers will have any unfained characteristics. We should not follow the programmers before the imitation. For programmers, unfined features are equivalent to illegal features, so they are checked by assertion. If you want to call Memmove, you call Memcpy. Don't you want to know what you are wrong?

By adding a assertion that can verify the two memory blocks, Memcpy can be strengthened as follows:

/ * Memcpy ─ ─ Copy unloaded memory block * /

Void memcpy (void * pvto, void * pvfrom, size_t size)

{

Void * PBTO = (Byte *) PVTO;

Void * pbfrom = (byte *) PVFROM;

Assert (PVTO! = NULL && PVFROM! = NULL);

AskERT (PBTO> = PBFROM SIZE || PBFROM> = PBTO SIZE);

While (size -> 0) * PBTO == * PBFROM ;

Return (PVTO);

}

The reader may think that the above strengthen is not significantly obvious, how do you only use a row of statements to complete the overlap check? In fact, as long as the two memory blocks are compared to two cars waiting in the parking, it can easily understand the truth. We know that if the rear bumper of a car is not overlap the two cars before the front bumper of another car. The above check is this idea, where PBTO and PBFROM are "rear bumpers" of two memory blocks. PBTO SIZE and PBFROM SIZE are some points before they have their respective "front bumpers". It is so simple.

By the way, if the reader has not realized the severity of the overlapping fill, as long as the PBTO is equal to PBFROM 1 and requires at least two bytes to move, it will be clear. Because in this case, the result of Memcpy is wrong.

So since now, you should often stop to see if there is no defined feature in the program. If the unseserive feature is used in the program, it is to remove it from the corresponding design, or include the corresponding assertion in the program to send a notification to the programmer when using unfined characteristics.

This approach is particularly important when providing a code base (or operating system) for other programmers. If the reader has previously provided a similar library for others, they know that when the programmer tries to get the desired result, they will use a variety of unfained features. The bigger challenge is to improve the release of the new library, because although the new library is fully compatible with the old library, there is always half of the application to generate a phenomenon when trying to use the new library, the problem is that the new library is in its "unfained Features, less than 100% of the old library.

To remove unselected features from the program

Or use assertion in the program to check illegal useless useless

Don't let this thing happen on your body

Since 1988, Microsoft's Money Tree DOS Word was delayed for three months, which clearly affected the company's sales. The important reason for this matter is that members of the development group for six months have always thought that they can hand over Word at any time.

The problem is developed by a key part of the Word Group to be used by another group in the company. This group has been telling the Word Group to complete them immediately, and the members of the team are confident. But they didn't realize that they were full of mistakes in their code.

This team's code and Word code have a clear difference between the Word code from the past to use assertions and debug code, but their code has almost no assertions. Therefore, its programmer does not have any good way to determine the actual error in its code, and the error can only be exposed slowly. If they use assertions in the code, these errors should be checked out before a few months.

Big call "danger" code

Although we have already reached a new topic, I still want to talk about overlap inspection assertions in Memcpy. For the above overlap check assertion:

AskERT (PBTO> = PBFROM SIZE || PBFROM> = PBTO SIZE);

If the condition of this assertion test is true when calling Memcpy, then after this assertion has failed, if you have never seen overlap inspections before, I don't know what it is going, can you think of what to find out? ? I think I can't think of it. But this is not to say that the above assertion skill is too strong, the sharpness is not enough, because regardless of this angle, this assertion is very intuitive. However, intuitive is not equal to obvious.

Please believe in me, I rarely than the assertion used in a program, but I don't know what the role of the assertion is more frustrated. You waste a lot of time, not to troubleshoot errors, but just to figure out what this error is. This is not all of the matter, and even more programmers will occasionally design a wrong assertion. So if you don't know what the corresponding assertion is checked, it is difficult to know that the error is in the program, or it appears in the assertion. Fortunately, this problem is very good, as long as it is not clear enough to assert, add the annotation. I know this is a taskful thing, but amazing is that few and programmers do. In order to avoid the risk of mistakes, the programmer has experienced a variety of hardships, but it does not explain what is dangerous. This is like a person who has passed through the forest, see a big name on the tree "dangerous" red word. But what is dangerous? Do you want to fall? Wasteling well? Big feet? Unless it tells people what is dangerous or dangerous is very obvious, this brand does not help people improve the alert effect, and people will ignore warnings on the brand. Similarly, the assertion that the programmer does not understand will also be ignored. In this case, the programmer will consider the corresponding assertion to be wrong and remove them from the program. Therefore, in order to make the programmer to understand the intention of the discovery, it is necessary to give a clear assertion to the annotation. If the annotations in the assertion also indicate other possible solutions of the corresponding errors, the effect is better. For example, when the programmer calls Memcpy with mutual overlapping memory blocks, it is a good opportunity to do. Programmers can use annotations to indicate Memmove at this time, but not only do things you want, and there is no overlap limit:

/ * Is the memory block overlap? Use Memmove * / if overlap

AskERT (PBTO> = PBFROM SIZE || PBFROM> = PBTO SIZE);

Don't have a long story when you write an assertion. The general method uses a seriously considered intermittent statement, which may be more guided to each detail with a whole text system. Note, do not recommend solving the problem in the annotation, unless you can confine whether it is helpful to other programmers. People who have annotations certainly do not want to make annotations to introduce others into astray.

Don't waste others - detailed explanation

Not used to check the wrong

When the programmer just starts using the assertion, sometimes it will be incorrectly used to check the truly error, and do not check the illegal conditions. See two assertions in the Strdup below:

Char * strdup (char * STR)

{

CHAR * STRNEW;

AskERT (STR! = Null);

Strnew = (char *) Malloc (Strlen 1);

Assert (strnew! = Null);

STRCPY (STRNEW, STR);

Return (Strnew);

}

The usage of the first assertion is correct because it is used to check illegal situations that should never occur when the program is working properly. The second assertion is quite different. It is the error condition that it will definitely appear in its final product and must be processed.

Do you do it again?

Sometimes it is necessary to make some assumptions for the program's operating environment when programming. But this is not to say that when programming, always make assumptions to the operating environment. For example, the following function MEMSET does not make any assumptions for its operating environment. Therefore, although it is not necessarily efficient, it can run under any ANSI C compiler:

/ * MEMSET - fills memory with "Byte" * /

Void * MEMSET (Void * PV, Byte B, Size_t size)

{

BYTE * PB = (Byte *) PV;

While (size -> 0)

* PB = B;

Return (PV);

}

However, in many computers, by first to fill into the small value of the memory block, then the memory is filld by the large value of the spell, since the number of times the actual fill is reduced, the recovered MEMSET function speed can be made. Faster, for example, on 68000, the following MERNSET functions fails to be four times higher than the above.

/ * longfill - fills the memory block with the value of "long". After filling the last long word,

* Returns a pointer to the first long word fills.

* /

Long * longfill (long * pl, long l, size_t size); / * prototype * /

MEMSET (Void * PV, Byte B, Size_t size)

{

BYTE * PB = (Byte *) PV;

IF (size> = sizethreshold)

{

Unsigned long L;

L = (B << 8) | b; / * Match a long word with 4 bytes * /

l = (l << 16) | L;

PB = (Byte *) longfill ((long *) PB, L, Size / 4);

SIZE = SIZE% 4;

}

While (size -> 0)

* PB = B;

Return (PV);

}

Other content is intuitive in addition to tests for Sizethreshold in the program. If the reader still doesn't quite understand why I want to enter this test, I can think about it to take a long time to spell a long, a longfill function. Testing SIZETHRESHOLD is to make the MEMSET only in faster filling speeds faster in faster than faster. Otherwise, it is still filled with Byte.

The only problem with this Memset version is that it has made some assumptions for compilers and operating systems. For example, this code is clearly assumed to take up four memory bytes, and the width of this byte is eight. These assumptions are correct for many computers, and there is no exception of almost all microcomputers. However, this does not mean that we should set it with this problem, because it is not equal to the future in the next few years.

Some of the programmers "Improvement" procedures are to write it into a better form of portability:

MEMSET (Void * PV, Byte B, Size_t size)

{

BYTE * PB = (Byte *) PV;

IF (size> = sizethreshold)

{

Unsigned long L;

SIZE_T SIZESIZE;

L = 0;

For (Sizesize = SizeOf (long); sizesize -> 0; null)

L = (l << char_bit) | B;

PB = (Byte *) longfill ((long *) PB, L, SIZE / SIZEOF (long);

SIZE = Size% SIZEOF (long);

}

While (size -> 0)

* PB = B;

Return (PV);

}

Since the operator SIZEOF is used in the program, this program looks better, but "looks" not equal "." If you want to move it into a new environment, you still have to check it. For example, if this program is run on Macintosh Plush or other 68000-based computer, if the PV begins to point to an odd address, the program will be paralyzed. This is because on the 68000, Byte * and long * are types that cannot be converted to each other, so if you store long in an odd address, you will cause a hardware error. So what should I do?

In this case, it should not be an attempt to write MEMSET as a portable function. To accept it, do not transform, do not change it. For the 68000, to avoid the above-mentioned odd address issues, you can fill it with Byte, fill it down, and then replace LONG continues to fill. Although LONG is aligned on an even address, it is already possible, but in a new Macintosh based on 68020, 68030 and 68040, performance will be better if it is aligned on the 4-byte boundary. As for other assumptions made in the program, you can use the assertion and condition to compile the corresponding verification:

MEMSET (Void * PV, Byte B, Size_t size)

{

BYTE * PB = (Byte *) PV;

#ifdef mc680x0

IF (size> = sizethreshold)

{

Unsigned long L;

ASSERT (SIZEOF (long) == 4 && char_bit == 8);

Assert (SIZETHRESHOLD> = 3);

/ * Fill with bytes until it is aligned on the long word boundary * /

While ((unsigned long) PB & 3)! = 0)

{

* PB = B;

Size -;

}

/ * Now assembled long words and fill other memory units with long words * /

L = (B << 8) | b; / * Match a long word with 4 bytes * /

l = (l << 16) | L;

PB = (Byte *) longfill ((long *) PB, L, Size / 4);

SIZE = SIZE% 4;

}

#ENDIF / * MC680X0 * /

While (size -> 0)

* PB = B;

Return (PV);

}

As readers have seen, the program is enclosed in the definition facility associated with the specific machines in the program. This not only avoids this part of the unmiglable code to be accidentally used on other different targets, but also to find all the code related to the target by searching the MC680X0 string in the program.

In order to verify that Long occupies 4 memory bytes, the width of Byte is 8, and a fairly intuitive assertion is added in the program. Although it is less likely to change, who knows that will change in the future?

Finally, in order to make PB points to the 4-byte boundary before calling longfill, a loop is used in the program. Because of the value of size, this loop will ultimately execute multiple multiple of Size equal to 3, so it has been added to check whether SizThreshold is at least 3 before the cycle (SizetHold should take a large value. But it should be at least 3 Otherwise, the program will not work).

After these changes, it is obvious that this program is no longer portable. The assumption or has been eliminated or verified by assertion. These measures make this program very few possibly by incorrectly.

Eliminate the implicit assumption, or use assertions to check their correctness, recognize the compiler is not enough.

Recently, some of Microsoft's group gradually discovered that they had to re-examine their code, because quite a lot of code is full of " 2" instead of " sizeof (int)", with 0xfff instead of uint_max without symbol Comparison of numbers, INT instead of the 16-bit data type used in the data structure is the problem.

You may think this is because these programmers are too lazy, but they will not agree to this view. In fact, they think that there is a good reason to explain that they can safely use the " 2" form, that is, the corresponding C compiler is written by Microsoft itself. This has caused a security illusion to the programmer. As a few years ago, a programmaking said: "The compiler group has never made all the changes that all our programs fall."

But this programmer is wrong.

In order to generate a faster smaller program on Intel 80386 and updated processors, the compiler group changes the size of the int (and other aspects). Although the compilation program group does not want the code within the company to get rid of the code, it is obviously more important to maintain the competitive position in the market. After all, this is the fault of Microsoft programmers who have made mistakes.

Can an impossible thing can happen?

The function of the function does not necessarily give all the input data of the function, and sometimes it is just a pointer to the function input data. For example, please see the simple compression restore program below:

Byte * pbexpand (byte * pbfrom, byte * pbto, size_t sizefrom)

{

BYTE B, * PBEND;

SIZE_T SIZE;

Pbend = PBFROM SIZEFROM; / * just pointing to the next position at the end of the buffer * /

While (PBFROM

{

B = * PBFROM ;

IF (b == BrepeAtcode)

{

/ * Store "size" in PBTO "B" * /

B = * PBFROM ;

Size = (size_t) * PBFROM ;

While (size -> 0)

* PBTO = B;

}

Else

* PBTO = B;

}

Return (PBTO);

}

This program copies the contents in a data buffer to another data buffer. But in the copy process, it is to find all the compressed character sequences. If a special byte Brepeatcode is found in the input data, it believes that the next two bytes thereafter are the restore characters to be repeated, and the number of repetitions of the character. Although this procedure is somewhat too simple, we can still use them in some cases similar to program editing. There, there are often many consecutive horizontal tabs and spaces that represent indentation.

In order to make PBEXPAND more robust, you can add an assertion in the entry point of the program to check the validity of PBFROM, SIZEFROM, and PBTO. But in addition, there are many other things that can be done. For example, data in the buffer can be confirmed.

Since three bytes are required for one decoding, the corresponding compression programs are compressed from two consecutive characters. In addition, although three consecutive characters can be compressed, do not get cheaper. Therefore, the compression program is only compressed for more than three consecutive characters.

There is an exception. If you contain Brepeatcode in the original data, you must perform special processes. Otherwise, when using PBEXPAND, it will be mistakenly considered to be the beginning of a compressed character sequence. When the compressor discovers BrepeAtCode in the original data, repeat it once, so as to distinguish between the true compressed character sequence.

In summary, for each character compression sequence, the number of repetitions is at least 4 or 1. In the latter case, the corresponding repeating character must be BREPEATCODE itself. We can use the assertion to verify this:

Byte * pbexpand (byte * pbfrom, byte * pbto, size_t sizefrom)

{

BYTE B, * PBEND;

SIZE_T SIZE;

Assert (pbfrom! = Null && pbto! = Null && sizefrom! = 0);

Pbend = PBFROM SIZEFROM; / * just pointing to the next position at the end of the buffer * /

While (PBFROM

{

B = * PBFROM ;

IF (b == BrepeAtcode)

{

/ * Store "size" in PBTO "B" * /

B = * PBFROM ;

Size = (size_t) * PBFROM ;

Assert (size> = 4 || (size == 1 && b == BrepeAtCode));

While (size -> 0)

* PBTO = B;

}

Else

* PBTO = B;

}

Return (PBTO);

}

If this assertion failed, the content of PBFROM pointed to or in the character compression program. Either case is wrong, and it is difficult to find errors without asserting.

Use the assertion to check the case where it is impossible

Deal with

If you are employed as a nuclear reactor, it is necessary to handle this condition of the core.

Some programmers solve this problem can be automatically irridated to the core, insert a cooling rod or some other methods that can cool the reactor cooling. Moreover, as long as the program has been controlled, it is not necessary to issue an alert to the relevant personnel.

Other programmers may choose another method, that is, as long as the core is overheating, alert to the reactor staff. Although the corresponding processing is still automatically made by the computer, the operator always knows this.

Which method do you choose if you have this program?

I think about this, everyone will basically there will be too many objections, that is, always alert to the operator, which is two things that the computer can resume the reactor. The core does not have an overheating phenomenon without reason, and it must have an unusual thing to cause this malfunction. Therefore, while the computer is handled, it is best to make the operator understand what happened to avoid the incident.

Surprisingly, programmers, especially experienced programmers, are usually like this: When some unexpected things happen, the program only conducts silent and quiet, and even some programmer will It is consciously to make the program. Maybe you use another way.

Of course, I am talking about the so-called anti-wrong programming.

In the previous section, we introduce the PBEXPAND program. This function is used by the anti-wrong program design. However, it can be seen from its cycle conditions that the following modified version does not use the anti-wrong programming.

Byte * pbexpand (byte * pbfrom, byte * pbto, size_t sizefrom)

{

BYTE B, * PBEND;

SIZE_T SIZE;

Pbend = PBFROM SIZEFROM; / * just pointing to the next position at the end of the buffer * /

While (pbfrom! = pbend)

{

B = * PBFROM ;

IF (b == BrepeAtcode) {

/ * Store "size" in PBTO "B" * /

B = * PBFROM ;

Size = (size_t) * PBFROM ;

DO

* PBTO = B;

While (size -! = 0)

}

Else

* PBTO = B;

}

Return (PBTO);

}

Although this program has more accurately reacted the corresponding algorithm, experienced programmers rarely be encoded like this. Otherwise, the opportunity is coming, we can put them into a double Cessna car that neither seat belt without the door. The above procedure makes people feel too dangerous.

Experienced programmer would think so: "I know that PBFROM should never be greater than Pbend, but if there is this situation, when this is impossible, let the external cycle exit Great."

Again, for the internal cycle, even if Size should always be greater than or equal to 1, but use the While loop instead of the DO loop, once the size is 0, it is not to make the entire program.

Make yourself from these "impossible" disturbances seem to be reasonable, even very smart. But if the Pbfrom has been added to Pbnd, what happens? What is the possibility of finding out this error in the anti-wrong versions of the dangerous version or the previous mistakes? When this error occurs, the above hazard version may cause paralysis of the entire system because PBExpand will attempt to compress all content in memory. In this case, the user will definitely find this error. On the contrary, for the previous anti-wrong versions, it will exit because there is not much damage (if any) has not been there. So although users may still find this mistake, I don't think this kind of possibilities are not big.

This is the case that the anti-wrong programming is often known as a good coding style, but it hides the error. To remember, the error we are talking about should never happen again, and the security process of these errors makes the writing error code more difficult. There is a jumping pointer similar to PBFROM in the program. And its value is in a variety of different quantities at each cycle, is particularly difficult.

Does this mean that we should give up the wrong mistake programming?

the answer is negative. Although the anti-mistake programming will conceal the error, it does value. The worst result that can be caused by a program is to perform crash, and the user may take several data that disappears to establish all. In a non-ideal world, the program is indeed embarrassed, so any measures to prevent user data from being lost will be worthwhile. The anti-mistake program is designed to achieve this goal. If there is no, the program will be like a house with a card, even if the hardware or operating system has occurred, it will collapse. At the same time, we also hope that when you design the anti-wrong program, we don't want to be concealed.

Assuming a function calls PBEXPAND with invalid parameters, such as SizeFrom is relatively small, and the content of the data buffer last byte is BREPEATCODE. Since this situation is similar to a compressed character sequence, PBEXPAND will read 2 bytes from the data buffer, so that the PBFROM exceeds Pbend. The results of it? The risk version of PBEXPAND may be paralyzed, but its anti-wrong versions may avoid the loss of user data, although it may also rush away 255 bytes of unknown data. Since both want to get, it is required to debug the version to make alarm, and the delivery version is securely restored to the error, so it can be used to use the anti-mistake program design to coding, on the other hand among things. Use the assertion to conduct alarm: byte * pbexpand (byte * pbfrom, byte * pbto, size_t sizefrom)

{

BYTE B, * PBEND;

SIZE_T SIZE;

Pbend = PBFROM SIZEFROM; / * just pointing to the next position at the end of the buffer * /

While (pbfrom! = pbend)

{

B = * PBFROM ;

......

}

Assert (pbfrom == pbend);

Return (PBTO);

}

The above assertion is only used to verify the normal termination of the function. In the delivery version of this function, the corresponding anti-wrong measures ensure that the user can not be lost when the problem; and in the debug version of the function, the error can still be reported.

However, it is not necessary to be too unstoppable in actual programming. For example, if the content of the PBFROM is always increased each time, then the PBFROM exceeds PBEND to cause problems, I am afraid that the accidental bombardment of the cosmic rays will be required. In this case, the corresponding assertion is not used, so it can be deleted from the program. If you need to use assertions in the program, you should set it according to the specific problem of commonly recognized. Finally, it should be noted that the loop is only one aspect of the programmer usually used to perform anti-wrong programming. In fact, whether this program design is used, you have to be with yourself before the code: "When you design the anti-mistake program design, have you concealed the error in the program?" If the answer is yes, it is necessary to Plus the corresponding assertion to alarm these errors.

Don't hide errors when you design the anti-wrong proceeds

Two algorithms are better than one algorithm

In order to capture errors in the program, it is not enough to check only for bad inputs and vulnerabilities. As the call function may pass useless information to the called function, the called function may also return useless information to the call function. Both are all we don't expect.

Since only Memcpy or MemSet has only one return parameter, it is extremely possible to return them to useless information. But for more complex procedures, it may not be easy to make this conclusion.

For example, I recently wrote some of the development tools for Macintosh programmers: 68000 disassembler. In the writing of the program, I can run how fast it can run, the key is to work correctly. Therefore, I chose a simple table driver algorithm to implement this program because it is relatively easy to test. In the program, I also used the assertion to capture possible missed errors during the test.

If the reader has previously seen the assembly language reference book, it is fortunate, because this book usually describes the details of each instruction, such as a binary form corresponding to each instruction. For example, if you look at the Add instruction in the 68000 assembly language reference manual, you can know that it has the following binary form:

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

1

1

0

1

Register

Op-mode

Effective Address

Mode | Register

Add:

We can ignore the Register and Mode fields in the instruction, and is only interested in binary brakes that are obvious to 0 or 1. In the case of add, we are only interested in it. If there is no other inlet bit of 0 or 1 in the instruction, it is checked whether the instruction is 1101 or 16-based number 0xD, you can know if the instruction is an Add command:

IF ((Inst & 0xF000) == 0xD000)

It is a strip add instruction ...

Seven of the Divs instruction mode used to perform a symbolic Dives instructions were obvious to 0 or 1:

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

1

0

0

0

Register

1

1

1

Effective Address

Mode | Register

Divs:

Similarly, if it is not obvious to the Register and the MODE domain that is not obvious to 0 or 1 in this command, you can know if the instruction is a Divs directive:

IF ((Inst & 0xF1C0) == 0x81c0)

It is a drash DIVS directive ...

You can use this first-shielded method to check each assembled instruction. Once confirmed as the Add or Divs instruction, you can call the decoding function to restore the contents of the Register and Mode areas just ignored.

This is the way I design the anti-profile procedure.

Naturally, the program does not use an IF statement different from 142 conditions to implement check of all possible 142 instructions but use a table containing shielding code, instruction feature, and decoding functions to check each instruction. Check the schema loop check instruction. If you match an instruction, call the corresponding decoder to translate the Register and Mode domain of the instruction.

The part of this table is given below and some code using the table:

/ * idInst is a form of shielding code and instruction feature,

* Its content represents the binary position mode of different types of instructions.

* /

Static Identity idInst [] =

{

{0xFF00, 0x0600, PCDecodeAddi}, / * Shielding code, features and functions * /

{0xF130, 0xD100, pcdecodeaddx},

{0xF000, 0xD000, PCDECodeAdd},

{0xF000, 0x6000, PCDECodebcc}, / * Short Transfer * /

{0xF1c0, 0x4180, pcdecodechk},

{0xF138, 0xB108, pcdecodecmpm},

{0xff00, 0x0c00, pcdecodecmpi},

{0xF1c0, 0x81c0, pcdecodedivs},

{0xF100, 0xB100, pcdecodeeor},

/ * ... * /

{0xff00, 0x4a00, pcdecodetst},

{0xfff8, 0x4e58, pcdecodeunlk},

{0x0000, 0x0000, pcdecodeerror}

}

/ * pcdisasm

* Absorbed an instruction and fill it in the Operation Code Structure OPC.

* PCDISASM Returns a modified program counter

*

* Typical usage: pcnext = pcdisasm (PC, & OPC);

* /

INSTRUCTION * PCDISASM (INSTRUCTION * PC, OPCODE * POPCRET)

{

Identity * PID;

Instruction instant = * pc;

For (PID = & idinst [0]; pid-> mask! = 0; PID )

{

IF ((Inst & Pid-> Mask) == PID-> PAT)

Break;

}

Return (pid-> pcdecode (inst, pc 1, popcret);

}

We see that the function PCDISASM is not very big. The algorithm used is very simple: read into the current instruction, find the corresponding content in the table; then call the corresponding decoder to fill in the corresponding content in the structural OPCode pointing to the POPCRET; finally returns a modified Program counter. Since it is not all the length of the 68000 instruction, the program counter must be modified. If necessary, in the parameter of the above decoding program, other domains can also include instructions, but still return the new program counter value to PCDISASM.

Now let's go back to the original problem.

By similar to a function like PCDISASM, the programmer is difficult to know if the data returned is valid. Perhaps PCDISASM can correctly identify instructions, but the decoder used can generate useless information, and this problem is difficult to find. One way to capture this error is to assert the assertion in the decoder corresponding to each instruction. Although this can also achieve the goal, but a better way is to add corresponding assertions in PCDISASM because it is the key to calling all decoders.

How can I do this in the PCDISASM? How can I find it correctly in PCDISASM? The answer is that the corresponding program must be written to confirm the content filled in the structure. How to confirm? This basically said that we must write a subroutine, compare the contents of the 68000 instructions with the content of OPCODE. In other words, a disassembler must be written again.

This sounds like a bit crazy, really need this?

Let us take a look at the approach of Microsoft Excel recalculate tools. Since the speed is the key to the success of the spreadsheet software, Excel uses a fairly complex algorithm to ensure that the formulas in other unrelated units are never recalculated. The only problem with this is because the algorithm is too complicated, so it will inevitably introduce new errors. Excel's programmer, of course, does not want this happening, so they have written a recalculation tool that is only used in the Excel debug version. When the original recalculation tool completes the recalculation work, use this recalculation tool to perform all the units containing the formula over, although slow but thorough recalculation. If the results of the two calculations are different, an assertion will be triggered.

Microsoft Word has also encountered similar problems. Since the word handler is also very critical when performing page layout, Word's programmers write this part of the program with assembly language so that it can be optimized. In this way, although the speed is up, it has become very bad in the prevention of the program. And the Excel recalculation tools that do not change often, Word's page layout procedures need to change regular changes with the increase of Word's new features. Therefore, in order to automatically detect errors in the page layout program, Word programmers wrote a C program for each manually optimized assembly language program. If the results generated by the two versions are inconsistent, they trigger a certain assertion.

Similarly, we can use the above method to confirm the first disassembler by using another anti-assessment procedure with another use only as a discipheral.

I don't want the second anti-assembler PCDisasmalt to disturb the reader. Simply put, it is logical drive, not a table driver. It uses nested Switch statements to separate the valid bits in the instruction until the required instructions are separated. The following program indicates the use of PCDisasmalt to confirm the first disassembler method: instruction * pcdisasm (instruction * pc, opcode * popcret)

{

Identity * PID;

Instruction instant = * pc;

Instruction * pcret;

For (PID = & idinst [0]; pid-> mask! = 0; PID )

{

IF ((Inst & Pid-> Mask) == PID-> PAT)

Break;

}

PCRET = PID-> PcDecode (INST, PC 1, POPCRET);

#ifdef debug

{

OpCode OPC;

/ * Check the validity of the two output values ​​* /

Assert (PCRET == PcDisasmalt (PC, & OPC));

ASSERT (MemcMP (Popcret, & OPC, SIZEOF (OPCODE)) == 0);

}

#ENDIF

Return (PCRET);

}

In normal case, adding debug check code in existing code should not have other impact on the original code, but this cannot be done in this program. Because we must establish a partial object PCRET to confirm the pointer value returned by PID-> PCDecode. Fortunately, this did not violate "In addition to the original code, the debug code added should also be implemented, and after the debug code, the original code will still be executed" This basic criterion is also accepted. Although this basic criterion is clear, it is clear, but once the assertion and debug code are started, it is sometimes attempting to replace the execution of the original code with the issued debug code. In Chapter 3, we will see a such example, but now let us say: "To suppress this impulse." Although we have to modify PCDISASM in order to perform corresponding debugging, the debug code added is not executed instead of the original code.

The above approach does not mean that each function in the program has two versions, because it is undoubtedly ridiculous and wasting time to make every functionality as efficient. The correct approach is to do this only for key parts in the program. I am sure that most programs have a key part of the key, such as recalculating programs in spreadsheet software, page layout programs in word handles, task schedulers in the project management program, and search / extract in the database. . In addition, each program is used to ensure that the user data is not lost, and it is also its key portion.

When writing a code, you have to grab all the opportunity to verify the results of the program (calling the bottleneck function of all other functions, it is especially suitable for such inspections). To use different algorithms as much as possible, it is necessary to make it more than just the same algorithm. By using different algorithms, it can not only discover errors in the implementation of the algorithm, but also increase the possibility of the discovery algorithm itself.

To use different algorithms to confirm the results of the program

Hey, what is going on?

Earlier in this chapter has said that you must be cautious when defining macro Assert. Specially mentioned that it cannot move the memory, and other functions cannot be called or other undesired side effects cannot be called. In this case, why do the following functions PCDISASM also use the assertion that does not meet the above requirements?

/ * Check the validity of the two output values ​​* /

Assert (PCRET == PcDisasmalt (PC, & OPC));

Assert (Popcret, & OPC, SIZEOF (OPCODE)) == 0); the reason why ASSERT calls other functions is for that, because it may generate a certain unpredictable impact on the code around Assert. However, in the above code, call other functions is not Assert, but the author, the user of Assert. Because I know that the other functions are safe in the PCDISASM, it will not cause problems, so it is not necessary to use the function call in this assertion.

Start accidental occurrence of errors

So far, we have always ignored the Register and Mode domain in the instruction. So what happens if these bits happen to match the content corresponding to other instructions in the table? For example, the binary form of the instruction EOR is as follows:

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

1

0

1

1

Register

1

Mode

Effective Address

Mode | Register

EOR:

The binary form of the instruction CMPM is very similar to it:

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

1

0

1

1

Register

1

Size

0

0

1

Register

CMPM:

Note that if the "Effective Address Mode" domain of the instruction Er is 001, then it looks like a CMPM instruction. Therefore, if the position of the EOR instruction is higher than the CMPM instruction in the IDINST table, all passed CMPM instructions are incorrectly considered to be the EOR directive.

Fortunately, due to the different algorithms used by PCDISASM and PCDISASMALT, it will cause assertion failure when the CMPM instruction is used for the first time. The reason is that PCDISASM fill in the EOR directive in the OPCode structure and PCDisasmalt is filled in the correct instruction CMPM we expect. Therefore, when the debug code is compared to the two structures, the assertion failed. This is the power of using different algorithms in the debug function.

But unpleasant is that this error will only be discovered when trying to disconnect the CMPM instruction. Of course, if the test set content used by the tester is enough to find this error. However, do the readers still remember what I have said in Chapter 1? What we pursue is to automatically find errors in the program as early as possible. And don't rely on other people when you check the error.

So although we can also push this task to the test group, don't do it. Although quite a few programmers think that the tester is to test the procedure for yourself, but to know that the tester's work is not just testing your procedure, finding the mistake in your own program is your own work. If you disagree with this point of view, please mention another work that can be taken only because someone is wrong to check. Since there is no, why should the program design should exception? If you want to write a code without the wrong code, you must take measures to be responsible. So let us start from now.

When making programming, you can ask yourself as long as you pay attention to some dangerous factors in the program: "How can I automatically find this error?" I will continue to ask myself by habitually, you will It is found to make the program more robust.

For example, to find the above error, you can scan the table immediately after the initialization of the program, and the table is scanned in the main function, and the previously no inlet does not correctly intercept the other instructions. Checking this wrong form check program is not long, but it is not clear:

Void CheckidInst (void)

{

Identity * PID, * Pidearlier; instruction institution

For (PID = & idinst [0]; pid-> mask! = 0; PID )

{

For (Pidearlier = & idInst [0]; Pidearlier

{

INST = PID-> Pat | (Pidearlier-> PAT & PID-> Mask);

IF ((Inst & Pidearlier-> Mask) == Pidearlier-> Pat)

Assert (pid-> mask) mask);

}

}

}

This program checks this error by comparing each of the instructions in the previous instruction in the table in the table. We know that each directive has its "no consideration" bit, that is, the bit that is blocked when it forms its instruction feature. However, if these "don't think" bit happens to match the corresponding instructions matches the instructions in the table, what happens? In this case, the conflict between the two inlets is generated. Then, for the entrances that may have conflicts, who should whom's position in the table?

The answer is simple. If the two portions in the table matches the same instruction, then 1 more inlet must be placed in front of the table. If this is not intuitive, consider the binary form of EOR and CMPM. If the entry corresponding to these two instructions matches the same instruction, which entry should be selected as "correct" match, why? Because the binding of the instruction binary form is defined as 0 or 1, the corresponding shield bit is 1, so it can be found in the corresponding shield code 1, which is more correct.

If the two instructions conflict with each other, the difference is more complicated. The specific processing method is to first take out the instruction feature in a population, then force the bits of "not considered" to accurately match the characteristics of each instruction in the table. The instruction feature generated by this operation is the value assigned to the variable INST in the above program. Depending on the design we know, the command INST must match the current instruction because it only changes the bits that have no relationships with the instruction feature in the current instruction. However, in addition to this, if it also generates conflicts between the two inlets in the same table, the corresponding shielding code must be performed.

By calling checkidinst during the initialization of the program, you can find the above conflict errors when performing an anti-assembly, without having to wait until the anti-discipline can be found. Programmers should find similar initial inspections in the program so that errors can be found as soon as possible. Otherwise, the error will hide for a while.

Don't wait for an error, use the initial inspection program

A warning

Once you start using the assertion, you may find that the error in the program will increase significantly. People will be horrified if there is no mental preparation.

I have rewritten a code shared by several Microsoft groups, which has many errors because I didn't use the assertion when writing the original code base, but I added an assertion in the new library. Make me surprised that when I handed the new version of the code to these groups, a programmer was very angry and asked me to give him the original codebook.

I ask him why?

He said: "There have been many mistakes after installing this library."

"Do you say a new library caused a mistake?" I am shocked.

"It's like this, we have encountered a lot of assertions that have not been in the past."

"Do you check these assertions?", I asked.

"Check, they are wrong in our program. So many assertions are impossible to have no mistakes, and we don't have time to track these things that don't belong to us. I want to go back to the original library."

I certainly don't think they have no problem. So I asked him to continue to use the new library until an assertion is wrong. Although he is unhappy, he still agreed to my request. As a result, he found that all errors are in their own project, not my new library. Because I didn't tell them in the new library, I was panicked, so this programmer was panicked because no one wanted to go wrong. But if I tell everyone that the assertion failed is a good thing, maybe this programmer will not panic. However, it is more panic about the error. Because the company performs project assessment through the number of obvious errors that have not completed the functionality and the number of obvious errors in the project, each of the project groups will become mentally nervous whenever they have increased significantly. Therefore, let others know that you increase the assertion. Otherwise, you have to prepare some excedrin for them.

summary

In this chapter, we introduce methods that use asserts to automatically check for errors in the program. Although this method is very valuable to the "last mistake" in the process of finding the procedure, the same as other tools may also be excessively used, and the user wants to flexibly master the use of the assertion. For some programmers, each time you remember to check if the denominator is 0, but it may be very important to other programmers, this may be very ridiculous. Users must have proper judgments.

In the entire survival of the project, the assertion should be retained in the program. Do not delete them until the program is delivered. These assertions are still useful when intended to add new features to programs in the future.

Point

l To maintain the delivery and debugging of two versions. The package delivered by the package should be automatically erroneous using the debug version as much as possible.

l The assertion is a simple method for debugging inspections. To use asserts to capture illegal conditions that should not occur. Do not confuse the difference between illegal conditions and error conditions, the latter must be processed in the final product.

l Use the assertion to confirm the parameters of the function, and the programmer alarms when the programmer uses unfined features. The more strict the function is defined, the easier it is to confirm its parameters.

l When writing a function, it is necessary to perform a repeated examination, and ask yourself: "What assumptions I plan?" Once the corresponding hypothesis is determined, it is necessary to use the assertion to check the assumptions, or rewrite the code remove the corresponding assumption. . In addition, I have to ask: "What is the most likely error in this program? How can I automatically detect the corresponding error?" Strive to write a test program capable of finding the error as soon as possible.

l General textbooks encourage programmers to perform anti-mistake programming, but remember that this coding style will hide errors. If the "impossible occurrence of" when performing anti-wrong encoding, the "unlike the happens" is actually happened, and the assertion is used to make alarm.

Exercise

1) Assuming that you must maintain a shared library and want to use the assertion in it, but don't want to issue the source code of this library, how to define this assertion, in order to display a meaningful information in the case of illegal cases Sore file name and list? For example, Memcpy may show the following assertion:

Assertion Failed in Memcpy: The Blocks over LAP

2) Whenever use Assert, the macro __file__ produces a unique file name string. That is to say, if 73 breaks are used in the same file, the compiler will generate 73 exact identical file name strings. How to implement an Assert macro to make the file name string only defined in each file?

3) What is the problem with the assertion in the following functions?

/ * getLine ─ ─ ─ 读 行 行 行 行 区 区 区 区 * /

Void getLine (char * pch)

{

INTCH; / * CH "must" is int * /

DO

Assert (ch = getchar ())! = EOF); ​​while (* pch = ch)! = '/ N')

}

4) When a programmer adds a new element to an enumeration type, sometimes adding new CASE conditions in the corresponding Switch statement. How can I use assertion to help find this problem?

5) CheckidInst can verify that the content in the IDINST table is correct, but other problems may occur in the table. Since there are many, it is easy to enter incorrect shielding code or instruction feature. How to enhance the ability of CheckidInst to automatically detect this input error.

6) As mentioned, when the "Effective Address Mode" domain of the instruction EOR is 001, it really turns a CMPM instruction. There are other restrictions in the EOR directive, such as the two bits in the MODE domain can never be 11 (this will make it a CMPA L command), and if its "Effective Address Mode" is 111, then "Effective The Address Register field must be 000 or 00L. Since PcDecodeeor is not called for the illegal combination of these EOR, how can I find out these errors in the table?

7) How do I use different algorithms to verify the QSORT function? How to verify the two-point lookup program? How do I verify the ITOA function?

Question:

Contact the company's company contacts, proceed that relevant personnel provide programmers to provide a debug version. (By the way, this is a sale that makes it do not pay. Because companies in the development of the operating system always want people to write applications for their operating systems, so that their operating system products are easier to go to the market.

Chapter 3 is to fight for the subsystem

In the previous chapter, we say that the bottleneck is an excellent thing to join the assertion because it allows us to use very little code to make a very thorough error check. This is like a football field, although there can be 50,000 fans to see the ball, but if the ticket staff stands in the entrance of the stadium, then only a few ticket trails will be enough. There is also such an entry in the program, which is the call point of the subsystem.

For example, for a file system, the user can open the file, turn off the file, read and write files, and create a file. This is five basic file operations, which typically require a large number of complex code support. With these basic operations, users can complete the corresponding file operations by calling their calls, without having to worry about the file directory, free storage, or a particular hardware device (such as disk drive, tape drive or networked device) Equal to achieve details.

Another example, for the memory management program, the user can allocate memory, release memory, and sometimes change the size of the assigned memory. These operations also require many of the code support.

Typically, the subsystem must be hidden to achieve detail, and the hidden implementation details may be quite complicated. While making details hidden, the subsystem provides users with some key entry points. The programmer implements communication with the synthesis system by calling these critical entry points. Therefore, if such a subsystem is used in the program and the debug check is added in its call point, it is possible to make a lot of error checking without spending a lot.

For example, if you ask you to write Malloc, Free, and Realloc subrouties (sometimes do this), you may be added to the code in the code. You may have a thorough test and have written an excellent programmer guide. Despite this, we know that users will still encounter problems when using these programs. So what can we do in order to help users?

The recommendation given here is that after the subsystem is written, ask yourself: "Under the case, how to use this subsystem in this subsystem, how can these problems?" In normal Next, when the risk factor in the code is started, it should have been asked. But no matter what, you should also ask again. For the memory management program. The error that the programmer may make is:

l Assign a memory block and use the content that is not initialized;

l Release a memory block but continue to reference the content;

l Calling Realloc to extend a memory block, so the original content has changed the storage location, but the program reference is still the content of the original storage location;

l After allocating a memory block, "lost" it, because no pointer to the assigned memory block is not saved;

l The reading and writing operation crosses the boundary of the allocated memory block;

l Do not check the error.

These issues are not expected, they exist every moment. Worse, these problems have the characteristics of non-reproducible, so it is difficult to find it. Once again, you can't see it. Until one day, the user was called by a common problem with a common problem, and the anger hurriedly called "Please", when you exclude the corresponding error, it will be discovered again.

Indeed, these mistakes are difficult to find. However, this is not to say that we have nothing to improve. As a result, it is really useful, but it must be enabled by making the assertion to play. Is the assertion in the memory management program to find them? Obviously it cannot be.

In this chapter, some other techniques that are used to succumb to the subsystem will be described. With these technologies, you can exempt a lot of trouble. Although this chapter is elaborated as an example, the resulting conclusions are equally applicable to other subsystems, whether it is a simple linked list management program, or a multi-user shared body inspection tool is applicable.

If there is a hidden, sometimes there is no

Typically, the method of solving the above problems is to directly couple the corresponding test code directly in the subsystem. But for two reasons, this book did not do this. The first reason is that I don't want the example in the example of Malloc, Free, and Realloc implement code. The second reason is that the user sometimes can't get the source code of the subsystem used. The reason why I will say is because there are two source code that provides standards in six compilers used to test this book example.

Since the user may not get the source code of the subsystem, or even if it can be obtained, the implementation of these source code is not necessarily the same, so this book is not directly plus corresponding test code directly in the source code of the subroutine, but using the so-called The "shell" function packages the memory management program and adds the corresponding test code in this layer package. This is the method that can be used in the case where the subsystem source code is not available. When writing the shell function, the naming conventions described earlier in this book are employed.

Let's discuss Malloc's shell function. Its form is as follows:

/ * FnewMemory - allocated a memory block * /

Flag fnewmemory (void ** pv, size_t size)

{

BYTE ** PPB = (byte **) PPV;

* ppb = (byte *) malloc (size);

Return (* ppb! = null); / * Success * /

}

This function seems to be complicated than Malloc, which is mainly the trouble brought by its pointer parameter void **. But if you see the programmer to call this function, it will find that it is clearer than the Mall Call. With F11Memory, the following call form:

IF ((pbblock) = (byte *) malloc (32)! = null)

Success - pbblock points to the allocated memory block ELSE

Unsuccess - pbblock equal to NULL

It can be replaced by:

IF (FnewMemory (& Pbblock, 32))

Success - pbblock points to the allocated memory block

Else

Unsuccess - pbblock equal to NULL

The latter call form is the same as the previous function. The only difference between FneWMemory and Malloc, is the former returns the call "success" flag to the memory block, and the latter returns the two different output results in a parameter. Regardless of which call form, if the assignment is successful, PbBlock points to the allocated memory block; if the assignment fails, PBBLock is NULL.

In the previous chapter, we have told that for unfined features, or you should eliminate it from the program, or you should use the assertion to verify that it will not be used. If this criterion is applied to Malloc, the behavior of this function is not defined in both cases, and the corresponding processing must be performed. In the first case, the result is not defined when the Malloc allocation length is zero according to the ANSI standard. In the second case, if the malloc allocation is successful, the content of the memory block it returns is unfained, which can be zero, and can be useless information, which is unwinding.

For memory blocks with zero length, the processing method is very simple, and this situation can be checked using assertions. But for another case, use asserts to check if the contents of the allocated memory block is effective? No, this is meaningless. Therefore, we have no choice but to eliminate it. The obvious way to eliminate this unfained behavior is that the F11Memory returns a memory block that is zero when the content is successful. This can solve the problem, but for a correct program, the initial content of the assigned memory block should not affect the execution of the program, so this unnecessary fill increases the burden of the delivery program, so it should be avoided.

Unnecessary fills may also conceal errors.

If the memory is allocated to a certain data structure, an error will occur when the initialization thereof is initialized (or when the maintenance program is extended to the data structure), there is an error. But if FnewMemory fills these domains as zero or other values ​​that may be useful, this error may hide this error.

In any case, we still don't want the content of the allocated memory blocks, because this will make the error difficult to reproduce. Then if only the useless information in the allocated memory block happens to be a particular value, what kind of result will be generated? This will not find an error in most of the time, but the program will continue to fail because it is not obvious reason, we can imagine that if each error is at a particular moment, it is necessary to exclude the program All errors will be more difficult. In this way, the program (and tester) is not crazy. The key to exposure is the randomness that eliminates errors.

Indeed, how to do this depends on the specific subsystem and the random characteristics thereafter. However, for Malloc, its randomness can be eliminated by filling the memory block therebetween. Of course, this fill should only be used in the debug version of the program. This can solve the problem and does not affect the issuance code of the program. However, we must remember that we don't want to conceal errors, so the value used to fill the memory block should be bizarre, but it should be used as a useless information, but it should be able to expose the wrong exposure.

For example, for the Macintosh program, a value of 0xA3 can be used. Select this value is the result of issuing the following question: What kind of value can expose illegal pointers? What kind of value can expose illegal counters or illegal index values? What if the newly allocated memory block is used as an instruction? On some Macintosh machines, users use odd pointers that cannot reference 16 or 32-bit values. It can be seen that the newly selected fill value should be odd. In addition, if the illegal counter or the index value is large. It will cause obvious delays, or will make the behavior of the system appear abnormal, thereby increasing the possibility of discovering such errors. Therefore, the selected fill value should be represented by one byte, which looks very strange than the big odd number. I chose 0xA3 not only because it satisfies the above requirements, but because it is still a illegal machine language instruction. So if the memory block is inexplicably executed, the program will be paralyzed immediately. At this time, if it is under the control of the system debugger, "undefined a-line trap" is generated. The last point seems to be a bit like a large sea, and it is found that the possibility of the error is minimal. But why should we not benefit every opportunity, no matter how small it works, go automatically check the error?

The machine is different, the selected and filling value may be different. For example, on the INTEL 80x86-based machine, the pointer can be odd, so the filling value is not important. However, the selection process of filling the value is similar, that is, the data that is not initialized in the first thing is the case, and then do everything else to make the corresponding situation. For Microsoft applications, the filling value can be selected as 0xcc. Because it is not only large, it is easy to discover, and if it is executed, the program can securely enter the debugger.

After adding the inspection of the memory block size, the form of the memory block is added, the form is as follows:

#define bgarbage 0xa3

Flag fnewmemory (void ** ppv, size_t size)

{

BYTE ** PPB = (byte **) PPV;

Assert (ppv! = Null && size! = 0);

* ppb = (byte *) malloc (size);

#ifdef debug

{

IF (* ppb! = null)

MEMSET (* PPB, BGARBAGE, SIZE);

}

#ENDIF

Return (* PPB! = NULL);

}

This version of FnewMemory not only helps the wrong reproduction, but also often makes the error to be easily discovered. If the index value of the loop is found when debugging is 0xA3A3, or the value of a pointer is 0xA3A3A3A3, then it is clear that they are not initialized. More than once, when I tracked an error, I found another error since I was accidentally encountered a certain combination of 0xA3.

Therefore, you want to check the subsystem in the app to determine which is the design of a random error. Once these places have been found, they will be excluded by changing the design. Or lines in their surroundings, to minimize randomness of errors.

To eliminate random characteristics - make the error can reproduce

Rushing unique information

The form of the FREE is as follows:

Void FreeMemory (void * pv)

{

Free (pv);

}

According to the ANSI standard, if an invalid pointer is passed to Free, the result is no defined. This seems very reasonable, but how can I know if PV is effective? How can I get the PV pointing to the start address of the allocated memory block? The conclusion is not possible, can't do at least if you don't get more information.

Things may also become worse.

Assume that the program maintains a certain type of tree whose DELETENODE program calls FreeMemory to release. Then if there is a fault in the deleteNode, there is no corresponding modification of the chain pointer in the adjacent assignment node when it releases the corresponding node. Obviously, this will contain a free node that has been released in the tree structure. But what is this? In most systems, this free knot will still be considered a valid tree knot. This result should not be particularly surprised. Because when Free is called, it is necessary to notify the memory management program that the memory space is no longer needed, so why should I waste time to mess up?

From the perspective of optimization, this is reasonable. However, it produces a bad side effect, which makes the unselected memory information that has been released, still contains effective data. There is this node in the tree, and the traversal of the tree does not cause errors, but causing the failure of the corresponding system. Instead, in the view, this tree does not seem to have any problems, is an effective tree. How can I find this problem? Unless your luck is as good as the winner of the Lotto card play, otherwise it is likely that you can't find it.

"No problem", you might say, "Just add some debug code in FreeMemory, make it fill in BGARBAGE before calling Free. In the case, the content of the corresponding memory block looks like it. So useless, so the tree handler will jump out when the free node is encountered. " This is a good idea, but do you know the size of the memory block to be released? Just, don't know.

You may have to raise your hand to fall, admit that it is completely defeated by FreeMemory. Not? There is no way to use the assertion to check the effectiveness of PV, and there is no way to destroy the content of the released memory block, because it doesn't know how much this memory block is found.

But don't give up your efforts, let us temporarily assume that there is a debug function SIZEOFBLOCK, which can give any memory allocation blocks and sizes. If there is a source code for the memory management program, write a function of such a function may not matter. Even if there is no source code for the memory management program, it is not urgent. In this chapter, we will introduce a method of implementation of SIZEOFBLOCK.

Or let us assume that there is already a SizeOfblock function. With this function, you can destroy the contents of the corresponding memory block before release:

Void FreeMemory (void * pv)

{

Assert (PV! = Null);

#ifdef debug

{

MEMSET (PV, BGARBAGE, SIZEOFBLOCK (PV));

}

#ENDIF

Free (pv);

}

The debug code in this function not only filled the content of the release memory block, but also confirmed the PV when the SizeOfBlock is called. If the pointer is not legal, it will be isolated by SizeOfblock (this function can of course do this because it certainly knows the details of each memory allocation block).

Since NULL is the legitimate parameters of Free (according to the ANSI standard, this time free is nothing), why should I use the assertion to check if the PV is NULL, is this not very strange? The reason why this should be expected: I don't agree that only for convenience, it allows the meaningless NULL pointer to the function. This discipline is to confirm this usage. Of course, you may have different perspectives, so you may want to take the assertion. But what I want to say is that users don't have to blindly abide by ANSI standards. Others think that free should accept NULL pointers, do not mean that you have to accept this idea.

Relloc is another function that releases memory and generates useless information. The following is given below, it is very similar to the Mallock's shell function FNewMemory: Flag FresizeMemory (Void ** PPV, SIZE_T SIZE)

{

BYTE ** PPB = (byte **) PPV;

BYTE * PBRESIZE;

PBRESIZE = (Byte *) Realloc (* PPB, SizeNew);

IF (* pbreesize! = null)

* PPB = PBRESize;

Return (* pbreesize! = null);

}

Like FneWMemory, FresizeMemory also returns a status flag that indicates whether the change in the size of the corresponding memory block is successful. If PBBLock points to a memory block that has been assigned, then it can be changed.

IF (FresizeMemory (& Pbblock, SizeNew)

Success - pbblock points to new memory blocks

Else

Unsuccess - pbblock points to the old memory block

The reader should be noted that the FresizeMemory does not remember the empty pointer in the case of failure of the operation. At this time, the newly returned pointer still points to the original memory distribution block, and the contents within the block unchanged.

Interestingly, the Realloc function (this is also true) to call Free, and call Malloc. Which function is called when executed, depends on the size of the corresponding memory block depends on it. In FreeMemory, the content of the corresponding memory block is rushed away before being released; and in the FneWMemory, the newly allocated memory block after calling Malloc, is filling the "scrap" that looks strange. In order to make FresizeMemory, these two things must be done. Therefore, there are two different debug code blocks in this function:

Flag FresizeMemory (void ** ppv, size_t sizenew)

{

BYTE ** PPB = (byte **) PPV;

BYTE * PBRESIZE;

#ifdef debug / * Introduced debugging local variables * /

SIZE_T SIZEOLD;

#ENDIF

Assert (ppb! = Null && signnew! = 0);

#ifdef debug

{

SIZEOLD = SizeOfblock (* PPB);

/ * If it is reduced, rush down the content of the block is released * /

IF (Sizenew

MEMSET ((* PPB) SizeNew, Bgarbage, Sizeold-SizeNew);

}

#ENDIF

PBRESIZE = (Byte *) Realloc (* PPB, SizeNew);

IF (PBRESize! = NULL)

{

#ifdef debug

{

/ * If expanded, the content added to the tail is initialized * /

IF (Sizenew> SIZEOLD)

MEMSET (PBRESize SIZEOLD, BGARBAGE, SIZENEW-SIZEOLD);

}

#ENDIF

* PPB = PBRESize;

}

Return (PBRESize! = NULL);

}

In order to do this, there seems to have added many additional code in this function. But when you read it, you will find that most of them are virtual. Such as curly brackets, # ifdef directives and annotations. Even if it does increase a lot of extra code, you don't have to worry about it. Because the debug version is not necessary to short, there is no particularly fast response speed, as long as it meets the daily requirements of programmers and testers. Therefore, unless the debug code will become too big, it is too slow and not used, usually in the application can add any debug code you think necessary. Enhance the error error of the program. It is important to examine the subsystem to determine the various situations of the establishment of data and release data and make the corresponding data into useless information.

Rushing the useless information to avoid being used wrongly

Using #ifdef to clear local variables!

Look at Sizeold, a local variable for debugging. Although SIZEOLD's explanation is made in a #ifdef sequence, it is very important to make the program. Because in the delivery versions of the program, all debugging code should be removed. I certainly know if this #ifdef pseudo directive, the corresponding program will become more readable, and the program's debug version and delivery versions are equally correct. But the only problem in this is that in its delivery versions, the SIZEOLD is illustrated, but it is not used.

Declaring but not using a sizeold variable in the delivery versions of the program, there is no problem. But this is not the case, which will cause serious problems. If the maintenance programmer does not notice that Sizeold is just a debug-specific variable, it is used in the delivery version, and because it is not initialized, it may cause serious problems. Cincurer in the #ifdef directive with a #ifdef directive, clearly indicating that Sizeold is just a dedicated variable. Therefore, if the programmer uses Sizeold in the non-debug code of the program (even #ifdef), the compiler error will be encountered when constructing the delivery version of the program. This is equivalent to adding dual insurance.

Using a #ifDef instruction to remove debug variables Although it makes it difficult, this usage can help us eliminate a root source that produces a potential error.

Produce procedures for moving and oscillation

It is assumed that the program is not a node of the tree structure, but to call FresizeMemory to expand the node to adapt to the requirements of the growth data structure. Then FresizeMemory expands the node, if you move the storage location of the node, two nodes appear: one is the true node of the new location, the other is unavailable in the original position. Useless information node.

In this way, if the programmer writes expandNode does not take into account this situation when FresizeMemory will cause the corresponding node to move in the extended node, what is the problem? The state of the corresponding tree structure will not remain unchanged, ie the adjacent node of the node is still pointing to the original memory block that is still reluctant but it seems to be still effective. After the expansion, the new knot will not float in the memory space, no pointer points to it? The fact is indeed this, it may generate it seems that it seems that it is effective but actually a tree structure that is wrong, and leaves a memory block that cannot be accessed in memory. This is very bad.

We can think of the original block content by modifying FresizeMemory so that it is moved in the extended memory block. To achieve this, just simply call MEMSET:

Flag FresizeMemory (void ** ppv, size_t sizenew)

{

......

PBRESIZE = (Byte *) Realloc (* PPB, SizeNew);

IF (PBRESize! = NULL)

{

#ifdef debug

{

/ * If movement occurs, rush down the original block content * /

IF (PBRESize! = * PPB)

MEMSET (* PPB, BGARBAGE, SIZEOLD);

/ * If expanded, the content added to the tail is initialized * /

IF (Sizenew> SIZEOLD)

MEMSET (PBRESize SIZEOLD, BGARBAGE, SIZENEW-SIZEOLD);

}

#ENDIF

* PPB = PBRESize;

}

Return (PBRESize! = NULL);

}

Unfortunately, don't do this. Even if you know the size and position of the original memory block, you cannot destroy the contents of the original memory block, because we don't know how the memory management program will process the memory space released by it. Some memory management programs don't do for the released memory space. But some memory management programs are used to store free space chains or other internal implementation data. This fact means that once the memory space is released, it will no longer belong to you, so you shouldn't move it again. If you move this part of the memory space, there is a danger of destroying the entire system.

To give a very extreme example, once I am adding new features for Microsoft's internal 68000 cross assembler, Macintosh Word and Excel request me to help them identify a long-term failure to make the system accidentally failed. . The difficulty of checking this error is that although it does not happen, it always occurs, thus caught people's attention. I don't want to talk about more details, but I have found the conditions that make this error reproduce the conditions, but I only use three days of time.

Find out the conditions that make this error, I spent a long time, but I still don't know why this error has been caused. Whenever I look at the corresponding data structure, they seem to have no problem. I didn't expect these so-called data structures that have no problem, actually actually call the useless information left by Realloc!

However, the real problem is not to find out how long the accurate reason for this error spent how long, but in order to find out how much time for this error reproduction. Realloc does not only move the location of the corresponding memory block when expanding the memory block, but the original memory block must be reassigned and fill in new data. In assemblers, both cases rarely occur.

This gives us another criterion to prepare the error code: "Don't let things rarely happen." Therefore, we need to determine what possible in the subsystem, and make them will happen and often. If there is a very rare behavior in the subsystem, it is necessary to manage it to make it reproduced.

You have a tracking error to follow the error handler, and feel that "this error handler is too much, I dare to have never been implemented"? There is definitely, each programmer has experienced this experience. The error handler is often easier, because it is rarely being executed.

Similarly, if it is not a realloc to expand the memory block, it is rare, and the error in this assembler can be discovered within a few hours, and it is not necessary to consume a few years. However, how can I make Realloc mobile memory blocks? The answer is not done, can't do at least if the corresponding operating system is not supported. Despite this, we can simulate the peers of Realloc. If the programmer calls FresizeMemory to expand a memory block, you can build a new memory block first, then copy the contents of the original memory block to this new block, finally release the method of the original memory block, accurate Simulate all the actions of Realloc.

Flag FresizeMemory (void ** ppv, size_t sizenew)

{

BYTE ** PPB = (byte **) PPV;

BYTE * PBRESIZE;

#ifdef debug

SIZE_T SIZEOLD;

#ENDIFASSERT (PPB! = Null && SizeNew! = 0);

#ifdef debug

{

SIZEOLD = SizeOfblock (* PPB);

/ * If it is reduced, first fill in the waste to be released.

* If expanded, through the simulation of Realloc's operation, forcing new memory blocks to generate mobile

* (Do not allow it to expand in the original position) If the new block is the same length, no

* do anything

* /

IF (Sizenew

MEMSET ((* PPB) SizeNew, Bgarbage, Sizeold-SizeNew);

Else IF (Sizenew> Sizeold)

{

BYTE * PBNEW;

IF (FnewMemory (& Pbnew, SizeNew))

{

Memcpy (PBNEW, * PPB, SIZEOLD);

FreeMemory (* ppb);

* PPB = PBNEW;

}

}

}

#ENDIF

PBRESIZE = (Byte *) Realloc (* PPB, SizeNew);

......

}

In the above program, the added new code is only executed when the corresponding memory block is expanded. By assigning a new memory block before release the original memory block, it can be ensured that as long as the assignment is successful, the storage location of the corresponding memory block is moved. If the assignment fails, the added new code is equivalent to a large empty operation instruction.

However, please note that the new code added in the above program is not only moving the corresponding memory blocks, but also the content of the original memory block is hosted. When it comes to freeMemory release the original memory block, the content of the memory block is rushed away.

Now you may think: Since the above program is used to simulate the realloc, why should I call Realloc? Moreover, a Return statement is added to the added code, for example:

IF (FnewMemory (& Pbnew, SizeNew))

{

Memcpy (PBNEW, * PPB, SIZEOLD);

FreeMemory (* ppb);

* PPB = PBNEW;

Return (TRUE);

}

Isn't it possible to improve your running speed?

Why don't we do this? We can do this, but remember not to do this because it is a bad habit. Remember that the debug code is an extra code, not a different code. Unless there is a very worthwhile reason, the original non-debug code should always be executed, even if it has become more after the debug code is added. After all, the best way to extract the code is to execute the code, so the original non-debug code should be performed as much as possible.

Sometimes when I explain these concepts to the programmer, they will refute: "Always moving the memory block is as harmful as the water does not move the memory block, you have already come to another extreme." They are really anesthes, so it is necessary to explain a bit.

If there is always something in the program's debug version and delivery versions, it is really harmful as never do. But in this example, FresizeMemory is actually not tight, although its debug version is unyielding to move the memory block, just like ate anhing.

If there is little problem, there is little problem, as long as there are many ways to occur in the delivery versions of the program and the debug version.

If something is less, try to make it often happen.

Save a log to evoke your attention

From the debugged endpoint, the memory management program is the size of the memory, which is known when creating the memory block, but then almost immediately lost this information unless a copy of the record is saved in some place. We have seen the value of the function sizeofblock, but if you can know the number of allocated memory blocks and its specific storage locations in memory, you will be larger. If you know this information, we can determine if it is valid regardless of the value of the pointer. If so, how much use is, especially for the confirmation of the function parameters. Assuming that we have functions FvalidPointer, which has two parameters PV and size size; when the PV actually pointing the memory allocation block, the function returns TRUE. Using this function we can write more stringent dedicated versions for commonly used subroutines. For example, if some of the content of the memory allocation block is often filled out, we can bypass the MEMSET function that is not very stringent in the pointer, and calls yourself the FillMemory program you have written. The program can make a more stringent confirmation of its pointer parameters:

Void FillMemory (Void * Pv, Byte B, Size_t size)

{

ASSERT (FValidPointer (PV, SIZE));

MEMSET (PV, B, SIZE);

}

By applying FvalidPointer, the function ensures that PV points to a valid memory block. Moreover, there will be at least SIZE bytes from the tail of the PV to the memory block.

If you prefer, we can call FillMemory in the program's debug version, and call the MEMSET directly in its delivery versions. To do this, just include the following global macro definitions in its delivery versions:

#define FillMemory (PB, B, Size) MEMSET ((PB), (B), (Size))

These contents have been a bit outline.

It has always emphasized that if additional information is saved in the debug version of the program, it can often provide stronger error checking.

So far, we introduce the method of populating the memory block in FillMemory and FresizeMemory. However, this method is just a relatively "weak" error discovery method compared to the record that can be done by saving a record that contains allocated memory block information.

As in the previous, we still assume that the worst case: from the corresponding subsystem itself, we can't get any information about allocating memory blocks. This means that through the memory management program, we can't get the size of the memory block, do not know if the pointer is valid, and don't even know if a memory block exists or has allocated how many memory blocks have been assigned. So if you need this information in the program, you must provide yourself. That is to say, you have to save a type of allocation log in the program. As for how to save this log is not important, it is important to get it when you need this information.

One possible way to maintain the log is: When allocating a memory block in the FneWMemory, a memory block is also assigned to the log information; when a memory block is released in the FFreeMemory, the corresponding log information is also released; when in FresizeMemory Change the size of the memory block, modify the corresponding log information, so that it reflects the new size and new location of the corresponding memory block. Obviously, we can pack these three actions in three different debug interfaces:

/ * Establish a memory record for the newly allocated memory block * /

Flag FcreateBlockInfo (Byte * Pbnew, Size_t SizeNew);

/ * Release a log information corresponding to a memory block * /

Void FreeBlockInfo (Byte * Pb);

/ * Modify the log information corresponding to the existing memory block * /

Void updateBlockInfo (byte * pbold, byte * pbnew, size_t sizenew); of course, the three programs maintenance log information is not important as long as they do not cause the running speed of the corresponding system to reduce the extent that cannot be used. The reader can find the implementation code of the above functions in Appendix B.

Modifications to FreeMemory and FresizeMemory so that it is very simple to call the appropriate subroutine. Modified FreeMemory has become as follows:

Void FreeMemory (void * pv)

{

#ifdef debug

{

MEMSET (PV, BGARBAGE, SIZEOFBLOCK (PV));

FreeBlockInfo (PV);

}

#ENDIF

Free (pv);

}

In FresizeMemory, if Realloc successfully changed the size of the corresponding memory block, then call UpdateBlockInfo (if Realloc fails, there is no content to modify). The following parts of FresizeMemory are as follows:

Flag FresizeMemory (void ** ppv, size_t sizenew)

{

......

PBRESIZE = (Byte *) Realloc (* PPB, SizeNew);

IF (PBRESize! = NULL)

{

#ifdef debug

{

UpdateBlockInfo (* PPB, PBRESize, SIZENEW);

/ * If expanded, the content added to the tail is initialized * /

IF (Sizenew> SIZEOLD)

MEMSET (PBRESize SIZEOLD, BGARBAGE, SIZENEW-SIZEOLD);

}

#ENDIF

* PPB = PBRESize;

}

Return (PBRESize! = NULL);

}

F11Memory is relatively complicated, so put it in the first place to discuss. When the FNEWMemory assigns a memory block, the system must allocate two memory blocks: a request to meet the caller, and the other is used to store the corresponding log information. Only when the assignment of two memory blocks is successful, the call of FneWMemory is successful. If it is not necessary, it will make some memory blocks without log information. It is important to have a corresponding log information that must have a corresponding log information, because if there is no log information, the assertion failed will be generated when the function confirmed to the pointer parameter is called.

In the following code, we will see that if the FNEWMemory successfully performed the assignment of the user request space, the allocation of the memory required for the corresponding log content fails, the function releases the first memory block and returns a memory Block assignment failed flag. Doing so can synchronize the allocated memory content with the corresponding log information.

The code of F11Memory is as follows:

Flag fnewmemory (void ** ppv, size_t size)

{

BYTE ** PPB = (byte **) PPV;

Assert (ppv! = Null && size! = 0);

* ppb = (byte *) malloc (size);

#ifdef debug

{

IF (* ppb! = null)

{

MEMSET (* PPB, BGARBAGE, SIZE);

/ * If you can't create Japanese block information,

* So simulate a total memory allocation error.

* /

IF (! fcreateBlockInfo (* ppb, size))

{

Free (* ppb);

* PPB = NULL;

}

}

}

#ENDIF

Return (* PPB! = NULL);

}

That's it.

Now we have a complete record of the corresponding memory system. With this information, we can easily write functions such as SizeOfblock and FvalidPointer (see Appendix B), as well as any other useful function. Save the debugging information for stronger error check

Don't wait for an error

Until now, everything we have done can only help users pay attention to the incorrect occurrence. This is good, but it can't automatically find errors. Take the DELETENODE function in the previously speaking, if the function call function FreeMemory releases a node, leave a pointer to the release memory space in the corresponding tree structure, then never be used in these pointers. In the case of, can we discover this problem? No, no. Another example, if we can't call FreeMemory in the function FresizeMemory, what will it be?

......

IF (FnewMemory (& Pbnew, SizeNew))

{

Memcpy (PBNEW, * PPB, SIZEOLD)

/ * FreeMemory (* ppb); * /

* PPB = PBNEW;

}

The result will generate a difficulty in this function. Say it is difficult to understand because the surface looks, what is wrong. But every time we execute this program, we will "lose" a memory space. Because this unique pointer to the memory block is rushed off when the PBNEW is given to * ppb. So can the debug code in this function help us find this error? Can't at all.

These errors are different from the previous mistakes because they do not cause any improper situation. As the gangsters are not planning to go out, the roadblock is useless, and the corresponding debug code is useless in the case where the corresponding data is not used, because it can't check these errors. Can't find that the error does not mean that these mistakes do not exist. They do exist only that we have not seen them - they are "hidden" very deep.

To find out these errors, you have to "go door-to-door" search like the programmer. Don't wait for the error to expose it, you have to play the debug code that can actively find this problem in the program.

We encounter two situations for the above programs. In the first case, we get a "suspension pointer" that points to the memory block that has been released; the second case, we allocate a memory block, but there is no corresponding pointer to it. These errors are often difficult to find, but if we have always saved the corresponding debugging information in the program, they can be easily discovered.

Let's take a look at how people check the mistakes in their bank financial reports: We own a copy of the fund, and the bank has a funding list. By comparing these two lists, we can find errors, which can also be used to discover errors that are lost in suspension pointers and memory blocks. We can compare the known normal table (saved in the debug information of the program). If the reference pointer is found to have not been assigned memory blocks or the corresponding memory blocks are not pointed by any pointer, it will definitely out.

But programmers, especially experienced programmers always avoid checking each pointer stored in each data structure. Because all the data structures in the program are to be tracked, if not all pointers there are, it seems very difficult if it is not impossible. The actual situation is, even if some programs written are programs, the corresponding memory space is also allocated separately for the pointer to facilitate inspection.

For example, the 68000 assembler can assign memory space for 753 symbolists, but it does not use 753 global variables to track these symbolic names, which will appear quite stupid. Instead, it uses arrays, hash tables, trees or simple linked lists. Therefore, although there may be 753 symbolic names, the use of cycles can be very simply covered with these data structures, and this is not much about the code. In order to compare the corresponding pointer table and the corresponding debug information, I define three functions. These three functions can collect subroutines with the information given by the previous section (the reader can find their implementation code in Appendix B):

/ * Mark all memory blocks as "not quoted" * /

Void ClearMemoryRefs (Void);

/ * Mark the memory block points to the PV as "has been referenced" * /

Void NoteMemoryref (Void * PV);

/ * Scan reference flag to find lost memory block * /

Void CheckMemoryrefs (Void);

The use of these three subroutines is very simple. First, call ClearMemoryRefs to set the corresponding debug information to the initial state. Second, the global data structure in the scanning program calls NOTEMEMORYREF to confirm the corresponding pointer and mark the memory block that it is "has been referenced". After all the pointers in the program, each pointer should be a valid pointer, and each of the assigned internal blocks should be marked with a reference tag. Finally, calling CHECKMEMROYREFS to verify that a memory block does not have a reference tag, which will trigger a corresponding assertion, and the controlled memory block of the police is a lost memory block.

Let's take a look at how to use these subrouties to confirm the pointers used in the assembler in the assembler described earlier in this chapter. For the sake of simplicity, we assume that the symbol table used by the assembler is a binary tree, and each of its nodes is as follows:

/ * "Symbol" is a node definition of a symbolic name.

* For each symbol defined in the user assembly source,

* Allocate a node

Typedef struct Symbol

{

Struct Symbol * psymright;

Struct Symbol * psymleft;

Char * strname; / * Text representation * /

......

} symbol; / * Name method: SYM, * psym * /

Only three domains containing pointers are given. The first two domains are the left subtree pointer and the right child pointer of the node, and the third domain is a string ending with a zero character. When we call ClearMemoryRefs, the function completes the traversal of the corresponding tree, and records the information about each pointer in the tree. Completing the code breaks in a debug-specific function NotesymbolRefs, the function is as follows:

Void NotesymbolRefs (Symbol * psym)

{

IF (psym! = null)

{

/ * Confirm the current node before entering the next node * /

NoteMemoryRef (psym);

NoteMemoryref (psym-> strname);

/ * Now confirm the current node of subtree * /

NotesymbolRefs (psym-> psymright);

NotesymbolRefs (psym-> psymleft);

}

}

This function sequences the symbolic table, and records the case of each pointer in the tree. Typically, the symbolic tables are stored as the sequence tree, so it should be sequentially traversed. But I use the order, because I want to confirm the validity of the validity before reference to PSYM, which requires the order traversal. If the order of traversal or rear sequence traversal is performed, it will be referenced to the content to which it points to the PSYM, so that the program will fail after multiple times of recursive. Of course, this can also be found. But tracking a random error and tracking an assertion failure, which would rather choose? After describing the routines of "Note-Ref" for other data structures, in order to facilitate the call to other parts of the program, they should be merged into a separate routine. For this assembler, the corresponding routines can have the following form.

#ifdef debug

Void CheckMemoryIntegrity (void)

{

/ * Mark all memory blocks as "not quoted" * /

ClearMemoryRefs ();

/ * Documentary all known allocations * /

NotesymbolRefs (psymroot);

Notemacrorefs ();

......

Notecacherefs ();

Notevariablerefs ();

/ * Ensure that each pointer is no problem * /

CheckmemoryRefs ();

}

#ENDIF

The last question is: "When do you call this routine?" Obviously, we should call this routine as much as possible, but in fact it depends on the specific needs. At least, this routine should be called to consistency check it before preparing to use the corresponding subsystem. If you can check the corresponding subsystem during the program waiting for the user buttons, move the mouse or the hardware switch, the effect will be better. In short, use every opportunity to capture errors.

Establish a detailed subsystem check and often conduct these inspections

Non-determination principle

I often explain to programmers using debugging checks. During the process I explained, sometimes he or she will hinder the original code due to the added debug code, and concerns about the severity of the adverse consequences of this code may be concerned. This is another issue related to the "non-deterministic principle" proposed by Heisenberg. If the reader is interested in this problem, please continue to read.

There is no doubt that the debug code added will cause the program to deliver the difference between the versions and debug versions. But as long as it is very cautious when joining the debug code, there is no internal behavior of the original program, then this difference should not have any problems. For example, although FresizeMemory may move the memory block very frequently, it does not change the basic behavior of the function. Similarly, although the memory space allocated by the F11Memory is more than the user's request, it should not have any impact on the user program. (If you expect to request 21 bytes, FnewMemory or Malloc should allocate 21 bytes to you, so you will have trouble, whether you have a debug code, you will have trouble. Because you want to meet the retrieval requirements, the memory management program allocates the memory total memory It is more than the amount of requests for users)

Another problem is that debugging code will increase the size of the application, so you need to take up more RAM. But readers should remember that the purpose of establishing debug versions is to capture errors, rather than maximizing memory. For debug versions, if you can't install the largest spreadsheet, you can't edit the biggest possible document or don't do it. It doesn't matter if you need a lot of memory. As long as the corresponding delivery version can do it. The worst case that will be encountered using the debug version is relatively delivered, running out of the available memory space, so that the program is abnormal, performing the corresponding error handling code; the best situation, is the debug version is very Catch the wrong mistake, there is little or less debugging time. These two extreme situations are valuable. Take a little

Robert Cialdini Points in its "Influnce: How And Why People To Things": If you are a salesman, then when the customer comes to your responsible men's clothing, you should always give it a first time. Customers look at the suit and then see the sweater. The reason why this is to increase sales, because after the customer bought a $ 500 set of suits, a $ 80-yuan sweater is not so expensive. But if you give customers a sweater first, the price of $ 80 may make it unacceptable. Finally, you can only sell a $ 30 yuan sweater. Anyone just spends 30 seconds to think about it, you will understand this truth. However, how many people have taken time to think about this problem?

Similarly, some programmers may think that the BGARBAGE is not important, as long as it will pick one from the number used in the past. Other programmers may also think that it is not important to recursively traverse the symbol table according to the order, or in the back sequence. But as we pointed out earlier, some choices are really better than some other options.

If you can choose to implement the details freelance, then stop for 30 seconds to check all possible choices before making the appropriate selection. For each choice, we have to ask yourself: "This choice will cause a mistake, or will help find an error?" If you have asked this question, you will find 0 will cause errors. The value such as OXA3 will help us find errors.

Carefully design the program test code, any choice should be considered

No need to know

When testing the subsystem, in order to use the corresponding test program, you may have encountered the need to understand the contents of these test procedures. The use of FValidPointer is such an example. If you don't know if there is such a function, you will not use it at all. However, the best test code should be transparent, regardless of whether the programmer can feel their existence, they will work.

Assuming a programmer who has no experience or a person who is unfamiliar with the project has joined the project group. In the case of the corresponding test code at all, if the F11Memory, FresizeMemory and FreeMemory have the corresponding test code, is he still used to use these functions in the program free of charge?

Then he doesn't realize that FresizeMemory will cause the memory block of memory, and therefore generates an error similar to the previous assembler in its program, what happens? He needs to know the content of the consistency check program due to the implementation of the corresponding consistency checkout and generate assertion "ILLEGAL POINTER"?

If he created and lost a memory block, what would it be? At this time, the corresponding consistency check can also be performed and assert "Lost Memory". Maybe, he even didn't know what "Lost Memory". But the truth is that he doesn't need to know this, and the corresponding check can work. Dividing is that by tracking this error does not have to ask for experienced programmers, you can learn about memory loss. This is the advantage of carefully designed subsystem test code - when the test code limits the error within a local scope, the error seizes and sent to the "broadcast room", and interrupts the normal work. For programmers, this is really good feedback.

Try to do transparent consistency check

We are delivered is not a debug version

In this chapter, I really add a lot of debug code to the memory management program. In this regard, some programmers may think: "Joining the debug code in the program seems to be very useful, but like this plus all inspections and also includes processing of log information," I have to recognize it. "I have to recognize I also have this feeling.

I used to give the program with so many debugging code that rely on efficiency, but I will recognize my mistakes. In the delivery versions of the program, add this debug code to break its market, but we don't add any test code in its delivery versions, which is just used in its debug version. Indeed, debugging code will reduce the running speed of the debugging version. But make your retail products in the user, or to help check the wrong version of your debug version is slightly slow, which one is worse? We should not worry about the efficiency of debugging, because customers do not use the program's debug version.

It is important to distinguish the debug version and delivery versions of the program. The debug version is used to find errors, and the delivery version is used to please customers. Therefore, the trade-offs made by these two versions are quite different when encoding.

Remember, as long as the corresponding delivery versions meet the size and speed requirements of the customer, you can do anything you want to do with the debug version. If you add a logger for a memory management program, you can help you find a variety of difficult to capture errors, then you will be happy. Customers can get a vibrant program and you can find errors for a lot of time and energy.

Microsoft's programmers always add corresponding debug code in their programs. For example, Excel contains some test programs for memory subsystems (they are more detailed than we introduced here). It has a unit form consistency checker; it has a mechanism that generates memory failure, so that the programmer can force the program to execute the "memory space" error handler; it also has many other checkers. This is not said that Excel's delivery version has never been wrong, it does have, but these errors rarely appear in code through a detailed subsystem check.

Similarly, although we add a lot of code to the memory management program, all of the code is used to construct fnewMemory, FreeMemory and FresizeMemory, we did not add anything to these functions, and did not give Malloc, Free, and Realloc's internal support code (which can be very important) add anything. Even the speed reduction caused by increasing the debug code is not as bad as imagined. If Microsoft's statistical results are representative, the speed of program debugging version (full of assertion and subsystem testing) should be half of the payload version.

Do not apply the constraints of the delivered version to the corresponding debug version

To use size and speed to exchange errors inspection capabilities

true

In order to find more errors, the past Microsoft always gives the debug version of the application of the application to the beta tester for beta test. However, when the product-based beta debug version, the "pre-release" weekly "pre-release" weekly appears, and said that the program is very good, but it is slow to get the same as the nasal sep, they will no longer provide products for the time being. β debugging version. This fact warn that we don't want to send the debug version to the test site, or before doing this, the internal debugging check code affecting performance in the debug version is clearly cleared. summary

In this chapter, we introduced six ways to enhance the memory subsystem. Although these methods are proposed for the memory subsystem, the views are equally applicable to other subsystems. Everyone can imagine that after the program has a detailed confirmation ability, it is difficult to quietly sneak into this procedure. Similarly, if these debugging checks in the assembler that I have talked earlier, I usually take a few years to find the Realloc error, and I can write in the corresponding code for a few hours or in a few days. Automatically found. Regardless of the technical high technology, there is no experience, these test code can grab this error.

In fact, these test codes can grasp all such errors. And it is automatically caught, not luck, it doesn't rely on skills.

This is a method of writing a wrong code.

Important:

l Check the subsystem written and ask yourself: "In what case, the programmer will make mistakes when using these subsystems." In the subsystem, add the corresponding assertion and confirmation check code to capture difficult to find Errors and common errors.

l If the error cannot be reproduced, they cannot be excluded. Find factors that may cause random behavior in the program and clear them from the debug version of the program. This error may occur in a constant value of the "unfained" memory unit. In this case, if the program references its content before the unit is correctly defined as a value, each time the part of the wrong code will receive the same error result.

l If the written subsystem releases memory (or other resources), and thus generates "useless information", then disrenation, so that it really like useless information. Otherwise, these released data may still be used, and will not be noticed.

l Similarly, if some things may occur in the subsystem written, then the corresponding debug code will be added to the subsystem, so that these things will happen. This increases the possibility of errors that typically do not implement the code.

l Do your best to make the test code that the test code can also work even if the programmer does not feel. The best test code does not have to know the test code that it exists can also work.

l If possible, put the test code in the written subsystem instead of putting it in the outer layer of the selected subsystem. Do not wait until the system encoding is performed, it is considered to consider its confirmation method. Every step in the designed subsystem must consider "How to make a detailed confirmation of this implementation". If this design is difficult to test or impossible to test it, we must consider another different design, even if it means that the size of the system is tested by the size or speed.

l Abandoning a confirmation test program before the speed is too slow or occupied, you have to think twice. Remember that these codes are not existing in the delivery versions of the program. If you find yourself: "This test program is too slow, too big", so you have to stop and ask yourself: "How can I keep this test program, and make it fast?"

Exercise

1) If you accidentally encountered data of certain combinations of 0xA3 when performing code test, then this data may be data that has not been initialized, or data that has been released. How can I modify the corresponding dad code so that we can easily determine which type of data discovered? 2) The code written by the programmer sometimes fills the memory cell other than the assigned memory block. Please give a corresponding memory subsystem check to make it possible to make this type of error alert.

3) Although the CheckMemoryIntegrity program is used to check the suspension pointer error, in some cases, the program does not check this error. For example, assume that a function calls FreeMemory, but a pointer is hung by an error in this function, that is, the memory block pointed to by the pointer has been released by FreeMemory. Now we have further assume that a function calls FNewMemory to redistribute this block-free memory before the pointer is confirmed. In this way, the pointer just happed also points to the newly allocated memory block. However, this memory block is no longer the original block. Therefore, this is an error. But for CHECKMEMORYINTEGRIRITY, everything is normal, and there is nothing wrong. If this error is more common in your project, how can I enhance the program to make it find out?

4) Use the NoteMemoryRef program, we can confirm all the pointers in the program. However, how do we confirm the size of allocated memory blocks? For example, it is assumed that the pointer points to a string containing 18 characters, but the length of the allocated memory block is less than 18. Or in the opposite case, the program considers that the allocated memory block has 15 bytes, but the corresponding log information indicates that 18 bytes are assigned. These two situations are wrong. How to enhance the corresponding consistency inspection procedure to identify this problem?

5) NoteMemoryRef allows us to mark a memory block as "already referenced", but use it we can't know if the number of pointers that reference the memory block exceeds its own number. For example, each node of the two-way linked list should only have two references. One is the forward pointer and the other is the backward pointer. But in most cases, each memory block should only have a pointer to reference it. If there are multiple pointers to reference a memory block, it must be an error in the program. How to improve the corresponding consistency check, allowing some memory blocks to allow multiple references to them; but other memory blocks still do not allow multiple references to them, and in this When the situation occurs, the corresponding assertion occurs.

6) This chapter is talking about the self-ending to help programmers check errors, and can add debug code in the corresponding memory system. However, can we increase the code that is helpful to the tester? The tester knows that the program often does not properly handle the error, so how does the ability to provide the tester to simulate "memory space exhaustion"?

Question:

Check out the main subsystem in your project to see which type of debugging check can you implement in order to check out common errors related to these subsystems?

Question:

If there is no debug version of the operating system, then buy one as possible. If you can't buy it, you will write one by the shell function. If you help others, please make the selected code (in some way) can be used for other developers.

Chapter 4 follows the procedure

As we told, the best way to find the wrong program is the execution program. During the execution of the program, these auto test tools are found through our eyes or using assertion and subsystem consistency. However, although the assertion and subsystem check is useful, if the programmer does not expect to check some questions in advance, it cannot guarantee that the program will not encounter these problems. This is like a home security check system. If only the alarm line is installed on the door and the window, it will not cause alert when the thief enters the home from the entrance of the sunroem or basement. If the interference sensor is installed on the item, the thief, the thief, but the thief steals your Barry Manilow, then he is likely to escape from being discovered. This is a common problem of many security inspection systems. Therefore, the only way to ensure that the items in the family is not stolen is to stay at home during the thief. This way to prevent errors from entering the program is the case. When it is most likely to have an error, we must closely look closely.

So when is the error sometimes can appear? Is it when writing or modifying the program? Indeed it is. Although the programmers know this, they don't always realize the importance of this, and don't always recognize the best way to write erroneous code is to perform a detailed test in compilation.

In this chapter, we don't talk about why the program is tested on the program when writing a program, just talking about the program to effectively test the program when writing a program.

Add a letter to the program

Recently, I have been writing a feature for Microsoft's internal Macintosh development system. But when I tested the code, I found an error. After tracking, it is determined that this error is in another programmer's new code. What makes me confused, this part of the code is very important to the code of other programmers, I can't think of how this piece of code can work. I came to his office to ask

"I think, I found a mistake in the code you have completed recently." I said. "Can you take a look at it?" He put the corresponding code into the editor, I refer to him to see the problem I think. When he saw the part of the code, he couldn't help but surprise.

"You are right, this part of the code is really wrong. But why didn't my test?"

I also feel strange. "What method test is used in the end?", I asked.

He explained to me his test method, it sounded like it should be able to find this error. We all feel very well. "Let's set up a breakpoint on this function to track it, look at the actual situation,", I propose.

We set a breakpoint to this function. However, when you find the run button, the corresponding test program is running, it does not touch the breakpoints we set at all. Not long, we found the reason why the test program did not perform this function - a few layers on the chain in this function, the optimization function of a function allows this function to skip unnecessary work below some case.

Does the reader still remember the black box test problem I said in Chapter 1? The tester provides a large number of inputs to the program, and then determines whether the program is problematic by checking the corresponding output. If the tester believes that the corresponding output result is not problematic, the corresponding program is considered no problem. However, the problem with this method is that the tester can find problems in the program in addition to providing input and receiving output. The reason why the above programmer misses the wrong mistake is that he uses a black box method to test its code. He gave some inputs, got the correct output, and considered that the code was correct. He did not use the other tools available for programmers to test their code.

Unlike most testers, programmers can set breakpoints in the code, and track the operation of the code step by step, and the process of observing the input becomes output. Despite this, it is strange that few programmers are accustomed to tracking their code when performing code testing. Many programmers are even impatient in the code to determine if the corresponding code is executed.

Let us return to this chapter to start talking about the problem: the best way to capture the wrong is to perform the corresponding check when writing or modifying the program. So what is the best way to test the programmer? It is a stroke that is tape-by-one, and the results are seriously viewed. I don't know much for programmers who can always write out without error programs. But several of the people I know have a habit of tracking it by it. This is better than you at home - unless you fall asleep, otherwise you will not know what trouble. As a person in charge, I always teach many programmers to check their code when conducting code testing, and they always look at me. This is not what they don't agree with me, but because the code is too time to observe it. They can easily catch the progress, and how does it have time to track their code? Fortunately, this Tightual feel is wrong. Yes, the tracking of the code is really taken, but it is only a small part of the code with the code. To know, when a new function is implemented, you must design a function of the external interface to select the corresponding algorithm and enter the source program all in the computer. In contrast, when you run the corresponding program, set a breakpoint for it, press the "Step" button to check how much time each line is? Not much, especially after habits. This is like learning a car who drives a handwit transmission. It seems that it is impossible, but after practicing a few days, you can even do it without anything when you need to change. Similarly, once it is a habit, once it is a habit, we will not add breakpoints and track the entire process. This process can be accompanied by naturally, and finally check out.

Don't wait until the error is followed by the procedure.

Branch in the code

Of course, some techniques can make us more effectively tracking the code. But if we only track partial instead of all code, it will not achieve a particularly good effect. For example, all programmers know that the error handling code is often wrong, the reason is that this part of the code is rarely tested, unless you specifically test this part of the code, or these errors will not be discovered. In order to find errors in the error handler, we can establish a test case that causes the error condition, or simulates an error when tracking the code by tracking. The latter method usually takes less time. For example, consider the following code interrupt:

Pbblock = (Byte *) Malloc (32);

IF (pbblock == null)

{

Handling the corresponding error;

......

}

......

Usually, when tracking this code one by one, Malloc assigns a 32-byte memory block and returns a non-NULL pointer value to enable the error handling code. However, in order to test the error handling code, you can track this code one by one and after performing the rule command, immediately use the trace program command to set the PbBlock as NULL pointer value:

Pbblock = (Byte *) Malloc (32);

Although Malloc may allocate success, the PbBlock is set to NULL pointer is equivalent to Malloc generation failure, allowing us to step into the corresponding error processing section. (Note: After changing the value of PBBLOCK, Malloc has been distributed, but don't forget that this is just testing!) In addition to the tracking of the error, each of the programs Possible paths should be followed by tracking. The obvious case with multiple code paths in the program is the IF and Switch statement, but there are still some other cases: &&, || : Operators, each of them has two paths. In order to verify the correctness of the program, at least you must track each instruction in the program. After doing this, we have a higher confidence in the program. At least we know that for some inputs, the corresponding program is definitely not wrong. If the test case is selected, the stroke of the code will benefit our best.

Take a follow-up tracking for each code path

What is the big change?

In the past, the programmer had asked such a question: "What if I have a relationship with the code of many places? That is the tracking of all added new code not too time?" If you think so , Then I may wish to ask you another question: "If you have done such a big change, is it possible to introduce any questions when doing these changes?"

It is accustomed to performing a funny negative feedback loop for the code. For example, programmers who track the code from the code will quickly learn to write a smaller function, because the tracking of the large function is very painful. (Testing a 10-page function is more difficult to test 10 pages long) The programmer will spend more time to consider how to make a large change of the necessary changes, so that it can be easier to carry out Test. Is these not what we expect? Without a project, the person in charge of the project likes programmers to make big changes, they will make the entire project too unstable. There is also a project leader likes a big, bad manager, because they often do not maintain.

If you find a big change, you must check the corresponding changes and judge. At the same time, I have to remember that in most cases, the time spent on the code is much less than the time spent on the implementation of the corresponding code.

Data flow - the life of the program

The function of the function is as follows (free of assertion) before the fast MEMSET function introduced in Chapter 2 I have written.

Void * Memset (Void * PV, Byte B, Size_Tsize)

{

BYTE PB = (Byte *) PV;

IF (size> = sizethreshold)

{

Unsigned long L;

/ * Match a long word with 4 bytes * /

L = (B << 24) | (B << 16) | (B << 8) | B;

PB = (Byte *) longfill ((long *) PB, 1, Size / 4);

SIZE = SIZE% 4;

}

While (size -> 0)

* PB = B;

Return (PV);

}

This code seems to be correct, in fact, there is a small error. After I have finished the above code, I used it in a ready-made application, and the result is no problem, the function works very well. However, in order to confirm that the function actually works, I set a breakpoint on this function and re-run the application. After entering the code tracing program, I check the parameters of the function: The pointer parameter value does not seem to have a problem, the size parameter is also the case, the byte parameter value is zero. At this time, I feel that the use of the byte value 0 to test this function is too should not, because it makes me a lot of types of errors, so I immediately change the value of the byte parameter into a more strange 0x4e. I first tested the size of the size smaller than sizethreshold, and the path had no problem. Subsequently, I tested the size of Size greater than or equal to SIZETHRESHOLD. I didn't think there would be any problems. But when I execute the following statement:

L = (B << 24) | (B << 16) | (B << 8) | B;

I found that L was set to 0x00004E4E, not the value I expected 0x4e4e4e4e. After the fast dump of the function for assembly language, I found this error and know why the application can still work in the event of this error.

I have to compile the compiler of this function to 16 digits. In the case of an integer is 16 bits, what kind of result will b << 24? The result will be 0. The same as the same B << 16 will also be 0. Although this program has no errors logically, the specific implementation is wrong. The function can work in the corresponding application because the application uses MEMSET to fill in 0 as 0, and 0 << 24 is still 0, so the result is correct.

I almost immediately discovered this error, because I spent more time, I spent a little more time, I spent a little bit of time. Indeed, this error is very serious, and it will eventually be discovered. But to remember, our goal is to find errors as early as possible. Turn-by-strokes to the code can help us achieve this goal.

The true role of tracking on the code is that it can observe the flow of data in the function. If you close the data stream while tracking the code, you will help you find so much mistakes:

l overflow and underflow errors;

l Data conversion error;

l difference 1 error;

l NULL pointer error;

l Use waste memory unit errors (0xA3 Error);

l Use = instead of == == assignment error;

l Operate priority error;

l Logic errors.

If you don't pay attention to the data stream, can we find all these errors? The value of focusing on data stream is that it can treat you with another very different point of view. You may have not been able to note that the assignment error in the following program:

IF (CH = '/ t')

ExpandTab ();

But when you track it, you will find that the contents of CH is destroyed when you close your data stream.

When tracking the code, we must close the data stream.

Why does the compiler do not issue a warning for the above error?

In the five compilers I have used to test this book, although the warning levels of each compiler are set to the largest, there is still no compiler to warn for B << 24. Although this code is legitimate ANSI C, I imagine what the code can actually complete the programmer. In this case, why not give a warning?

When you encounter such an error, tell the manufacturer of the corresponding compiler, so that the new version of the compiler can send this error to the warning. Don't underestimate the right of customers who spend money. What are you missing?

One problem with the source debugger is to miss some important details when performing a line of code. For example, suppose to incorporate && inputs in the code below

/ * If the symbol exists and it has a corresponding body name,

* So release this name

* /

IF (psym! = null & psym-> strname! = null)

{

FreeMemory (psym-> strname);

psym-> strname = null;

}

Although this program is legal but it is wrong. The purpose of the IF statement is to avoid using the NULL Pinger Psym to reference the Symbol member strname, but the above code is not this. Instead, regardless of whether the value of PSYM is NULL, the Strname field will always be referenced.

If you use the source debugger to track your code, and press the "Step" button when you reach the IF statement, then the debugger will perform the entire IF statement as an action. If this error is found, you will notice that even if the left side of its expression is false, the right side of the expression will be executed. (Or, if you are lucky, the system will have an error when the program indirectly references the Null pointer. But there is not much desktop computing opportunity to do this, at least do not do this at present.)

I remember that we have said: &, || and?: The operator has two paths, so it is necessary to check the error to track each path. The problem with the source debugger is to use a single step and cross &&, || : Two paths. There are two practical methods to solve this problem.

The first method, as long as the composite condition statement using & || operators, scans some of the corresponding conditions, verify these criteria spelling errors and then using the debugger command to display each comparison in the condition. Doing so can help us find that although the calculation results of the entire expression are correct in some cases, this expression is indeed an error. For example, if you think in this case || The first part of the expression should be true, the second part should be False, but the result is the opposite. Although the calculation results of the entire expression are correct, there is an error in the expression. Observe the various parts of the expression can be found.

The second, is also a more thorough way to step into the composite condition statement in assembly language level and? : Interior of the operator. Yes, this takes more work, but for critical code, it is important to actually take the internal code in the middle of the code. This is the same as the C statement, once you are used to it. The assembly language instruction is also very fast, but it is only necessary to practice.

Source-level debugger may hide the details of the execution

Code to the critical part of the compilation command level

Turn off optimization?

If the compiler optimization is high, it is a very interesting practice for tracking of the code. Because the compiler is generated when generating an optimized code, the machine code corresponding to the neighboring source statement may be mixed in one. For this compiler, a "single step" command skips three lines of code is not common; Similarly, use the "single step" instruction to perform a line of sending data from one place to another source, but find the corresponding The data has not been transmitted in the past.

In order to track the code, you can consider turning off unnecessary compiler optimization when compiling debugging versions. These optimizations are useless in addition to the generated machine code. I have heard some programmers against the optimization of the compiler. They think this will cause unnecessary differences in the program's debug version and delivery versions. If you are concerned about the compiler generating a code generation error, this view is a bit reason. But we should also think that the purpose of our establishment of the debug version is to find out the error in the program. In this case, if you turn off the compiled optimization, you can help us do this, then it is worth considering. The best way is to track the optimized code, first look at how big is the difficulty of doing this, then track the code-by-step tracking, just close the compiler optimization function that you think must be closed.

summary

I hope I know a method of persuading programmers to track their code, or at least make them try for a month. But I found out that programmers generally overcome the idea of ​​"That Tet time". The benefits of the project leader is that you can overbearing this matter until the programmer recognizes that this does not cost a lot of time, and it feels worth doing so, because the error rate has declined significantly.

If you have not tracked your program, would you start this? Only you know the answer to this question. But I guess when you pick up this book and start reading, it is because you are being reduced in the code in your or your leadership. This naturally coms down as the following question: You are preferred to spend a small amount of time, verify it by tracking the code; or prefer to let the wrong slip into the original code, I hope the tester can notice these mistakes so after you Modify it. Select in you.

Important:

The L code will not give you an error itself, the error is the product of the programmer to write new code or modify the existing code. If you want to find out the error in the code, no method is better than the compile time to compile the code.

l Although it is intuitive that you may think that it takes a lot of time to check the code, it is wrong. I have just take a short time to take a short time, but I don't spend more time after this habit is natural, you can take it quickly.

l Be sure to track each code path, at least track, especially the error processing section in the code. Don't forget &&, || : These operators that each have two code paths that need to be tested.

l In some cases, it may be necessary to track the code by assembly language level. Although it is not necessary to do this, don't avoid this approach when necessary.

Question:

If you look at the exercises in the first chapter, you will find that the compiler can automatically check the common mistakes for you. Re-examine these exercises, ask yourself this time: If you use the debugger to track the corresponding code, will you miss those mistakes?

Question:

Looking at the mistakes reported for your program for your program, it is determined that if you are tracking it by writing a program, how many mistakes will you catch?

Chapter 5 Candy Machine Interface

One of the benefits of Microsoft Employees from the company is to just enjoy free soft drinks, such as fragrant Seli mineral water, milk plus chocolate and soft pack juice, etc. But hate is, if you want to eat candy, you have to pay yourself. So sometimes, I slip to the vending machine. Once, I stuffed a few 25 cents, then pressed the selection button 4 and 5. But when the vending machine spits out the jasmine fragrant bubble sugar, not the old grandmother's peanut butter biscuit I want to buy, I stunned. Naturally, the vending machine is right, I am wrong, 45 is representing bubble gum. Look at the small mark of peanut butter cookies on the vending machine, further confirming my mistake. The label is written to peanut butter biscuits, No. 21, 45 cents. This thing has always made me worry, because if the designer of the vending machine spends more than 30 seconds, consider their design, it will not make me and countless people to meet this kind of thing: I bought something I don't want to buy. . If they think about: "Well, people often think about 45 cents to the keyboard, I dare to bet, people often put the price when the price will be given to the vending machine when the keyboard will make the purchase machine. So We should use the letter key and should not use the number key to avoid this.

This way to design the vending machine does not increase his cost, nor will it change its original design, but whenever you type 45 cents on the keyboard, it will find that the machine refuses to accept this input, remind you to type The corresponding alphabet code. This design will guide people to do the right thing.

When we design the interface interface, what is the same problem. Unfortunately, programmers do not consider how other programmers will use the functions he designed. Like the above candy interface, the fine difference in design may be very easy to cause errors, or it may be very easy to avoid errors. It is not enough for the design of the function that does not have an error, but it must also make them use safe.

Very nature, getchar () will get an int

The standard C library function and thousands of other functions written in this mode have the above-mentioned candy machine interface, which is easy to use. Just say the getchar function, we have enough reason to say that this function is risky, the most serious problem is that the design name of the function encourages programmers to write a fault code. About this, let's take a look at Brian Kernighan and Dennis Ritchie how to say in its "C programming language":

Consider the following code:

Char C;

C = getchar ();

IF (c == EOF)

......

On the machine that is not symbolically extended, C is always a positive number because it is a char type and EOF is negative, and the result will eventually fail. In order to avoid this, INT must be used without CHAR to save the variables of the GetChar return value.

This is not to explain that even experienced programmers must carefully use a function? According to the function name of GetChar, it is a natural thing to define Cymnograms, which is why the programmer will encounter this error. But do GetChar is so harmful? The work to do is not complicated, but it is read from a device and returns possible error.

The following code gives another common problem:

/ * STRDUP - Established a copy of a string * /

Char * strdup (char * STR)

{

CHAR * STRNEW;

Strnew = (char *) Malloc (Strlen 1);

STRCPY (STRNEW, STR);

Return (Strnew);

}

This function will work well in general, unless the memory space is exhausted causes Malloc failure. At this time, it returns a NULL pointer that does not point to any memory cell. But when the destination pointer strnew is NULL, the ghost knows what Strcpy will do. STRCPY is not expected by the programmer, whether it is a failure or quietly rushing away. Programmers will encounter trouble when using GetChar and Malloc, because they can write code even if there is a defect but still work on the surface. Until a few weeks or even a few months, it will encounter a series of difficult events that cause these code failures, just like a disaster that the Titanic ship is sinking. GetChar and Malloc can't boot programmers to write the correct code, and it is easy to neglect the error.

The problem with GetChar and Malloc is that the values ​​they return are inaccurate. Sometimes they returned to the effective data, but they returned to incredible error values.

If getChar does not return a strange EOF value, declare the C declared as a character type is correct, and the programmer will not encounter the error mentioned by kernighan and Ritchie. Similarly, if Malloc does not return, it is a NULL of the memory pointer, and the programmer will not forget the corresponding error. The problem is not afraid that these functions return to errors, and they are afraid that they hide the error in the normal return value that the programmer is extremely ignored.

If we re-design getchar, make them return two different outputs, respectively? It returns True or false based on whether to successfully read a new character, and return the read character to a variable passed to him by reference:

Flag Fgetchar (Char * PCH);

Through this interface, we can naturally write

Chat ch;

IF (FGetChar (& ch))

CH is the next character

Else

Into the EOF, and useless information

In this way, "char or int" is solved. Any programmer, no matter how naive is unlikely, I will never forget the wrong return value, compare the return value of getchar and fgetchar, you see that getchar emphasizes the characters returned to the characters that FGETCHAR emphasize the error? If your goal is to write a wrong code, what do you think should be emphasized?

Indeed, this will lose the flexibility below when writing the code:

Putchar (GetChar ());

But do you know how high it has a frequency of getchar's failure? Almost in all cases, the above code will generate an error.

Some programmers may think: "It is really safe, but it is a waste of code. Because when calling it, you must pass more parameters. In addition, if the programmer does not pass & ch, how to do CH? When using the SCANF function, forget the corresponding &, long-term, has always been an error. "

Asked well.

The quality of the compiler generation code is actually depends on the specific compiler. Some compilers generate a slightly more code, some little less because we don't have to compare the return value and EOF of the function after each call. Regardless of a little more, considering the plummet of disk and memory prices, the complexity of the program and the corresponding error rate increase, and the subtle difference in the code size may not be consumed.

In the second question, such as passing characters for FgetCha rather than character pointers, there is no worries after using the function prototype suggested in Chapter 1. If other parameters of non-character pointers are delivered to FGetCha, the compiler will automatically generate an error message to indicate the mistakes made to you.

In fact, the practice of combining mutually exclusive outputs into a single return value is from assembly language inherited. For assembly languages, only a limited machine register can be used to process and deliver data. Therefore, the use of a register in the assembly language environment returns two mutually exclusive values ​​are both efficient and necessary. However, programming is another thing, although c can make us "closer to the machine", but this is not said that we should use it as a senior assembly language. When the interface of the design function, choose to make the programmer to write the design of the correct code. Do not use the return value caused by confusing - each output should only represent a data type, to explicitly reflect these points in the design, users are difficult to ignore these important details.

To make users don't easily ignore the wrong situation

Do not hide the error code in the normal return value

Just think about it more

The programmer always knows when multiple outputs are combined into a single return value, so implementing the above recommendations - as long as they don't do it. However, in other cases, the programmer design can be very good, but it will contain potential dangers like Trojan Ma. Observe the following code that changes the size of the memory block:

PBBUF = (Byte *) Realloc (PBBUF, SIZENEW);

IF (PBBUF! = NULL)

Use initialization this bigger buffer

Do you see the error of this program? If you didn't see it, it didn't matter - this error is very serious, but it is very subtle, if you don't give a little hint, few people will find it. So we give a prompt: if the PBBUF is the only pointer to the memory block that will change its size, what happens when Realloc fails? The answer is that when Realloc returns, fills NULL in PBBUF and rushes down this unique pointer to the original memory block. In short, the above code will generate the phenomenon of loss of memory block.

How many times we have to change a memory block, think you want to store a pointer to the new memory block to another different variable? I want to be like a 25 cents in the street, and I am definitely rarely stored in different variables like a new pointer. Usually people change a large hour of a memory block, it is desirable to point to the new memory block with the original variable, which is the reason for the programmer to fall into the trap and write the above code.

Note that the programmers that often returns the error value and valid data together, which will habitually design the interface like Realloc. Ideally, Realloc should return an error code regardless of whether the memory block is expanded or not, a pointer to the corresponding memory block is returned. This is two independent outputs. Let's take a look at FresizeMemory, it is the shell function of the Realloc introduced in Chapter 3. After the debug code is removed, its form is as follows:

Flag FresizeMemory (void ** ppv, size _t sizenew)

{

BYTE ** PPB = (byte **) PPV;

BYTE * PBRESIZE;

PBRESIZE = (Byte *) Realloc (* PPB, SizeNew);

IF (PBRESize! = NULL)

* PPB = PBRESize;

Return (PBRESize! = NULL);

}

The IF statement in the above code guarantees that the original pointer will not be destroyed. If you use FresizeMemory to override the Realloc code in this section, you will get:

FresizeMemory (& pbbuf, sizenew)

Use initialization this bigger buffer

If FresizeMemory fails, PBBUF will not be set to NULL. It still points to the original memory block, as we expect. So we can ask: "Use FresizeMemory, is the programmer may lose memory block?" We can also ask: "Is the programmer to deal with the error of FresizeMemory?" Another interesting question that needs to be explained is: consciously follow The first suggestion given in this chapter ("Do not hide the error in the return value"). Never design an interface such as Realloc. At the beginning, they would make more images such as FresizeMemory, so there is no problem with realloc's loss of memory block issues. All the arguments of this book are based on interactions, they will have unexpected effects. This is an example.

However, separation of the output of the function does not always enable us to avoid the interface of hidden traps, I really hope to give a better advice, but I think the only way to find out these hidden traps is to stop thinking. the design of. The best way to do this is to check the various possible combinations of input and output, find side effects that may cause problems. I know that this is sometimes very boring, but I have to remember: This is more cost-effective than this problem than it takes more than the time. The worst case is this step, so the day, how many other programmers have to track the mistakes caused by the bad interface of the design. Just think about the error caused by the incorrect of the trap of the trap by getchar, malloc, and realloc, the programmer of the world should waste how much time, we have written other functions to all according to this mode. Say. This is really terrible! In fact, as long as you consider a little more, you can completely avoid this phenomenon.

Have a harmless force to find and eliminate the defects in the function interface

Single function memory management program

Although we spent a lot of time in Chapter 3 to discuss the Realloc function, it did not involve many of its more strange aspects. If you take a C run library manual, Isors the full description of Realloc you will find some narratives similar to the following:

Void * Realloc (Void * PV, SIZE_T SIZE);

Realloc changes the size of the previously allocated memory block, the original content of the memory block is reserved between the start position of the block to the minimum length of the new block and the length of the old block.

l If the new length of the memory block is smaller than the old length, Realloc releases the memory space that the tail of the block is no longer wanted, and the returned PV is unchanged.

l If the new length of the memory block is greater than the old length, the expanded memory block may be assigned to a new address, the original content of the block is copied to a new location. The returned pointer points to the enlarged memory block, and the contents of the block expand part are not initialized.

l If you do not expand the request, Realloc returns NULL, and when the memory block is reduced, Realloc will successfully.

l If the PV is NULL, the role of Realloc is equivalent to calling Malloc (SIZE) and returns a pointer to the newly assigned memory block, or returns NULL when the request cannot meet.

l If the PV is not NULL, but the new block is zero, the role of Realloc is equivalent to calling Free (PV) and always returns NULL.

l If the PV is NULL and the current memory block is zero, the result is not defined.

Damn! Realloc is a best example of achieving "face", which completed all memory management work in a function. Since there is so much MalloC? What do you want to do? Realloc is all inclusive.

There are a few good reasons to explain that we should not design a function like this. First, how can such a function expect the programmer to be used safely? It includes such a large detail, even experienced programmers know. If you have questions about this, please investigate, how many programmers know that a NULL pointer to Realloc is equivalent to calling Malloc; how many programmers know that a zero block length effect is the same as the call free. . Indeed, these functions are quite hidden, so we can ask them to avoid some problems that must be known, if they call Realloc to expand a memory block, or if they know that the corresponding memory block may be mobile? Another problem with Realloc is that we know that it may be useless to pass to Realloc, but because it defines such a general purpose, it is difficult to prevent invalid parameters. If the NULL pointer is incorrectly, it is legal; if it is incorrectly transmitted to zero block lengths. Worse, this wants to change the size of the memory block, but Malloc has a new block or free dropped the current memory block. If any parameters are actually legal, how do we use asserts to check the validity of the Realloc parameter? No matter what kind of parameters you provide, Realloc can handle, even in extreme cases. One extreme is its Free memory block, and the other is its Malloc memory block. This is a trivial opposite function.

Fairness, programmers usually do not sit down and think: "I intend to design a complete subsystem in a function." The function like Realloc is almost always generated: one is a variety of functions to gradually The evolution; the other is a specific implementation to increase excess functions (such as Free and Malloc), in order to include these so-called "lucky" functions, programmers implement the function extended the corresponding form description.

Regardless of what kind of reason to prepare a multi-functional function, it is decomposed into different functions. For Realloc, it is to decompose the expansion of the memory block, reduce the memory block, allocate the memory block, and release the memory block. Decompose Realloc into four different functions, we can make the error checking. For example, if we want to narrow the memory block, we know that the corresponding pointer must point to a valid memory block, and the new block must be less than (or equal) the current block length. In addition to anything is wrong. Using a separate ShrinkMemory function we can verify these parameters by assertion.

In some cases we actually want a function to do multiple things. For example, when the realloc is called, usually we know that the new block is greater than or less than the current block length? It depends on the specific program, but I usually don't know (although I often make this information). For me, it is best to have a function that can expand the memory block and reduce the memory block. This avoids writing if statements must be written at each time you need to change the memory block size. Thus, although abandoned some excess parameters, you can get compensation that no longer need to write multiple IF statements (possibly procedures). Since we always know when to assign memory, we should release the memory, so you should cut these functions from the Realloc to make them constitute a separate function. Chapter 3 describes F11Memory, FreeMemory, and FresizeMemroY are the three definitions of good functions.

But if I am making a program that usually knows that it is to expand or reduce the memory block, I will definitely break the memory block and the reduced memory block function, and then build two new functions:

Flag FgrowMemory (void ** ppv, size_t sizerger);

Void ShrinkMemory (Void * PV, SIZE_T SIZESMALLER); not only allows me to thoroughly confirm the input pointer and block length parameters, but also call ShrinkMemory is also small, because it guarantees that the corresponding memory block is always reduced and Never move it. So don't write:

Assert (Sizenew <= sizeofblock (pb)); // Confirm PB and SizeNew

(void) Realloc (PB, SizeNew); // Setting reduction does not fail

just write:

ShrinkMemory (PB, SizeNew);

You can complete the corresponding confirmation. The easiest reason for using ShrinkMemory replaces Realloc is that the corresponding code is extra clear. Using ShrinkMemory, it is no longer necessary to note that it may fail, no longer need to convert the use of VOID to the useless parts, no longer need to verify the validity of PB and SizeNew, because ShrinkMemory will do for us. all of these. But if you use Reallo, I even think it should also use the assertion to check if the pointer he returns is identical to PB.

Don't write a variety of functions

In order to confirm the parameters, write a single function

Ambiguous two-can

As we mentioned in order to avoid confusion for programmers, the various outputs of the function should be clearly listed. If this proposal is also applied to the function of the function, it naturally avoids writing a function of REALLOC. Realloc Enter a memory block pointer parameter, but sometimes you can take an incredible NULL value, resulting in a malloc to be used. The Realloc has a block length, but it can take an incredible zero value, resulting in a material of Free. These incredible parameter values ​​seem to have no harm, in fact, it harms the understandability of the program. We can see that the following code is to change the size of the memory block, or allocate or release the memory block?

PBNEW = Realloc (PB, SIZE);

We don't know about this, they are all possible, depending on the value of PB and SIZE. However, if we know that PB's pointing is a valid memory block, size is a legal block length, immediately knowing it is changing the size of the memory block. As a clear output makes people easily figure out the results of the function, clear inputs are also easy to understand the things to do, it is very valuable to the maintenance personnel who must read and understand the programs of others.

Sometimes the ambiguous input is not as easy to discover in Realloc. Let's take a look at the special string copy routine below. It starts from Strfrom to take the SIZE character and store them in a string starting from STRTO:

Char * CopysUbstr (Char * STRTO, Char * Strfrom, Size_t size)

{

Char * strstart = strto;

While (size -> 0)

STRTO = strfrom ;

* STRTO = '/ 0';

Return (strStart);

}

CopySubStr is similar to a standard function strcpy, which is different that it guarantees that the string of starting at STRTO is determined to be a C-string of zero end. Typical usage of this function is to extract the subtrings from a large string. For example, from a combination string:

Static char * strdaynames = "sunmontuewedthufrisat";

......

Assert (day> = 0 && day <= 6);

COPYSUBSTR (strDay, strdaynames day * 3, 3);

Now we understand the working style of CopySubstr, but do you see if the input is a problem? As long as you try to write to the function to confirm its parameters, it is easy to discover this problem. The assertions of the parameters STRTO and STRFROM can be: assert (strto! = Null && strfrom! = Null);

But how do we confirm the size parameter? SIZE is a zero law? What should I do with SIZE greater than Strfrom? If you see the implementation of this function, we will see these two cases can be processed. If the size is equal to zero when entering the function, the While loop will not be executed; if size is greater than Strfrom, the While loop will copy the strFrom along with its terminator to STRTO. To illustrate this, you must explain in the form of the function:

/ * COPYSUBSTR ─ ─ Pumped from the string

* Dump the front size character of Strfrom to start from STRTO

* Start in the string. If the number of characters in the struntom is small

* In "size", then all characters in strFrom are copied

* Bei to STRTO. If size is equal to zero, STRTO is set

* Set an empty string.

* /

Char * CopysUbstr (Char * STRTO, Char * Strfrom, Size_t size)

{

......

It sounds very familiar, isn't it? In this way, similar functions are as used as the dust on the light bulb. But is this the best way to deal with its size input parameters? The answer is "not", at least from the point of view of writing the wrong code, it is "not".

For example, it is assumed that the programmer has a "3" to "33" when calling COPYSUBSTR:

CopysUbstr (strDay, strdaynames day * 3, 33);

This is indeed an error, but it is completely legally based on the definition of COPYSUBSTR. Yes, it may also grab this error before the corresponding code, but can't automatically discovery it, must be isolated from people. Don't forget to check the error from the assertion close to the wrong, you have to get the error rate faster than the output from the error.

From the "unclear" point, if the parameters of the function are crossed or meaningless, even if it can be handled, it should be considered illegal input. Because quietly accepting strange input values, hide rather than exposure errors. In a sense, the anti-wrong program design should allow "unrestrained" input. In order to improve the robustness of the program, you must include the corresponding anti-wrong error in the code, not a problem with a problem:

/ * COPYSUBSTR ─ ─ Pumped from the string

* Dump the StrFrom's first "size" character dump from STRTO

* Start in the string, in strFrom, at least

* Has a "size" character.

* /

Char * COPYSUBSTR (Char Strto, Charstrfrom, Size_t Size)

{

Char * strstart = strto;

Assert (strTo! = Null && strfrom! = Null);

Assert (size <= strlen (strfrom));

While (size -> 0)

STRTO = strfrom ;

* STRTO = '/ 0';

Reurn (strn);

}

Sometimes it allows the function to accept meaningless parameters - such as parameters of size 0, it is worthwhile because it can be exempted from unnecessary testing during call. For example, because the MEMSET allows its SIZE parameter to be zero, the IF statement in the following program is unnecessary: ​​if (strlen! = 0) / * filled with Str * /

MEMSET (STR, ChSPACE, STRLEN (STR);

Be careful when allowing parameters of 0 to be 0. The programmer processing size (or count) is O parameter is usually because they can handle instead of handling. If the number of functions have a size parameter, it is not necessarily handled for the size of 0, and it is necessary to ask yourself: "How much is the number of the programmer to call this function?" If it is almost just like this? " Call, then do not process the size of 0, and add corresponding assertions. To remember that the restriction is to eliminate the opportunity to capture the corresponding error, so a good guideline is that the input is strictly defined for the input of the function, and maximizes the assertion. In this way, if a limit is discovered too harsh, it can be removed without affecting the other parts of the program.

Chapter 3 is the NULL pointer inspection in FreeMemory, which is used. Because I never call FreeMemory with a null pointer, I will strengthen the inspection of this error. There may be different views on this. There is no right or wrong here, but to ensure that the conscious choice is made, not just a casual habit.

Don't be ambiguous, defining the parameters of the function

Don't let me fail now

Microsoft's policy of recruiting employees is to ask for some technical issues to candidates when interview. For programmers, some programming issues are given. I have often begun to assess candidates from the request to write standard Tolower functions. I handed a candidate an ASCII table, how to write a function to convert a uppercase letter into a corresponding lowercase letter? "I intentionally speaking about how other symbols other than letters and lowercase letters are very vague, mainly thinking See how they will handle these situations. Will these symbols remain unchanged when returning? Will these symbols check these symbols in assertions? Will they be ignored? The function written by more than half of the programmers will be the following:

Char TOLOWER (Char CH)

{

Return (CH 'A' - 'A');

}

This kind of writing is no problem in the case of CH is uppercase letters, but if CH is other symbols, it will be problematic. When I pointed out this situation to the candidate, sometimes they would say: "I assume that CH must be uppercase letters. If it is not a capital letter, I can return it unchanged." This solution is reasonable, but other solutions It is not necessarily. More common is those candidates who have not been selected: "I didn't take into account this problem. I can solve this problem. When CH is not uppercase letters, it returns an error code." Sometimes they will return to null, Sometimes returning empty characters. But for some reason, it is undoubtedly -1 will occupy the wind:

Char TOLOWER (Char CH)

{

IF (ch> = 'a' && ch <= 'z')

Return (CH 'A' - 'A');

Else

Return (-1);

}

These solutions violate the recommendations given earlier because they mix the error value with the true data. But the real problem is not that the candidates have not been able to notice the recommendations they have never heard of, but they returned to the wrong code in the case of greatness.

This proposes another question: If the function returns an error code, each caller of the function must process the error. If the TOLOWER may return -1, then you can't just write this:

CH = TOLOWER (CH); but must write this:

INT chnew; / * In order to accommodate -1, it must be int type * /

IF ((Chnew = TOLOWER (CH))! = -1)

CH = chnew;

This is related to the previous section. If you realize that it is necessary to use TOLOWER if you call, you will understand that it is not the best way to define this function.

If you find yourself if you want to return an error code, then stop to ask yourself: Do you still have other design methods, you don't have to return to the error, so don't define the TOLOWER to return to the uppercase letter corresponding to the uppercase letter. To make it "if the CH is uppercase letter, it returns its corresponding lowercase letters; otherwise, returning it without changing."

If you find that the error cannot be eliminated, you can consider that these problems are not allowed, that is, the input of the function is verified by the assertion. If this proposal is applied to TOLOWER. Will get:

Char TOLOWER (Char CH)

{

AskERT (CH> = 'a' && ch <= 'z');

Return (CH 'A' - 'A');

}

Both methods can make the caller of the function do not have to run errors check, which means that the code generated is smaller and less errors.

Writing function does not fail in a given effective input

See the meaning of the words

Standing on the caller's position, I didn't have excessively emphasized how important it was to check the designed function interface. Considering that the function is only defined once, but in many places in the program, it will be very stupid to understand that the way the function is called very stupid. The Toolower examples we have seen will explain this, which has caused the complication of the corresponding call code. However, it is not only a complex code that is not only with the output and returning unnecessary error code. Sometimes the cause of code complication is completely ignored by carelessness and ignoring the effect of "Read ').

For example, when the disk processing section of improving the compiled application is improved, a file search call written is below:

IF (fseek (fpdocument, offset, l) == 0)

You can say that it will perform some kind of search, you can also see the corresponding error situation, but what is the readability of this call? Which type of search is made (starting from the file start position, starting from the current location of the file, or starting from the end of the file)? If the call returns 0 value, this shows that success or failure?

Conversely, if the programmer uses a predefined name to make a corresponding call;

#include / * Introduced SEEK_CUR definition * /

#define err_none 0

......

IF (fseek (fpdocument, offset, seek_cur) == Err_none)

......

This is not clearer, is it clearer? That's it. But this is not a new thing that makes people feel surprised, and the programmer already knows how to avoid using inexplicable numbers in the program. The famous constant can not only make the code more readable, but also make the code more portable (considering that SEEK_CUR may not be 1) on other systems.

What I want to point out is that although many programmers use NULL, TRUE and FALSE as a famous constant. But they are not famous constants, but it is only a text representation of Moming's wonderful numbers. For example, what is the following call?

UnsignedToStr (U, STR, TRUE);

UnsignedToStr (U, Str, False); you may guess these calls are used to convert an unsigned value into its body representation. But what role does the above Boolean parameters do? If I write these calls into the following, it will be more clearly:

#define base10 1

#define base16 0

.........

UnsignedTostr (U, STR, BASE10);

UnsignedTostr (U, STR, BASE16);

When the programmer sits down and writes this function, its Boolean parameter value seems very clear. The programmer first makes a function description, then do the implementation of the function:

/ * Unsignedtostr

* This function converts an unsigned value into its corresponding

* The text representation, if fdecimal is True, u is turned

* Replace it into a decade; otherwise, it is converted into

* Hexadecimal representation.

* /

Void unsignedtostr (unsigned U, Char * Strresult, Flag FDecimal)

{

.........

Is there anything else than this?

But in fact, Boolean parameters often indicate that designers are not well thought out. The corresponding function can do two different things, using Boolean parameters to choose what you want to do; you can also flexibly not just two different features, but programmers use Boolean to indicate two cases of unique interests. . These two may often be correct.

For example, if we think of unsignedtostr as a function that only two different things, it should be removed into two functions below:

Void unsignedtodecstr (unsigned u, char * STR);

Void unsignedtohexstr (unsigned u, char * STR);

But in this case, a better solution is to change its Boolean parameters to a universal parameter, so that UNSIGNEDTOSTR is more flexible. This makes the programmer not to transmit True or FALSE, but the corresponding conversion base:

Void UNSIGNEDTOSTR (unsigned u, char * str, unsigned base);

This allows us to get clear flexible design, which makes the corresponding call code easy to understand, and also add the function of the function.

This suggestion seems to be contradictory with our "parameters to strictly define functions", "we turn to the specific TRUE or FALSE input into general input, most of the function may not be used . But to remember, although the parameters become general, we can always include assertions in the function to check that the value of Base will only be 10 or 16. Thus, if you need to perform binary or octal conversion, you can relax this discipline to transfer the base value equal to 2 and 8.

Compared to what I have seen, the value is True, False, 2 and -L function, which is much better. Because the value domain of the Boolean parameter is not easy to expand, so you have to continue to endure these meaningless parameter values, or you have to modify each call statement.

Make the program in the call, it is clearly understood; to avoid Boolean parameters

Tip to people

As the last measure of preventing errors, we can write corresponding annotations in a function to emphasize the danger it may generate, and give the correct usage of functions, which can help other programmers do not errors when using this function. For example, getChar's annotation should not be like this:

/ * getChar ── The function is the same as getc (stdin) * /

Int getchar (void)

......

It does not help with programmers, we should write it:

/ * getChar ─ ─ equivalent to getc (stdin)

* getchar returns a character from stdin, when there is

* When it is wrong, it returns "int" EOF. One of this function * Typical usage is:

* Int CH; // In order to accommodate EOF, CH must be int type

* If ((ch = getchar ())! = EOF)

* Success --- CH is the next character

* Else

* Failure - FERROR (stdin) will give an error type

* /

Int getchar (void)

......

If these two descriptions are handed over to the programmers of the beginner C library, do you think is the description of the danger that occurs when using getchar? What kind of difference will the two descriptions generate when the programmer uses getChar? Do you think he will write a new code, or will replicate the example given under the annotations you have done, and then modify it as needed?

Another positive role in which the function is annotated in this way is that it can force the programmer who is not cautious enough to stop thinking about how other programmers can use the functions they compile. If the programmer's function interface is stupid, when writing a typical usage, he should pay attention to the clumsy interface. Even if he did not pay attention to the problem, as long as the example given by typical usage is in detail, there is no relationship. For example, if Realloc is annotated as the following form, it will not cause so many problems:

/ * Realloc (PV, SIZE)

* ...

* A typical usage of this function is.

* Void * pvnew; // to protect PV to prevent Realloc fail

* Pvnew = realloc (PV, SIZENEW);

* If (pvnew! = Null)

* {

* Success - revised PV

* Pv = pvnew;

*}

* Else

* Failure ─ Do not use the value of NULL to rush to PV

* /

Void Realloc (Void * PV, SIZE_T SIZE)

By copying such an example, even if the programmer is not enough, it is likely to avoid the problem of memory loss in this chapter. Although the example does not work to all programmers, it is the same as the warning information on the pharmaceutical packaging, it will have an impact on some people. And from any point, this is helpful.

However, don't use an example to replace a good interface. GetChar and Realloc interfaces make users easily errors, these harm should be eliminated rather than just give a description.

Writing an annotation

summary

The interface that is designed to resist the wrong mistake is not difficult, but this does need more consideration and willing to give up the deep-rooted coding habit. The recommendations given in this chapter simply change the interface of the function, allowing the programmer to write the correct code without having to consider the other part of the code. This chapter runs through the complete key to "make everything clearly". If the programmer understands and remembers each detail, it will not make mistakes - they will make mistakes because they have forgotten or never know these important content. Therefore, it is designed to resist the interface to the wrong interface, so that the programmer is difficult to ignore the corresponding details.

Important:

l The most easy to use and understand the function interface, which is the interface that only represents one type of data per input and output parameters. Mixed the error value and other dedicated values ​​in the input and output parameters of the function, only the interface of the function.

l The interface of the design function forces the programmer to consider all important details (such as error-in), do not make programmers to easily neglect or forget the details.

l When you think of the way the programmer calls the function of the function, find out the interface defects that may make the programmer unintentionally introduce the error. It is especially important to strive for a function that is always successful, so that the caller does not have to perform corresponding error handling. l In order to increase the apprecience of the program, it is necessary to ensure that the call to the edged function can be read by the programmer of these calls. Ming Ming's numbers and Boolean parameters have backed by this goal, so they should be eliminated.

l Decompose the function of multi-function. Take a more specialized function name (such as ShrinkMemory rather than realloc) not only enhance people's understanding of the program, but allows us to automatically check the call errors in a more stringent assertion.

l In order to display the programmer's appropriate call method, the way to pass the function's interface is detailed. To emphasize dangerous aspects.

Exercise:

1) The function strDUp started in this chapter assigns a copy of a string, but if it fails, returns NULL. What is the strDup interface that is more resistant to the error?

2) I said the presence of Boolean losers, often indicating that there may be a better function interface. But what is the Boolean output parameter? For example, if FgetChar fails, it returns false and requires a programmer to call Ferror (stdin) to determine the reason for the error, what is the better getChar interface?

3) Why is ANSI's STRNCPY function inevitably makes a rash programmer make mistakes?

4) If the reader is familiar with the C Inline specifier to talk about it to write the value of the function interface that can resist the error.

5) C uses & reference parameters similar to the VAR parameters in Pascal. So, not to write this:

Flag Fgetchar (Char * PCH); / * Prototype * /

......

IF (FGetChar (& ch))

CH contains new characters ...

Can write:

Flag fgetchar (char & ch); / * prototype * /

......

IF (FGetChar (CH)) / * Automatic Pass & Ch * /

CH contains new characters ...

From the surface, this enhancement seems to be good because the programmer is impossible to "forget" the explicit & requirements required in formal C. But why use this feature that generates an easy error, instead of resisting the incorrect interface?

6) The standard STRCMP function takes two strings and compares them by characters. If these two strings are equal, strCMP returns 0; if the first string is less than the second, it returns a negative number; if the first string is greater than the second, it returns a positive number. Therefore, when STRCMP is called, the corresponding code usually has the following form:

IF (strCMP (str3, str2) RE1_OP 0)

......

Here REL_OP is ===,! =,>,> =,

Question:

Check a standard C library function to redesign the corresponding interface makes it more resistant to errors. In order to make redesigned functions, what is the advantages and disadvantages of these functions?

Question:

Search for the MEMSET, MEMMOVE, MEMCPY, and STRN series functions used in a large number of code (such as Strncpy, etc.). In the call found, how many of the number of required functions are required to receive a zero count value? This convenience is enough to explain that the allowable function is reasonable to accept the zero count value? Chapter 6 Risk Business

If a programmer is placed on the edge of the cliff, how will he come down from the cliff? Is it climbing down along the rope? Still take a glider? Still simply jumping directly? Whether it is climbing down along the rope or using a glider, we are not too accurate, but it can be sure, he will not jump, because it is too dangerous. However, when a programmer has several possible implementations, they often only consider space and speed, and completely ignore risk. If the programmer is in such a cliff, it is ignored the risk, and only considers the most effective way to reach the bottom of the cliff. What will the result?

Programmers ignore risk, at least two reasons:

First, because they blindly think that there will be no mistakes, no matter how they implement the encoding. No programmer will say: "I am going to write a rapid sorting program and intend to have three errors in the program." The programmer did not plan to go wrong, and later mistakes appeared, they did not especially surprised.

I think the second reason for programmers ignore risk is also the main reason: there is never taught them to ask questions: "How much risk is there? How much risk is there? Is there a safer method? To write this expression? Can you test this design? "To ask these questions, we must first give up this point of view: No matter which option is made, you can always get the wrong code. Even if the point is correct, when can I get a wrong code? It is due to the use of secure coding, can you get a wrong code for a few days or weeks? Did not neglect the risk, there are many errors that need to be unleired after several month debugging and modification?

Therefore, this chapter will discuss some of the risks in some ordinary encoding practices, and how to reduce or even eliminate these risks.

How long is Long's bit field?

The National Standards Association (ANSI) committee reviews various C language running on many platforms. They found that although people think that C language is a transplantable language, it is actually not the case. Not only different systems are different, and the pre-processing procedures and language themselves are different in many important aspects. The ANSI committee has standardized most of the issues, so that the programmer can write a portable code, but ANSI standards ignore a very important aspect, it does not define some internal data types such as CHAR, INT, and LONG. The ANSI standards left these important implementations to the developers of the compiler to determine that the standard itself does not specifically define these types.

For example, an ANSI standard compulsory program may have 32-bit int and char. They are symbols in the default; another ANSI standard compiler may have 16-bit int and char, default, unsigned. Despite this different, these two compilers may strictly strictly have an ANSI standard.

Please see the following code:

CHAR CH;

......

CH = 0xFF;

IF (CH == 0xFF)

......

My problem is that the expression in the IF statement is true or for fake?

The correct answer is: I don't know. Because this is completely dependent on the compiler. If the characters are unsigned in the default, the expression is definitely true. However, for characters, such as compilers, such as 80x86 and 680x0, each test will fail, which is determined by the C language expansion rule. In the above code, the character ch is compared to the integer 0xFF. According to the C language expansion rules, the compiler must first convert CH to integer int, and then compare again. The key is that if the ch is symbol, the symbol bit is expanded in the conversion, and its value will be extended from 0xFF to 0xfff (assuming INT is 16 bits). This is the reason for the test failure.

The above is an example designed to prove the author's point of view. The reader may say that it is not a code with the actual meaning. However, there is also the same problem in the usual code below.

CHAR * PCH;

......

IF (* PCH == 0xFF)

......

In this definition, the CHAR type is not unique, the bit domain is incorrect. For example, how much is the value domain of the following bit domain?

int REG: 3;

Still don't know. Even if REG is defined as integer int, this implies it is a symbol, but according to the different compilers used, the REG can be both a symbol or unsigned. If you want REG to explicitly become a symbolic integer or unsigned integer, you must use Singned Int or Unsigned Int.

How big is the SHORT, INT, and the ANSI standard is not given. It is decided to leave it to the compiler.

Members of the ANSI committee are not disassembled for the problem of error defining data types. In fact, they examined a large number of C language implementations and concluded: Since the type definition between each compiler is so different, the definition strict standard will invalidate a large number of existing code. And this is exactly in violation of their guidelines: "The existing code is very important." Their purpose is not to establish a better language, but give the existing language standard, as long as it is possible, they will protect the existing code.

Constraints for types will also violate another guiding principle of the committee: "Keep the C language vitality, even if it does not guarantee that it has portability, it is necessary to make it fast." Therefore, if the implementation feels symbolic characters for a given The machine is more effective, then a symbolic character is used. Similarly, depending on the hardware implementation, INT can be selected as 16-bit, 32-bit or other digits, that is, in the default, the user does not know whether it is a bit field with a symbol or no symbol.

Internal types There is a deficiencies in its specifications, when upgrading or changing the compiler, or when you move to a new target environment, or even change the code and discover the compiler used. When the rules change, this shortage will be reflected.

This does not mean that the user cannot safely use the internal type, as long as the user does not have the type of type that does not have clearly explained the type of the ANSI standard. Users can safely use internal types.

For example, you can use a variable CHAR data type as long as it provides 0 to 127 value, this is a symbolic character and a symbolic crossover. So, when the code is written:

Char * STRCPY (Char * PCHTO, Char * PCHFROM)

{

Char * pchstart = PCHTO;

While (* PCHTO = * PCHFROM )! = '/ 0')

NULL;

Return (PCHSTART);

}

It works in any compiler because there is no counterfeit assumption. And the following code is not:

/ * strcmp - compare two strings

*

* If strLeft

* If strleft> strright, return a positive value

* /

INT strcmp (const char * strright)

{

For (NULL; * STRLEFT == * strright; strright )

{

if (strLeft == '/ 0') / * matches the last end character? * /

Return (0);

}

RETURN (* strLeft <* strright)? - 1: 1);

}

This code, lost portability due to the comparison operation of the last line. As long as the user uses the "<" operator or other operators to have symbolic information, forcing the compiler if the compiler will generate unmiglable code. Modifying strcmp is easy, you only have to declare strLeft and strright to unregulated character pointer, or fill it directly in the comparative form.

(* (unsigned char *) strright <* (unsigned char *) strright)

Remember a principle not to use "simple" characters in your expression. Since there are also the same problem, there is also a similar principle: do not use the "simple" bit.

If you carefully read the analysis ANSI standard, you can export the definition of the portable type set. These portable types can be operated in multiple compilers on multiple compilers.

Char 0 to 127

Signed char -l27 to127 (not -l28)

Unsigned char 0 to 255

The size is not fixed, but not less than 8 words

Short-32767 to 32767 (not -32768)

Signed short -32767 to 32767

Unsigned short 0 to 65535

Size, but not less than 16 words

Long -2147483647 to 2147483647 (not -2147483648)

Signed long -2147483647 to 2147483647

Unsigned long 0 to 4294967295

Size, but not less than 32 words

INT I: N 0 to 2 ^ (N-1) -1

Signed INT I: N - (2 ^ (N-1) -1) to 2 ^ (N-1) -1

Unsigned INT I: N 0 to 2 ^ (N) -1

The size is undecided, at least N-line bits

The most well-transplanted type is: They only consider three most common numerics: 补 补, 补 码, 符.

Now we don't have to worry about writing portable code. Handling this problem is like people who choose to pick up the tiles for their kitchen operators. Most people are willing to choose their favorite. In the future, the home buyers can tolerate the porcelain tiles, so that they don't have to sell houses to remove, Replace the veneer tile. The reader should also consider the portable code in the same way. In most cases, the portable transplantable code is as easy as the transplantability code. In order to avoid future repetitive labor, it is best to write portable code.

Use data type with strict definitions

Try to use portable data type

Some of the programmers may think that the use of portable types is less efficient than using "nature". For example, assume that the int type is most effective to the target hardware. This means that this "natural" bit may be greater than 16 bits, and the value maintained may be greater than 32767. It is now assumed that the user's compiler uses 32-bit int, and the topic requires a value domain of 0 to 40,000. So, considering that the machine can effectively handle 40,000 values ​​within Int, or insist on using the portable type, use long instead of int?

The answer is that if the machine is used 32-bit int. Then you can use 32-bit long, the code generated by the two is similar to the same (the fact prove is as follows), so use long. Users can use long efficiency on the machine that must be supported in the future, should also be used to use the portable type.

Data overflow or underflow

There are some code, the surface looks correct. But because there is a subtle problem in achieving, it has failed, which is the most serious mistake. "Simple Character" is this nature error. The following code also has such an error, which is used as a checklist for initialization standard TOLOWER macros.

Char chtolower [UCHAR_MAX 1];

Void Buildtolowrtable (Void) / * ASCII version * /

{

UNSIGNED CHAR CH;

/ * First of all, set each word as it yourself * /

For (CH = 0; CH <= uchar_max; ch )

Chtolower [CH] = CH;

/ * Now lower-write letters in the slot of uppercase letters * /

For (CH = 'a'; CH <= 'Z'; CH )

ChTOLOWER [CH] = CH 'A' - 'A';

}

......

#define TOLOWER (CH) (ChTOLOWER [(unsigned char) (ch)])

Although the code looks very reliable, it is actually buildtolowrtable very likely to hang the system. Look at the first cycle, when is CH greater than uchar_max? If you think "never", it is right. If you don't think so, please see the explanation below.

Suppose CH is equal to UCHAR_MAX, then the cycle sentence should perform the last time. However, before the final test, CH increased to uchar_max 1, which will cause a CH overflow to 0. Therefore, CH will always be less than or equal to UCHAR_MAX, and the machine will perform an unlimited loop.

By viewing code, is this problem?

Variables may also overflow, which will result in the same dilemma. Below is a code of implementing the Memchr function. Its function is to find a character that appears in the first time by checking the storage block. If the character is found in the memory block, the pointer to the character is called, otherwise, returns the null pointer. Like the buildtolowrtable above, the Memchr code seems to be correct, but it is actually wrong.

Void * Memchr (void * pv, unsigned char ch, size_t size)

{

UNSIGNED Char * PCH = (unsigned char *) PV;

While (- size> = 0)

{

IF (* PCH == CH)

Return (PCH);

PCH ;

}

Return (NULL);

}

When is the loop termination? The loop will only terminate only when the size is less than 0. But can SIZE less than 0? No, because size is an unsigned value, when it is 0, the expression -size will overflow and become the maximum unsigned bit of the type size_t definition. This overflow error is more serious than the error in BULDTOLOWERTABLE. If Memchr has found characters in the memory block, it will work correctly, even if the character is not found, it does not cause the system to hang up. Insist on checking until you find this character in somewhere and return to the pointer to the character. However, it is also possible to produce a very serious mistake in some applications.

We want to compile programs to issue a warning for "simple characters" errors and two errors above. But there is almost no compiler gives these issues to warn. Therefore, the programmer will rely on self-discovery overflow and overflow errors before compiling the program generator.

However, if the user follows the recommendations by the recommendation of Chapter 4 of this book, then these three errors can find it. The user will find that * PCH has been converted to 0xffffffffffff, the CH overflow is 0xFFF before comparing the 0xFF. Since these mistakes are too subtle, it may be a few hours to read the code carefully, and it will not find the overflow, but if the data stream of the program in the debug state can be easily discovered.

I often ask: "Is this variable expression feeding or overflow?"

Skilled body

Another common upper overflow example can also be seen in the code below, which converts an integer to the corresponding ASCII:

Void INTOSTR (Int i, char * str)

{

CHAR * STRDIGITS;

IF (i <0)

{

* STR = '-';

i = -i; / * Turn I to positive value * /

}

/ * Reverse sequence exports each bit value * /

Strdigits = STR;

DO

* STR = i% 10 '0';

While ((i / = 10)> 0);

* Str = '/ 0';

ReverseStr (strDigits); / * Transform the order of the numbers to a positive order * /

}

If the code is running on a binary complement machine, there is a problem when i is equal to the minimum negative (for example, the 16-bit machine -32768). The reason is that -i in expression i = -i; that is, overflowing the range of int types. However, the real mistake is in the way the programmer implements the code: the programmer does not fully follow his own design, but only approximates his design.

In the design requirements: "If i is negative, add a negative number, then convert the no symbolic portion of i to ASCII." The above code does not do this. It is actually performed: "If i is negative, add a negative, then the positive value of i is converted to ASCII." It is the number of symbolic numbers that cause all trouble. If the code will be performed well according to the algorithm and the number of no sign is used. You can divide the above code into two functions, which is very useful.

Void INTOSTR (Int i, char * str)

{

IF (i <0)

{

* STR = '-';

i = -i;

}

UnStostr (UNSIGNED) I, STR);

}

Void Unstostr (unsigned u, char * STR)

{

Char * strstart = STR;

DO

* STR = (u% 10) '0'; while ((u / = 10)> 0);

* Str = '/ 0';

ReverseStr (strStart);

}

In the code above, I should be handed over, which is the same as the previous example, why can it work properly? This is because if i is the minimum negative number -32768, binary complement form is expressed as 0x8000, then pass all bits (ie 0 changing 1) plus 1 to get negative, so that the -i is 0x8000, if there is The number of symbols, indicating -32768, if the number of unsigned numbers, represents 32768. According to the definition, any number represented by binary complement, by falling each of the ports, the negative value of the number can be obtained. Therefore, 0x8000 is indicated by the minimum negative number-32768, that is, 32768, thus should be interpreted as unsigned number.

At this point, the code is correct, but it is not beautiful. The above code is easy to produce illusions. According to the type of portable. -32768 is not a valid portable integer value, so all confusion can be excluded by inserting assertions in an appropriate location in INTSTR.

Void INTOSTR (Int i, char * str)

{

/ * i is beyond the range? Use longtostr ... * /

ASSERT (I> = - 32768 && i <- 32767);

By using the above assertion, it is possible to avoid problems related to some numeria, and other programmers can write less transplantations. No matter what, remember:

Improve design as accurately, it may be wrong to achieve design.

Each function completes its own task

I used to examine the character window code, which is a class window library designed for Microsoft-based DOS applications. I do this because two groups of PC-Word and PC-Works that use the library feel that the library The code is huge, slow, and unstable. I first started to examine the code, I found that the programmer did not implement the code according to their original design, and this exactly violates another guiding principle to prepare the error code.

First introduce the background.

The basic design of the character window is very simple: the user treats video as a collection of windows, each window can have its own sub-window. Design a root window that represents the entire display, it can have a menu box, a drop-down menu, an application document window, a dialog, and other subsequent windows. Each sub-window may have its own sub-window. For example, the dialog window may have a sub-window set up for the OK key and the CANCEL key, and it is also possible to include a list box window, which may have a sub-window for the scroll bar.

In order to indicate the window hierarchy, the character window uses a binary tree structure. A branch is a sub-window called "Children"; another branch pointing to a window called "Siblings" with the same parent:

Typedef struct window

{

Struct Window * PWNDCHILD; / * If there is no child, null * /

Strcut window * Pwindsibling; / * If there is no brothers and sisters, null * /

CHAR * STRWNDTILE;

/ * ... * /

} Window; / * Name: WND, * PWND * /

You can review any algorithm books and find an effective way to handle the binary tree. But when I read the code in the character window code, I am a bit surprised when I am in the code that is inserted into the sub-window, and the code is as follows;

/ * Point to the top level window list, such as icon menu box and main file window * /

Static window * PWNDROOTCHILDREN = NULL;

Void AddChild (Window * PwndParent, Window * PWndNewBorn)

{

/ * New window may only have a child window ... * /

Assert (PWndNewBorn-> PWNDSIBLING == NULL);

IF (pwndparent == null)

{

/ * Add the window to the top-level list * /

PWNDNEWBorn-> PWNDSIBLING = PWNDROOTCHILDREN

PWNDROOTCHILDREN = PWNDNEWBORN;

}

Else

{

/ * If it is the first child of the parents, then start a chain,

* Otherwise add to the end of the existing brother chain

* /

IF (PWNDParent -> PWNDCHILD == NULL)

PWNDPARENT -> PWNDCHILD = PWNDNEWBORN;

Else

{

WINDOW * PWND = PWNDPARENT -> PWNDCHILD;

While (pwnd-> pWndsibling! = NULL)

PWND = PWND -> PWNDSIBLING;

PWND -> PWNDSIBLING = PWNDNEWBORN;

}

}

}

Although it is actually designed to design the window structure as a binary tree structure, it is not implemented in this structure. Since the root window (indicating the entire display window) does not have a title, there is no action, there is no action, which does not move, hide, delete, only the PWNDCHILD field pointing to the menu box and the app in the Window structure is meaningful. of. So someone thinks that the full WINDOW structure is a waste, and uses a simple pin of the top window to replace the WNDROOT structure.

With a pointer instead of WNDROOT, some bytes may be saved in the data space, but the cost of the code is huge. The examples like AddChild have to handle two different data structures: the linked list of the root window tree, instead of processing simple binary trees. When each routine is used as a parameter with a window pointer, it has to check the dedicated NULL pointer indicating the display window, which is much more. No wonder PC-Word and PC-Works two groups worry that the code is huge.

I proposed ADDCHILD issues that I didn't want to discuss design issues, but I want to point out that this implementation method is at least violated three principles in writing unleired code guidelines. The first two principles have been described above: "Do not accept parameters with special meaning", such as NULL pointers; "" according to the design and cannot be realized approximately. "The third new principle is: Try to make each function Complete a business.

How do I understand this new principle? If addChild has a task, you want to add a sub-window in an existing window, and the above code has three separate insertion processes. Common sense tells us if there are three code rather than a code to complete a task, it is likely to be wrong. For example, brain surgery, if you can do it, you can't do it three times, write code is also this. If you write a function, I found that I have to make a task multiple times, I have to stop asking yourself, whether you can use a code to complete this task.

Sometimes this function is also required, and it performs a function to do twice, such as the MEMSET fast version of Chapter 2 (please review it, it has two separate fill cycles, a fast, one slow). If you can affirm your own reason, you can also break this principle. Improved the first step in AddChild is very easy, delete "optimization", and implement it according to the original design. With a pointer PWNDDisplay pointing to the window structure, instead of PWNDROOTCHILDREN, the window structure represents the screen display, passes the PWndDisplay to AddChild, not to pass Null to AddChild, and do not need to have a dedicated code for processing the root window:

/ * Assign root window in the process of program initialization

* PWndDisplay will be set to point to root window

* /

Window * pwnddisplay = null;

Void AddChild (Window * PwndParent, Window * PWndNewBorn)

{

/ * New window may only have a child window * /

Assert (PWNDNEWBorn -> PWNDSIBLING == NULL);

/ * If it is the first child of the parents, then start a chain,

* Otherwise add to the end of the existing brother chain

* /

IF (PWNDParent -> PWNDCHILD == NULL)

PWNDPARENT -> PWNDCHILD = PWNDNEWBORN;

Else

{

WINDOW * PWND = PWNDPARENT -> PWNDCHILD;

While (pwnd-> pWndsibling! = NULL)

PWND = PWND -> PWNDSIBLING;

PWND -> PWNDSIBLING = PWNDNEWBORN;

}

}

The above code not only improves AddChild (and other functions that each with the tree structure), but also correct the original version of the root window to insert.

A "task" should be completed once

Why is the window that is hierarchical?

Why do you need a hierarchical window, one of the most important reasons is to simplify the movement, concealment, and delete the window. If the mobile dialog window, the OK and CANCEL box are still in the original location? Or, if a window is concealed, is its sub-window? Obviously, this is not expected. By supporting the sub-window, it can be said: "Move this window" and all related windows will tighten.

Avoid irrelevant IF, &&, but "but" branch

The last version of AddChild is better than the previous version, but it is still two-way code to complete the same task. The appearance of the IF statement indicates that there may be a case where the same task is repeated twice, although the way two execution is different. But the appearance of the IF statement should knock the alarm in people's minds. Indeed, there are some fields that require legal use of the IF statement to perform some conditional operations, but in most cases, this is the result of the coarse achievement of grass rate design. Because the design organization that will be full of exceptions is more likely to stop and export the model without exceptions.

For example, in the window structure, there can be two ways to traverse the brothers: one method is to enter the circulation of the pointing window structure. From a window to another window, it is a window-centric algorithm; another method is to enter the circulation of the pointer, from a pointer to another, which is a pointer-centered algorithm. The above-described ADDCHILD is implemented in a window-centric algorithm.

Since the current version of AddChll is to scan the brother chain list to attach a new window, there is a special process of the first pointer. The additional window is actually established a pointer to a new window within the "next window pointer" domain in the previous window. Note that from a window to another window, the pointer of the previous window may be a brother pointer or a father and child pointer. Special processing ensures that the correct pointer is modified. However, if you use a pointer-centric type, you always point to "Next Window Pointer". The "next window pointer" is a father and son pointer or a brother's pointer is not critical, so there is no special treatment. For ease of understanding, please see the following code.

Void AddChild (Window * PwndParent, Window * PWndNewBorn)

{

Window ** ppwindnext;

/ * New window may have no brothers window ... * /

Assert (PWNDNEWBorn -> PWNDSIBLING == NULL);

/ * Use a pointer-centric algorithm

* Set PPWndNext point to PWNDPARENT -> PWNDCHILD

* Because PWNDPARENT -> PWNDCHILD is the first "next brother pointer" in the chain

* /

PPWndNext = & PWNDPARENT-> PWNDCHILD;

While (* ppwndnext! = NULL)

PPWndNext = & (* PPWndNext) -> PWNDSIBLING;

* ppwndnext = PWndNewBorn;

}

The above code is a modification of the classic dumb-header linked list, which is famous for handling empty lists because no special code is required.

Don't worry that the above ADDCHILD code will violate the previous proposed principles, that is, accurately implement design without approximation. The code is not designed as people usually think, but it does really implement design. Just like we observe the glasses, this lens is convex or concave? The two answers may be right, this is to see how to observe it. For addChild, use pointer-centric algorithms that you don't have to write code for special circumstances.

Don't worry about the efficiency of the last version AddChild. The code it produces will be much less than the code generated by any of the previous versions. Even the code of the circular part may also be better than the code generated by previous versions. Don't think that the cycle is complicated than before, but it is true (compile these two versions, see the result).

Avoid irrelevant IF statements

"?:" Operators are also an IF statement

The C programmer must face: "?:" operator is just another representation of the IF-ELSE statement. Because we often see "?:" When some programmers are judged, IF-ELSE statements are not explicitly used. There is such an example in the Excel dialog processing code, which contains the following function, the function of this function is to determine the next state of the check box:

/ * UCYCLECHECKBOX - Return to the dialog box underground status

*

* The current settings are given, and the UCUR return dialog should have the next state

* This is handled only 0 and 1 to handle it in 2, 3, 4, 2 ... three states

* Tri-state check box in circulation

* /

Unsigned UcyclecheckBox (unsigned ucur)

{

Return ((UCUR <= 1)? (1-UCUR): (UCUR == 4)? 2: (UCUR 1));

}

I have worked with those who don't want to nested twice "?:" Working with UCYCLECHECKBOX, but when it is explicitly used to write their names on the code of the IF statement, although most compilers but The code of these two functions generated by non-best compiler is almost the same, they still turn to COBOL. USIGNED UCYCECHECKBOX (unsigned ucur)

{

UNSigned URET;

IF (UCUR <= 1)

{

IF (ucur! = 0) / * Process 0, 1, 0 ... loop * /

URET = 0;

Else

URET = 1;

}

Else

{

IF (ucur == 4) / * Process 2, 3, 4, 2 ... loop * /

URET = 2;

Else

URET = UCUR 1;

}

Return (URET)

}

Although some compilers are indeed nested "?:" The version has produced a better code, but it is actually good. If the user's target already has a high efficiency compiler, try the code below. Do a comparison.

Unsigned UcyclecheckBox (unsigned ucur)

{

UNSigned URET;

IF (UCUR <= 1)

{

URET = 0; / * Process 0, 1, 0 ... loop * /

IF (ucur == 0)

URET = 1;

}

Else

{

Curet = 2; / * Treatment 2, 3, 4, 2 ... loop * /

IF (ucur! = 4)

URET = UCUR 1;

}

Return (URET);

}

Take a closer look at the three versions of Ucyclecheckbox, although you know what they may do, but it is not a bit. If input 3 will return? Can you easily get an answer is 4? I can't. These functions of two simple cycles are very clear, but it is difficult to master.

Use "?:" The problem exists in: Because it is very simple, easy to use, it seems that it is an ideal method for generating efficient code, so the programmer will no longer look for better solutions. More seriously, the programmer will convert the IF version to "?:" To get "better" solution, and actually "?:" Is not good. This is like trying to get more money by replacing 100 dollars of banknotes to 10,000 cents, the money is not increased, but it is not convenient to use it. If the programmer spends the time to spend the replacement algorithm, not the same algorithm is implemented in a slightly different manner, then the following more direct implementation methods are proposed:

Unsigned UcyclecheckBox (unsigned ucur)

{

Assert (ucur> = 0 && ucur <= 4);

IF (ucur == 1) / * Re-start the first loop? * /

Return (0);

IF (UCUR == 4) / * Re-start the second loop? * /

Return (2);

Return (UCUR 1); / * There is no special handling * /

}

Perhaps someone will ask the following list solution:

Unsigned UcyclecheckBox (unsigned ucur)

{

Static const unsigned unsigned unstestate [] = {1,0,3,4,2}; assert (ucur> = 0 && ucur <= 4);

Return (UnextState [UCUR]);

}

By avoiding use "?:" You can get a better algorithm, not just a good way. The list is compared to the previous version, which is better understood? Which is the best code? Which is easier for the first time? You should understand some truths.

Avoid using nested "?:" Operators

Eliminate the redundancy of the code

In implementations, special circumstances are sometimes supported. In order to avoid the rabbit, we have independently of the code that handles special circumstances. In this way, the maintenance personnel will not unintentionally cause errors without unintentional in the future maintenance.

Two versions of implementing INTSTR have been given, and the INTOSTR code that often appears in the C programming tutorial (called ITOA in the tutorial):

Void INTOSTR (Int i, char * str)

{

INT IORIGINAL = i;

CHAR * PCH;

IF (iRiginal <0)

i = -i; / * Turn I to positive value * /

/ * Away to string * /

PCH = STR;

DO

* PCH = i% 10 '0';

While ((i / = 10)> 0);

IF (IRIGINAL <0) / * Don't forget the negative sign * /

* PCH = '-';

* PCH = '/ 0';

ReverseStr (STR); / * Subsequent skewers are transferred from reverse sequence to positive order * /

}

Note that there are two IF statements in the code, and the test is the same special example. Since it is easy to write two code to an IF statement. We must ask "Why don't you do so?"

Sometimes, the repeated test does not occur in the IF statement, and occurs within the condition of the for or while loop statement. For example, another code that implements the Memchr function is given:

Void * Memchr (void * pv, unsigned char ch, size_t size)

{

UNSIGNED Char * PCH = (unsigned char *) PV;

Unsigned char * pchend = pch size;

While (PCH

PCH ;

RETURN ((PCH

}

Take a comparison with the following Memchr version:

Void * Memchr (void * pv, unsigned char ch, size_t size)

{

UNSIGNED Char * PCH = (unsigned char *) PV;

Unsigned char * pchend = pch size;

While (PCH

{

IF (* PCH == CH)

Return (PCH);

PCH ;

}

Return (NULL);

}

Which one looks better? The first version should compare PCH and PChend twice, the second version is only compared. Which one is clearer? What is more important? Is it possible to correct?

Since the second version only performs block range inspections in the While condition, it is more likely to understand and accurately implement the functionality of the function. The only strength of the first version is to save some paper when you need to print the program out. Each special situation can only be handled once

Don't return too dangerous

Does the Memchr given above correctly? Do you see that the two versions have the same fine error? Prompt: When the PV points to the last 72 bytes of the storage area, and size is 72, what is the range of memory to find the storage area? If the answer is "all the range of the storage area, repeatedly look up." Then your answer is right. Due to the use of risks in the program, Memchr has fallen into an unlimited cycle.

Risk-speaking language is such a phrase or expression, which seems to be able to work correctly, but in fact, in some special occasions, they do not implement correctly. The C language is the language of such a usual language. The best way is: Whenever it is possible to avoid using these usual languages. Risk in Memchr is:

Pchend = PCH Size;

While (PCH

...

Among them, PChend points to the next location of the last character found in the storage area, so it can be used in the While expression. The programmer feels that this is very convenient. If the store is in the stored location, the program will work very well, but if you just find the end of the memory, then the location does not exist (an exception is : If you use ANSI C, you can always calculate the address of the first element outside of an array. Ansi C supports this performance).

As a first step for correcting errors, rewrite the above code as follows:

PChend = PCH SIZE - 1;

While (PCH <= pchend)

...

However, this still does not work correctly. Remember the UCHAR_MAX over the buildtolowrtable in the earlier? There is also the same error here. Now PChend may point to a legitimate storage location, however, since each PCH is overflow, the loop will not terminate.

When you can use a pointer to use a counter, use a counter as a control expression is a safe way to override a range:

Void * Memchr (void * pv, unsigned char ch, size_t size)

{

UNSIGNED Char * PCH = (unsigned char *) PV;

While (size -> 0)

{

IF (* PCH == CH)

Return (PCH);

PCH ;

}

Return (NULL);

}

The above code is not only correct, but the resulting code may be better than the code generated by each version because it does not have to initialize PChend. Since Size must be copied first before 1, it is necessary to compare with 0, so people usually think that the size-version is larger than the PChend version and is slow. However, in fact, for many compilers, the size - version is hacal faster, somewhat. This depends on how the compiler is using registers. For the 80x86 compilation program, it depends on which storage model is used. Anyway, in terms of size and speed, the size of the Size-version and the PChend version is very small, and it is not worth noting.

Hereinafter, another idiom is given, in fact, it has already been mentioned. Some of the programmers may try to rewrite the cyclic expression, use -size instead: while (--Size> = 0)

......

The reasonable side of this change is: the above expression does not produce a worse code than the previous expression. In some cases, a little better code may occur. But its only problem is that if it blindly pursues, the error will raise the code as the flies see the livestock.

why?

If the size is an unsigned value (like the Memchr), depending on the definition, it will always be greater than or equal to 0, and the loop will perform the permanent transport, so the expression cannot work properly.

If SIZE is a number of symbols, the expression cannot work properly. If SIZE is an int type and a minimum negative INT_MIN enters the loop, it will be subtracted first, then the overflow is generated, so that a large number of times in the loop is performed.

Instead, no matter how Size can make "size-> 0" correctly. This is a small, but it is very important.

The only reason for programmers using "- size> 0" is to speed up the speed. Let's take a closer look, if there is a speed problem, then this improvement is like a nail knife cut lawn, but it can do this, but there is no effect. If there is no speed problem, why do you want to take this risk? This seems that there is no need to let the grass leaves of the lawn are as long, and there is no need to make each line of code efficiency. It is necessary to realize that the most important thing is the overall effect.

At some of some programmers, giving up any opportunity to get efficiency seems to be similar to crime. However, after reading this book, you will get this idea. Even if efficiency may be slightly lower, you should use secure design and implementation to systematically reduce risk. Users should not pay attention to whether there is some loop in a certain place, and should focus on whether it is accidentally introduced when trying to save some cycles. In terms of investment, it is: profit does not prove that the adventure is correct.

Using shift operations to achieve multiplication, division, mode 2 power is another risk-free language. It belongs to the "Waste Efficiency" class. For example, the MEMSET fast version given in Chapter 2 has the following line:

PB = (Byte *) longfill ((long *) PB, 1, Size / 4);

Size = SIZE% 4

It can be sure that there are some programmers who want to: "How low efficiency" when reading the code above. They may write other forms:

PB = (Byte *) longfill ((long *) PB, 1, Size >> 2);

Size = Size & 3

Shift operation is faster than division or moderate, which is right on most machines. However, such operations that are powered by 2 power removal or designs unsigned values ​​(such as size) have been optimized, even if the commercial computer is the same, it is not necessary to manually optimize these unsigned expressions.

So, what will be symbolic expression? Is explicit optimization worth it? It's not worth it.

Assume that there is a symbolic expression:

MidPoint = (Upper Lower) / 2;

When the value of the symbol number is negative, the shift is different from the result of the symbol division, so the compiler of the binary complement is not optimized to shift. If we know that Upper Lower in the above expressions is always positive, you can use shift to rewrote the expression to be made as follows, this code is better:

MidPoint = (UNPER LOWER) >> 1 Therefore, optimizing a symbolic expression is worth it. However, is the shift is the best way? Not. The mandatory conversion method shown below is also very good, and it is much more secure than the shift method. Please try it on the compiler:

Midpoint = (UNPER LOWER) / 2;

The above code is not what tells the compiler to do what is needed, but will be passed to the compiler. By telling the result of telling the compiler, it is not a symbol-free, and it can be displaced. Now compare two optimizations, which is easier to understand? That more portable? That is more likely to be correct in the first execution?

Over the years, I have found a lot of division of signed values ​​because the programmer uses shifts, and the symbol value cannot be guaranteed to be positive and the error caused by the positive value; discovers the shift error of many directions; found Many shift bits are wrong to displacement errors; even discovering the priority of the expression "A = B C / 4" to "A = B C >> 2" due to accidents. But I have never discovered that in other '/' and '4' to achieve errors in division 4.

The C language has many other risk-related languages. There is a best way to find the risk-free language that you often use, this is to check every error that appeared before, and then ask yourself: "How to avoid these errors?" Then establish a personal risk to use the language Avoid using these usual languages.

Avoid using risk-free language

Don't estimate the price too high

In 1984, when Apple developed Macintosh, Microsoft was one of several companies that adopt Macintosh products. Obviously, products with other companies are both beneficial and harmful to Microsoft. With Macintosh means that with the development of Macintosh itself, Microsoft must continue to develop the corresponding products. Therefore, Microsoft's programmers must often use the Work-Arounds to make Macintosh work normally. However, when Apple's first version of the Macintosh operating system is upgraded. In the early test, the new version of the operating system found that Microsoft's products were not working properly. To simplify the problem, Apple requests Microsoft to delete the outdated working environment (Work-Arounds) to keep the same as the latest operating system.

However, deleting the working environment in Excel means rewriting the hand-optimized assembly language process. Increase 12 cycles in your code. Since this process is critical, we have discussed whether to rewrite this problem. Some people think that they should be consistent with Apple, and another person has to keep the speed.

Finally, the programmer adds a counter to a function, allowing Excel to run three hours and observe the frequency called by the function. This function is called about 76,000 times, which will only increase 3 hours to 3 hours and 0.1 seconds.

This example is justified: the local efficiency is not worth it. If you pay attention to efficiency, please focus on the efficiency of global efficiency and algorithms so you will see the effect of your efforts.

Inconsisal is obstacles to correct code

Please see the following code, which contains the simplest type of error - priority error:

Word = high << 8 low;

The code is originally combined with two 8-bit bytes into a 16-bit word, but because the operator is high, the code is actually implemented, it moved 8 low Bit. Programmers generally do not mix the shift operators and arithmetic operators. If you only use the shift class operator or only the arithmetic class operator can be completed, then why should the shift operator and an arithmetic operator? Word = high << 8 | low; / * shift solution * /

Word = high * 256 low; / * arithmetic solution * /

Does these formats are difficult to understand? Do they have low efficiency? of course not. These two solutions are very different, but these two solutions are correct.

If the programmer uses only one type of operator while writing, the probability of error code is small, because the priority order of the same type operator is easy to master. Of course there is also exception, but as a principle, this is right. How many programmers have, think of adding first after the mind, but write the following expression?

MidPoint = Upper Lower / 2;

Since the programmer has learned a number of logical courses, familiarity F (A, B, c) = AB C, so remember that the priority order of the shift operator does not have any difficulties. Most programmers know the order (from high to low) is "~", "&", "/". It is easy to think of the shift operator between "~" and "&", because it is not "~" constraints are so tight (think about it ~ a << 2), but it is more preferred than "&" High levels (wants to be able to have a power to operate and multiplication).

Programmers may clearly know the priority of various operators, but it is prone to problems when they mix all kinds of operators. Therefore, the first principle is that if possible, do not mix different types of operators. The second principle is that if you must mix different types of operators, use parentheses to isolate them.

You have seen the first principle how to make the code from an error. Please look at the While loop below, this is given, from this example, you can see how the principle avoids the code error:

While (ch = getchar ()! = EOF)

...;

The above code will use the assignment operator with the comparison operator to introduce a priority error. The above error can be corrected by rewriting the loop for mixing the operator, but the result is very bad:

DO

{

CH = GetChar ();

IF (CH == EOF)

Break;

...;

WHILE (TRUE);

In this case, it is best to break the first principle, use the principle of the second, separate the operator with parentheses:

While ((ch = getchar ())! = EOF)

...;

It is not necessary to use the type of operator to mix, if necessary

Different types of operators are mixed, and they are separated by brackets.

No priority table

When you want to insert a parentheses, some programmers always check the priority tables first, then determine if there is necessary to insert parentheses, if there is no need to do it. For such a programmer, you must remind him: "If you must check the table to determine the priority order, then it is too complicated, simple." This means that you can insert parentheses without parentheses. This is not only correct, but it is obvious that anyone can judge the priority without checking the table. Avoid contact with errors together

In the previous chapter, we discussed to avoid returning an error value while designing a function, so as to prevent the programmer to handle or miss these return values ​​(such as Toolower, when CH is not uppercase characters, it returns -1). This chapter, we have to talk about this topic, "Do not call the return error", which will not handle or miss the error conditions returned by other people's functions. Sometimes this function must be called, in which case, this error handling code must be taken in the debugging system, ensuring that the function works correctly.

However, it is necessary to emphasize that if the same error condition is repeatedly processed from the beginning to end, the error handling portion is independent. The simplest method is also a method that each programmer knows is to put the error handle in one child, which is good. But in some cases, you can do better than this.

For example, the character window has a code that can be one window in six or seven, as follows:

IF (FresizeMemory (& PWND-> Strwndtitle, Strlen (StrNewTitle) 1)))

STRCPY (PWND-> strwndtitle, strnewtitle);

Else

/ * Can't assign space for window name ... * /

In the case where the storage area has sufficient space to store the new name, the above code changes the name of the window. If there is not enough space, it will keep the current name of the window and try to process the error. The problem is, how to deal with this error? Alarm to the user? Do not tell the original name quietly? Use the new name to copy it to the current name? These solutions are not ideal, especially when crossed as part of a generic sub-process, is more unsatisfactory.

The above is just a case in a variety of situations that do not want the code fail. To rename a window, always get it.

The problem exists above is that there is no enough storage space to store a new window name. However, this problem is easy to solve if you are willing to allocate the namespace. For example, in a typical character window application, only a few windows need to be renamed, these names account for a lot, even if the name is the maximum length, we allocate enough space for the maximum name, and Not only the space required for the current name length. So, the rename window becomes the following simple code:

STRCPY (PWnd -> strwndtitle, strnewtitle);

It can also be changed, hidden in the renameWindow function, using assertions to verify that the assigned namespace is sufficient, it can store any any name:

Void RenameWindow (Window * PWnd, Char * Strnewtitle)

{

ASSERT (FVALIDWINDOW (PWND));

Assert (Strnewtitle! = NULL);

Assert (FvalidPointer (pwnd-> strwndtitle, strlen (strnewtitle) -1);

STRCPY (PWND-> strwndtitle, strnewtitle);}

The disadvantage of this method is that the storage area is wat when the namespace is overwhelmed. But at the same time, because any error handling code is not required, the code space is completed. The problem now is to weigh data space and code space, and determine which one is more important according to the specific situation of operation. If there are thousands of windows in the program to rename, you may not have an overwhelmed window name space.

Avoid calling a function of returning an error

summary

At this point, this chapter says: programming is "risky industry" has been very clear. All views in this chapter focus on how to convert a risk-free encoding conversion into code in space, speed, and even ergonomic.

However, don't advancing at all points given in this chapter, we must constantly sum up our new ideas in practice, and strictly abide by these principles when coding. Do you think about every coding habit? Whether to use a certain code habit because of seeing other programmers? The programmer who has just gotten often believes that the shift is achieved by a "skill", while experienced programmers think this is very obvious, there is nothing to be doubts, which is correct?

Important:

l should be cautious when selecting data types. Although the ANSI standard requires all the execution programs to support CHAR, INT, LONG, etc., but it does not specifically define these types. In order to avoid the error, you should only select the data type according to the standard of ANSI.

l Since the code may run on an unmetive hardware, it is likely that the algorithm is correct and is wrong. Therefore, it is necessary to check whether the calculation results and the data type range of the test results are overflow or underflow.

l When implementing a certain design, it must be achieved strictly in accordance with the design. If you only implement the request, it is easy to go wrong when writing code.

l Each function should have only a strict definition task, not only that, but also to complete each task should also have only one way. If the same code can be performed regardless of the input, it will greatly reduce the probability of the mistakes that are not easy to discover.

l if statement is a warning signal, indicating that the job does more than the need. Strive to eliminate every unnecessary if statement in your code, often ask yourself: "How to change the design to delete this special situation?" Sometimes you may want to change the data structure, sometimes change the way of examining the problem, just like the lens is convex Still the same as the concave problem.

l Sometimes the IF statement is hidden in the control expressions of the While and FOR loops. "?:" The operator is another form of the IF statement.

l For the risks of language, pay attention to those similar but safer words. It is especially be wary of those who look like a good code, because such implementation has little significant impact on overall efficiency, but increases additional risk.

l When you write an expression, try not to mix different types of operators. If you must use it, use parentheses to separate them.

l Special circumstances in special circumstances is wrong. If possible, you should try to avoid the function that can fail. If you have to call the return error, you should have the error handle the error to all errors, which will increase the opportunity to find errors in the error handling code.

l In some cases, it is possible to cancel a general error handling code, but to ensure that what does not fail. This means that the error will be handled in one time or from fundamentally changed design during initialization.

Exercise:

1) What is the portable range of "simple" one bit domain?

2) If the value of the Boolean value is represented by "simple" bit domain, what is the function that returns a Boolean value?

3) From the second version of AddChild to the final version, use global variables PWndDisplay to point to the allocated window structure that represents the entire display. If so, you can also declare a global window structure: WNDDISPLAY. Although this is ok, why don't you do this? 4) If a programmer is proposed; in order to improve the efficiency, the following cycle should be

While (Expression)

{

A;

IF (f)

B;

Else

C;

D;

}

Rewriting is:

IF (f)

While (Expression)

{

A;

B;

D;

}

Else

While (Expression)

{

A;

C;

D;

}

The above A and D represent the inventory group. The second version is fast, but what is the risk of comparing it with the first version?

5) If you read the ANSI standard, you will find some functions, which has several names that have almost the same parameters. E.g:

INT STRCMP (const char * s1, const char * s2);

Why do such a function is risky? How should I eliminate the risk?

6) There is risks in the following conditions:

While (PCH <= pchend)

But why is there a risks using similar decreasing cycles?

While (PCH -> = PCHSTART)

7) Some programmers use the following simplified methods to improve efficiency or make problems. Why should you avoid this?

a) use printf (STR) instead of Printf ("% S", STR);

b) instead of f =! f with f = 1-f;

c)

INTCH; / * CH must be an integer * /

......

CH = * STR = getchar ();

......

Instead of two independent assignment statements.

8) UCYCLECHECKBOX, TOLOWER and Chapter 2 disassembler use table driver algorithm, what is the advantages and risk of use?

9) Suppose your compiler cannot automatically provide shift operators for an arithmetic operation of unsigned 2 power, then in addition to risk problems and unavailable issues, why should we avoid use of shifts and "With" operation?

10) An important principle in programming is: Never lose user data. Suppose you are doing the WordsNasher project, you need to write a "save file" subroutine, at this time, in order to save the user file, you must assign a temporary data buffer to the user file. The question is if you can't assign a buffer, you can't save the file, so that the above principles are violated. How can I guarantee that the user file is saved?

Question:

List all the risk language features that you can think of into a form, list the advantages and disadvantages of each feature. Subsequently, in the case of each item in the table, you would rather adventure and use this feature.

Chapter 7 Elephant in Coding

Write a novel, I hope that every page can attract readers, so that the readers are excited, surprised, suspense! Never make readers feel bored. So every page must be satched with pepper, describe some scenes to attract readers, if the novel is written; "criminal approaches Joe and stabbed him", readers will sleep. In order to make readers are interested, it is necessary to describe what is the fear of Joe, when describing the footsteps of Qiao!! Hey! ", The reader can feel Joe's fear; when" 咚! 咚 "football When the readers can feel the hands of Joe in sweating; when the footsteps accelerate, the readers can understand what Joe is panicked. Panic. The most important thing is that the readers maintain a suspense, Can Qiaon can't escape? ... Use amazing and suspense in novels. It is also necessary. But if you put them in your code, it is bad. When writing code, "plot" should be intuitive so that other programmers can know everything that will happen. If you use the code to express a criminal approach Joe and stabbed him, then "the criminal approaches Joe and stabbed him". This code is short, clear, and tells everything that happens. However, for some reason, the programmer refused to write a simple and clear code, but it is best to use skill, more refined, exotic coding method, it is best not to do this.

But the intuitive code doesn't mean that it is a simple code, and the intuitive code allows you to reach the B point along a clear path. It is possible to intuitive code when necessary.

Therefore, this chapter will investigate the programming style that produces non-intuitive code. Examples are very clever, skill, but it is not obvious, of course, these programs will cause some subtle mistakes.

Do you want to pay attention to what?

The following code is the error version of the Memchr given by the previous chapter:

Void * Memchr (void * pv, unsigned char ch, size_t size)

{

UNSIGNED Char * PCH = (unsigned char *) PV;

While (size -> 0)

{

IF (* PCD == CH)

Return (PCH);

PCH ;

}

Return (NULL);

}

A game that most programmers play is "How do I make the code faster?" Game. This is not a bad game, but as we feel from this book: If you are passionate about this game, it is a bad thing.

For example, if you play this game on the above example, you will ask yourself: "How to speed up the loop?" There are only three possible ways: delete the range check, delete character test, or delete the pointer to increase, if you delete which one The steps are not good, but if you are willing to abandon the traditional coding method, you can delete it.

Look at the range check, the reason for this check is just because: When you do not find the characters you want to find within the head size byte, you will return NULL. To delete the check, you can find a CH character simply. This can be implemented by the following method: Store the character CH on the first byte behind the lookup stored area. In this way, if there is no character in the storage area, you can find the following CH characters stored:

Void * Memchr (void * pv, unsigned char ch, size_t size)

{

UNSIGNED Char * PCH = (unsigned char *) PV;

UNSIGNED Char * PCHPLANT;

UNSIGNED Char Chsave;

/ * PCHPLANT points to the first byte behind the storage area to be viewed

* Store CH in the byte referred to in PCHPLANT to ensure that Memchr will definitely be hung by CH.

* /

PCHPLANT = PCH SIZE; Chsave = * pchplant;

* pchplant = CH;

While (* PCH! = CH)

PCH ;

* pchplant = chsave;

RETURN ((PCH == PCHPLANT)? NULL: PCH);

}

Is it ingenious? correct? By covers the PCHPLANT pointing with CH, it can ensure that MEMCHR can always find CH, which can delete the range check to double the speed of the loop. However, this is strong, is it reliable?

The new version of this Memchr seems to be firm, especially it carefully saves the characters that PCHPLANT to be covered by it, but the Memchr is still problematic. For beginners, please consider the following:

l If PCPLANT points to a read-only memory, you don't work in * pchplant, you don't work, so the function will return an invalid pointer when CH is found in the Size 1 range.

l If PCHPLANT points to memory being mapped to I / O, then the CH is stored in * pchplant, it is difficult to expect what happen, from making the floppy disk stop (or start) work to the industrial robot violent waving torch.

l If the PCH points to the last size bytes, the PCH and Size are legal, but the PCHPLANT will point to the non-existing or write-protected storage space. Store CH in * Pchplant may cause storage failures, or do not do any action. At this time, if you don't find the hustle CH in the Size 1 characters, the function will fail.

l If PCHPLANT points to data shared by a parallel process, then a process may be missed to reference the storage space to reference when a process stores CH at * PCHPLANT.

The last point is especially bustling, because there are many ways to cause system paralysis. If you call Memchr to find allocated storage space, it is not yet destroyed a data structure of the storage management program, how will this be good? If the parallel process is a routine such as a code connection or interrupt, it is best not to call the store management program, otherwise the system may be paralyzed. If you call Memchr Scan the global array and enter the boundary variable referenced by another task, what is it? What happens when two instances of the program have to find shared data in parallel? There are many situations that make the program dead.

Of course, you can't experience the subtle mistakes caused by Memchr, as you will work well as long as you don't modify the key storage area. However, once the function like Memchr has caused an error, it is like a problem that these errors are like a needle in the sea. This is because the process of executing Memchr is working very well, and the other process crashes because the storage area is damaged. At this time, there is no reason to suspect that Memchr is caused. This is difficult to find this error.

Now you know, why buy a circuit emulator worth $ 50,000. Because they record each period from the beginning to the crash, each directive, and each paragraph of the computer reference. It may take a few days to read the output of the emulator, but if you persist and don't blindly handle these output results, you should find the error.

There is already a warning: Don't quote the storage area that is not your own. Why do we have to endure the brain? Note that "quote" means not only read and write. Reading the unknown storage area may not generate an incredible interaction with other processes, but if the protected storage area, the stored area does not exist, or the program will be quickly dead if you are mapped to the I / O store Drop. Just quote belonging to your own storage space

The thief of the car key is still

I am very strange to some programmers, they never quote their own storage space. But they feel that the code below the freewindowsTree routine is correct:

Void FreewsTree (Windows * PWNDROOT)

{

IF (PWNDROOT! = NULL)

{

WINDOW * PWND;

/ * Release the child window of the PWNDROOT ... * /

For (PWND = PWNDROOT-> PWNDCHILD; PWND! = NULL; PWND = PWND-> PWNDSIBLING)

FreeWindowTree (PWND);

IF (PWNDROOT-> STRWNDTITE! = NULL)

FreeMemory (PWNDROOT-> STRWNDTITE);

FreeMemory (PWNDROOT);

}

}

Take a look at the for loop, what is wrong? When FreeWindowStree releases each sub-window in the PWNDSIBLING list, release the PWND first, then the FOR loop refers to the released block when the control is assigned:

PWND = PWND-> PWNDSILBING;

But once the PWnd is released, what is the value of PWND-> PWNDSIBLING? Of course, a bunch of garbage. But some programmers do not accept this fact, just that the store is still good, and there is no matter what to affect it, it should still be effective! In other words, in addition to the release of the PWND, there is no other thing.

I never understand why some programmers will consider the stored storage area that has been released, which is the same as you have ever used to live in an apartment that has ever lived or the difference between the car who used to be your car? The storage space you can't quote safely refer to Because we are talking about in Chapter 3, the storage management program may have connected this released space to the idle chain, or it has been used for other private information .

Only the system can have idle storage, programmers can't have

Data permission

This problem may not be tested in the program design manual you read, but each data in the code is implied with a read and write permission associated with it. This permission does not have a clear field, nor explicitly gives an explicitly given the variable, but implied in the interface of the design subsystem and function.

For example, there is an implicit agreement between programmers and programmers who call a function and write this correspondence:

Suppose I am a caller, you are called by the caller, if I pass a pointer pointing to you, then you agree to input as a constant and promise is not written. Similarly, if I pass a pointer to the output, you agree to handle it as only the object and promise to read it without reading. Finally, whether the pointer points to input or pointing output, you agree to strictly limit the reference to the storage space that saves these outputs.

Come over, my callor agrees to treat the read-only output as constant and has promised to write to them. In addition, agree to strictly limit the reference to saving these output spaces.

In other words: "You don't want to mess with my things, I don't mess with you." To keep in mind: At any time, as long as you violate the implicit read and write permission, then wear the danger of interrupt code, because Programmers writing these codes firmly believe that each programmer should follow these conventions. Calling like Memchr's function programmer should not worry that in some special cases, Memchr will run anomalies. Take only the required

Previous chapter, we give a way of implementation of UNSTOSTR functions, as follows:

/ * Unstostr - one unsigned value converted to string * /

Void Unstostr (unsigned u, char * STR)

{

Char * strstart = STR;

DO

* STR = (u% 10) '0';

While ((u / = 10)> 0);

* Str = '/ 0';

ReverseStr (strStart);

}

The above code is Direct implementation of UnStostr, but some programmers feel that this is uncomfortable because the code exports numbers in a reverse order, but it is necessary to establish a string of forward order. Therefore, it is necessary to call REVERSSTR to rearrange the order of the numbers. This seems to be very wasteful. If you intend to export numbers in a reverse order, why not build a string of the reverse order to cancel the call to REVERSESTR? Why can't you like this:

Void Unstostr (unsigned u, char * STR)

{

CHAR * PCH;

/ * U exceeds the range? Use ulongtostr ... * /

ASSERT (u <= 65536);

/ * Store each digit backward

* The string is large enough to store the maximum possible value of U

* /

PCH = & STR [5];

* PCH = '/ 0';

DO

* - PCH = U% 10 '0';

While ((u / = 10)> 0);

STRCPY (STR, PCH);

}

Some of the programmers are very satisfied with the above code because it is more effective and easier to understand. It is more effective because Strcpy is faster than ReverSestr, especially for compilers that can generate "call" as inline instructions. The reason why the code is easier to understand is because the C programmer is more familiar with Strcpy. When the programmer sees the ReverseStr, it will be alleged as the news that they hear their friends live in the hospital.

What is this again? If unsTOSTR is so perfect, I said these doing! Of course, it is not perfect, in fact, Unstostr has a serious defect.

Tell me, how big is the storage space referred to by STR? You don't know. This is not rare for C language interface programs. There is a principle between the caller and the implementation, this is that the STR will point to a sufficiently large memory area to store U. UnStostr assumes that STR points to the maximum possible value required for the conversion U, but u is not often the maximum. Therefore, the caller is written as follows:

DisplayScore ()

{

Char strscore [3];

Unstostr (UserScore, STRSCORE);

}

Since UserScore does not generate a string of less than three characters (two digits plus an empty character), the programmer is completely reasonable to define strscore as three characters, however, unsTOSTR assumes STRSCORE is 6 characters. The array and destroy three bytes behind the STRSCORE in the storage area. In the above example, if the machine has a down-growing stack, UNSTOSTR will damage the rear pointer of the structure, or damage to the DISPLAYSCORE calorie, or damage to both. At this time, the machine is likely to be paralyzed, so you should pay attention to this problem. However, if STRSCORE is not just a local variable, it may not be noted that unsTostr destroys a variable that is followed by the STRSCORE. I believe that there will be programmers to argue: Define strscore to preserve the maximum string is risky. This is indeed risks, but only when the programmer writes the same code as the last version of the UNSTOSTR. In fact, there is no need to display the trick as above: because Unstostr is implemented safely by establishing strings in a local buffer, then copy the final product to STR:

Void Unstostr (unsigned u, char * STR)

{

CHAR STRDIGITS [6];

CHAR * PCH;

/ * u is out of range? Use ulongtostr ... * /

ASSERT (u <= 65536);

PCH = & STRDIGITS [6];

* PCH = '/ 0';

DO

* - PCH = U% 10 '0';

While ((u / = 10)> 0);

STRCPY (STR, PCH);

}

It is necessary to remember that unless the Str has been defined elsewhere, the pointer like Str does not point to the storage area that is used as a workspace buffer. In order to improve efficiency, pointers like STR are transmitted by reference transmission output rather than passing the output.

Pointer to the output pointer does not point to the pointer to the workspace buffer

Private data

Of course, there will be programmers to think that Strcpy efficiency is too low in UnsTostr. After all, Untostr is to create an output string. Then when you save some loops by returning a pointer to the string you have established, don't you copy it to another buffer?

Char * strfromuns (unsigned u)

{

Static char strdigits = "?????"; / * 5 characters '/ 0' * /

CHAR * PCH;

/ * u is out of range? Use ulongtostr ... * /

ASSERT (u <= 65535);

/ * Store each digit in front of strDigits * /

PCH = & STRDIGITS [5];

Assert (* pch == '/ 0');

DO

* - PCH = U% 10 '0';

While ((u / = 10)> 0);

Return (PCH);

}

The above code is almost the same as the code given by the previous section, and the difference is only static, so that even if the StrFromuns returns to the storage area that is assigned to StrDigits after StrFromuns, it is still saved.

Imagine: If you want to convert two unsigned values ​​into a string function, you will write:

Strhonghscore = strfromuns (highscore);

...

StRTHISSCORE = strfromuns (score);

What is wrong with this? Can you see that call strfromuns to convert Score? Is it a string fingered by strHightScore? You may argue that the error is in this code above, not in strfromuns. But remember that we are told in Chapter 5: The function can work correctly, and they must also prevent programmers to produce obvious errors. Since you and I know that some programmers will make a mistake similar to the above, I can always confirm that strFromuns has an interface error.

Even the programmer has realized that strFromuns's strings are very fragile, they will not be able to introduce errors. Suppose they call strFromuns and call another function, and they don't know that this function also calls strFromuns, so they destroy their strings. Alternatively, assume that there are multiple code execution lines, one of which performs a linear apart strFromuns, then it is possible to rush down another string still in use.

Even if the above problems are secondary, these problems are definitely

Now, with the development of the project, it may also be many times. So, when you decide to insert the call to strFromuns in your function, you must do the following two points:

l Make sure that your caller (and your caller's caller, etc.) is not being using the string returned by strFromuns. In other words, you must verify that there is no such function on the call chain that may call your function and assume that the private buffer of StrfrorNun is protected.

l Make sure you don't invoke any functions that call strFromuns to prevent damage to the string that still needs. Of course, this means that you can't call those functions of strFromuns directly and indirectly.

If you insert a call to strFromun, you don't have the above two checks, then you have risked to the risk of incorrect. However, it is impossible to comply with the two situations of the program when the programmer corrects the error and adds new features. Each time you change the call chain called your function, or modify the functions called by your code, these maintenance personnel must re-test the above. Do you think they will do it? It is difficult. Those programmers did not even realize that they should test the above conditions. After all, they only make changes, restructuring code and add new features; then what do they do for functions strFromuns? They may have never used, and they have never seen this function.

It is because such a design makes it easy to introduce an error when maintenance procedures, so it is possible to raise an error in the strFromuns. Of course, when the programmer is to isolate the strFromuns error, the error is not in strFromuns inside the code that is incorrectly using StrFromuns. Therefore, only curses strFromuns do not solve real problems. The programmer should correct this special error, and StrFromuns are no longer used over time.

Do not deliver data with static (or global) quantity storage area

Global problem

The StrFromuns example described above illustrates that when it returns data with a pointer to a static memory area, you will face the danger. Example did not indicate that there is also the same risk whenever you pass data to a non-local buffer. You can rewrite strfromuns to make it buffer in the global buffer, even in the permanent buffer, (permanently buffer is generally built in the program started by Malloc), but there is no change, because the programmer still can still two consecutive times Call strfromuns and the second call will destroy the string returned for the first time.

Therefore, the experience method is: unless you have absolute necessary, don't pass data to the global buffer. Don't forget, the static partial buffer is the same as the global buffer.

During the design function, the secure method is to allow the caller to assign the caller to allocate a partial (non-static) buffer. If you can force the function caller to provide a pointer to the output buffer, you can avoid all the problems.

Vansence of functions

Data is dangerous to the public buffer, but if it is more careful and the luck is good, it may get rid of danger. However, writing orientation depends on other programs is not only dangerous, but also is not responsible: If the host function has changed, the parasitic function is destroyed.

The best example of the parasitic function I know, from a standard procedure for the Forth programming language of the portrait. In the late 1970s, Fig (FORTH Interest Group) tried to stimulate people's interest in Forth Language by providing a public Forth-77 standard program. Those Forth programs define three standard functions: Fill, which fills the storage block in bytes; CMOVE, which uses "head to head" algorithm copy storage;

In those Forth programs, CMOVE is written in an optimized assembly language, but in order to be portable, Fill is written in the Forth language itself. CMOVE code and the code we design are similar, converted to C language as follows:

/ * CMOVE - to transfer storage with the head to the head

Void cmove (byte * pbfrom, byte * pbto, size_t size)

{

While (size -> 0)

* PBTO = * PBFROM ;

}

The implementation of Fill is amazing:

/ * Fill populates a storage domain * /

Void Fill (Byte * Pb, Size_t size, byte b)

{

IF (size> 0)

{

* PB = B;

CMOVE (Pb, Pb 1, Size-1);

}

}

Fill calls CMOVE to implement its feature, which is a bit notes before it is working. This implementation method is either "clever", or it is "rough", just see how you look. If you think fill is smart, consider: Forth may need to implement CMOVE as a header transfer. However, if it is for efficiency. Rewind CMOVE, use long (long words) instead of byte to move the storage, what will it be? In my opinion, the above Fill program is rough, not smart.

But suppose you don't intend to change CMOVE. You can even write a comment in CMOVE: Fill relies on its internal processing to warn other programmers, but this only solves half a problem.

Assuming that you have done the control code for controlling the four-degree-of-freedom factory robot, each degree of freedom has 256 locations. This robot can be designed as long as the four bytes mapped to the I / O memory, where each byte controls a degree of freedom.

In order to save a degree of freedom, write a certain value between 0 and 255 to the corresponding position of the memory. In order to retrieve a degree of freedom current position (especially when a certain degree of freedom is moved to a new location, it is especially useful to read the corresponding value from the corresponding storage location.

If you want to reset four degrees of freedom to the initial point (0, 0, 0, 0). In theory, it can be written as follows: Fill (PBROBOTDEVICE, 4, 0); / * Reset the robot to the initial point status * /

The code cannot work normally according to the Fill definition of the foregoing manner. Fill will write 0 to the first degree of freedom, and the other three freedoms have been filled with garbage, causing the robot to be disordered. Why is this so? If you look at the Fill design, you can understand that it is to copy the previously stored bytes to the current byte to fill. However, when Fill reads the first byte, it is desirable to be 0. However, since it is read, it should be the current position of the first degree of freedom, so this location may not be 0. Because the first degree of freedom may not move to position 0 in a short period of time in which the position value reads back is in a short period of one second. This position value may be arbitrary, so the second degree of freedom is sent to an uncertain point. Similarly, third and fourth degrees of freedom will also be sent to unknown places.

In order to make the Fill work properly must ensure that the Fill can read the value just written into the store from the memory. However, for the storage area mapped to I / O, ROM, protected storage, or empty storage, the above requirements cannot be guaranteed.

My point of view is that Fill has errors because it plagiarizes the private details of other functions and abused these knowledge. On other forms of memory other than RAM, Fill does not work correctly, this is a secondary problem, and more importantly, it has proved that it will be in trouble at any time as long as you don't write intuitive code.

Don't write a parasitic function

As a result, make the programmer more honest

If CMOVE uses assertions to verify the legality of its parameters (ie, the source storage is not damaged before being copied to the target storage space), then the programmer writes Fill will come to the first test the code. This makers have two options: either rewrite the Fill with a reasonable algorithm, or remove the corresponding assertions from CMOVE. Fortunately, there is almost no programmer to remove CMOVE in order to make the poor Fill program work.

As a result, it also prevents free storage space errors in FreeWindowTree from entering the original source code of the project. By using the assertion and the debug code in Chapter 3, when the window with a sub-window is used to test FreeWindowTree, the corresponding assertion will be triggered. Most programmers will modify the freewindowtree itself in order to eliminate assertion failure unless you want to deduct the assertion.

Object non-use

Use a screwdriver to broadcast the cover of the paint can, then use this screwdriver to stir well, which is one of the most familiar moves in home maintenance, I have a lot of various colors screwdrivers to prove this. However, when people know that this will stop screwdriver, should not do this, why also use a screwdriver to stir the paint? The reason is that this is because it is very convenient to solve the problem. Of course, some programming means is also very convenient and guaranteed to work, but like the screwdriver, they did not play their original role.

For example, the following code, it will compare the result as part of the calculation expression.

Unsigned atou (char * str); / * ATOI's unsigned version * /

/ * ATOI converts ASCII strings to integer values ​​* /

Int Atoi (Char * STR)

{

/ * STR's format is "[blank] [ /-] number" * /

While (Isspace (* Str))

Str ;

IF (* Str == '-')

RETURN (- (int) ATOU (STR 1));

Return ((int) ATOU (STR (* Str == ' ')))));

The above code adds the test results of * Str == ' ') onto the string pointer to skip the optional leader ' '. Because the results of the ANSI standard, any relationship operation or 0 or 1, can be written like this. But some programmers may not realize that the ANSI standard is not a rule book that you can do and what you can't do. You can write out the qualified code, but it violates its intentions. Therefore, don't write such a code because it is allowed to write, this code should be written.

However, the real problem is nothing to do with the code, and the programmer is closely related. If the programmer feels that the use logic is very good in computing expressions, what else is it to use what other security unknown shortcuts?

Do not abuse programming language

Standard will also change

When the Forth-83 standard is released, some Forth programmers discover their code with them. The reason is: In the FORTH-77 standard, the results of the Boolean value are defined as 0 and 1. Due to various reasons, the results of the Boolean value are changed to 0 and -1 in the FORTH-83 standard. Of course, this change only destroys the code that depends on "true" 1.

Not just that the Forth programmer has encountered this situation.

In the late 1970s, UCSD Pascal was very popular. If you use Pascal on the microcomputer, most of them are UCSD Pascal. But later, many UCSD Pascal programmers have received a new version of the compiler and found that their code cannot work on the new compiler. The reason is: the compiler's writer, because various reasons have changed the value of "true" to destroy all programs that depend on the original value.

Who can say that the will be changed in the future? Even if C does not change, C or other derived languages ​​change?

Programming language syndrome

Those who don't know how the C language code is converted to machine code, often attempting to increase the quality of machine code by using a simple C statement. They believe that if you use the least amount of C statement, you should get the minimum machine code. There is a certain relationship between the number of C code and the corresponding machine code number, but this relationship is not applicable when applying this relationship to a single line code.

Do you still remember the UCYCECHECKBOX function in Chapter 6?

Unsigned UcyclecheckBox (unsigned ucur)

{

Return ((UCUR <= 1)? (UCUR? 0: 1): (UCUR == 4)? 2: (UCUR 1));

}

UcycleCheckbox can be said to be a shortcoming C code, but as I pointed out, it produces a bad machine code. See the return statement given in the section:

Return ((int) ATOU (STR (* Str == ' ')))))

If you are using a good compiler, and your target can generate 0/1 results without any branch instructions, then add the results of the comparison to the pointer, this statement will produce a pretty good code . If the conditions described above are not included, the compiler is likely to be compared to the internal expansion to? : Operation, the generated code is like you have written the same C code as shown below:

RETURN ((int) ATOU (STR (* Str == ' ')? 1: 0))));

Due to "?:" The operation is just a makeup's IF-ELSE statement, so the resulting code may be more detrimental to the following, intuitive code is worse:

IF (* str == ' ') / * Skip optional ' ' * /

Str ;

Return (INT) ATOU (STR);

Of course, there are other methods to optimize the above code. I have seen such a situation: the programmer uses the "||" operator "||" operator to "improve" in a line:

IF ((* SDTR! = ' ') || STR ) / * Skip optional ' ' * /

Return (INT) ATOU (STR);

The reason why this code can work because the C language has a short circuit evaluation rule, but the code is placed on one line does not guarantee a better machine code than using the IF statement; if the 0 or 1 side effects generated by the compiler, Use "||" or even worse code.

Need to remember this simple rule: use "||" for logical expressions, put "?:" For conditional expressions, use IF to use for conditional statements. This is not to say that they are mixed, but often for efficient and maintenance of the code.

My point is: If you always use a wonderful expression, in order to write the C code on a row of the source code, you can get the best yoga state, you are likely to suffer from terrible "one line" disease (Also known as programming language syndrome). Then you have to do a deep breathing, repeatedly remind yourself: "Multi-line source code may generate high-efficiency machine code. Multi-line source code may generate high efficiency machine code ..."

Compact C code does not guarantee efficient machine code

Don't arrogantly

The most annoying book in the world is those written by experts, and their content is full of books that do not have the necessary technical terms. They don't say "The error may cause your system to suspend or fail", "said such program design defects may result in loss of the system's control or cause system termination." They also use terms such as "axioming program" and "defect classification", as if the programmer has to use some techniques every day. ! ! ! ! These authors hide the information they need to be hidden in a vagiously understood term, which not only does not help readers, but it makes the reader more confused. Not just the author of the book, some programmers are also passionate about writing ambiguous code, they think that only code is vague to give people a deep impression.

For example, look at how the following function is working:

Void * Memmove (void * pvto, void * pv from, size_t size)

{

BYTE * PBTO = (Byte *) PVTO;

BYTE * PBFROM = (byte *) pvfrom;

((PBTO

Return (PVTO);

}

If I rewrite it as follows, is the function better?

Void * Memmove (void * pvto, void * pvfrom, size_t size)

{

BYTE * PBTO = (Byte *) PVTO;

BYTE * PBFROM = (byte *) pvfrom;

IF (PVTO

TAILMOVE (PBTO, PBFROM, SIZE);

Else

Headmove (PBTO, PBFROM, SIZE);

Return (PBTO);

}

The first example looks unlike legitimate C language programs, but actually. Comparison is very good, the first example compiled code is much less than the code generated by the second example. Despite this, how many programmers can understand how the first function works? What will it be to maintain this code? If you write the correct code, no one can understand, what does it mean? If you don't plan to let others understand, you can even write this function with manually optimized assembly language. The following code is another example of making many programmers:

While (Expression)

{

INT i = 33;

Char Str [20];

... other code ...

}

Please answer quickly, is it to initialize i every cycle, or initialize I when you enter the cycle? Can you know the correct answer without thinking? If you can't affirm it, you will be learned, because even the expert C programmer usually browsing the rules of the C language in your mind to answer this question.

If you have changed slightly, become the code as shown below.

While (Expression)

{

INT I;

Char Str [20];

i = 33;

... Other code ...

}

Do you have any questions about the i 33 for each cycle? Is there a programmer in your group to show this? of course not.

Unlike novel writers, they only have one type of reader, but the programmer has two types of readers: program maintenance personnel who use code users and program maintenance people who must update the code. Programmers often forget this. I know that there are not many programmers for forgetting users, but according to the procedures I read these years, the programmer seems to have forgot their second type of reader: program maintenance personnel.

This point of view should be written is not new, and the programmer knows that this code should be written. However, they always don't realize that although they write maintainable code all day, if they use the language that only C language professionals can understand, then these code is actually unmatched. Depending on the definition, the maintenance code should be that the maintenance personnel can easily understand and do not introduce an error of the error when modified. Anyway, the program maintenance personnel generally is a novice of the project rather than an expert.

Therefore, when you consider your readers, you must also consider the program maintenance personnel. Next time you want to write the following code:

STRNCPY (strDay, & "sunmontuewedthufrisat" [day * 3], 3);

You can stop yourself and write code with a way to surprise and understand and understand it:

Static char strdaynames [] = "sunmontuewedthufrisat";

...

STRNCPY (strDay, & strdaynames [day * 3], 3);

Write code for the general level programmer

Who is in maintenance procedures

In Microsoft, each programmer has written the number of new code. It is more familiar with his more familiarity to the product being developed. It is more familiar to the product, which is more familiar with the product. Maintenance programming. Of course, if you know very little about the project, you have to spend a lot of time to read the code written by others, modify the incorrect of others, and add some of the localization of the existing feature. Intuitively, this arrangement is very meaningful. If you don't know how the system is written, then you can't add important functions to your system.

Summary, this arrangement is: Generally, experienced programmers write code, novice maintenance code. I don't say that it should be arranged so, this arrangement is practical and it is. However, only the experienced programmer recognizes that they have the responsibility to make the code they have written, can be maintained by the program maintenance personnel and program design, and this arrangement can be passed. Don't understand what I mean, I don't say that you should write the primary C program to make the program design novice to understand your code, so it is always as stupid as always preparing the expert C code. What I want to say is that when you can express it in a normal degree language, you should avoid use of difficult or mysterious C. If your code is easy to understand, then novices are not easy to introduce errors when maintaining, and you don't have to explain how the code is working.

summary

We have been examined some controversial coding practices, most of which look good. However, as we have seen, look at it, even see five times, you may not have alert to the subtle side effects of the clever code. So suggestions: If you find that you have written code, stop writing code and find another solution. Your "skill" may be very good, but if you really feel that it is a little, that is, your intuition is telling you, the situation is not wonderful. Listen to your intuition, if you think your code is not a skill, then this is actually told yourself, although this algorithm should be intuitive, it is not true, but it produces the correct results. Then the error in this algorithm is also not obvious.

Therefore, writing intuitive code is a real smart person.

Important:

l If the data you have to use is not all yourself, then it is temporary, or you should write it. Although you may think that reading data is always secure, it is necessary to read data from the memory area mapped to I / O, may cause harm to hardware.

l Whenever the memory is released, people still want to quote it, but to restrain yourself. The reference free storage area is extremely easy to cause errors.

l In order to improve efficiency, it is also very attractive to global buffer or static buffering data, but this is a shortcut full of risk. If you write a function, it is used to create data to be used to call the function, then the data is returned to the call function, or the data is not accidentally changed.

l Do not write a function of a particular implementation that relies on support functions. We have seen that the Fill routine does not call CMOVE as it is given, which can only be an example of bad programming.

l When the program is designed, you must write code accurately according to the original intention of the programming language. Avoid using doubt procedures to design idioms, even if language standards ensure it work, don't use it. Remember that the standard is also changing.

l If you can effectively represent a concept in C language, then similarly, the corresponding machine code should also be effective. Logically seems to be like this, but it is actually not the case. Therefore, before you compress the multi-line C code to a row of code, you must make a better machine code after you can make better machine code.

l Finally, don't write the code as a lawyer write the contract. If the general level of programmers cannot read and understand your code, then your code is too complicated, using a simple language.

Exercise:

1) The C programming designer often modifies the parameters passing to the function. Why does this practice do not violate the write authority of the input data?

2) The main defect of the following strFromuns function has been introduced before (review, it returns the data in the non-protection buffer), in addition to this, is there any error in STRDIGITS? Char * strfromuns (unsigned u)

{

Static char strdigits = "?????"; / * Series long is 5 char '0' * /

CHAR * PCH;

/ * U exceeds the range? Use ulongtostr * /

ASSERT (u <= 65535);

/ * Store each digit backward in strdigits * /

PCH = & STRDIGITS [5];

Assert (* pch == '/ 0');

DO

* - PCH = U% 10 '0';

While ((u / = 10)> 0);

Return (PCH);

}

3) When I read a magazine code, I noticed such a function that uses the MEMSET function to set three local variables to 0, as shown below:

Void Dosomething (...)

{

INT I;

Int J;

INT K;

MEMSET (& K, 0, 3 * SizeOf (int));

......

Such code can be run on some compiler, but why should I avoid using this skill?

4) Although the computer contains some operating systems in the read-only memory, if you avoid unnecessary internal operations, you have bypass the system interface and call the ROM process directly. Why is this risk?

5) Traditionally, C allows the number of parameters than the number of programs to function to function to function to function. Some programmers use this feature to optimize calls, which do not require all parameters. E.g:

...

DOOPERATION (OPNEGACC); / * does not need to pass Val * /

...

Void Dooperation (Operation OP, INT VAL)

{

Switch (OP)

{

Case Opnegacc:

Accumulator = - Accumulator;

Break;

Case opAddval:

Accumulator = VAL;

Break;

...

Although this optimization can still work but why should I avoid this?

6) The following assertions are correct, but why do you have to override it?

Assert ((f & 1) == f);

7) Recover another version of Memmove using the following code:

((PBTO

How to rewote Memmove makes it both the efficiency of the above code and is more likely to understand?

8) The following assembly language code gives the common shortcuts for the call function. If you use this code, why is it to find trouble?

Move R0, #printer

Call Print 4

...

Print: Move R0, # Display; (4-byte instruction)

...; RO == Device Identifier ID

9) The following compilation is another trick. This code has the same problem with the code of the previous exercise. In addition to the above problems, why should you avoid using this skill?

Instclear r0 = 0x36a2; "Clear RO" is a 16-way instruction

...

Call Print 2; output to printer

...

Print: Move R0, # insttharr0; (4-byte instruction)

Comp R0, # 0; 0 == Printer, Non-0 == DISPLAY

Chapter 8, the remaining is attitude issues

The methods discussed in this book can be used to check errors and prevent errors, but these technologies cannot guarantee that they can definitely write unlike code, just like a skilled team that cannot be the same. It is important to develop good habits and correct attitudes.

If a team is discussing how to train in his mouth, is this team might have a chance to win? If the team's team members are constantly compressing, or when they have been replaced or cut off, what will it be? Although these issues are not directly related to the players, it affects the player's level.

The same pronunciation can use all the recommendations of this book, however, if you have doubtful attitude or use the wrong coding habit, you will be difficult to write an error-free code. Therefore, you have to win the confidence and good habits, and like, you will have the same problem if you don't have to win confidence and good habits.

Therefore, in this chapter, some main obstacles to writing erroneous code are presented. As long as these obstacles can be realized, it is easy to correct.

I still have a trick, I still have a trick.

When you ask the programmer to hear the answer to them, how many times they hear them: "Oh! The error disappears." Before I have, I have said such a word to my first manager. At that time, we are developing Apple II database products. If you ask me if I have already tried to find the error project, I said, "Oh! The error disappears." The manager paused, then invited me to sit on his office. .

"You said 'mistakes disappeared', what does it mean?"

"Oh! You know, I will carefully check the error report."

The manager got behind the chair: "Where do you think the mistake is?"

"I don't know." I said, "I think it has been corrected."

"You don't know who changed, is it?"

"Yes, I don't know." I replied frankly.

"Well, then you don't think you should find out what happened to the end?" He said. "After all, you are dealing with the computer, and the error will not be self-corrected."

The manager then further explained that there were three reasons for error disappearance: First, the error report is wrong; the other is that the error has been corrected by other programmers; the third is that this error still exists but did not show it. That is, as a professional programmer, one of its responsibilities is to determine which of the three cases of the error belong to the above three cases, thereby taking corresponding actions, but must not be simply ignored because the error does not appear. It, it is good.

When I first heard this advice, it was in the era of CP / M and Apple II, this advice was valuable. In fact, in the past few decades, it is valuable, and it is still very valuable. However, until I became the head of the project, and discovered that the programmer was generally happy to have a wrong tester or some programmers had already excluded this mistake. At this time, I realized how it makes sense.

Error disappearance is often because programmers and testers use different programs. If there is no error in the code used by the programmer, the program version is used, and if the error has not yet appeared, the test group can be notified. However, if the error does appear, it is necessary to track it to its earlier version of the source, and determine how to modify it, then check out why the error will be invisible in the current source program. Usually the error still exists, just that the environment has changed to cover up the error. No matter what reason, in order to adopt a suitable step to correct the error, you must understand why the error disappears. Error almost won't "disappear"

Wasted energy?

When I asked the programmer to find the reported error on the old version of the source, they often have to make complapous, which seems to be a waste time. If you think so, you have to understand that you don't want you to restore the old version of the source, but you want to check these source programs to provide you with a bad machine, and this is also the most effective way to find an error.

Even if you find that the error has been corrected, the error separated in the old version of the source is also worth it. Is it the end of "corrected" to "correct it"? Still give the error sign to "no longer generated" and send it to the test group? What will the tester do it? They will definitely not think this error has been corrected. They only have two options. One is to spend time to try to make another group of use cases that can produce an error; the other is to drop this error, labeled it " Will be generated "and hope that the error has been corrected. Compared with the error in the old version of the source code and the end of "corrected". The two options are not good.

Correct in time, half-time

When I first participated in the Excel team, we postpone all the wrong corrections to the finals of the project. This is not to say that some people use the gun to say our spine: "Until all the characteristics realize it to change the error", but there is always the pressure of maintaining progress and realization, but a little stress in the modification of errors. . I have said: "Unless the system is embarrassed or so that the test group is shut down, don't rush to change it, we have time to modify the mistake." In short, the error is not put In priority.

I believe that the current Microsoft programmer has heard that the above is definitely reversed, because the project is no longer developed in this way. There are too many problems in this way. The worst is that it is not possible to predict when it can be completed. How do you estimate the time to modify 1742 errors? Of course, not only 1742 errors need to be modified, because the programmer will cause new errors when the programmer modifies the old error. Closer, modifying an error may expose other potential errors, due to the first error barrier, the test group failed to discover these potential errors.

But this is not the only problem.

Since the desired feature is completed because there is no correct error, the product looks much more than its actual progress. The company's important person test uses internal release versions, finding some accidental mistakes, the product works very well, they are very surprised, only a six-month development time has almost completed a final product. They can't see the storage space overflow, or some featured errors that have never tried it, they only know that the code "all the characteristics are complete", basically work.

In the last time a few months of time, modifying an error is often too morale. The programmer likes the program and does not want to change the wrong, but in each project, it has been a few months, and they can do nothing. Since everyone other than the development group clearly knows that the code is close to completion, the change is changed often has a lot of pressure.

Is this not looking for?

However, the Macintosh Excel 1.03 starts to the undo - a window product that is not announced (due to the out-of-control error table), Microsoft has been running with erroneous products, which forces Microsoft to carefully study how to develop products. The conclusions are not surprised: L Do not save time by reacting the final phase of the product development cycle. Modifying a year's code is more difficult than modifying a few days before writing, in fact, this is a waste of time.

l The "one-time" modified error will bring a lot of problems, because the sooner of finding the error, the smaller the possibility of repeating this error.

l The error is a negative feedback, and the program is fast, but the programmer is neglected. If the specification can only increase the new feature after all the error is all correct, then the programmer's omission can be avoided, and they will be busy modifying errors. Conversely, if the programmer is allowed to slightly, the management is out of control.

l If the number of errors remains at nearly 0, it is easy to predict the completion time of the product. It is only necessary to estimate the time required to complete 32 features, without estimating 32 feature plus the time required to correct 1742 errors. Better, you can always be an advantageous position that can be handed over to developed features at any time.

These views are not only applicable to Microsoft development and apply to any software development. If you don't correct it in time when you find an error, then Microsoft's bad experience is your back textbook. You can learn a lot from your own tough experience or from others' pain lessons. What should I do?

Modify mistakes immediately, don't postpone until the end

Monk

A doctor's story is taught in Anthony Robbins' novel "Awaken The Giant With". One day, a doctor walked to a raging river, she suddenly heard the sauce of the drowning. She watched four weeks and found no one to save, so she jumped into the water and turned toward the drowning. She saved the water to save the shore, and did the mouthful of artificial breathing. This person just resumed breathing, and came from the river to the other two falling aurans. She jumped into the water again, saving these two people on the shore, just when she settled these two people, the doctor heard another four falling aener's saving, and then he heard another eight water drums for help ... ... The problem is that the doctor is only busy to save people, can't draw up to find out who throws people into the water.

Like this doctor, programmers are often busy "cure" mistakes without stopping to judge what causes these errors. For example, a function strfromuns we have taught in the previous chapter, because it forces the programmer to use unprotected data to cause errors. But the error always appears within the function of calling strFromuns, not in strfromuns itself. Therefore, what do you think should modify the true root StrFromuns of the wrong, or modify this function of the fault to call StrFromuns?

When I migrate a feature of Windows Excel to Macintosh Excel (at that time, similar problems have also occurred. After transplanting a feature, I started testing the code and found a function to get an unpredictable NULL empty pointer. I checked the code, but the error is in the function (out null) called by this function, or is it in this function itself (no null)? So I found the original programmer and asked him to explain the situation. He immediately edited the function and said: "Oh, this function has no NULL pointer." Then, when I stand on the side, he corrects the error by inserting the following code, when the pointer is NULL " Quickly hop "out of IF (PB == NULL)

Return (False);

I reminded him whether this function should not get an empty pointer NULL, and the error is in the call function and is not in this function. He replied: "I know these code, I can correct the error." But I think this kind of solution seems to have only correct the wrong symptom without correcting the wrong reason, so I returned my office to spend 10 Minute time to track the source of NULL empty pointers. It is found that the empty pointer NULL is not only the true root of this error, but also the reason for the other two known mistakes.

When I tracked the root of the wrong, I often think that: "Wait, modify this function may be right. If this function is wrong, the function should have a problem in another place, but it is not I have a problem. "I am sure you can guess why functions can work in another place, which works because a programmer has changed this more common error.

Modify the error to be ruled, don't govern the form

Do you have a code that is not born with a non-wrong?

"As long as there is no destruction, how to change it." This seems to be some programmers' slogan. Regardless of whether the code works well, some programmers should always be forced to leave their own traces on the code. If you have worked with those who like to reformat the entire file to fit the programmer who is fitted with them, you will definitely understand what I said. Although most programmers are very cautious about the "cleaning" code, it seems that all programmers have done this.

The problem with the cleaning code is that the programmer always does not put the improved code as a new code. For example, some programmers saw the code shown below when browsing the file, they changed the test with 0 to the test with '/ 0' (other programmers may want all to delete the test).

Char * STRCPY (Char * PCHTO, Char * PCHFROM)

{

Char * pchstart = PCHTO;

While (* PCHTO = * PCHFROM )! = 0)

NULL;

Return (PCHSTART);

}

The problem that changed 0 to an empty character is easy to type '/ 0' error to '0', but how many programmers are willing to test STRCPY after doing this simple change? This reminds us: When you have made such a simple change, do you also perform a complete test like a newly written code? If not, these unnecessary code changes will have a risk of incorrectly.

You may think that as long as the modified code can still be compiled, these changes are not wrong. For example, how can the name of changing the local variable will lead the problem? But it does cause problems. I used to track an error until a function, this function has a local variable name Hprint, which conflicts with a global variable with the same name. Since this function is still normal, I have viewed the old version of the source program, let's take a look at what the current version has changed and verify that my changes will retrore the previous mistake. I found the cleanup code problem. There is a local variable Hprint1 in the old version, but there is no Hprint2 or Hprint3 to explain the meaning of '1' in the name. Deleting the programmer of '1' must think that Hprint1 is a human redundancy and cleaned it to hprint, causing a name conflict, resulting in an error. In order to avoid the above mistakes, you must often remind yourself: the programmer who works with me is not some stupid. When you find some obvious mistakes or clearly there is no necessary code, the above police sentence will remind you to be careful. It seems to have a problem with the code, you will find that it may have a good reason why it is written so, but it is not obvious. I have seen a ridiculous code, the only purpose is to work when the compiler code is generated (this is extremely rare - translator's note), if you clean up this code, then error. Of course, such code should have a comment to explain the features it want to implement, but not all programmers think about it.

So if you find the code like this:

Char ChgetNext (Void)

{

INTCH; / * CH "must" is int type * /

CH = GetChar ();

Return (ChRemapChar (CH));

}

Don't rush to delete CH, clear "there is no need", clean it into such a function:

Char ChgetNext (Void)

{

Return (chREMAPCHAR (GetChar ());

}

After this is organized, if ChRemapChar is a macro, it will see the value of the parameter multiple times, which may introduce an error. Therefore, maintaining "unnecessary" local variables to avoid unnecessary errors.

Do not organize code unless the success or failure of the relationship

Put the "cold door" characteristics into the cold palace

Avoid cleaning code is just a special case of writing a unleired code universal principle. This universal principle is: Do not write (or modify) code if it is not necessary. This suggestion seems to be very strange, but if you often ask questions: "What is important for the success or failure of the product?" Thus deleting these characteristics, then you will not fall into the predicament.

Some features are added to the product, but it exists only to fill the feature set; some of the existence of some features is because large companies buy them asking these characteristics; there are some features that can exist because of competitors With these features, the reviewers decided to incorporate these features into the characteristic table. If you have a good market and product planning group, you should not join these unnecessary features. However, as a programmer, it will not only adopt these characteristics with large current, but may even be some origin that there is no necessary feature.

Have you listened to the process of the process? "If WordSmasher can do ..., it will be a big 'cold door'." This so-called "cold door" is because it can improve the quality of the product, or because it is technically challenging? If this feature improves the quality of the product, the feature should be delayed to the next version of the program, and it will be reasonably evaluated and the corresponding schedule will be made. If this feature is just a technical challenge, then it is veto it. My suggestion is not to suppress creativity, but to curb the development of unnecessary features and related errors.

Sometimes, technically challenging features can improve the quality of the product, sometimes it is. Please choose. Don't achieve the characteristics of no strategic meaning

No free lunch

"Free" is another source of excessive errors. On the surface, free characteristics seem to be worthwhile, because this only needs little or even doing any effort to skip the existing design. How can I better than this? It will bring a lot of problems with free characteristics, although they have almost no key role in the success or failure of the product. As I said in the previous section, you should regard any non-key feature as a source of errors. Programmers add free feature to the program because they can increase rather than because they must increase. If it doesn't need you to pay any price, why not add a feature?

what! But this is a fallacy. For programmers, increasing free features may not work, but for features, it is not only a code, but also someone must write the feature, but also have someone to test it. Don't forget that there must be someone to modify this feature that may have an error.

When I heard a programmer is free, I know that he doesn't spend time to consider the true price of this feature.

No free feature

Flexible breeding error

Another policy to avoid errors is that there is no necessary flexibility in the design. This principle runs through this book. For example, in the first chapter, I used the selection compilation warning to avoid redundant and risk-free C language usual language. In Chapter 2, I define Assert as a statement to prevent the use of macros from incorrectly in the expression. In Chapter 3, I used the assert to capture the NULL pointer passed to FreeMemory, even if I use the NULL pointer to call the free function is legal, I have done it. ... I can list the example of each chapter.

The problem with flexible design is that the more flexible design is, the more difficult it is to make mistakes. Remember the few things that I have emphasized in Realloc in Chapter 5? You can almost throw away any input set of Realloc, but it will continue to execute, but it may not be executed as you want. Worse, since the function is flexible, it is not possible to insert the validity of the meaningful assertion verification input. However, if Realloc is divided into extension, shrink, assignment, and release the four special functions of the storage block, it is much easier to confirm that the function changes.

In addition to excessive flexible functions, it should always be alert to excessive flexible feature. Due to flexible features may result in some "legal" situation that is not expected, you may think that these conditions don't have to be tested or even think this is legal, so flexible characteristics are also very difficult.

For example, when I add a colorful support program for the Excel of Apple's Excel and the new Macintosh II machine, I want to transplant a code from Windows Excel, which allows the user to specify the text color displayed in the spreadsheet. For example, add a color to a lattice, you should choose an existing plaid, as shown below (print 1234.5678 to $ 1,234.57):

$ #, ## 0.00

And in front of the color declaration. In order to display blue, the user needs to change the above form to:

[blue] $ #, ## 0.00

If the user wrote [Red], then the data is displayed in red, so wait.

Excel's product description is very clear, color description should be placed in the beginning of data, but when I transplanted this feature to open the test code, I found all the following forms to work.

$ #, ## 0.00 [blue]

$ #, ## [blue] 0.00

$ [Blue] #, ## 0.00

You can put [Blue] anywhere. When I asked the original programmer, it was an error or a feature. He said that the color declaration can be placed "" Just getting from the grammar analysis cycle. "He didn't think that a little extra flexibility was a mistake, at the time It is also to think that so that the code will remain. However, review, we should not allow this extra flexibility. Soon the test group found six subtle errors, and finally all of these errors were due to format syntax analysis programs because it did not expect that color descriptions were in the middle of the format. But we didn't correct this error by deleting unnecessary flexibility (this requires an easy IF statement), but only corrects these specific errors, that is, correct the symptoms of the wrong, thereby retaining anyone no longer needs Flexibility. Today, Excel still allows color to explain to any location you want.

So remember when you implement feature: Do not make them have unnecessary flexibility, but it is easy to use. Both are different.

No necessary flexibility is not allowed

The transplanted code is also a new code

In the process of transplanting Windows Excel code to Maxintosh Excel, I got such a lesson, people who have always taken some inspections for this transplanted code. After all, these codes have been tested in the original product. I caught all the errors in the Excel digital format code before handing the transplant code to the test group, but I didn't do this. I just copied the code to Macintosh Excel, and some of them have modified these code to the project, and then temporarily tested the code to verify that it has been properly connected. I have no comprehensive test feature itself, because I think this has been tested. This is a disregard, especially at the time, Windows Excel itself is also in the development phase, which is even more disturbed. That is that the Microsoft team defers the modified error to the final stage of the product cycle.

In fact, no matter how you implement the characteristics, it is designed to be designed from the beginning, or in accordance with an existing code, you have a responsible to exclude errors existing in the code you want to join the project. If Macintosh Excel has the same identification with Windows Excel? Of course, it is not possible, because this does not mitigate the severity of these errors. I will appear when I am lazy.

"Try a try" is a taboo

You may say that many times like this: "I don't know how to come ...", and other programmers answer you: "Are you tried ...?" Almost you can hear it in each newly established group. Similar to this conversation. A programmer is a message to ask: "How can I hide the cursor?" The first person said: "Try to move the cursor to the screen", another person suggests: "Set the cursor sign to 0 The cursor pixel is not visible. "The third person may say:" The cursor is just a bit image, so I want to set it to zero its width and height. " Try, try, try ...

I admit that this is a ridiculous example, but I am sure you have heard similar dialogue. Usually in all scenarios recommended to "try a try", it may not be a suitable scheme that can be adopted. When someone tells you to try something, just tell you a given answer that is not a question.

Do you have any mistakes for a variety of programs? If the trial's thing has been defined by the system, then there is no error. But things are often not the case. When the programmer begins to try a certain plan, he often stays away from the system they know, and enters the realm of hunger and seek answers, this solution is likely to have an uncognitive side effect, and will change. The programmer also has a bad habit, which is consciously read from the free storage area. What do you think so? Free definitely does not define what is the content in the free storage area, but some programmers feel that they need to quote the free storage area, they try, and they have to rely on Free to implement this behavior. Therefore, pay attention to the suggestions to listen to the programmer, such as: "You can try it ...", etc., you will find that most suggestions use undefined or ill-state definitions. If the programmer knows how to solve it when it comes, they will not tell you "trial". For example, they will definitely tell you "Use the SetCursorstate (Invisible) system call."

Before finding the correct solution, don't "try", take time to seek correct solution

Try more

In the past few years, you can receive some read-only editors on the Macintosh news team in Microsoft's Macintosh programmers. These editors are interesting, but it is not very useful, often can't answer the issues raised. There are always some programmers to propose those answers to be clear written in "Apple's Macintosh Manual", but programmers get an answer In addition to clearly understanding in the manual, it is often a general solution. Fortunately, there are always several Macintosh internal experts to give clear answers, such as "See the Macintosh Internal Manual Chapter 4, page 32, it said that you should ...".

My point is: If you find that you are testing the problem, stop, take out your manual to read it carefully. This is not playing code so interesting, nor did you ask others how to try it so easy, but you will learn a lot of knowledge about operating system and how to program it above it.

"Sacred" schedule

When a considerable feature is to be realized, some programmers have to spend two weeks to write code on the keyboard, never urgently test his procedure. Other programmers will stop checking his procedures after ten small features. If this method can make programmers test their code thoroughly, then this method has no mistakes. But is this possible?

Consider this situation: A programmer should use 5 days to achieve five characteristics. This programmer has two options: one is to implement a feature to test one feature, one by one, and the other is five five places. Which method do you think in actually produce a strong code? In the past few years, I have examined these two coding styles. In most cases, programmers who write code edge test code are less error. I can even tell you why this is like this.

Suppose the programmer is all used for 5 days to implement five features, but then he realizes that he does not have too much time in the schedule to fully test these code. Do you think the programmer uses an extra day or two days to fully test these code? Or play a play code, verify that the code seems to work normally, then go to the next feature? Of course, the answer will depend on the programmer and work environment. However, the problem brought is to give up the progress plan, or reduce the test. If you give up the progress plan, most companies will dissatisfaction, and reduce the test, then lose negative feedback, programmers may be more promising. Progress program.

Even if the programmer is single instead of being prepared and testing the characteristics, the programmer still has to reduce the test. But when the programmer is made by the characteristics of the characteristics, the effect is more obvious. In a batch of features, as long as there is a touch of feature, you will take up all the features of the test time. The disadvantage of using the schedule is that the programmer will give the speed than the priority of the test, essentially the progress is priority than the correct code. My experience is that if the programmer writes a feature code according to the schedule, even if the test is reduced, he must "complete" this feature according to progress. He will think: "If there are some unknown errors in the code, the test group will notify me."

In order to avoid this trap, try to write and test the small block code, do not use the progress as an excuse to skip the test.

Try to write and test the small block code.

Even if the test code affects the progress, you must adhere to the test code.

Famous fact

Chapter 5 has explained that getchar's names often make the programmer that the function returns a character, which actually returns an int. Similarly, programmers believe that test groups will test their code, this is their work, in addition to this, what do they still do? In fact, this view is wrong. Regardless of how the programmers believe that the existence of the test group is not to test the code written by the programmer, they are to protect the company, and ultimately users do not suffer from poor products.

If compared with the housing construction process, it is easy to understand the role of the test. In the house building, the builder builds a house and the inspector check it. However, the inspector does not "test" these housing: Electrical engineer will never go to the installation of the wire, and will never pass the power supply, no test of the insurance box, do not use the multimeter to deliver the route before each outlet. This electrical engineer will never think: "I don't have to do these tests. If there is a problem, the inspector will notify me." Electrical engineers who have this idea will quickly find that they are difficult to find.

Like the above-mentioned housing inspector, the main reason for the program tester is not responsible for the test procedure is that they do not have the necessary tools and techniques, although there is exception, but this is a principle. Although it is different from the computer world, the tester testing the user's code is impossible to test better than the user. Can the tester join the assertion to capture a problematic data stream? Can the tester test the storage management program to test the subsystem like this in Chapter 3? The tester can use the debugger to pass the code by sequential instruction to check if each code path works in accordance with the requirements of the juice? The status is that although programmers test their code is much more effective than testers, they don't do it, this is because the computer world has these statements.

But don't misunderstand what I mean, the test group plays an important role in the development process, but it is never the role that the programmer is imagined. When they inspect the product, find the defects that make the procedures failure, confirm whether the product is not compatible with the previous products, reminding the development department to improve product performance, using products to confirm these features of the product in actual use. All of these are not related to tests, just injecting quality in the product.

So, please remember that in Chapter 2, if you want to continue writing unleired code, you must seize the key to control, don't rely on the test group to find errors, because this is not their work.

The responsibility of the test code is not in the tester, but the programmer's own responsibility

repetitive work

If the programmer has the responsibility of testing code, then there is naturally this question: "Is the programmer and testers do repeated efforts?" May be. But then ask again, of course not. The programmer test code is from the outward test, and the tester is tested from the outside.

For example, when the programmer tests the code, it always moves from the test path, verifying the code and data stream, step by sequential command (or line) to the subsystem to confirm the function in the subsystem. Other functions have a normal operation, and finally the programmer uses unit tests to verify that each independent subsystem can cooperate correctly. The state of the internal data structure can also be detected by unit testing. On the other hand, the tester has tested the code as a black box, tested from the various inputs of the program to find errors, and the tester may also use the regression test to confirm that all reports have been excluded. The tester is then gradually advanced, using the code coverage tool to check how much internal code is executed in the global test, and the information obtained is generated to generate a new test to execute the uncovered code.

This is an example of using two different "algorithms" test procedures. This is because the programmer emphasizes the code and the tester emphasizes the feature. The two are considered from different orientation, which increases the opportunity to find unknown mistakes.

White-eyed tester

Whether the reader noted that when the test group found a mistake, how many programmers made a comfort, they would say: "Hey! I am very happy to test this error before delivery." However, there are some programmers, In the tester reporting the mistakes in their programs, especially when they pointed out multiple errors in a code, they hate it. I have seen this programmer anger and rushing down, and I have heard some project leaders say why testers make me no peace, this is a test error, because we have deleted this data. "Once, I also stopped a project leader and a punch between a test person in charge, because the project leader is already under the huge pressure of delaying delivery, and the test group continues to report the error This makes him very uneasy.

Is this very stupid? It is indeed very stupid. We did not pay attention to this product before the delivery of non-difficult and pressures, it is easy to think that this is ridiculous. But think about it, if you are surrounded by mistakes, the delivery period has been spent for several months, it is easy to think that these testers are indeed bad guys.

Whenever I see that the programmer is turned to the tester, I always pull themide and ask them: Why do you have to be responsible for the mistakes made by the programmer, and the testers have no reason, they are just just Reporter.

Whenever the tester is reported to you, your first reaction is shocked and not believed. You have never thought that the test will find errors in your code; your second reaction It should be gratitude, because the tester helps you avoid delivery errors.

Don't blame the tester to discover your mistakes

There is no ridiculous mistake

Sometimes you will hear that the programmer complains that a mistake is too ridiculous, or complaining that a tester often reports some stupid mistakes. If you hear such complaints, stop and remind him that the tester does not judge the seriousness of the error, nor does it say that these errors are worth removing. The tester must report all the mistakes, whether it is stupid or not stupid, although the tester knows that some stupid mistakes may be a side effect of a serious problem.

But the real problem is why the programmer does not capture these errors when testing this code? Even if these errors are slightly and not excluded, it is also very important to find out the wrong roots to avoid similar errors in the future.

An error may be very slight. But its existence itself is very serious.

Establish your own priority

If you turn a few pages to a few pages, you will be surprised to find that some of them seem to be contradictory. However, after you think carefully, you may not think so. In summary, the programmer must be dealt with frequently contradictory targets such as fast code and compact code.

So the problem is, which one is choosing when two possible implementations? It is necessary to make a choice between the fast algorithm and the small cleaner algorithm, but between the fast algorithm and the maintenance algorithm, or between the small, but risky algorithms and the algorithms that are even more and easy to test. What kind of choice do you make when making a choice? There is certain that some programmers will answer these questions without thinking, but there are also some programmers who cannot determine which one chooses. If you ask them the same problem after a few weeks, they will give different answers. The reason why programmers cannot determine this kind of mudity is that because they don't know what their priority is known in addition to identity or speed of these very ordinary priorities. However, if there is no clear priority in the programming, it is like a blind ride, there must be stopped in each turn and ask yourself: "What should I do now?" At this time you often make a mistake. s Choice.

However, some programmers, they clearly know their priority, but because their priority is not correct or inactive, they don't care seriously on key issues and therefore constantly make mistakes. For example, many experienced programmers are still affected by the priority priorities advocated in the late 1970s, then the storage space is small, the microcomputer runs very slow, in order to write a useful program, you must use maintainability to exchange space and speed. But now, the program is increasing, and the capacity of RAM is getting bigger and bigger, and the speed of computer operation is constantly accelerating, so that most of the tasks can be completed on time. Therefore, the present priority is different, no longer uses maintainability to exchange space and speed, otherwise it will get a program that is not obviously fast but not maintained at speeds. There are still some programmers to regard the size and speed as the god, regarding them as the key to product success or failure. Since these programmers have all the last order, they have been doing errors.

Therefore, as long as you haven't considered your priority, then you have to sit down for yourself (if you are the head of the project, just create a list of priorities for your group), so that you can complete the project The goal is constantly making the best choice. Note that I am talking about "project goals." Your priority list should not reflect what you want, but should reflect what you should do. For example, some programmers may list "personal expression" as the highest priority, which is beneficial to the programmer or the product? These programmers did not accept naming standards? I don't agree to use {} positioning style, or I have been engaged in it?

It should be noted that there is no "correct" method to determine your priority sequence, but the selected priority will determine the style and quality of the code. Let's take a look at the priority list of York and Gil's two programmers:

York's priority list Gil's priority list

Positiveness correctness

Global efficiency testability

Size global efficiency

Local efficiency maintainability / clarity

Personal convenient consistency

Maintainability / clarity

Partial efficiency of personal expression

Testability personal expression

Consistent personal convenience

How will these priorities affect York and Gil's code? Both people first focus on writing the correct code, which is just the only one of the priority arrangements. It can be seen that York is very attaches great to the size and speed, and it is very easy to consider whether the test code is very easy to write clearly. Jill puts more attention on the correct code, just when the size and speed are jeopardized, it takes them as a consideration. Gill believes that unless it is easy to test the code, it will not verify that the code is correct, so Gill is highly placed in the priority arrangement sequence list.

Do you think this two programmers are more likely:

l How can the selected compiler can automatically capture errors and alarm? Although you may need an additional job in order to use the security environment.

l Use the assertion and subsystem to check inspection?

l Check every code path from micro reward verify all just written new code?

l Use the security function interface to replace the risk-based function interface? Although in each call point may have an additional number of instructions in 1 to 2 or more.

l Use the portable type and use it with the shift (for example, using / 4 instead >> 2)?

l Avoid using efficiency skills in Chapter 7?

Is this a problem with a problem? I do not think so. I asked you: "Do you think Gil and York will read this book and take it in accordance with the recommendations in the book?" Who will go to "program design style elements" or other guidance books, and follow the book Is it recommended?

Readers should notice that due to York's priority order, he will focus on how to unfavorable product. His returns how to waste every line of code as soon as possible, and rarely consider. In contrast, according to her priority, she focuses on the product, not code, unless the prove (or obvious) does need to consider the size and speed, otherwise she does not consider the size and speed.

Now think about it, which is more important to your company? So what should your priority?

Establish your own priority list and insist

Speak

Have you seen the code written by others, and why do you write this? Do you have asked them on this code, and then they said: "Oh, I don't know why I wrote this, I guess I felt this to write correct."

I often review the code and find ways to help programmers improve technology. I found "Oh, I don't know" This answer is quite common. I also found that the programmer who made this answer did not establish a clear priority, and their decision seems to be arbitrary. Conversely, the programmer with a clear priority order is accurately known why they choose this implementation, and when he is asked, he can say it.

summary

This chapter has not mentioned a very important point of view, this is: You must develop the habits of how often you write code. This book is the result of long-term insistence on some simple questions.

l How can I automatically detect an error?

l How can I prevent errors?

l This idea and habit are to help me write a wrong code or hinder me write a wrong code?

All views of this chapter are the results generated by the last question. Examining your own ideas is important, these concepts reflect the priority of individual considerations. If you think that the presence of the test group is to test your code, you will continue to have trouble in writing a wrong code, because your concept tells you some extent, it is ok to test. If you don't think about writing a wrong code, how can you try to write a wrong code? If you want to write a wrong code, you should clear the concept of obstructing you to achieve this goal. The way to clear it is to ask yourself, your own concept is good or harmful to achieving goals.

Important:

l The error will neither be generated, and it will not be corrected by himself. If you get an error report, this error no longer appears. Don't assume that the tester has an illusion, but try to find errors, even the old version of the program.

l cannot be "later" to modify the error. This is a common lesson that many products are canceled. If you find an error while you find an error, then your project will not suffer from devastating fate. How can you have a series of errors when your project always keeps approximately 0 errors?

l When you track an error, you will always ask yourself, this error will be a big mistake. Of course, modifying a symptom that has just been tracked is easy, but it is necessary to find a true cause.

l Don't write unnecessary code. Let your competitors cleaning up the code and go to realize the "cold door" but worthless characteristics to achieve free features. Let them spend a lot of time to modify all unnecessary errors caused by these unique code.

l Remember flexible and easy to use is not a matter. In your design functions and features, the focus is easy to use; if they are just flexible, like the realloc function and the color format feature in Excel, then it can't make the code more useful; vice versa, make the discovery error change It is more difficult.

l Do not be "tried" a solution to achieve the impact of the expected results. Instead, the time of the flower attempts should be used to find the right solution. If necessary, contact the company responsible for your operating system, this is better than the proposed weird implementation that may have problems in the future.

l The code is written as small as possible for comprehensive testing. Don't hurt in the test. Remember, if you don't test your code, no one will test your code. No matter what, you don't expect the test group to test the code for you.

L Finally, determine the priority order of your team and follow this order. If you are York, the project requires Gill, so you must change your habits at least in your work.

Question:

Convince your programming group to build or adopt a priority list. If your company has different levels of talents (such as primary program designers, programmakers, advanced programming designers, programming analysts), you may have to consider different levels of priority lists, why?

Appendix A Code Checklist

The list of questions given in this appendix summarizes all the views of this book. The best way to use this table is to spend two weeks time to review your design and encoding implementation. Take a few minutes to take a look at the list. Once you are familiar with these questions, you can flexibly write code. At this point, you can put the table on one side.

General problem

─ Do you have a debug version for the program?

─ Do you correct the mistakes you discovered?

─ Do you adhere to thorough test code. Even if the progress is still there?

─ Do you rely on the test group to test code?

─ Do you know the priority order of the code?

Is your compiler if there is an optional warning?

About returning to the main program - Do you have a warning of the compiler (including optional)?

─ Your code does not use Lint

─ Does your code have tested unit testing?

─ Will you gradually pass each coded path to observe the data stream?

─ Do you have gradually passed all key code at the assembly language level?

- Is there any code? If so, have the modification thoroughly tested?

- Is a document pointed out that the use of your code is dangerous?

- Is the program maintenance personnel understand your code?

Whenever a function or subsystem is implemented

─ Does the assertion confirm the effectiveness of the function parameters?

Is there an undefined or meaningless code in one code?

─ Can I create undefined data?

- Is there an assertion that is difficult to understand? Is there any explanation?

─ Do you have any assumptions in your code?

─ Does the use of assertions WARNING I have a very situation?

- Is the defensive programming? Is the code hide an error?

- Is there a second algorithm to verify the first algorithm?

- Is there a startup check that can be used to confirm code or data?

Does a code contain random behavior? Can you eliminate these behaviors?

─ If your code produces useless information, do you put them in the debug code to useless information?

- Is there a rare or strange behavior in the code?

─ If the code is part of a subsystem, do you have a subsystem test?

- Is there any situation in your design and code?

- Do you have an integrity check even if the programmer does not feel it?

─ Do you have a valuable debug test to measure if you are too big or too slow?

- Do you use a non-portable data type?

Is there a variable or expression in a code or overflow?

- Is it accurately realized your design? Still very close to your design?

- Is the code to solve a problem in more than one?

─ Do you try to eliminate every IF statement in your code?

─ Have you used nested? : Operator?

- Is the special code have been isolated?

─ Do you use risk language?

- Is it unnecessarily mixed with different types of operators?

─ Do I call a function of returning an error? Can you eliminate this call?

- Is there any not allocated storage space?

- Is there any stored space that has been released?

─ Do you unnecessarily use output buffer storage?

─ Do you transmit data to a static or global buffer?

─ Does your function depend on the internal details of another function?

- Is there a weird or doubtful C usage?

- Is there a problem in the code?

- Is there unnecessary flexibility in the code? Can you eliminate them?

─ Your code is the result of "trying to" solve multiple times?

─ Is a function small and easy to test?

Whenever a function or subsystem is designed

─ Is this feature that meet the product's market strategy?

Is the error code hidden as a special situation of the normal return value?

─ Do you review your interface? Can it guarantee that it is difficult to make mistakes?

- Is there a function of multi-purpose and face?

- Is you too flexible (empty cave) function parameters?

─ Does it return an error condition when your function is no longer needed?

─ When calling point your function is easy to read? ─ Does your function have Boolean input?

Modify the error

─ Error can't disappear, can you find the root of the wrong?

- First, modify the true root cause of the error, or only modify the symptoms of the error?

Appendix B Memory login routine

The code in this appendix implements a simple list version of the memory login routine discussed in Chapter 3. This code is intended to make it easy to understand, but this does not mean that it is not used in a large number of applications that use the memory management program. But when you take time to rewrite the code, try this code before you can provide a quick-looking data structure, it is too slow to actually apply. You may find this code very well, especially when there is no allocation of many global shared storage modules, even more.

The implementation given in this file is very intuitive: whenever a memory block is assigned, the routine additionally assigns a small number of memory to store the block info (block information) structure, and the block information is logged in (see below) ). When a new BlockInfo structure is created, the login information is populated and placed on the header of the linked table. The linked list does not have a special order. Once again, the implementation is a selection because it is simple and easy to understand.

Block.h:

# impDef debug

/ * ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ---------------------------------------------------------------------------------------------------------------------------------------

* BlockInfo is a data structure. It records a storage login message that has allocated memory blocks.

* Each allocated memory block has a corresponding BlockInfo structure in memory login

* /

Typedef struct blockinfo

{

Struct BlockInfo * PBINEXT;

BYTE * PB; / * Start location of the memory block * /

SIZE_T SIZE; / * The length of the memory block * /

Flag freference; / * Over quoted? * /

} blockinfo; / * Name: BI, * PBI * /

Flag FcreateBlockInfo (Byte * Pbnew, Size_t SizeNew);

Void FreeBlockInfo (Byte * Pbtofree);

Void UpdateBlockInfobyte (byte * Pbold, Byte * Pbnew, size_t sizenew);

SIZE_T SIZEOFBLOCK (BYTE * PB);

Void ClearMemoryRefs (Void);

Void NoteMemoryref (Void * PV);

Void CheckMemoryrefs (Void);

Flag FvalidPointer (Void * PV, SIZE_T SIZE);

#ENDIF

Block.c:

#ifdef debug

/ * ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ---------------------

* The function in this file must be compared to the pointer, and the ANSI standard cannot ensure that the operation is

* Portable.

*

* The following macro is independently of the pointer required by the file. This implementation has always been directly direct

* Compare the "straight-out" pointer, the definition below is not applicable to certain universal 80x86 memory models.

* /

#define FPTRLESS (PLEFT, PLEFT) (PLEFT <(PRIGHT))

#define FPTRGRTR (PLEFT, PLEFT) (PLEFT <(PRIGHT))

#define fptRequal (PLEFT, PLEFT) (PLEFT) # Define FPTRLESSEQ (PLEFT, PLEFT <= (PRIGHT))

#define FPTRGRTREQ (PLEFT, PLEFT) (PLLT)> = (Prright))

/ * ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------ * /

/ * * * * * Private data / function * * * * * /

/ * ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------ * /

/ * ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------

* Pbihead points to the one-way link list of memory management programs debugging.

* /

Static blockInfo * pbihead = null;

/ * ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- --------------------

* PbiGetBlockInfo (PB)

*

* PbiGetBlockInfo query memory login finds the memory block referred to PB and returns to the point

* Pointers in the corresponding BlockInfo structure in the login. Note: PB must point to an allocated

* Memory block, otherwise it will get an assertion failed; the function or triggers an assertion or success, it

* Do not return an error.

*

* blockinfo * PBI;

* ...

* PBI = PbiGetBlockInfo (PB);

* // PBI-> PB points to the start position of the storage block indicated by the PB

* // PBI-> Size is the size of the storage block indicated by PB

* /

Static BlockInfo * PbiGetBlockInfo (Byte * Pb)

{

BlockInfo * PBI;

For (PBI = Pbihead; PBI! = NULL; PBI = PBI-> PBINEXT)

{

BYTE * PBSTART = PBI-> Pb; / * For readability * /

BYTE * PBEND = PBI-> PB PBI-> SIZE - 1;

IF (FPTRGRTREQ (PB, PBStart && fptrlesseq (PB, Pbend))

Break;

}

/ * Cannot find the pointer? Is it (a) garbage? (b) Point to a memory block that has been released?

* Or (c) pointing to a memory block that moves by FRESIZEMEMORY?

* /

Assert (PBI! = Null);

Return (PBI);

}

/ * ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------ * /

/ * * * * * Public function * * * * /

/ * ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------ * /

/ * ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------ * / * FcreateBlockInfo (PBNEW, SIZENEW)

*

* This function creates a login entry for the memory block defined by the PBNEW: SIZENEW. If successfully

* Established a login message, the function returns true, otherwise returns false.

*

* IF (FcreateBlockInfo (Pbnew, SizeNew))

* Success - the memory login has a PBNEW: SizeNew item

* Else

* Failure - Because there is no such thing, PBNEW should be released

* /

Flag FcreateBlockInfo (byte * pbnew, size_t sizenew)

{

BlockInfo * PBI;

Assert (PBNew! = Null && SizeNew! = 0);

PBI = (BlockInfo *) Malloc (Sizeof (BlockInfo));

IF (PBI! = NULL)

{

PBI-> PB = PBNEW;

PBI-> SIZE = SizeNew;

PBI-> PBINEXT = PBIHEAD;

Pbihead = PBI;

}

Return (FLAG) (PBI! = null);

}

/ * ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------

* FreeBlockInfo (PBTOFREE)

*

* This function clears the login item indicated by the PBTOFREE. PBTOFREE must point to one

* The start position of the assigned memory block, otherwise a assertion will fail.

* /

Void FreeBlockInfo (Byte * PBTOFREE)

{

Blocinfo * PBI, * PBIPREV;

For (PBI = Pbihead; PBI! = NULL; PBI = PBI-> PBINEXT)

{

IF (FPTREQUAL (PBI-> PB, PBTOFREE))

{

IF (pBipRev == null)

Pbihead = PBI-> Pbihead;

Else

PBIPREV-> PBINEXT = PBI-> PBINEXT;

Break;

}

PBIPREV = PBI;

}

/ * If it is pBI is null, PBTOFREE is invalid * /

Assert (PBI! = Null);

/ * Destroy * PBI before release * /

MEMSET (PBI, BGARBAGE, SIZEOF (BlockInfo);

Free (PBI);

}

/ * ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------

* UpdateBlockInfo (PBOLD, PBNEW, SIZENEW)

*

* UpdateBlockInfo detects the login information of the storage block indicating the PBOLD, then the function is repaired

* Change information has reflected the new location (PBNEW) and new byte length of the memory block now

* Degrees (Sizenew). PBOLD must point to the start position of the assigned memory block, otherwise

* Will get an assertion failed.

* /

Void UpdateBlockInfo (byte * pbold, byte * pbnew, size_t sizenew)

{

BlockInfo * PBI;

Assert (PBNew! = Null && SizeNew! = 0);

PBI = PbiGetBlockInfo (PBOLD);

AskERT (PBOLD == PBI-> PB); / * must point to the start position of a memory block * /

PBI-> PB = PBNEW;

PBI-> SIZE = SizeNew;

}

/ * ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------

* SizeOfblock (PB)

* SIZEOFBLOCK Returns the size of the storage block indicated by the PB. PB must point to a distributed storage block

* The start position will get an assertion failed.

* /

Size_t sizeofblock (byte * pb)

{

BlockInfo * PBI;

PBI = PbiGetBlockInfo (PB);

Assert (PB == PBI-> PB); / * must point to the start position of the memory block * /

Return (PBI-> Size);

}

/ * ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------ * /

/ * The following routines are used to find lost memory blocks and hanging pointers. * /

/ * See Chapter 3 * /

/ * ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------ * /

/ * ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------

* ClearMemoryRefs (Void)

*

* ClearMemoryRefs is not referenced in memory login.

* /

Void ClearMemoryRefs (Void)

{

BlockInfo * PBI;

For (PBI = Pbihead; PBI! = NULL; PBI = PBI-> PBINEXT)

PBI-> freference = false;

}

/ * ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------

* NoteMemoryRef (PV)

*

* NoteMemoryRefs The memory block flag referred to in PV is referenced. Note: PV does not have to point to one

* The start position of the memory block; it can point to any location that has allocated storage blocks.

* /

Void NoteMemoryref (Void * PV)

(

BlockInfo * PBI;

PBI = PbiGetBlockInfo ((Byte *) PV);

PBI-> freeenced = TRUE;

}

/ * ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------

* CheckMemoryRefs (void)

* CheckMemoryRefs Scan internal login to find not to mark NoteMemoryRef

* The memory block. If the function discovers a memory block that is not marked, it triggers an assertion.

* /

Void CheckMemoryrefs (Void)

{

BlockInfo * PBI;

For (PBI = Pbihead; PBI! = NULL; PBI = PBI-> PBINEXT) {

/ * Simple check the integrity of the storage block. If you trigger the assertion, you will explain the management blockInfo.

* The debug code has some errors, or the memory of the disorder has destroyed the data structure.

* No matter which situation, there is an error.

* /

Assert (PBI-> PB! = NULL && PBI-> Size! = 0);

/ * Check the loss / missing storage. If this assertion is triggered, the app or lost

* Lost the track of the storage block or does not explain all global pointers with NoteMemoryRef.

* /

Assert (PBI-> Freeenced);

}

}

/ * ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ------------------

* FvalidPointer (PV, SIZE)

*

* FvalidPointer Verify that PV points to an allocated storage block and from PV

* The end of the block has at least "SIZE" allocated bytes. If there is any condition, it is not satisfied.

* FvalidPointer will trigger assertions; this function will never return false, FvalidPointer

* Returns a (Total True) tag is to allow the letter to be called within the assertion macro

* Number. When this is not the most effective way, do not use #ifdef debug or not to introduce other icons

* Ask the macro, and simply use assertions to handle debug / delivery version control.

*

* ASSERT (FValidPointer (PB, SIZE));

* /

Flag FvalidPointer (Void * Pv, Size_t size)

{

BlockInfo * PBI;

BYTE * PB = (Byte *) PV;

PBI = PbiGetBlockInfo (PB); / * Enables PVs * /

Assert (PV! = Null && size! = 0);

/ * PV is effective, but SIZE? (If the storage block is overflow on the PB Size,

* Size is invalid)

* /

ASSERT (FPTRLESSEQ (PB Size, PBI-> PB PBI-> SIZE));

Return (TRUE);

}

#ENDIF

Appendix C exercises answer

This appendix gives the answers to all exercises in this book.

Chapter 1

1) The compiler will seize the priority order. Because it explains the expression as:

While (CH = (getchar ()! = EOF))

In other words, the compiler regards it as a value to CH, and thus thinking that you "==" the key "=", and issue a warning that may have a copy error.

2a) The simplest way to seize an accidental "Octa Error" is to throw away the selected compilation switch, which causes the compiler to make an error when chance to encounter an octal constant. Instead, use decimal or hexadecimal.

2b) In order to seize the programmer to "&&" falsely type "&" (or "||" false key "|"), the compiler uses the "==" false button "=" Also test. When "&" (or "|") is used in the IF statement or in the composite condition, the compiler will generate an error when comparing the results and 0. So seeing this statement will generate a warning.

IF (u & 1) / * u is an odd number? * / And the following statement does not generate warning information.

IF ((u & 1)! = 0) / * u is an odd number? * /

2C) WARNING The easiest way to become a comment is that when the first character of the compilation discovery comment is a letter or (, a warning is issued. This test will seize the following two suspicious situations:

Quot = numer / * pdenom;

Quot = Number / * (Pointer Expression);

To avoid warnings, you can make your intentions clearer by separating "/" with "*" with spaces or parenthesis.

Quot = numer / * pdenom;

Quot = Number / (* pdenom);

/ * Note: This annotation will produce a warning * /

/ * This note does not produce a warning * /

/ * ------------------ Warning Do not worry ----------------- * /

2D) Compiling the possible premature order of priority order is to find "There is a troublesome operator" in the same inclusive expression. For example, when the programmer occasionally uses the "<" and " " operators, the compiler will find the priority order, issue a warning for the following code:

Word = BHIGH << 8 blow;

However, since the following statement contains parentheses, the compiler does not issue a warning message:

Word = (BHIGH << 8) blow;

Word = BHIGH << (8 blow);

If you don't have a dedicated annotation, you can write a warning note: "If the two operators have different priority order, they are not enclosed in parentheses, then they should make a warning." This is too poor, but you are thinking To understand this. Develop a good heuristic comment, you need to run a lot of code on your computer until the final resulting result. You certainly don't want warning information to the following common language:

Word = BHigh * 256 blow;

IF (CH == '|| CH ==' / t '|| CH ==' / n ')

3) When the compiler finds two consecutive IF statements, the compiler will issue a warning message that may have hanging ELSE:

IF (Expression 1)

IF (Expression 2)

......

Else

......

IF (Expression 1)

IF (Expression 2)

......

Else

......

In order to avoid warning messages from the compiler, you can include the inner IF statement in parentheses:

IF (Expression1)

{

IF (Expression2)

......

}

Else

......

IF (Expression1)

{

IF (Expression2)

......

Else

......

}

4) It is very meaningful to place constants and expressions on the left side of the comparison operation, which provides a method for automatically checking errors. However, this method must have an operand to be a constant or expression as a premise. If the two operands are variables, this method does not work. Please note that when the programmer is writing code, be sure to learn and remember to use this technology.

By using a compilation switch, the compiler will warn each possible assignment. Especially for programs that have no experience, compilation switches are more effective.

If there is a compilation switch, it must be used; if not, the constant and expression are placed on the left side of the comparative form.

5) In order to prevent misappropriate pre-treatment macros, the compilation (actual pretreatment) program should have a switch to allow programmers to use unselected macros to be used for error conditions. Due to the ANSI compiler and support the old #ifdef pre-processing instruction, it supports the new pre-processed defined one yuan operator, then there is almost no definition macro "definition" to 0. The following code will generate errors: / * Establish a target equation * /

# i intel8080

......

#elif intel80x86

......

#ELIF MC6809

......

#ELIF MC680X0

......

#ENDIF

Therefore, it should be written as the following code:

/ * Establish a target equation * /

#if Defined (Intel8080)

......

#elif defined (Intel80x86)

......

#elif defined (mc6809)

......

#elif defined (MC680X0)

......

#ENDIF

This switch does not give a warning if the unselected macro is used in the # IFDEF statement. Because it is intentional.

Chapter two

1) One possible implementation of the Assertmsg macro is to generate two functions: one is a confirmation expression, the other is a string when the assertion is negative. For example, if you want to print the message of Memcpy, you should call Assertmsg as follows:

AskERTMSG (PBTO> = PBFROM Size || Pbfrom> = PBTO SIZE,

"Memcpy: The Blocks overlap");

Below is the implementation of Assertmsg macros. You should put Assertmsg definitions in the header file and place the _assertmsg routine in a convenient source file.

#ifdef debug

void_assertmsg (char * strmessage); / * prototype * /

#define assertmsg (f, str) /

IF (f) /

NULL /

ELSE /

_Assertmsg (STR)

#ELSE

#define assertmsg (f, str) null

#ENDIF

In another file:

#ifdef debug

void_assertmsg (char * strmessage)

{

Fflush (stdout);

FPRINTF (stderr, "/ n / n assertion flilure in% s / n", strmessage);

Fflush (stdeer);

Abort ();

}

#ENDIF

2) If your compiler supports a such switch, it notifies the compiler to assign all the same string in the same location, then the easiest way is to switch. If this choice is allowed, your program or a copy of 73 file names, and the compiler only assigns a string. The disadvantage of this method is that it not only "overrides" the assertion string, but also "overrides all the strings in the source file, but only unwanted behaviors.

Another way is to change the implementation of the ASSERT macro, and consciously only reference the string of the same file name in the entire file. The only difficulty is how to create a string of the file name, but even if this is not a problem, you should also hide the implementation details in a new AssertFile macro, this macro is only used in the start of the source file: #include

......

#include

AskERTFILE (__file__) / * plus * /

......

Void * Memcpy (void * pvto, void * pvfrom, size_t size)

{

BYTE * PBTO = (Byte *) PVTO;

BYTE * PBFROM = (byte *) pvfrom;

Assert (PVTO! = Null && pvfrom! = Null); / * No change * /

......

Below is the code that implements the AssertFile macro and the corresponding Assert version.

#ifdef debug

#define assertfile (str) static char strassertfile [] = STR;

#define assert (f) /

IF (f) /

NULL /

ELSE /

_Assert (strassertfile, _line_)

#ELSE

#define assertfile (STR)

#define assert (f) Null

#ENDIF

Using this version of Assert, you can get a lot of storage space. For example, the test application of this book is small, but use the code given above, these programs can save 3K data space.

3) The problem using this assertion is that the test contains the code that should be retained in the function non-debug version. The non-debug code will enter an infinite loop unless the CH happens to be equal to the executor. Therefore, the function should be written as the following form:

Void getLine (char * pch)

{

INTCH; / * CH must be int type * /

DO

{

CH = GetChar ();

Assert (ch! = EOF);

}

While (* PCH = CH)! = '/ n');

}

4) Isors the error existing in the modified switch statement, there is a very simple method, which is to assert the DEFAULT branch to confirm that the DEFAULT branch is the only branch that should be processed. In some cases, the DEFAULT branch cannot be referenced because all possible situations are clearly processed. If this happens, use the following code:

......

DEFAULT:

Assert (false); / * From not to it * /

Break;

}

5) There is a relationship between the shield code and the corresponding mode in the table, the mode should always be a subset of the shield code, or once masked, no instructions match the pattern. The following CheckIDInst program uses to confirm the pattern of blocking code:

Void CheckidInst (void)

{

Identity * PID, * Pidearlier;

INSTRUCTION INST;

For (PID = & idinst [0]; pid-> mask! = 0; PID )

{

/ * Mode is definitely a subset of shielding code * /

Assert ((pid-> PAT & PID-> Mask) == pid-> Pat); ......

6) Use an assert to confirm that there is no question of any questions:

Instruction * pcdecodeeor (Instruction Inst, Instruction * PC, Opcode * POPC)

{

/ * Do we have missed CMPM or CMPA.l instructions wrong? * /

Assert (Eamode (inst)! = 1 && mode (inst)! = 3);

/ * If it is a non-register, only absolute words and long-term modes are allowed * /

Assert (inst)! = 7 || (Eareg (inst) == 0 || Eareg (inst) == 1));

......

7) The key to selecting a backup algorithm is to select a different algorithm. For example, in order to confirm that Qsort is workable, you can scan the sorted data to verify that the order is correct (scan is not sorted, it should be regarded as a different algorithm). In order to verify that the two-point lookup work is normal, use a linear scan to see if the results of the two lookups are the same. Finally, in order to verify the correctness of the ITOA function, the string returned to the function is re-converted into an integer, and then it is passed to the integer of ITOA, they should be equal.

Of course, unless you are in the air plane, radiological plant, or some other error, you may threaten the situation of life, otherwise, you may not want to use the backup algorithm for each code you write. However, the backup algorithm should be used for all the more important parts in the application.

Chapter 3

1) By destroying the two types of storage space with different debug values, it is easy to distinguish a program that uses unaptive data or continues to use the released data. For example, using BNEWGARBAGE, FnewMemoery can destroy new unin-initialized storage, use BFREEGARBAGE, FreeMemory can destroy the released storage:

#define bnewgarbage 0xA3

#define bfreegarbage 0xa5

FresizeMemory builds these two types of useless data, you can use the above two values, or you can also build two other values.

2) A method for seizing "overflow" is that periodically checking the byte behind each assigned block, confirming that these bytes are not modified. Although this test sounds very intuitive, it asks you to remember all bytes, and it also ignores a potential problem that you may not be assigned to your memory block. Fortunately, there is a simple way to implement this test, but you want to allocate an extra byte for each assigned block.

For example, you need to assign 36 bytes when you call FneWMemory, you actually have to assign 37 bytes, and store a known "debug word" in that additional storage unit. Similarly, when FresizeMemory calls Realloc, you can assign and set an extra byte. In order to seize overflow, it should be added to the assertion in SizeOfblock, FvalidPointer, FreeBlockInfo, Notememoryref and CheckMemoryrefs, confirmed that it has not been exposed to the debug bit.

Below is a method of implementing this code. First of all, you have to define BDebugbyte and SizeOfDebugbyte:

/ * BDebugbyte is a singular value that is stored in each of the Debug version of the program, and SizeOfDebugbyte is added to Malloc and Realloc.

* SIZE, make the assigned space size correct.

* /

#define bdebugbyte 0xe1

#ifdef debug

#define SizeOfdebugbyte 1

#ELSE

#define sizeofdebugbyte 0

#ENDIF

Next, you should use SizeOfDebugbyte in FnewMemory and FresizeMemory to adjust the call to Malloc and Realloc. If the assignment is successful, use BDebugbyte to populate those extra bytes:

Flag fnewmemory (void ** ppv, size_t size)

{

BYTE ** PPB = (byte **) PPV;

Assert (ppv! = Null && size! = 0);

* ppb = (byte *) malloc (size sizeofdebugbyte); / * Change * /

#ifdef debug

{

* (* ppb size) = bdebugbyte; / * plus * /

MEMSET (* PPB, BGARBAGE, SIZE);

......

Flag FresizeMemory (void ** ppv, size_t sizenew)

{

BYTE ** PPB = (byte **) PPV;

BYTE * PBRESIZE;

......

PBRESIZE = (Byte *) Realloc (* ppb, sizenew sizeofdebugbyte); / * Change * /

IF (PBRESize! = NULL)

{

#ifdef debug

{

* (PBRESIZE SIZENEW) = BDEBUGBYTE; / * plus * /

UpdateBlockInfo (* PPB, PBRESize, SIZENEW);

......

Finally, the following assertions are inserted into the SizeOfBlock, FvalidPointer, FreeBlockInfo, NoteMEMoryRef, and the CheckMemoryRefs routine, which are given in Appendix B.

/ * Guarantee nothing in the upper bound of block * /

AskERT (* (PBI-> PB PBI-> size) == BDEBUGBYTE);

After doing these changes, the storage subsystem can seize overflow errors that are written to the assigned memory block.

3) There are many ways to check the pointer to the suspension. One possible solution is to change the freeMemory debug version, which makes it unruly released these storage blocks, but establishes a release chain for the assigned block, (these storage blocks, for the system, they are allocated, for User programs are released, they have been released). Modifying FreeMemory in this way that the "release" memory block is not reassigned before calling CheckMemoryRefs to confirm the subsystem. CheckMemoryRefs makes the storage system valid by getting the "release" chain of FreeMemory and truly releases all of these memory blocks.

Although this method can seize a pointer that does not hang, unless your program encounters such an error, it is generally not to use this method. Because this method violates the "code of additional information when debugging code, not different code" principles.

4) In order to enable the object size referenced by the pointer, two cases must be considered: One case is the pointer points to the entire block; the other is that the needle points point the partial allocation space within the block. For the first case, the most stringent test can be taken to confirm the beginning of the pointer, and the size of the block matches the return value of the SIZEOFBLOCK function. For the second case, the test should be weak: Therefore, if you do not use the NoteMemoryRef program to represent partial distribution blocks and complete blocks, you can use two functions to represent two types of blocks, which can be implemented by the following manner: add a parameter size to existing NoteMemoryRef function, after expansion The NoteMemoryRef function identifies partial distribution blocks; create a new function NoteMeMoryBlock to represent the full block, as shown below:

/ * NoteMemoryRef (PV, SIZE)

*

* NoteMemoryRef will be referenced by the memory block flag referred to in PV. Note: PV does not have to point to one

* The beginning of the memory block; it can point to any position in the dispensing memory block, but in this storage block

* There is at least there is "size" byte. Note: Use NoteMemoryBlock if possible.

* It is more reliable.

* /

Void NoteMemoryref (Void * PV, SIZE_T SIZE);

/ * NoteMemoryBlock (PV, SIZE)

*

* NoteMEMORYBLOCK is referenced by the memory block flag referred to in PV. Note: PV must

* Point to a memory block, which is just a "size" byte.

* /

Void NoteMemoryBlock (void * pv, size_t size);

These functions can seize the errors given in the exercise.

5) In order to improve the integrity inspection in Appendix B, first change the reference flag in the BlockInfo structure to a reference count, then change ClearMemoryRef and NoteMEMoryRef to process the counter, which is obvious. However, how to modify the checkmemoryrefs so that when some have multiple references, it is only for these blocks to check and do not check for other memory blocks.

One way to solve this problem is: Improve the NoteMemoryRef routine, which is the label ID of a ruler storage block in addition to a pointer to the storage block. NoteMemoryRef can save the label in the BlockInfo structure, subsequently checking the reference counter with the label. The following is the code after these changes. See the original function in Appendix B for the previous comments:

/ * Block tag is a table * / with a reference to save various types

Typedef enum

{

Tagnone, / * ClearMemoryRefs is set to tagnone * /

tagsymname,

Tagsymstruct,

TaglistNode, / * These blocks must have two references * /

......

Blocktag;

Void ClearMemoryRefs (Void)

{

BlockInfo * PBI;

For (PBI = Pbihead; PBI! = NUL; PBI = PBI-> PBINEXT)

{

PBI-> NREFERENCED = 0;

PBI-> tag = tagnone;

}

}

Void NoteMemoryref (Void * PV, BlockTag TAG)

{

BlockInfo * PBI;

PBI = PbiGetBlockInfo ((Byte *) PV);

PBI-> NREFERENCED ;

Assert (PBI-> tag == tagnone || PBI-> tag == tag);

PBI-> TAG = Tag;

}

Void CheckMemoryrefs (Void)

{

BlockInfo * PBI;

For (PBI = Pbihead; PBI! = NULL; PBI = PBI-> PBINEXT)

{

/ * Simple check block integration. If the following assertion is triggered, it means management block letters.

* The debug code is wrong, or there may be a new storage to erase the data.

* Structure. Both cases are erroneous.

* /

Assert (PBI-> PB! = NULL && PBI-> Size! = 0);

/ * Check the memory lost or missed, if all the references mean that the app is either lost.

* Traces, or no all global pointers are included in NoteMemoryRef. some

* Type blocks can have multiple references to them.

* /

Switch (PBI-> TAG)

{

DEFAULT:

Assert (PBI-> NREFERENCED == 1);

Break;

Case TaglistNode:

Assert (PBI-> NREFERENCED == 2);

Break;

......

}

}

}

6) Developers of DOS, Windows and Macintosh usually use the following methods to test memory space exhaustive conditions. They use a tool to use the storage space until the storage space of the application is wrong. Although this method can work, it is not accurate, it will cause allocation failures for a place to have a place. If you want to test an isolated feature, this technology is not very useful. A better way is to create an emulation of the memory overflow in the store management program.

However, please note that storage is just a type of resource error, as well as magnetic disks, wrong, telephone line busy error, and other errors. Therefore, there is a need for universal tools that deliberately manufacturing a shortage of resources.

A solution is to establish a FailureInfo structure that includes information about how to do error handling mechanisms in this structure. Programmers and testers fill in the FailureInfo structure in external tests, then demonstrate their characteristics. (Microsoft Applications often use Debug-Only), which allows testers to use such systems, like the use of Excel, have a macro language in applications, there is a debug-only macro allows testers to automate this process) .

In order to declare the fault structure of the storage manager, you should use the following code:

FailureInfo fimemory;

In order to simulate internal storage in FnewMemory or FresizeMemory, the four-line debug code should be added to each function:

Flag fnewmemory (void ** ppv, size_t size)

{

BYTE ** PPB = (byte **) PPV;

#ifdef debug

IF (ffakefailure (& fimore))

Return (False);

#ENFIF

......

Flag FresizeMemory (void ** ppv, size_t sizenew)

{

BYTE ** PPB = (byte **) PPV;

BYTE * PBRESIZE;

#ifdef debug

IF (ffakefailure (& fimore))

Return (false); # eDIF

......

This sets a fault mechanism in the code. To make it work, call the setFailures function to initialize the FailureInfo structure:

SetFailures (FimeMory, 5, 7);

Calling setFailures with 5 and 7 is telling the fault system, and successfully calls the system 5 times before obtaining seven consecutive failures. Two common calls to SetFarilies are:

SetFailures (& fimemory, uint_max, 0); / * Do not forge any fault * /

SetFailures (& fimemory, 0, uint_max); / * Always forgery fault * /

With SETFAILURES, you can write a unit test of the same paragraph code again and again. It is to simulate all possible error modes every time you use a different value to call setFailures. The second "failed" value is usually kept as uint_max, the first "success" value counts from 0 to a large number, gradually testing it. This number is large to test all memory consumption conditions.

Finally, when you want to call memory (or disk, etc.) system, you know that you will not be able to fail; especially when you assign resources in a debug code, it is often the case. The following two can be temporarily allowed to fail on the fault mechanism:

Disablefailures (& fimeMory);

... for allocation ...

Enablefailures (& fimeMory);

The following code is a fault mechanism for establishing four functions:

Typedef struct

{

Unsigned nsucceed; / * There are # times before the fault is successful * /

Unsigned nfail; / * # failed * /

Unsigned nTries; / * has been called # 时 * /

INT LOCK; / * such as Lock> 0, this mechanism does not work * /

FailureInfo;

Void SetFailes (FailureInfo * PFI, UNSIGNED NSUCCEED, UNSIGNED NFAIL)

{

/ * If NFAIL is 0, nsuCCEED is required to be uint_max * /

Assert (nfail! = 0 || nsucceed == uint_max);

PFI-> nsucceed = nsucceed;

PFI-> nfail = nfail;

PFI-> nTries = 0;

PFI-> LOCK = 0;

}

Void EnableFailures (FailureInfo * Info)

{

Assert (PFI-> Lock> 0);

PFI-> LOCK -;

}

Void Disablefailures (FailureInfo * PFI)

{

Assert (PFI-> LOCK> = 0 && Pfi-> lock

PFI-> LOCK ;

}

Flag Ffakefailure (FailureInfo * PFI)

{

Assert (PFI = NULL);

IF (PFI-> Lock> 0)

Return (False);

IF (PFI-> nTries! = uint_max) / * Do not make NTRIES overflow * /

PFI-> NTRIES ;

IF (PFI-> NTRIES <= PFI-> nsucceed) Return (false);

IF (PFI-> NTRIES - PFI-> nsucceed <= PFI-> NFAIL)

Return (TRUE);

Return (False);

}

Chapter 4

The fourth chapter has no exercises.

Chapter 5

1) Like Malloc, since strDup's error return value is a null pointer with an illusion, it is easy to lose, so StrDup has a dangerous interface. As an inclusive interface, the error condition should be separated from the pointer to the output to make the error condition clearer. The following code is this interface:

Char * strdup; / * Pointer to replicate strings * /

IF (FSTRDUP (& Strdup, StrtOcopy)

Success - - strDUP points to new strings

Else

Failure - strdup is NULL

2) GetChar interface is better than the FGetChar interface, which will return an error code instead of a value of "success" in True and false. E.g:

/ * errgetchar may return an error * /

Typedef enum

{

Errnone = 0,

Erreof,

Errbadread,

......

} Error;

Void ReadSomestuff (Void)

{

CHAR CH;

Error ERR;

IF ((err = errgetchar (& ch)) == Errnone)

Success - - CH get the next character

Else

Failure - ERR has an error type

......

This interface is better than FGETCHAR, because it allows errgetchar to return a variety of error conditions (and a variety of corresponding success conditions). If you don't care about returning an error, you can cancel local variable err, return to FGETCHAR's interface:

IF (ErrgetChar (& ch) == errnone

Success - - CH get the next character

Else

Failure ─ ─ Do not care what error type

3) The STRNCPY function has a trouble problem that the performance is unstable: sometimes Strncpy terminates a specified string with an empty character, sometimes it is. Strncpy is column with other universal string functions, and the programmer may incorrectly determine the STRNCPY function itself is a universal function, in fact it is not. Since it has exceptional performance, in fact, StrNcPy should not be in the ANSI standard, but because it is widely used in the pre-processing implementation of ANSI C, it can also be said to be in the ANSI standard.

4) The C Inline function indicates very valuable, which allows the user to define a function as a function of the same macro, but there is no macro "function" to the incompatible side effects of the parameters.

5) There is a serious problem with the C new & reference parameters, which hides a fact that the variable is passed by reference, not through the value, which may cause confusion. For example, suppose you redefine the FresizeMemory function and use the reference parameters. Programmer can write:

FresizeMemory (Pb, SizeNew))

Resize is successful

But pay attention to, the programmer who is not familiar with this function will not think that PB may change during the call. Do you think this will affect the maintenance of the program?

In contracted this, C programmers often operate the formal parameters in their functions because they know that these parameters are passed through values, not by reference. However, consider the maintenance personnel to modify the error in the function, you can't write this. If these programmers do not pay attention to the declarations, he may modify the parameters, and it is not aware that this change is not partially in this function. 6) The problem exists in the interface of STRCMP is that the return value of this function has caused an unpleasant code that caused an uninterect understanding. In order to improve strCMP, the design interface should make the return value easily understood for programmers that are not familiar with this function.

There is an interface that makes a smaller change on the current STRCMP. It is not returning a positive or negative value for unequal strings, but forcing programmers to change all comparisons to 0. Modify strcmp is it returns three definitions of good name constants:

IF (strcmp (strcmp (str_less) == STR_LESS

IF (strcmp (strcmp (str_greater) == Str_greater

IF (strCMP (strcmp (strcmp (str_equal) == str_equal

Another possible interface is that each class is compared to use separate functions:

IF (FSTRLEFT (STRLEFT, STRRIGHT))

IF (FSTRGREATER (STRLEFT, STRRIGHT))

IF (FStrequal (strright))

The advantage of the second interface is that it can be implemented by using macro on existing STRCMP functions. The comparison of <= and> = this is defined as a macro to greatly improve readability. As a result, it is improved, and there is no loss in terms of space and speed.

#define fstrless (strcmp (strcmp (strcmp (strcmp (strrgiht) <0)

#define fstrgreater (strcmp (strcmp (strcmp (strcmp (strcmp (strcmp)> 0)

#define fstrequal (strCMP (strcmp (strcmp (strcmp) == 0)

Chapter 6

1) "Simple" 1-bit domain's portable range is 0, which is not used. The bit domain does have a non-zero, but do not know what is the value: This value can be -1 or 1, depending on the compiler used in the default, it is not a symbol. Good bit field. If all comparisons are limited to 0, two states of the bit field can be safely used. If psw.carry is a simple 1-bit domain, you can safely write the following code:

IF (psw.carry == 0) if (! psw.carry)

IF (psw.carry! = 0) IF (psw.carry)

However, the following statement is risky because they rely on the compiler used:

IF (psw.carry == 1) IF (psw.carry == -1)

IF (psw.carry! = 1) IF (psw.carry! = -1)

2) Returns the function of the Boolean value like the "simple" 1 bit field, there is no way to say what the value returned by "TRUE" will be. It can be dependent on: False is 0. However, programmers often use non-0 values ​​as a return value of "True", of course, this is not equal to constant True. If you assume that FneWMemory returns a Boolean value, you can safely write the following code: if (fnewMemory (...) == false)

IF (FnewMemory (...) == FASLE)

Even better code:

IF (fnewmemory)

IF (fnewmemory)

However, the following code is risky because it assumes that the FneWMemory will not return any non-zero value except TRUE:

IF (fnewmemory (...) == true) / * Risk * /

Remember a good rule: do not compare the Boolean value with TRUE.

3) If you declare WNDDISPLAY as a global window structure, you give it a special attribute that does not have a different window structure: global. This seems to be a secondary detail, but it may introduce an unpredictable error. For example, suppose you want to write a routine that releases the window and all sub-windows, the following functions implement this feature:

Void Freewindowtree (Window * PWNDROOT)

{

IF (PWNDROOT! = NULL)

{

Window * pwnd, * pwndNext;

ASSERT (FValidWindow (PWNDROOT);

For (PWND = PWNDROOT-> PWNDCHILD; PWND! = NULL; PWND = PWNDNEXT)

{

PWNDNEXT = PWND-> PWNDSIBLING;

FreeWindowTree (PWND);

}

IF (PWNDROOT-> STRWNDTITE! = NULL)

FreeMemory (PWNDROOT-> STRWNDTITE);

FreeMemory (PWNDROOT);

}

}

But note that if you want to release each window, you can safely pass the PWndDisplay because it points to the assigned window structure. However, you cannot pass & WndDisplay, because the code will release WNDDISPLAY, which is impossible because WndDisplay is a global window structure. In order to make the code with & wndDisplay work correctly, you must insert FreeMemory before final call:

IF (PWNDROOT! = & WNDDISPLAY)

If this is done, the code will rely on the global data structure.嗬!

If there is no error in the code, there is a best way to avoid any weird design in the implementation.

4) The second version code is more risky than the first version of the code, which is a few reasons. Since A, D and Expression are public code in the first edition, they must be executed and tested regardless of the value of F. In the second edition, each expression associated with A and D will be tested, unless they are the same, otherwise it is risks to a certain branch. (Different Optimizations are specifically optimized for two A and two D for two A and two D for two A and two D for the B or C, then two A and two D will be different.

In the second edition, there is a problem that it is difficult to ensure two A and two D synchronization when the programmer modifies the error or improving the code. Especially when two A and two D are not the same, it is even more likely. Therefore, the first edition is used unless the cost of calculating F is too expensive to observe it. Here, remember another very useful rule: minimize code differences by maximizing the number of public code. 5) Use similar names to be dangerous, such as S1 and S2, and it is easy to make a key to S1 when you want to type S2. Worse, this error may not be discovered when compiling such a code. Use similar names, make it difficult to find names to reverse errors:

INT STRCMP (Const Char * S1, Const Char * S2)

{

For (NULL; S1 == S2; S1 , S2 )

{

IF (* S1 == '/ 0') / * matches the end? * /

Return (0;

}

RETURN ((* (unsigned char *) S2 <* (unsigned char *) S1)? -1: 1);

}

The above code is wrong, the last line of testing is reversed, because the name itself has no meaning, this error is hard to find. However, if the descriptive, distinguished name, such as SLEFT and SRIGHT, the number of occurrences of the above two types of errors will be automatically lowered, and the code is better read.

6) ANSI standard guarantees the first byte of the declared data type, but it does not guarantee the byte of any data type; the standard does not guarantee the word front of the memory block allocated by Malloc Site.

For example, a pointer of some 80x86 storage models is implemented using the base: Offset (base: offset), and only manipulating the unsigned offset. If PCHStart is a pointer at the beginning of the assigned memory block, its offset is 0. If you assume that the PCH starts exceeding the value of PCHStart size, it will never be less than PCHStart, because its offset will never be less than the offset of PCHStart.

7A) If the STR contains several% symbols, use Printf (STR) instead of Printf ("% s", STR), an error occurs, and the printf will incorrectly interpret the% symbols contained in the Str as a format description. The trouble of using Printf ("% s", str) is that because it can be very "obvious" optimized to Printf (STR), the careful programmer will introduce an error when cleaning code.

7b) Use f = 1-f instead of f =! F is risk because it assumes f or 0 or 1. However, use! F clearly indicate that it is a flip sign, which works on all F values. The only reason to use 1-f is that it produces a higher than the code than! F efficiency, but to remember that local efficiency has rarely impacts the overall impact on the procedure. Use 1-f to increase the risk of generating errors.

7C) The risk of using multiple assignments in one statement is that undesirable data type conversions may be caused. In the given example, the programmer declares the CH as int for it to properly process the EOF value that getchar can return. However, the value returned by getchar has existed in a string, and the value is converted to char, which is that the char is assigned to CH instead of GetChar returned to CH. If the value of EOF on the system is negative, the default value of the compiler is a symbolic character, then the error will appear quickly. However, if the default value of the compiler is a symbolic character, EOF may be intercepted as a character. When reclassified to int, it may just equal to EOF again. This does not mean that the code is working correctly. If you can't see the EOF problem, you lose the ability to distinguish between EOF and EOF backward characters. 8) In a typical case, the table makes the code, speed speed, can be used to simplify the code, increasing the correct probability. However, when considering the data in the table, the opposite conclusion is obtained. First, the code may be less, but the form takes up storage space. In general, the table solution may be more storage space than the non-form implementation. Another problem using the table is risky, you must ensure that the data in the table is completely correct, sometimes it is easy to do, such as the TOLOWER HOCYCECHECKBOX table. However, for some of the tables in Chapter 2, Chapter 2, the table in the anti-assembler, to ensure that the data in the table is completely correct, because it is easy to introduce an error. So get a principle: Do not use the form unless you can make sure the data is valid.

9) If you use the compiler that does not do some like a multiplication, division transition into shift (in appropriate time) such basic optimization, then there must be a worse code generation problem that makes you worry, don't worry Shift instead of division such a small improvement. Don't work hard to overcome the limitations of poor compilation procedures in terms of tips for improving efficiency. Instead, keep the code clarity and find a good compiler.

10) To ensure that the user's file can always be saved, the buffer is allocated before the user changes the file. If each file requires a buffer, then a buffer is assigned each time you open a file. If the assignment fails, the file is not opened, or open the file as a read-only file. However, if you handle all open files with a buffer, you can assign this buffer when you initialize. And when the buffer is hung in most of the time, don't worry, don't worry about "waste" storage space. "Waste" storage space and make sure you can save the user's data, which is better than the user works for 5 hours, and the data cannot be saved more than the buffer.

Chapter 7

1) The following code has modified the two input parameters PCHTO and PCHFROM of the function:

Char * STRCPY (Char * PCHTO, Char * PCHFROM)

{

Char * pchstart = PCHTO;

While (* PCHTO = * PCHFROM )

NULL;

Return (PCHSTART);

}

Modifying PCHTO and PCHFROM does not violate the write permissions related to these two parameters because they are passed through a value, which is that STRCPY accepts the copy input, so it allows Strcpy to modify them. But not to pay attention, not all computer languages ​​(such as fortran) are passing the parameters through a value, so although this practice is very safe, if other languages ​​are implemented, it may be very dangerous. 2) StrDigits' problem is that it is declared as a static pointer, not a static buffer, if the selection of the compiler is indicated by the compiler, the direct amount of all strings is handled, then this declaration will bring a small difference. Come on the problem. The compiler supports the "Standing String Delive" option accepts all string direct quantities and stores them and other constants in the program. Since constants do not change, these compilers generally scan all constant strings and delete replicated constant strings. In other words, if strFromuns and strfromint declare a static pointer to a string similar to "?????", the compiler may assign a copy of the string (rather than two). Some compilers are even more thorough, as long as a string matches the tail of another string (eg "Her" matches the "Mother" to match, it is stored. This changes a string to change other strings.

The way to solve this problem is to process all the strings as constant, and restrict the program code only from them. If you want to change a string, then declare a character buffer instead of declaring a string pointer:

Char * strfromuns (unsigned u)

{

Static char strdigits [] = "?????"; / * 5 characters '/ 0' * /

......

But this is also risky, as this depends on the correct number of "?" Logo, and assumes that the empty characters of the tail will not be damaged. Is it a good idea to use the "?" Flag to hold space, is this string really 5 "?" Logo? If you can't guarantee this, then you will understand why you should use different characters.

The declaration of the size of the buffer is a safe implementation:

Char * strfromuns (unsigned u)

{

Static Char strdigits [6]; / * 5 characters '/ 0' * /

......

PCH = & strdigits [] 5;

* PCH = '/ 0'; / * Replace Assert * /

......

3) Use MEMSET to initialize the adjacent area, which is very adventurous and very inefficient (relative to direct use assignment):

i = 0; / * sets I, J and K are zero * /

J = 0;

K = 0;

Or more simply:

i = j = k = 0; / * sets I, J and K are zero * /

These codes are portable and efficient, so it is very obvious, even no longer need to explain, and the MEMSET version is another thing.

I can't affirm what the initial programmer wants to get anything, but I can definitely not get any benefits. For all compilers other than the best compiler, call MemSet's memory operation, more expensive than display declarations I, J and K, but supplementary programmers use an excellent compiler, as long as In the compile time, you know the length of the to fill the value, this compiler can insert minute fill, then this "call" will turn into three SIZEOF (int) storage. This does not make the situation improve the situation: the code still assumes that the compiler will assign I, J, K in the stack, where K is stored at the bottom of the stack, the code also assumes I, J, K, and each other, There are no other excess "pad" bytes to adjust the length of the variable to facilitate access. Who said that variables do not put in the main storage? A good compile program is subject to a cross-lifecycle analysis, putting the taste information into the register and maintains the entire declaration period of the Permanent Declaration. For example, I and J may always be allocated in the register, and the root cannot be in the main memory; on the other hand, K must be assigned in the main memory because its address is transmitted to MEMSET (you can't use the address of the register), In this case, I and J are still not initialized, while the 2 * sizeof (int) one bytes behind K will always be incorrectly set to zero.

4) When you call or jump to a fixed address of the machine ROM, it will face two hazards. The first danger is that the ROM in your machine may have change, but the new model hardware will definitely have some change, even if the ROM routine has not changed, the hardware vendor sometimes comes from the software left in RAM. Fixed errors in the ROM, where the patches are called through the system interface. If you bypassed these interfaces, then these patches are bypass.

5) If Val is not required, it will not pass VAL. The problem brought is that the calling program is to be assumed to do the internal execution of the DoOoperation, just like the relationship between Fill and Cmove. Assume that some programmers should improve the code to write it as the code as follows, so that it has been reference VAL:

Void Dooperation (Operation OP, INT VAL)

{

IF (OP

DOPRIMARYOPS (OP, VAL);

ELSE IF (OP

DOFLOATOPS (OP, VAL);

Else

......

}

What happens when doperation is not existed? This depends on your operating system. If "VAL" is the write protection section of the stack structure, the code may be extremely terminated.

You can make programmers to play for your function by forcibly transferring the positions occupied by the variables that are no longer used. For example, in the document you can say: "Whenever you call the doOoperation whenever you call the DOOPERATION, you will pass 0 to Val." A assertion about the storage location allows the programmer no longer toss:

Case Opnegacc:

AskERT (VAL == 0); / * Pass 0 * /

Accumulator = -acumulator;

Break;

6) This assertion is used to verify that F is TRUE or FALSE. This assertion is not only unclear, but more importantly, it is not necessary to be so complicated in debug code, which is based on commodity version. This assertion is preferably written:

Assert (f == true || f == false);

7) Do not put all the work on a row code, declare a function pointer, divide a line of code into two, as shown below:

Void * Memmove (void * pvto, void * pvfrom, size_t size) {

Void (* pfnmove) (Byte *, Byte *, Size_t);

BYTE * PBTO = (Byte *) PVTO;

BYTE * PBFROM = (byte *) pvfrom;

Pfnmove = (PBTO

(* pfnmove) (PBTO, PBFROM, SIZE);

Return (PVTO);

}

8) Because the code called the Print routine relies on the memory implementation of the Print code. If a programmer changed the print code, and did not realize that the other code calls it is implemented from the entrance to 4 bytes, then the programmer modifies the code, it may destroy the call of "Print 4". By. If you find this question, you should rewrite the code of the entry point, add it in the middle of the routine, at least to make the entry point in front of the maintenance person:

Move R0, #printer

Call PrintDevice

......

PrintDisplay: Move R0, #display

PrintDevice: ......; r0 == Device ID

9) This meaningless type is very popular when the microcomputer is only very small, because each byte is very valuable, and this way can usually save one or two bytes. Later, this is a bad habit. Now think it is very bad habit. If you still write such a code in your group, let them correct this habit or let them leave your group. You don't have to let such a code to find trouble.

Chapter 8

Chapter 8 has no exercises.

After the comment

Our discussion is over, the reader may also ask me with doubts: "Do you believe it is possible to write a wrong program?" Of course, you can't guarantee this in absolutely, 100%. But it can be believed that as long as you insist on illustrating it, you can write very close to a wrong program. As is like a brush room, you can don't touch the carpet during the brush room, but you must put the cloth on the ground, surround the baffle, and carefully brush, you will be determined to persist. Similarly, readers must work hard to remove errors in the code. The only way to do this is to persist in the right direction.

Although you put the wrong code to the first place, just use the technique given in this book, you can't fully reach this goal. In fact, there is no criterion table that guarantees any errors that can't be encoded. Therefore, the most important thing is that the reader must insist on establishing a list of mistakes to list the mistakes you can find to avoid the previous mistakes. Some items in this table may be surprised.

For example, I have introduced an annoying minor error to Excel: When I browse a file, I accidentally deleted a line. At that time, I didn't detect this error, combined with this change along with other files to the main program. After others discover this mistake and track me. I think, how can I detect and prevent such errors? The answer is clear: Before using the changed code to the main program, use the source code control management program to list the changes. This extra step is put into implementation and does not take us, in the next five years, it helps me find three major errors and some less appropriate small changes. Is these three mistakes no value for five years? It is worth it for me, because as I said earlier, I can't do much, I can know that I will not bring unwanted changes to the main program. Again again, correct the error to put in the primary status.

The reader may find that the evaluation code is to solve the problem, providing better documentation is to help develop the internal staff of the product. If you do not use a unit test, you may have already done it early. The reader even discovered that the Debug code mentioned in Chapter 3 Exercise 6 is very useful to help the tester. Sometimes this method of solving problems is not very practical. At this time, the best way is to avoid algorithms and implementation that causes errors. In fact, it is unable to completely eliminate errors, but through unremitting efforts, it can increase the time interval of the same mistake. In order to help everyone do this, a list of inspections used in Appendix A. This checklist integrated all the views of this book.

In summary, the key to successfully writing errors code can be summarized as a general principle

Never allow the same error to appear twice

转载请注明原文地址:https://www.9cbs.com/read-90226.html

New Post(0)