Chapter 7 Classification Regulations

xiaoxiao2021-03-06  16

Chapter 7 Classification Regulations

This chapter describes the basic elements of the C program. You use these elements called "lexical elements" or "symbol" to construct statements, definition, description, etc., and use them to construct a complete program. This chapter discusses the following lexical elements:

* Language symbol

* Note

* Identifier

* Keywords

* Punctuation

* Operator

* Text

This chapter also includes Table 1.1, which gives the priority and combination law of the C operator (priority from the highest to minimum). See Chapter 4, "Expression" in Chapter 4, for a complete discussion of the operator.

Document translation overview

Like the C program and C procedures, each file is composed of one or more files, each file is translated in the following concept order (actual order follows "AS if" rules: As long as these steps are followed, translation inevitable):

1. Lord symbolization: This translation phase performs character mapping and tri-letter processing, line segmentation, and symbolicization.

2. Preprocessing: This translation phase is transferred to the auxiliary source file referenced by the #include command, handles the "character string" and "characterization" commands, perform symbol delivery and macro extensions (if you need more information, you can see it later " The "Preprocessor Command" in the Preprocessor Reference ", the result of the pre-processed is a set of ordered symbols, together with a" conversion unit ".

The preprocessor command always begins with a number symbol (#) (ie, the first non-empty character must be a number symbol). One line can only appear a preprocessor command.

For example: #include // contains iostream.h text in the conversion unit.

#define ndebug // Define ndebug (ndebug contains an empty document string).

3. Generate code: This translation phase is a symbol generated by the pre-processing phase to generate the target code. At this stage, the syntax and semantic check of source code are performed.

If more information is required, please refer to the "Translation Stage" in the "Preprocessor Reference" after this manual. The C preprocessor is a strict superchard of the ANSI C pre-regulator, but the C preprocessor is different in certain instances.

The following listed below is a few different points of the ANSI C and C preprocessors:

* Support for a single line of comments, see "Notes" for details

* A predefined macro __cplusPlus is only defined by C . For details, please refer to "Predefined Macro" in the "Pre-Processor Reference" behind this volume.

* C pre-regulator cannot recognize C . * .-> *, and :: operator. The operator details "Operators" and Chapter 4 "expressions behind this chapter.

Language symbol

A language symbol is a minimum unit that makes sense to the compiler in the C program. The C syntax analysis program identifies the following types of language symbols: identifiers, keywords, text, operators, punctuation symbols, and other separators. This string symbol constitutes a conversion unit.

Language symbols are usually separated by "blank", blank can be one or more:

* Space

* Level or vertical tab

* Removal

* Change page

* Note Coordination Symbol:

Keyword

Identifier

constant

Operator

Punctuation

Prerequisite symbol:

Head name

Identifier

PP number

Character constant

String text

Operator

Punctuation

Each non-empty character cannot be one of the above.

The syntax analysis program scans the input characters from left to right, and selects the longest language symbol set as much as possible from the input stream to divide the language symbol. Consider the following code segment:

A = i j;

The programmer written by this code may have the following two statements:

A = i ( J)

A = (i ) j

Since the syntax analysis program creates the longest symbol set from the input stream, it selects the second explanation to get symbol I , , and J.

Comment

The comment is the text ignored by the compiler, but it is very useful for program designers. Note The code is usually labeled for later reference. The compiler treats them as a blank. When debugging, you can use the comment so that the specific row code does not run; however, # if # Endif preprocessing commands better in this regard, because you can enclose the code included in the comment, but you can't nest.

A C note is written in one way:

> / * (Slash, asterisk) characters, followed by the designer sequence (including new row), follow the * / character. This syntax is the same as ANSI C.

> // (two slashes) characters, followed by the desired character sequence. A new row that is not followed by backwards will end this form of comments. Therefore, it is often referred to as "single-line comments". Note characters (/ *, * / and //) There is no special meaning in characters constant: string text or comment. Therefore, the annotation using the first syntax cannot be used nested, considering the following example:

/ * Intent: Comment Out this block of cotne.

Problem: Nested Comments On Each Line of Code Are Are Illegal.

FileName = String ("Hello.dat"); / * Initialize File String * /

COUT << "File:" << filename << "/ n"; / * print status message * / * /

The execution code cannot be compiled because the compiler scans the input stream, from the first / * to the first * /, it is considered a comment. In this case, the first * / appears in the end of the INITIALIZE FILE STRING. Then there is no one / * paired with it for the last * /.

Note: The single line form (//) follows a continuation result (/) will result in unexpected results. Consider the following code:

#includevoid main () {Printf ("this is a number% d", /// 5);

After pre-processing, the previous code is wrong and displayed as follows:

#indudevoid main () {Printf ("this is a number% d",}

Identifier

A identifier is a character sequence for encoding one of the following:

* Object or variable name

* Class, structure or joint name

* Enumeration type name

* Member L function or class member function of class, structure, joint, or enumeration

* TypedEf name

* Name Name

* Macro name

* Macro parameter

grammar

Identifier: Non-digital identifier Non-digital identifier digital non-number: one of the following - A b C D e f G H i j k L M N o P Q R S T U v w x y z A b C D E f G H i J k L M N o P Q R s T U v w x y z number: The following 0 1 2 3 4 5 6 7 8 9

Microsoft Special Office

The first 247 characters only on the Microsoft C identifier are meaningful. Since the user-defined type name is made complicated by the compiler "modified" to save the facts of the type information. The result name, including type information, can not exceed 247 characters long. (For details, please "Modify Name" in the "Microsoft Visualc 6.0 Programmer Guide". Factors that can affect the length of the modified identifier:

* Whether identifier indicates an object of a user-defined type of object or a derived type of a user-defined type.

* Regardless of the identifier representation or the type of function derived.

* The number of parameters of a function.

Microsoft End

The first character of an identifier must be a letter, regardless of uppercase or lowercase, or a underscore (_). Since the C identifier is sensitive to case, FileName and FileName are different.

The identifier cannot be used as the same spelling and case in use with keywords. The identifier is legal, for example, Pint is a legitimate identifier, although it contains keyword int.

In the beginning of an identifier, use a continuous two underscores (-) or one underlined head to keep up with a capital letter, reserved in all the range of C , you should avoid using a headed underline with a lowercase letter There is a name of the file scope because it may conflict with the identifier saved in the present or future.

Keyword

The keyword is a predefined retention identifier with special meaning. They cannot be used as identifiers in your program. The following is a keyword that is reserved in C :

The symptory keyword is one of the following

asm * auto bad_cast bad_typeid boolbreakcasecatchcharclassconstconst_castcontinuedefaultdeletedodoubleDynamic_castelseenumexceptexplicitexternfalseFinallyfloatforintgotoifinlinelongmutablenamespacenewoperatorprivateprotectedpublicregisterreinterpret_castreturnshortsignedsizeofstaticstatic_caststructswitchtemplatethisthrowtruetrytype_infotypedeftypeidtypenameunionunsignedusingvirtualvoidvolatilewhile * to achieve compatibility with other C reserved, but not implemented. Use --ASM.

Microsoft Special Office

In Microsoft C , the identifier at the beginning of two underscores is preserved for the compiler. Therefore, the Microsoft specifies that the double underscore is added before the Microsoft specific keyword, which cannot be used as an identifier.

allocate3 - inlineproperty3 - asm1 - int8selectany3 - based2 - int16 - single_inheritance - cdecl - int32 - stdcall - declspec - int64thread3dllexport3 - leave - trydllimport3 - multiple_inheritance uuid3 - exceptnaked3 - uuidof - FastCallnothrow3 - Virtual_inheritance - Finally

1. Alternative C ASM syntax

2. --BASED keyword is limited to 32-bit target compilation

3. These are special identifiers when used with -declspec, and there is no limit to applications in other context.

The Microsoft Expansion section is allowed in the default, to make sure your program is fully portable, you can use the specified ANSI compatibility / zA command line option to make the Microsoft expansion portion during compilation. When you do this, Microsoft Specific keywords are unavailable.

When Microsoft expands partial enable, you can use the keywords in front of the columns in your program. For ANSI applications, these keywords are crossed to double downline. For backward compatibility, all keywords except __except, _ _finally, _ _leave and __TRY, their single underline versions are supported, and __ cdecl is also available in front of the previous underscore.

Microsoft End

Punctuation

The punctuation in C has syntax and semantic meaning for the compiler, but it does not specify some kind of operation that generates values. Some punctuation, whether or in use or in combination, it can be used as an operator of C or is important for the pre-processor.

grammar

Practice symbol: one of the following

!% ^ & * () - = {} | ~

[] /; ': "<>,. / #

Point symbols [], () and {} must appear in the translation of the 4th phase.

Operator

The operator specifies the following evaluation operation:

* One operand (single operator)

* Two operands (double-purpose operator)

* Three operands (three-mean operator)

The C language contains all operators in C, which also adds several new operators, and Table 1.1 lists the operators available in Microsoft C .

The operator determines the order of operations that contain these operators in accordance with strict priority. The operator either combines the expression on the left, either combined with the expression on the right, which is called "combined law".

The operators in the same group have the same priority, and in the expression from left to right, the priority and combination law of the C operator is given from the left-to-right operation in the expression. High to low). Table 1.1 C operator priority and combination

Operator name or meaning combined with the law operator name or meanings Combination Law :: Range Dismiam:: The overall "array subscript from left to right () function call from left to right () type conversion None -> member selection (Pointer) from left to right · Member selection (object) from left to right suffix increase 1 No - Refix 1 No New Assignment Object No delete Undo Object Assignment None DELETE [] Undo Object Assignment No Prefix 1 No - prefix minimize 1 no * Cancellation association no & Take address No single operator plus no - arithmetic non-operation (single-grade) no! Logic is not bits, the size of the SizeOf object is not sizeof () Type size No TypeId () Type None (Type) Type Force (Conversion) Forced (conversion) No Dynamic-Cast type Force (conversion) No reinterpret-cast type Force (conversion) No static -cast type mandatory (conversion) None. * Using the pointer to the class member (object) from left to right-> * Cancellation class member pointer from left to right * multiply from left to right / except from left to right% (take Modulus) from left to right from left to right - reduce from left to right << left shift from left to right >> right shift from left to right greater than from left to right <= less than Is equal to left to right> = greater than or equal to from left to right == equal to from left to right! = Inequal from left to right & bits and from left to right ^ press or from left to right | stand or from left to right | Left to right && logic and from left to right || logic or from left to right E1? E2: E3 condition from right to left = assignment from right to left * = multiply assignment from right to left / = after assignment from right Go to left% = After the mold, the value is assigned from the right to left =, the value is assigned from right to left - = post-assignment from right to left << = left shift to assign values ​​from right to left >> = right shift from right To the left & = bits and then assignments from right to left | = Bit or then assign value from right to left ^ =, assignment from right to left, comma from left to right

Text

Non-argumentable program elements are called "text" or "constant". The terms "text" and "constant" are used interchangeably.

There are four main categories: integer, character, floating point, and string.

grammar

Text:

Integrity constant

Character constant

Floating point constant

String text

Integrity constant

The integer constant is a constant data element without the decimal part or index, always starting with a number. You can specify integer constants in decimal, octal or hexadecimal form. Can be specified as a symbol or no symbol type

And long or short.

grammar

Integral constant:

Decimal constant integer suffix OPT

Octa constant integer suffix OPT

Hexadecimal constant integer suffix OPT

'C Character Sequence' Decimal Constant:

Non-0 number

Decimal constant number

Eight reformed constants:

0

Octa-made constant eight feed numbers

Hexadecimal constant:

0x hexadecimal number

0x hexadecimal number

Hexadecimal constant hexadecimal number

Non-0 numbers: one of the following

1 2 3 4 5 6 7 8 9

Okimony: one of the following

0 1 2 3 4 5 6 7

Hexadecimal number: one of the following

0 1 2 3 4 5 6 7 8 9

A b C D e f

A b C D e f

Integer suffix:

No symbolic suffix long suffix OPT

Long suffix unsigned suffix OPT

No symbolic suffix: one of the following

u u u u

Long suffix: one of the following

L L

64-bit integer suffix

i64

Use an octal or hexadecimal count method to specify an integer constant, use a prefix indicate the base. To specify an integer constant of a given type, use a suffix specified type.

Specify a decimal constant and must begin with a non-0 number. E.g:

INT i = 157; // decimal constant

INT j = 0198; // is not a decimal number, is a wrong octic constant

INT K = 0365; // The head of the head is specified is an octal constant, not a decimal number.

Specify an octal constant, then start with 0, followed by a digital sequence from 0-7. Number 8 and 9 are erroneous when specifying an octal constant. For example: INT i = 0377; // Octa constant

INT j = 0397; // Error: 9 is not an octave

Specifies a hexadecimal constant, starting with 0x or 0x (case of size), followed by the number of sequences within the range of 0-9 and A (or A) -F (or F). The numerical range representative of hexadecimal A (or A) to F (or F) is 10-15. For example: INT i = 0x3FFF; // Hex hexadecimal constant int J = 0x3FFF; // is specified as a non-symbol type, then uses a U. E.g

: unsigned uval = 328u; // No symbol number

Long lval = 0x7ffffl; // long value as hexadecimal constant

Unsigned long ulval = 0776745 ul; // No symbol long value

Character constant

The character constant is one or several members in the Source Clear Set, and the source character set is a character set used in a program, which is enclosed by single quotation marks ('). They are used to indicate the "execution character set", that is, the character set in the character set of the machine executes the machine.

Microsoft Special Office

For Microsoft C , the source character set and the execution character set are the ASCII code.

Microsoft End

There are three character constants:

* Ordinary characters

* Multi-character constant

* Wide character constant

Note: Use a wide character constant to replace the multi-character constant to ensure that the portability character constant is specified as one or more characters enclosed in single quotes, for example:

CHAR CH = 'x'; // Specify ordinary characters

INT MBCH = 'ab'; // Specify multi-character constants depend on the system

Wchar_t wcch = l'ab '; // Specifying the wide character constant When the type of MBCH is int, if it is illustrated as a CHAR type, the second byte will be retained. A multi-character constant has four meaningful characters. If the specified number of characters exceeds four, an error message will be generated.

grammar

Character Constant: C 'Character Sequence' L'C Character Sequence 'C Character Sequence: C Character C Character Sequence C Character C Character: Source Character Concentration Insertion No. ('), Reverse Slash (/) or Livelife Any character transfusion sequence transfusion sequence: Simple escape sequence eight-binary transfusion sequence hexadecimal transfusion sequence simple escape sequence: one of the following / '/ "/? // / A / b / f / n / r / T / V eight refrigeration sequence: / octal digital / eight-input digital eight-en-numeric / eight-input digital eight-binary digital eight-way digital hexadecimal transfusion sequence: / X hexadecimal digital hexadecimal transfusion sequence hex digital

Microsoft C supports normal characters, multi-characters, and wide characteristics, using wide character constants to specify members of the expansion of the character set (for example, to support international applications). Ordinary characters constants are type char, multi-character constant type INT, wide character constant type is Wchar_t. (Wchar_t type is defined in Standard Header file stddef.h, stdlib.h, and string.h, but the prototype of the wide character function is only in stdlib.h).

The only difference between the specified ordinary and wide-characterized constant is a wide character constant letter to head. E.g:

Char schar = 'x'; // ordinary character regular

Wchar_t wchar = l '/ x81 / x19'; // Wide word constant

Table 1.2 gives these characters to be represented by a escape sequence or non-displayed character depends on the system or not allowed to be used in the character constant.

Table 1.2 C reserved or non-displayed characters

Character ASCII indicates that ASCII value transfusion sequence wrap NL (LF) 10 or 0x0A / N horizontal tab HT9 / T vertical tab VT11 or 0x0B / V retractable BS8 / B Enter CR13 or 0x0D / R switch FF12 or 0x0C / F Alarm BEL7 / A backslash / 92 or 0x5c // question mark? 63 or 0x3f /? Single quota "34 or 0x27 / 'double quotation mark" 34 or 0x22 / "eight into the number OOO- / OOO sixteen Motor hHH- / XHHH empty character NUL0 / 0

If the character behind the reverse slash is not a specified legal escape sequence, the result is determined when implemented. In Microsoft C , the characters followed by the reverse slash are received as text, although the escape does not exist, and the warning of the layer 1 ("Unrecognized character escape sequence"). The octacular transfi sequence specified in the form of / OOO includes a reverse slash and one, two or three octic characters; the hexadecimal transfusion sequence specified in the form of / xhhh includes characters / x and subsequent Six-in-one digital sequence; different from the octal transfusion sequence is that there is no limit to the number of hexadecimal numbers in the hexadecimal transfusion sequence.

The octal transfusion sequence ends by the first non-eight-feed number or when three characters are visible, for example, Wchar_t OCH = L '/ 076a'; // End CHAR CH = '/ 233' at A; // The sequence ends similarly after 3 characters, and the hexadecimal transfusion sequence ends at the first non-hexadecimal number.

Since hexadecimal numbers include letters A ~ F (and A ~ F), it is determined that the escape sequence ends at the desired number. Since single quotes (') are used to enclose characters, use the escape sequence /' to represent the included single quotes. Double quotes (") may not be used to indicate the sequence. The reverse slash (/) is placed in the end of the row, if you want a reverse slash in a character constant, you must Place two backslash in a row (more information on the row can refer to the "translation phase" in the "Preprocessor Reference" behind this volume).

Floating point constant

The value specified by floating-point constants must have a fractional portion, which contains a decimal point (.) And can include an index.

grammar

Quantity of floating point:

Decimal constant index part OPT floating point suffix OPT

Digital Sequence Index Partial Floating Point Sufficient OPT

Decimal constant:

Digital sequence OPT. Digital Series

Digital sequence

Index section:

E Sword OPT Digital Sequence

E Sword OPT Digital Sequence

Symbol: one of the following

- Digital sequence:

digital

Digital sequence digital floating point suffix: one of the following

f L f L

Floating-point constants have a "mantissa" specified value, an "index" specified value, and an optional suffix specifies the type of constant. The mantissa is specified as a character sequence with a decimal point with an optional digital sequence of the number of fractional portions. E.g:

18.46

38.

If an index appears, it is used as a power of 10 to specify the magnitude of the number, as shown in the following example:

18.46E0 //18.46

18.46e1 //184.6

If an index occurs, the decimal point of the tail may not, the whole number is like 18E0. The default type of floating point constant is Double. The suffix f or L (or F or L, the suffix is ​​not case-sensitive), and constants can be specified as Float or Long Double, respectively.

Although Long Double and Double have the same representation, they are different types. For example, you can overload functions:

Void func (double); and void func (long double);

String text

String text is a string of 0 or more characters in the source character set enclosed in a double quotation, a string text represents a character sequence, together and constitutes a string ending with spaces.

grammar

String text:

"S Character Sequence OPT"

L "S Character Sequence OPT"

s Character Sequence:

s character

S Character Sequence S Character

s Character:

Any member of the source character set in addition to double quotation marks ("), backslash (/) or newline characters

Escape sequence

C strings have these types:

* CHAR [N] array, n is a string (in character mode) length plus 1 because the end of the end value '/ 0' identifies the end of the string.

* Wchar_t array, for a wide string.

The result of modifying string constants is uncertain, for example:

CHAR * SZSTR = "1234";

SZSTR [2] = 'a'; / / The result is uncertain

Microsoft Special Office

In some cases, the same string text can be "merge" to save the space of the file, when the string text merges, the compiler makes all references to a particular string text to the same location in the memory, and Not allowing each reference to point to their respective string text instances. / GF Compilation option enables string merge.

Microsoft End

When the string text is specified, the adjacent strings will be connected. Therefore, as follows:

Char szstr [] = "12" "34"

The same is identical to the following description: char SZSTR [] = "1234";

This connection of such adjacent strings makes it easy to specify a long string through a plurality of rows.

Cout << "Four Score and Seven Years"

"AGO, OUR Forefathers Brought Forth"

"Upon this Continent a new nation."

In the previous example, the entire string "Four Score and Seven Years AGO, OurForefathers Brought Forth Upon this Continent A New Nation." Is divided together. This string can also be specified using a row split as follows:

Cout << "Four Score and Seven Years /

AGO, OUR Forefathers BROUGHT FORTH /

Upon this Continent a new nation. "

After all adjacent strings in the constant are connected, the NULL character '/ 0' is added, providing a string end tag for the C-character string processing function.

When the first string contains an essential character, the connection of the string may generate unexpected results, considering the following two instructions:

Char szstr1 [] = "/ 01" "23";

Char szstr2 [] = "/ 0123";

Although the SZSTR1 and SZSTR2 contain the same value naturally, the values ​​involved in the actual are given by Figure 1.1.

Figure 1.1 Connection between escape and string

Microsoft Special Office

A string text is about 2048 bytes, which is suitable for a string of char [] and wchar_t [] types. If a string text contains the part enclosed by the double quotes, the preprocessor connects this part into a single string, and is connected to each line, plus an additional byte on the total byte.

For example, a string contains 40 rows, 50 characters per row (2000 characters), and a line of 7 characters, and each line is enclosed by double quotes, with a total of 2007 bytes, plus one byte End empty characters, a total of 2008 bytes. When the string is connected, the previous 40 rows add an additional character to the total, so that the total of 2048 bytes (additional characters are not true to the string). Note, if you use the continuation (/) instead of dual quotation marks, the preprocessor does not add an additional character for each row.

Microsoft End

By counting the number of characters to add 1 or add 2 to the WCHAR_T type, then determine the size of the string object.

Because dual quotes (") are used to enclose the string, the transfalth sequence (/") is represented by the essential dual quotes itself. Single quotes (') may not be represented without the sequence of escape. The reverse slash is placed at the end of the row, if you want the reverse slash in a string, you must hit two backslashes (//) (see questions about the continuation problem) The "Translation Phase" section of the "Preprocessor Reference" later.

In order to specify a string of a wide character (Wchar_t []), start with a double quotation mark with character L. E.g:

Wchar_t wszstr [] = l "lalg";

All common escape code listed in "Character Constants" is legal in string constants, for example:

Cout << "first line / nsecond line";

COUT << "Error! Take Corrective Action / A";

Since the escape code is terminated at the first non-heteo-numbers of characters, it is possible to specify a string constant with a sixteen-based escape code to specify the unexpected result. The following example is intended to create a included ASCII 5 string text, followed by characters FIVE:

"/ x05five"

The actual result is the 5F of hexadecimal, that is, the Underline of the ASCII code, followed by the character IVE.

The following example is generated by the desired result:

"/ 005five" // uses an octal constant

转载请注明原文地址:https://www.9cbs.com/read-46759.html

New Post(0)