原 著: david a rusling Translation: Banyan & Fifa (2001-04-27 13:54:15)
Chapter II Software Basics is a collection of computer instructions to perform a particular task. The program can be written in a variety of programming: from the low-level computer language - assembly language to advanced, language to the C program language unrelated to the machine itself. The operating system is a special procedure that allows users to run applications such as spreadsheets or word processing software. This chapter will introduce the basic principles of programming, and give an overview of operating system design objectives and functions. 2.1 Computer Programming Language 2.1.1 Assembly Language These CPUs are not understandable for humans from the instructions executed by the main memory. They are machine code that tells the computer how accurately moves. In Intel 80486 instructions 16 Enter 0x89E5 represents copying the contents of the ESP register into the EBP register. One of the tools designed for the earliest computer is the assembler, which can assemble the source files that people understand into machine code. The assembly language requires explicit operation registers and data and is associated with a particular processor. For example, the assembly language of Intel X86 microprocessors is even different from the assembly language of the Alpha Ax microprocessor. The following is an Alpha AXP assembly instruction program: LDR R16, (R15); LINE 1 LDR R17, 4 (R15); LINE 2 BEQ R16, R17, 100; LINE 3 STR R17, (R15); line 4 100:; line 5 First line statement loads the value in the address indicated by the register 15 into the register 16. Next, the adjacent unit content is loaded into the register 17. The third line statement compares the value in the register 16 and register 17. If it jumps to the label 100, it will continue to perform the fourth line statement in memory. If the register is equal, there is no need to save. Compilation-level procedures are generally lengthy and difficult to write, and it is easy to make mistakes. There is only a few parts of the Linux core to write in assembly language, and these are in order to improve efficiency or need to be compatible with different CPUs. 2.1.2 C programming language and compiler use assembly language writing programs to be a difficult and time consuming job. It is also easy to make mistakes and the program is not portable: can only be run on a particular processor family. This kind of language that is unrelated to the specific machine is much better in use with the C language. The C program language allows the logical algorithm it provides to describe the program. It provides a compiler tool to convert the C program into assembly language and eventually generate machine-related code. A good compiler can generate the efficiency of the assembly language program. Most of the Linux kernels are written in C language. The following is a C language clip: if (x! = Y) x = Y; it is the same as the task and assembly language code example. If the value of the variable X is different, the content of Y is given to X. The C code is organized into a subroutine and performs a task separately. The subroutine can return the value of any data type supported by C. A larger program is composed of a single C source code module, and each module has its own subroutine and data structure. These C source code modules combine related functions to complete such as file processing. C supports many types of variables, the variable is a memory location referenced by the symbol name. In the above example, X and Y are positions in memory. Programmers don't care where the variable is placed, and these work is done by the connection program. Some variables contain different types of data, integers, and floating point numbers, and pointers. The pointer is a variable that contains other data memory locations or addresses. Assume that there is a variable x, located at the memory address 0x80010000. You can use the pointer variable PX to point to X, then the value of PX is 0x80010000. The C language allows the associated variables to combine the data structure, such as: struct {INT i; char b;} my_struct; this is a structure called my_struct, which contains two elements, one is 32-bit integer i, the other is 8 Bit character B.
2.1.3 Connecting Program Connects is an application that connects several target modules and library processes to form a single program. The target module is a machine code generated from the assembler or compiler, which contains executable code and data, and the module is combined together to form a program. For example, a module may contain all database functions in the program and the other is the main process line parameter. The connection program modifies the reference relationship between the target module, so that the data or subroutine referenced in a module does exactly in other modules. Linux core is a huge program formed by many target modules. 2.2 Operating System Concept If there is no software, the computer is just a bunch of heat-free electronic devices. If the hardware is more than the heart of the computer, it is its soul. The operating system is a collection of system programs that provide users with the user running the application. The operating system is abstracted on the system hardware, which provides a virtual machine for the system user. Most PCs can run one or more operating systems, each of which has different appearances. Linux consists of many independent functional sections. For example, the Linux kernel, if there is no library function and the shell, the kernel is not used. In order to understand what the operating system is, think when you type a simple command, what happened in the system: $ ls mail c images Perl DOCS TCL $ $ symbol is a prompt provided by the user log in to the housing (refer to Bash here) . It indicates that you are waiting for the user to type some commands. Typing the LS command, first the keyboard driver identifies the contents of the knock. The keyboard driver then passes them to the housing, by the shell program to find the executable program (LS) of the same name. If LS is found in the / bin / ls directory, call the core service to read the executive image of the LS into virtual memory and start execution. The LS calls the core file subsystem to find those files available. The file system uses a buffered file system information, or calls the disk device to drive from the disk read information. Of course, LS may also cause network drivers and remote machines to exchange information to identify remote file system information about the system to access (file system can be remotely installed via network file system or NFS). When this information is obtained, the LS writes this information to the display screen by calling the video driver. These these sounds are very complicated. This very simple command process tells us that the operating system is a set of functions for collaborative work, which gives all users a consistent impression. 2.2.1 Memory Management Due to the limited resource, such as memory, operating system handling transactions seems very lengthy. One basic function of the operating system is to make a system that works only a small amount of physical memory works like many memory. This big memory is virtual memory. It is the software running in the deception system, allowing them to think that there is a large amount of memory available. The system divides the memory into the easy-to-handed page, swaps these pages to the hard drive when the system is run. Since there is another trick: the existence of multi-processing, these software can't feel the size of real memory in the system. 2.2.2 Process Process can be considered to be a program in an execution state, each process has a specific program entity. Observe the processes in the following Linux system, you will find that there is much more process than you think.
For example, in my system, I will get the following results: $ ps pid tty stat time Command 158 pre 1 0:00 -bash 174 pre 1 0:00 sh / usr / x11r6 / bin / startx 175 pre 1 0:00 xinit / usr / x11r6 / lib / x11 / xinit / xinitrc - 178 pre 1 n 0:00 Bowman 182 pre 1 n 0:01 RXVT -GEOMETRY 120X35-FG White -bg Black 184 pre 1 <0:00 XClock -BG Grey -Geometry -1500-1500 -Padding 0 185 pre 1 <0:00 xload -bg grey -geometry -0-0 -Label XLoad 187 PP6 1 9:26 / bin / bash 202 pre 1 n 0:00 RXVT -GEOMETRY 120X35-FG White -bg Black 203 PPC 2 0:00 / bin / bash 1796 pre 1 n 0:00 rxvt -geometry 120x35 -fg white -bg black 1797 V06 1 0:00 / bin / bash 3056 pp6 3 <0:02 Emacs Intro / Introduction.Tex 3270 PP6 3 0:00 Ps $ If there are many CPUs, each process can run on a different CPU. Unfortunately, there is only one CPU in most systems. This allows the operating system to run a few programs to generate their illusion at the same time. This way is called a time slice. At the same time, this approach also deceived the process to make them think that only you are running. The process is separated from it so that a process crashes or misuse will not affect other processes. The operating system makes this by providing a discrete address space for each process. 2.2.3 Device Drive Device The drive forms the main part of the Linux core. Like other parts of the operating system, they run in a high-relief environment and once an error will cause catastrophic consequences. The device drives the interoperability between the operating system and the hardware device. For example, when the file system is written to the data block by using the universal block device interface. The device driver is responsible for processing details of all devices. The device driver is related to a particular controller chip, if there is an NCR810 SCSI control card in the system, there is a need for NCR810 SCSI drivers. 2.2.4 File System Linux and UNIX, the independent file system in the system is not accessed through the device flag, but is accessed by means of the hierarchical tree structure of the file system. When Linux adds a new file system to the system, it will mount to a directory, such as / mnt / cdrom. An important feature of Linux is to support a variety of file systems. This makes it very flexible and can coexist with other operating systems. The most common file system in Linux is an EXT2 file system, which has been supported in most Linux distribution versions. The file system provides the user a general image of the file and directory on the hard disk of the system, regardless of the type of file and the characteristics of the underlying physical device. Linux transparently supports multiple file systems and integrates all files and file systems that are currently installed into the virtual file system. Therefore, users and processes generally don't know which file system is located in a file, they just use it. The block device driver hides the difference in physical block device types (such as IDE and SCSI) and file systems, and physical devices are only linear storage collections of data blocks. The difference between the device causes the block size, from the 512-byte of the floppy device to 1024 bytes of the IDE disk. These are hidden, which is invisible to system users. Regardless of the type of device, the EXT2 file system seems to be the same.
2.3 Core Data Structure Operating System Must contain information about the current state of the system. These data structures must be made to reflect these situations when the system changes. For example, a new process will result when a user logs into the system. The core must create a data structure representing a new process while connecting the data structure of other processes in the system. Most data structures exist in physical memory and can only be accessed by cores or their subsystems. The data structure includes data and pointers; there are addresses or subroutines of other data structures. They mix together let the Linux core data structure look very confused. Although it may be used in several core subsystems, each data structure has special uses. Understanding Linux core is to understand its data structure and various functions of these data structures in Linux core. This book focuses on the description of Linux core, mainly discusses algorithms of each core subsystem, complete tasks, and the use of core data structures. 2.3.1 Connection list Linux uses many software engineering technology to connect its data structure. In many cases, it uses the Linked or Chained data structure. Each data structure describes a thing, such as a process or network device, the core must be able to access all of these structures. In the linked table structure, the end node pointer includes the address of the first structure, and in each structure, it includes a pointer to the next structure in the table. The last item of the table must be 0 or NULL to indicate that this is the tail of the table. In the two-way linked list, each structure contains a pointer to the front structure and the latter structure in the table. The advantage of using the bidirectional linked list is easier to add and delete nodes in the middle of the table, but require more memory operations. This is a typical operating system overhead and the CPU cycle. 2.3.2 Link lists are used to connect the data structure, but the operational efficiency of the linked list is not high. If we want to search for a particular content, we may have to traverse the entire list. Linux uses another technique: hash tables to improve efficiency. The hash table is an array or vector of the pointer, pointing to the continuous neighboring data set in memory. Each pointer element in the hash table points to a separate list. If you use the data structure to describe people in the village, you can use age as an index. In order to find someone's data, you can use age as an index in the population hash table, find the data structure that contains this person's specific data. However, there are many ages in the village, so that the hash table pointer turns a pointer to the person data linked list with the same age. Search for this small chain list is obviously more than searching the entire data linked list. Since the hash table speeds up the access speed of the data structure, Linux often uses it to implement caches. Caches is a subset of information that holds frequently accessed information. The data structure that is often used by the core will be placed in Cache. The disadvantage of caches is that more complicated than using and maintaining a single linked list and hash tables. When looking for a data structure, if you can find in Cache (this is called cache hit), this is really good. But if you don't find it, you must find it and add it to the cache. If the Cache space has been used, Linux must decide which structure will be discarded from it, but it is possible that the data to be abandoned is the data to be used next time Linux. 2.3.3 Abstract Interface Linux core often abstracts its interface. The interface refers to a set of subroutines and data structures performed in a particular manner. For example, all network device drivers must provide a subroutine that operates some particular data structures. General code may use some code of the underlying. For example, the network layer code is universal, it gets support for specific devices related to the standard interface. Usually when the system is started, the underlying interface is registered with the higher level interface. These registration operations include adding a structural node to the linked list. For example, each file system constructing into the core registers itself when the system is started. File / proc / filesyms can see file systems that have been registered to the core.