Data structure and algorithm

zhaozj2021-02-16  33

Note: The small part of the article below is an article on the previous school BBS. Since there is no left at the time, the original author of the content of this article is unable to write, and express the thank you. .

(1) What is the data structure data structure is a term more widely used throughout the computer science and technology. It is used to reflect the internal composition of a data, that is, a data consists of those components data, which is constituted, what is the structure. The data structure has logically data structures and physical data structures. The logical data structure reflects the logical relationship between the component data, and the physical data structure reflects the storage arrangement of the component data in the computer. The data structure is in the form of data. The data structure is an organizational mode of information, and its purpose is to improve the efficiency of the algorithm, which is usually corresponding to a set of algorithms, and the data in the data structure can be performed by this set of algorithms. What is the main study of data structures? Data Structure As a large number of logical structures and storage structures of the main research data, and various operations for data. Therefore, there are mainly three aspects: the logical structure of the data; the physical storage structure of the data; the operation (or algorithm) of the data. Typically, the design of the algorithm depends on the logical structure of the data, and the implementation of the algorithm depends on the physical storage structure of the data. What is the data structure? What is a logical structure and physical structure? Data refers to a collection of elements consisting of limited symbols (such as "0", and "1", with its own structure, operation, and corresponding semantics). The structure is a collection of relationships between elements. Generally, a data structure DS can be represented as a binary group: DS = (D, S), // IE, Data-Structure = (Data-Part, Logic-structure-part) D is a collection of data elements (Or "Node", may also contain "data item" or "data field"), S is a collection of relationships defined on D (or other collections), s = {r | R: D × D ×. ..}, called the logical structure of the element. There are four basic types of logical structures: collection structure, linear structure, tree structure, and network structure. Tables and trees are the most commonly used high-efficiency data structures, many efficient algorithms can be designed with these two data structures. Table is a linear structure (full order relationship), tree (predecessor or hierarchical relationship) and chart (WEAK / Local Orders) is a non-linear structure. The physical structure of the data structure refers to the storage mirroring of the logical structure. The physical structure P of the data structure DS corresponds to a mapping of data elements from the DS to the storage area M (maintaining a logical structure S): PD, S) -> M memory model: A memory M is a series of fixed-size storage The unit, each unit u has a unique address A (U), which is continuously encoded. Each unit U has a unique subsequent unit u '= SUCC (U). P 4 Basic Mapping Models: Sequential, Linked, Indexed, and Hashing mapping. Therefore, we can at least 4 × 4 possible physical data structures: SEQUENTIAL (SETS) Linked Lists Indexed Trees Hash Graphs (not all possible combinations reasonable) Data Structure DS Action: All definitions on DS Operations must maintain the logic and physical structure of DS when changing data elements (nodes) or nodes. Basic operations on the DS: Any other advanced operations for DS can be implemented with these basic operations. It is best to treat DS and all of his basic operations as a whole - call it module.

We can further abstract the module as a data type (where the storage structure of the DS is represented as a private member, the basic operation is represented as a public method), called ADT. As an ADT, stacks, and queues are a special table, they have a subset of the operations of the table. The advanced operation of DATS can be designed (not packaged) algorithm, and the DS is processed by the basic operation. Good and bad DS: If a DS can be converted to a linear DS (for example, a linear form) through some "linear rule), it is called it as a good DS. Good DS usually corresponds to a good (efficient) algorithm. This is determined by the computing power of the computer because the computer can only access logic continuous memory cells, so how to not be linearly structural logically. For example, to operate a figure, all nodes to access the graph, you must sequentially access all nodes in some order (to form a prejudice), and must be converted to a linear structure in some way. The graph can be operated. Tree is a good DS - it has a very simple and efficient linearization rule, so you can use the tree to design a lot of very efficient algorithms. The implementation and use of trees is very simple, but you can solve a large number of special complex problems, so the tree is the most important and most useful data structure in actual programming. The structure of the tree has a recursive properties - each leaf node can be replaced by a sub-tree, and vice versa. In fact, each of the recursive structures can be converted to (or equivalent) tree structure. Abstract we know from the machine language to advanced languages, the algorithm is defined as an arithmetic sequence. All operations in this computing sequence are defined on a specific type of data model and to solve a specific problem. This arithmetic sequence should have the following four characteristics. Finite, that is, the number of items is limited, and each of the calculations can be completed within a limited time; the determinism, that is, each of the sequences has a clear definition, no means; no input However, it must have an output computing item; feasibility, that is, the corresponding correct output can be obtained for any given legal input. These features can be used to determine whether a determined arithmetic sequence is called a algorithm. However, our current problem is not to discriminate whether a determined arithmetic sequence is called an algorithm, but to be called an algorithm for algorithms, reviewing we used to express it in programming languages. The procedure of the algorithm is expressed, and the procedure of the algorithm is the procedure of the algorithm, because each element of the algorithm is clearly expressed, the entire algorithm is not a problem. As an algorithm for the arithmetic sequence, there are three elements. As data of various operations of various operations in the calculation sequence; various calculations in the calculation sequence; the control transfer in the calculation sequence. These three elements are simply referred to as data, operations, and control, respectively. Since the algorithm is energized, it changes thousand, and the object data acts in which the operation acts, the result data has a wide range of data, and it is not amended. The simplest, most basic, Boolean data, character data, integers, and real data, etc., slightly complex, matrix, record and other data; more complex collection, trees, and maps, there are sounds, graphics, images and other data. . Also because the algorithm is endless, the change is thousands, the types of calculations are varied, colorful. The most basic initial et al. Has assignment operations, arithmetic operations, logical operations and relationships, etc.; slightly complex arithmetic expressions and logical expressions, etc .; more complex function value calculations, vector operation, matrix operation, collection operation, and In addition, there may be composite and nested in the above-mentioned operations. About control transfer, relatively simple. In serial calculations, it only has several sequential, branch, loop, recursive and unconditional transfer. Let's review that since the computer has been introduced, the above three elements of the algorithm have been expressed and have experienced a process. The earliest programming language is a machine language, that is, a specific set of specific computers.

At that time, all algorithms to run on the computer must be expressed directly by machine language, and the computer can accept it. The calculation sequence of the algorithm includes operational objects and calculation results must be converted to a command sequence. Each of these instructions appear in the form of encoding (instruction code and address code). The algorithm expressed in the algorithm language, the difference is 100,000 miles. For those who are not subject to the process of design, a program is just a "Tianshu", which makes people see unclear, and the readability is extremely poor. The operation, data and control of the algorithm of the machine language are very complicated, because the instructions provided by the machine language are equal, the original. The machine language only accepts the arithmetic operation, the bit logic operation and the number of comparison comparisons. For a slightly complex operation, you must decompose one by one until you reach the original orientation to replace it. The data that can directly expresses only the most original bit, bytes, and three types of data. Algorithm Even even the simplest data such as Boolean, characters, integers, and real numbers must be mapped in place, bytes, and words one by one, but also allocate their storage units one by one. The expression of data in the algorithm is much more troublesome. The control transfer instruction provided by the machine language is also only a conditional transfer, conditional transfer, entry subroutine, and the most basic kinds of the subroutine return. Use them to construct cycles, form branches, call functions and processes to do many preparations in advance, have to rely on many techniques. There are many disadvantages that directly use the machine language expression algorithm. A large amount of complicated trivial details contain programmers to make them more time and energy to engage in creative labor, and perform more important tasks for them. Ensure the correctness of the program, efficiency. The programmer must be able to control the overall situation of the program, and the details of the implementation of the program, even if the intelligence super group of programmers will often take care of the abstainment, and thus the procedures compiled, and the development cycle is long. Since the idea and expression of the programming and expression in the machine language are in the case of people's habits, only the programmers who have been trained for a long time can be competent, so that the program is high and widowed. Because its written form is all "secret" code, the readability is poor, not convenient for communication and cooperation. Because it relies hard to detrimentally, the portability is poor, and the reuse is poor. These disadvantages have caused the computer applications at the time that the computer applications fails to be promoted quickly. Overall the way the above disadvantage is abstracted in the programming language, allowing it to approach the algorithm language as much as possible. To this end, people first noticed readability and portability because they are relatively easy to improve by abstraction. So, a assembly language will soon appear. This language is abstracted by the machine language, first manifesting each of the instructions of the machine language: The instruction code generation is in memory symbol, the address code generation is symbolic, so that its meaning is displayed on the symbol and no longer hidden in In the encoding, it is possible to let people look "text" life. Secondly, in this language, it is rid of a specific computer limit, and can run on a computer of different instruction sets as long as the computer is equipped with a assembler. This is undoubtedly a step in the machine language faculty language. However, it is too far from the algorithm language, and the procedures cannot be relieved from the data, operations and control of the decomposition algorithm, the operations that can directly express them directly. By the mid-1950s, the advanced language of the program design such as Fortran, Algol60, and the later PL / L, PASCAL, etc., the program expression of the algorithm produced a big leap. It is true that the algorithm ultimately expresses the machine language on the specific computer to run on the computer and get the required results. However, the practice of assembly language inspiring people, expressing it into a machine language, it is not necessary to step by step, you can walk or build a bridge over the river. Even the table reaches a mediation language, then turn into a machine language. As a mediation language, the assembly language is not very successful because it is too far from the algorithm language.

This guides people to design a standardized language that is close to the algorithm language, the so-called advanced language, allowing programmers to express algorithms, and then by means of "translation" of the specified senior language to the specification, Finally, algorithm is expressed as machine language. Moreover, since the advanced language and machine language have normative, "translation" is fully mechanized by the computer, just like the assembly language is translated into a machine language, as long as the computer is equipped with a compiler. The above two steps, the previous step is done by the programmer, and the next step can be done by the compiler. These two steps are completely independent after the specified clear they do. They each don't work with each other. The previous step is just to correctly express a given algorithm with the advanced language, generate a high-level language program; the next step is to translate the advanced language programs obtained into a machine language program. As for how programmers use advanced language expression algorithms and compilers to translate advanced language algorithms into algorithms expressing in machine language, it is obviously not coherent. The above ideological method for processing the complex process from the algorithm language final table to reach a machine language is an abstraction. The appearance of assembly language and senior languages ​​is this abstract example. The huge success of advanced languages ​​compared to assembly language lies in that it introduces many concepts and tools that are close to algorithm language in the expression of data, operations, and control, greatly improve abstract expression algorithm. In terms of operation, advanced language such as Pascal, in addition to the four calculations, logical operations, relationship operations, arithmetic expressions, logical expressions, logical operations, arithmetic expressions, logical expressions, etc., and let Custom. The importance of this tool is not only in its streamlined program text, but also it reflects the two-level abstraction of the program. In the function and process call level, people only care about what it can do, don't care about how it does. Only when the function is defined by the function, people give the details. Readers who have used high-level languages ​​know that once the function and the name, parameters, and function are specified clearly, then call them in the program, they explain them completely separately in the header of the program. You can modify or even replace the functional body and process, without affecting their calls. If the function is regarded as an arithmetic name, regard the parameters as the result of the operation, the result of the function and the process call and the initial calculation are not two. Use functions and processes and their composite or nested can naturally express any complex calculations in algorithm language. In terms of data, advanced languages ​​such as Pascal attracted the concept of data types, that is, categorize all data. Each data (including expressions) or each data variable belongs to a class. This type of data is called a data type. Therefore, the data type is a description of the data or data variable class, which indicates all the values ​​that may take on the data or data variable. For unconnected data, advanced languages ​​such as Pascal, in addition to providing standard basic data types - Boolean, characters, integer, and except, providing users can customizable enumeration types, sub-boundary types, and pointer types . These types (except for pointers), their usage methods comply with the habits used in the algorithm language. For structural data, advanced languages ​​such as Pascal provide four standard data types such as arrays, records, restricted collections and files. Among them, the array is the abstraction of the vector, matrix in scientific calculations; records are abstractions of records in business and management; there is a restricted collection is an abstraction of the potential set of sufficiently small collections in mathematics; documents are such as disk, etc. Abstract. It is possible to construct a structural data using the basic data types provided (including standard and custom), constructor, comparable, restricted, and file constructors. In addition, users are allowed to utilize standard structural data types, and more complex and higher-level structural data by composite or nested constructs. This makes the data type in the advanced language in a significant hierarchy. The hierarchical layers of the data type in the advanced language are not exhaustive, so they can express data in any complex level in the algorithm language.

In terms of control, advanced languages ​​such as Pascal provide six ways of expressing an expression algorithm controlling transfer. (1) Default order control ";". (2) Condition (branch) control: "IF expression (true) THEN S1 ELSE S2;". (3) Select (case): "Case expression OF value 1: S1 value 2: S2 ... value N: SN end" (4) loop control: "While Expression (true) Do S;" or "Repeat s Until expression (true);" or "for variable name: = initial value to / DOWNTO final value DO S;" (5) The call, including recursive functions, and recursive processes. (6) unconditional transfer goto. These six expression methods not only cover all the control expressions in the algorithm language, but it is no longer like the original, such as the original, such as cumbersome, but as seen above, and the expression of the natural language Nothing. Programming language from the machine language to advanced language abstraction Provide programmers with environmental and tools for structural programming, making the design of the program is well-readable, strong maintainability, high reliability; high-level language is far from the machine language, and the specific computer hardware is not large, so The program written has a good portability, and the reuse rate is high; because the complicated trivial matters are handed over to the compiler, the automation is high, the development cycle is short, and the pre-sequence will be relieved, and the time and energy can be concentrated. Engaged in more important creative labor to improve, the quality of the program. Data structures, data types, and abstract data type data structures, data types, and abstract data types. These three terms are different from differently similar, reflecting they have both differences and links. The data structure is a term widely used throughout computer science and technology. It is used to reflect the internal composition of a data, that is, which component data is made, what is the structure, which is constituted. The data structure has logically data structures and physical data structures. The logical data structure reflects the logical relationship between the component data, and the physical data structure reflects the storage arrangement of the component data in the computer. The data structure is in the form of data. The data is classified according to the data structure, and the data with the same data structure is the same. All of the same type of data is called a data type. In the program design advanced language, the data type is used to illustrate the property in the data classification. It is an attribute of data. This property defines the range of variations of the data. For the needs of the solution, the advanced language defines a series of data types according to the type of data structure. Data types defined by different advanced languages ​​are not the same. The type of data type defined by the Pascal language. Among them, the simple data type corresponds to a simple data structure; the construction data type corresponds to complex data structures; in complex data structures, the component data is allowed to have a complex data structure, and thus the construction data type allows composite nested; pointers The type corresponds to the relationship between the components data in the data structure, the simple data type on the surface, actually points to the complex ingredient data, the data in the configuration data type, so it does not take it into the simple data type, and it is not planned Infoction data type, and separately draw a class. The data structure reflects the configuration of the internal components, which often uses a structural diagram: each component data in the data is viewed as a node, and is represented by a square box or circle, and the relationship between the component data is correspondingly knotted. The connection between the arrows is indicated. If the ingredient data itself has its own structure, the structure is nesting. The nested nested here also allows recursive nested. Due to the introduction of pointer data, it is possible to construct a variety of complex data structures.

According to the relationship between the components in the data structure, the data structure is linear and nonlinear. There are also hierarchical and mesh in nonlinear data structures. Since the data type is divided according to the data structure, the one class of data structures correspond to a data type. Data Types The structure presented in this type also has a linear and nonlinear division, and the hierarchical and mesh are divided. A data variable, the type of type in advanced language must be the data type corresponding to the data structure of the read variable. The most commonly used data structure is an array structure and a recording structure. The characteristics of the array structure are: the number of components data is fixed, and the logical relationship between them is embodied by the sequence number of the ingredient data (or the subscript of the array). These ingredients are arranged in one place in order of serial numbers. Each component data has the same structure (which can be a simple structure, or a complex structure), thus belonging to the same data type (correspondingly a simple data type or constructive data type). This same type of data is called a base type. All ingredient data is arranged in a continuous storage unit sequentially. In summary, the array structure is a linear, uniform, and its ingredient data can be randomly accessed. Because of this, the structure has these good characteristics, so it is most often used. In the advanced language, the data type is the array type, that is, the data variable of the array structure must be described as array [i] of t0, where i is an array, the subscript type, and T0 is an array structure. Base type. The recording structure is another commonly used data structure. It is characterized by the same as the array structure, the number of components data is fixed. However, there is no natural sequence between ingredient data, and they are in equal position. Each component data is referred to as a domain and gives a domain name. Different domains have different domain names. Different domains allow different structures, thus allowing different data types. As with array structures, they can be randomly accessed, but the route of access is domain name. The data type corresponding to the record structure in the advanced language is the record type. The variable of the data of the record structure must be described as the type of record. The meaning of abstract data type has been specifically described in the previous paragraph. It can be understood as further abstraction of data types. That is to bundle the calculation of the data type and data type, and package. The purpose of introducing abstract data types is to separate data types and data types on the data type and operations in programs, so that they are independent of each other. For the description of the abstract data type, in addition to the data structure that must be described, the operation (process or function) defined above it must be described. The process and functions defined on abstract data types are based on the data structure of the data type of the abstract data type. (II) Under the generic design and data structure and algorithm, I want to talk about the latest promotion of generic program design models for data structures and algorithms, generic thinking has put the basic idea of ​​data structure and algorithms to an unprecedented Height, there are now many programming languages ​​to support generic design, such as ADA, C , and it is said to fully support generic design in Java's next version and C #.

Let's talk about the basic ideas of generic design: generic programming, directly with GP styles) is a new programming idea, OO, OB, and PO These are known to those skilled programming idea is GP The abstract degree is higher, and the coupling of the components based on the GP-designed components, there is no inheritance, so the intertility and scalability between the components are very high. We all know that any algorithm acts on a particular data structure. The easiest example is the rapid sorting algorithm. The most fundamental condition is that the object being sorted is stored in an array, because rapid sorting is because The random storage characteristics of the array can be exchanged in unit time, not just two objects, and if the object is stored with a joint table, the time of acquiring the object in the joint table is linear. Both O [N], which will make fast sorting to lose its rapid characteristics. That is, when we design a algorithm, we always consider the data structure of its applications, such as array finding, lapsery, tree lookup, diagram finding its core is looking for, but because of the role data structure There will be a variety of different expressions. This close relationship between data structures and algorithms has always been our previous understanding. The fundamental idea of ​​generic design is to separate the data structure of the algorithm and its role, that is, our design algorithm does not consider what the algorithm we design will act on what data structure. The ideal state of generic design is a lookup algorithm to act on arrays, link tables, trees, and maps, etc., becomes a universal, generic algorithm. Is this ideal that is very tempting? The generic programming is unprecedented, and the abstraction, GP and OO do not require you to call the function through additional indirect layers: it allows you to write completely generalized and reused algorithms The efficiency is equivalent to the algorithm designed for a particular data structure. We all know that the data structure can be represented by the user-defined type in C , and the template technology in C is the type as a parameter, then I can imagine the use of template technology to implement the GP idea we started, that is, a template function can be Various types of passes work, and these types can be the various data structures we have defined. The generic algorithm is detached from a specific type and a specific data structure, so that it adapts to the usual type of general as possible, the algorithm itself is just to achieve the logic nature of the algorithm, and not to be implemented for various data structures. Disturbed. This means that a generic algorithm actually has two parts. 1. The actual instructions used to describe the nature of the algorithm; 2. Correctly specify a set of demand conditions for the nature of its parameter types. At this point, I believe that many people have begun to be confused, huh, don't matter. After all, GP is a very high-abstract programming idea. The core is the abstract condition has become the core of the programming process, which replaces the type of the type in Oo, because the type is not what we consider. The focus, the type has become an abstract condition, so we call such procedures for generic ideas ------ Types.

(3) Personal learning experience is for how to learn data structures, I personally think that the appropriate method is, first recognize the nature, data structure and algorithm of the data structure, and the application method of the data structure. Otherwise, we are likely to fall into the complex characteristics of various data structures, but it is not to know what is the essence of the data structure. It has been learning a lot of long time but did not understand anything, here I said, I have some personal information about my personal Structural nature: The most important thing to learn the data structure is the understanding of programming methods and program language concepts and implementations INT i; int i []; struct i {}; ADT i {}; what is the difference, this is the program Design language implementation issues. Defining a data type is to define a class of operation INT i, J; i = J 1; this operation is implemented in the language itself, that is, how you don't need to care about how this is done, So int is the basic data type abstract basic data type is the data structure. When you define the ADT P {};, if P is a list, you want to implement its various operations. And all the operations that P can be completed must be implemented by you, and you implement the foundation is these CHAR INT FLOAT * P ... basic data types. This is the abstract data type. When you complete the type definition, the rest is the algorithm to complete the control of the program process. So: Data Structure Algorithm = Program still wants to say, design ideas, programming languages, and data structures, always promote the maximum driving force for computer software science development. Generally speaking, it is a language that supports this kind of thinking and the type of data that contains such thoughts. Specifically: facing the machine-programming machine language, assembly language facing process C. . . . Language-oriented object-oriented programming Java C is of course thought is just thinking, you can do it in different languages. However, it must be explained that only the language on one level can be implemented. For example, the machine language, because its language itself is very low, low to the operation of assigning values ​​to an object, is also subject to programmers. It does not support this level in the language. Why can't I enter the object-oriented programming? The key is to support ADT in C. He can use complex ADT instead of C that has become a Class of the basic data type, but it is because it is done to complete OO, and must be programmers to implement (define the ADT). Therefore, C is not suitable for developing OO software. However, C just defines the class into a basic type, which completes the abstraction of the hierarchy of OO. Of course, because C is a super-collection of C, he is also fully supported. So much, that is, I want to talk about the relationship between data structure and program language and design ideas. When we can link the learning and programming language of the data structure, the program design idea, I think we have a more profound understanding of the data structure itself. We learn when we learn every new data structure. No more panic, because we know that any data structure has its common common common and special features, each data structure is real in a field, from this aspect, the data structure itself has Inheritance characteristics, we can use a inherited tree to represent a complete data structure system, and each data structure is a child node in the inheritance system. Finally, here I recommend several books that I feel more excellent in data structures:

转载请注明原文地址:https://www.9cbs.com/read-22748.html

New Post(0)