For beginners, many orders of compilation are too complicated. It is often written for a long time and can't write a drifting procedure, so that we have hindered our interest in assessment, many people will give up. So I personally want to learn compilation, don't necessarily write the program, the writing program is really not a compilation, everyone may wish to play DEBUG, sometimes crack has a small software more accomplish than completing a program (just like learning computer first play games ). Some deep instructions in fact only for experienced assembly programmers, for us, too high. In order to make the language assembly language, you must first eliminate those gorgeous complex orders, focus on the most important instructions (CMP loop Mov JNZ ...). But I want to complete the above goals in the textbooks in the 啰, talk about what is easy, so I organize this super concentration (use WinZip, WinRar ... sequentially, 嘿!) Tutorial. The big words are unsatisfactory, see the text, you can "not care" between the predecessors or later, it is very accomplished, try! So - this next? - Here We GO! (Don't understand it when you read it, you must decompose below)
Because the assembly is dialogue with hardware through CPU and memory, we have to understand CPU and memory first: (About the number of progress is not mentioned here)
The CPU is a chip that can perform all arithmetic / logical operations of the computer and the basic I / O control function. A assembly language can only be used for specific CPUs. That is, the different CPUs have the syntax of their assembly language. Personal computers have been launched in 1981, and its CPU development process is: 8086 → 80286 → 80386 → 80486 → Pentium → ..., as well as AMD, Cyrix, etc. The back is compatible with the function of the previous CPU, only more instructions (such as multi-Pentium MMX instruction set), increasing registers (such as 386 EAX), increasing registers (such as 486 FS). To ensure that the assembler can apply to various models, it is recommended to use 8086 assembly language, which is best compatible. This article will be 8086 assembly language. Register (Register) is a component inside the CPU, so the data transfer between registers is very fast. Uses: 1. You can perform arithmetic and logical operations within the register. 2. The address stored in the register can be used to point to some location of the memory, that is, addressed. 3. Can be used to read and write data to the peripheral device of the computer. 8086 There are 8 8-bit data registers, which can form a 16-bit register: AH & Al = AX: accumulated registers, often used for operation; BH & BL = BX: base address register, often used to address index; CH & Cl = CX: Count register , Often used to count; DH & DL = DX: Data register, commonly used for data delivery. In order to use all memory space, 8086 set four segment registers, specifically used to save segment address: code segment register; DS (data segment): Data segment register; SS (STACK segment): Stack Segment register; EXTRA Segment: Additional segment register. When a program is to be executed, it is necessary to determine which locations of the program code, the data, and stacks, and point to these starting positions by setting segment registers Cs, DS, SS. Usually, the DS is fixed, and the CS is modified as needed. Therefore, the program can be written in the case where the addressable space is less than 64K. Therefore, the size of the program and its data is limited to 64K referred to in DS, which is why COM files must not be greater than 64K. 8086 as a battlefield with memory, use registers as a military base to accelerate work.
In addition to the registers mentioned earlier, there are some special functions: IP (Intruction Pointer): Instruction Pointer Register, in conjunction with CS, track the execution process of the program; SP (Stack Pointer): Stack pointer, matching SS Use, you can point to the current stack position. BP (Base Pointer): The base pointer register can be used as a relative base address of SS; SI (Source INDEX): Source address register can be used to store source address pointers relative to DS segments; DI (Destination INDEX) : Purpose index register, can be used to store an index pointer to the ES segment. There is also a logo register fr (Flag Register), there are nine meaningful signs, which will be described in detail below. Memory is a key part of computer operation and is also places where the computer stores information in the work. Memory organization has many storage locations that can be stored, called "Address". The 8086 address bus has 20, so the CPU has a 1M addressing space, which is also a valid control range of DOS, and 8086 can do the operation only to process 16-bit data, that is, only 0 to 64K, so it must be segmented Addressing to control the entire memory address. A complete 20-bit address can be divided into two parts: 1. Segment base address: 16-bit binary number, add four binary 0, ie one 16-based 0, becomes 20-bit binary number, can be set 1M Any 64K segment, usually a 16-bit binary number; 2. Offset: Directly use the 16-bit binary number, point to any of the segment bases. Such as: 2222 (segment base): 3333 (offset), its actual 20-bit address value is: 25553. In addition to the above nutrients, you must know what DOS, BIOS function calls, simple, function calls are similar to the Win95 API, which is equivalent to subroutines. The compliance writer is already worthy. If you don't have to use MS, IBM subroutine, this day is really no way (see "Computer Lovers" 98-12).
There are two main methods for writing assembly languages: 1. Using MASM or TASM and other compilers; 2. Use the unligrated program debug.com. Debug is actually a compiler that is the main purpose is that the error is correct, that is, the error in the assembler. However, it is also possible to write short assembler, especially for beginners, DEBUG is the best entry tool. Because Debug operations are easy: Just type the Debug Enter, the A carriage return can be compiled, the process is simple, and when using the compiler, you must use the text editor, the compiler itself, LINK, and EXE2BIN, each program You must use a series of fairly complex commands to work, and use the compiler to process the source program, you must join an indicative statement that is independent of the instruction statement, so as to identify the compiler, use Debug to avoid many difficult to understand Procedure. In addition to the assembler, DEBUG can also be used to check and modify memory location, load storage and executing programs, and check and modify registers. In other words, Debug is designed to make us contact hardware. (8086 Common Directive Usage will be explained in each assembler, limited to space, and it is not possible to list all instructions). Debug's A command can compile out a simple COM file, so the program written by Debug must be legally legal by address 100h (COM file requirements). FOLLOW ME, SETP BY SETP (step back):
Enter a100; assembly from DS: 100 2. Enter MOV DL, 1; load the value of 01H into the DL register 3. Enter the MOV AH, 2; load the value of 02h into the DL register 4. Enter int 21; invoke DOS 21 interrupt 2 feature, used to display the character 5. Enter int 20; call the DOS 20 interrupt, terminate the program, return the control, press ENTER button 7. Now put the assembly language program In-memory, enter G (run) 8. Result: Output a symbol. ㄖ ← The result is actually not it, because Word97 cannot display the original result, so finding a fake product will be. Program terminated normally
We can use a U command to disassemble the sixteen-based machine code to make an instruction instruction. You will find that the assembly instruction on each line is compiled into the corresponding machine code, and 8086 actually executes the program with machine code. 1. Enter U100, 106 1FED: 0100 B201 MOV DL, 01 1FED: 0102 B402 MOV AH, 02 1FED: 0104 CD21 INT 21 1FED: 0106 CD20 INT 20 Debug You can view, change the register content with the R command. The CS: IP register saves the execution instruction address. 1. Input R AX = 0000 bx = 0000 cx = 0000 DX = 0000 sp = ffee bp = 0000 Si = 0000 di = 0000 DS = 1FED ES = 1FED SS = 1FED CS = 1FED IP = 0100 NV Up EI PL NZ NA PO NC 1FED: 0100 B201 MOV DL, 01
When the program is executed by DS: 100, the DebUG will automatically reset the IP content to 100 when the program is terminated. When you want to make this program into an independent executable, you can name the program with n commands. But must be a COM file, otherwise you can't load it in DEBUG. Enter n smile.com; we have to tell the Debug program length: The program starts from 100 to 106, so it takes 7; bytes. We use BX to store the length value high portion, and store the low portion in CX. 2. Enter the RBX; view the contents of the BX register, this program only 7 bytes, so this step can be omitted 3. Enter the RCX; view the contents of the CX register 4. Enter 7; the number of the bytes of the program 5. Enter W; W Command Writing the program into the (WRITE) disk to this, we can really touch the 8086 assembly instruction. When we write assembly language programs, we usually do not directly put the machine code in memory, but in a string of help symbols, these symbols are easier to remember than the hexadecimal code. It is a compilation instruction. Help symbols tell what operations should be implemented. That is, the assembly language composed of the help symbol is designed to be designed, and the machine language is designed to PC.
Now let's analyze a program that can display all ASCII code. 1. Enter Debug 2. Enter A100 3. Enter MOV CX, 0100; load cycle number MOV DL, 00; load the first ASCII code, then load the new code MOV AH, 02 INT 21 INC DL; Inc: Increment command each time, each time the data register The value of the DL plus 1 loop 0105; loop: loop instruction, each execution loop, CX value minus 1, and jump; to the start address 105 of the loop, until the CX is 0, the loop stop INT 20 4. Enter G Show all ASCII codes When we want to display a string, such as: Understand? It can be used to interrupt the 9H number of the DOS21H. Enter the downlink program, save and execute it: 1. Enter the A100 MOV DX, 109; DS: DX = String start address MOV AH, 9; DOS 09H function call INT 21; string Output INT 20 DB 'UNDERSTAND ? $ '; Definition string
In assembly language, there are two different instructions: 1. Regular directives: such as MOV, etc., is an instruction that belongs to the CPU, which is used to tell the CPU what should be done when the program is executed, so it will be operated (OP-CODE) The way is stored in memory; 2. Pseudo-instructions: such as DB, etc., is an instruction that belongs to the compiler of debug, which is used to tell the compiler what should be done when compiling. The DB (Define Byte) instruction is used to tell Debug to put all ASCII code in single quotes into memory. The string using the 9H function must be ended at the end. Use D commands to view DB pseudo instructions to put those content in memory. 6. Enter d100 1975: 0100 BA 09 01 B4 09 CD 21 CD-20 75 6e 64 65 72 73 74 ......! Underst 1975: 0110 61 6e 64 24 8b 46 F8 89-45 04 8B 46 34 00 64 19 and $ .F.... --.0.D. 1975: 0120 89 45 02 33 C0 5E 5F C9-C3 00 C8 04 00 00 57 56 .E.3. ^ _..... WV 1975: 0130 6B F8 0e 81 C7 Fe 53 8B-DF 8B C2 E8 32 Fe 0b C0 K ..... S ..... 2 ... 1975: 0140 74 05 33 C0 99 EB 17 8B-45 0C E8 D4 97 8B F0 89 T.3 ..... E ....... 1975: 0150 56 Fe 0b D0 74 EC 8B 45-08 03 C6 8B 56 Fe 5E 5F v ... t .. E .... v. ^ _ 1975: 0160 C9 C3 C8 02 00 00 6B D8-0e 81 C3 Fe 53 89 5E Fe ... k ..... S. ^. 1975: 0170 8b C2 E8 FB FD 0B C0 75-09 8B 5E Fe 8b 47 0C E8 ....... u .. ^ .. g .. now, let's analyze another program: Enter any string by the keyboard, then display it . DB 20 indicates that debug is reserved for 20H unused memory space for buffer use. Enter the A100 MOV DX, 0116; DS: DX = Buffer Address, determine the buffer address MOV AH, 0A; 0AH function call INT 21; keyboard input buffer MOV DL, 0A; due to function AH in each The string finally adds an ingredient code (0DH by ENTER MOV AH, 02; generated), so that the cursor is automatically returned to the front end of the input line, in order to make the new output INT 21; string does not cover the original input string Therefore, the function 2H is used to add one; OAH, so that the cursor is moved to the front end of the next row. MOV DX, 0118; loaded into the starting position of the string MOV AH, 09; 9H function encountered a symbol to stop the output, so the string must be added to INT 21; $, otherwise the 9h function will continue to put memory Useless data is displayed from Int 20 DB 20; define buffers to send you a sentence: Learning Mission is taboo.
The guest's words will not talk. Workers must be good, and must first make a tool. It is better to say that Debug is a compiler, but it is "direct translator", and Debug's A command can only convert a line assembly instruction into a machine language and perform it immediately. The function of the true compiler (MASM) is to use a text editor (edit, etc.) to build a compiler, an independent and attached .asm text file called the source program. It is the input part of the MASM program. MASM enters the ASM file, compiles into .Obj files, called the target program. The OBJ file contains only information about where the program is to load and how to merge with other programs, and cannot directly load memory execution. The link program LINK converts the OBJ file into an EXE file that can be loaded into memory execution. You can also use EXE2BIN to transfer the eligible EXE file to a COM file (COM files not only occupy the least memory, but also the fastest running). Below we use MASM to write a program as the first program function written with Debug. Edit a SMILE.ASM source program file with Edit. Source program Debug program Prognam Segment Assume CS: Prognam ORG 100H A100 MOV DL, 1 MOV DL, 1 MOV AH, 2 MOV AH, 2 INT 21H INT 21 INT 20H INT 20 prognam ends end
Comparison: 1. Because MASM will assume all the values to decomplications, and DEBUG only uses hexadecimal, so in the source program, we must add the letter to the letter after the number, such as H represents ten Hexa, D represents decimal. If it is a hexadecimal number starting with letters, it must also add 0 to the letter to indicate that it is a number, such as 0ah. 2. Source program adds five lines of narrative: ProgNam Segment and ProgNam Ends are paired, used to tell Masm and Link, this program will be placed in a program called ProgRam Name, where the parameters can be Certified, but its location must be fixed. Assume CS: ProgNam must start at the beginning of the program, used to tell the compiler The location of this program is located in the CS register. End is used to tell MASM, the program ends, the ORG 100H acts equivalent to the A100 of the debug, and assembles from the offset 100. All source programs for the COM file must contain these five lines, and must appear in the same order and location, this thing will be recorded, and thousands will be. Then, we compile smile.asm with MASM. Enter MASM SMILE ← Don't be joined to attach .asm. Microsoft (R) Macro Assembler Version 5.10 Copyright (c) Microsoft Corp 1981, 1988. All rights reserved. Object filename [smile.obj]: ← Do you change the OBJ file name, if you don't change Enter Source Listing [Nul.lst] : ← Do you need a list file (LST), do not need ENTER cross-reference [nul.crf]: ← Do you need a comparison file (CRF), no need ENTER 50162 403867 BYTES SYMBOL SPACE FREE 0 WARNING Errors ← Warning error, Indicates that the compiler does not understand some of the statements, usually an input error. 0 Severe error ← Severe errors will cause the program to be unforgettable, usually a syntax structure error. If there is no error exists, you can generate an OBJ file. The OBJ is included in the compiled binary result, it is not possible to perform in the DOS loaded, which must be linked. When linking the OBJ file (smile.obj) into an EXE file (smile.exe). 1. Enter link smile ← No additional name OBJ Microsoft (R) Overlay Linker Version 3.64 Copyright (c) Microsoft Corp 1981, 1988. All rights reserved. Run file [smile.exe]: ← Whether to change the output EXE file name, if not Change ENTER LIST file [Nul.map]: ← Do you need a list file (MAP), unnevel libraries [.lib]: ← Do you need a library file, type the file name, don't be Enter Link: Warning L4021: No Stack Segment ← Since the COM file does not use the stack segment, the error message ← "no stack segment" does not affect the normal execution of the program
At this time, the exe file has been generated, we also need to use EXE2BIN to convert the exe file (Smile.exe) to the COM file. Enter EX2BIN SMILE to generate a bin file (smile.bin). In fact, the bin file is exactly the same as the COM file, but because DOS only recognizes the COM, EXE and BAT files, the bin file cannot be executed correctly, rename or enter the exe2bin smile smile.com. Now, there should be a smile.com file on the disk. You can execute this program directly in the prompt symbol C:>, you can enter the file name smile. Do you feel that the program is used to generate a program, more trouble than Debug! In the case of a small program, it is true, but for larger programs, you will find its advantages. We will then do the ASCII program in a compiler to see if there is different differences. First, establish an ASCII.asm file with edit.com. ProGnam segment; definition paragraph Assume CS: ProgNam; put the section base address of the above defined segment into CS MOV CX, 100H; load cycle number MOV DL, 0; loaded into the first ASCII code, then load new Code NEXT: MOV AH, 2 INT 21H Inc DL; Inc: Increment Instruction, each time the value within the data register DL plus 1 loop next; cycle instruction, execute it, CX minus 1, until CX is 0, loop stop int 20h Prognam ends; segment termination end; assembly termination in the source program of assembly language, each program line contains three elements: Start: MOV DL, 1; loaded into the first ASCII code, then load new code each time Identifier expression annotation
In the original file, the annotations can make the program more easily understood, which is convenient for reference. Each line is annotated as ";"; " The compiler does not pay attention to the annotation, and the annotated data does not appear in the OBJ, EXE or COM file. Since we don't know the address of each program, we must represent the relative address in the symbol name, called "identifier". We typically type the identifier at the appropriate position of the appropriate line. The identifier (Label) is up to 31 bytes, so we are in the program, try to make a simple text as an identifier. Now, you can compile this ascii.asm file into ascii.com. 1.masm ASCII, 2.Link ASCII, 3.exe2bin ascii ascii.com.
Note: When you compile your design, you will usually make a typing error, the identifier name is missing, the hexadecimal number is H, logical error, etc. Compilating veterans often give new people's advice is: It is best to have some mistakes to write (others told me); if the first executive procedure, you get the expected result, you are best to check it again Because it may be wrong. In principle, as long as the general logical architecture is correct, find the wrong process in the program, and even more interesting with the writer itself. When writing big procedures, it is best to divide many modules, so that the purpose of the program itself is simple, easy to write and check error, and other points can be clearer, saving compiled time. If the reader has a place where reading is best to use paper notes about registers, memory and other content, slowly stroke on paper, it is clear. Below we will write a "big program" that can get a decimal value from the keyboard and convert it into a hexadecimal value and displayed on the screen. Foreword: To let 8086 perform such functions, we must first break this problem into a series of steps, called program planning. First, in a flow chart, ensure that the entire program is logically there is no problem (don't say it! What language must be said). This modular planning method is called "program planning from top". When you really write a program, you start from the smallest unit module (subroutine). When each module is completed, it is combined into a big program; this place looks at the eyes, and the small way is called " The program is designed. Our first module is BiniHex, which is the main purpose to remove the binary number from the 8086 BX register and display it on the screen in a hexadecimal manner. Note: If the subroutine cannot operate independently, it is normal. BiniHex Segment Assume CS: BiniHex MOV CH, 4; Record the number of hexadecimal digits (four digits) Rotate: MOV CL, 4; using CL When the counter The content is to handle 4 hexadecimal number MOV Al, BL; transfer BX low eight-bit BL to Al And Al, 0FH; put unused bit clear add al, 30h; 30h, and deposits Al Cmp Al, 3ah; and 3AH comparison JL Printit; less than 3AH Transfer Add Al, 7h; add 30 h within Al PRINTIT: MOV DL, Al; put the ASCII code to DL MOV AH, 2 INT 21H DEC CH; CH minus one, reduced to zero, zero mark 1 JNZ Rotate; JNZ: When the zero flag is not set, jump to the specified address. That is: I do it, then transfer INT 20h; returning from the subroutine BiniHex Ends End
Using the loop left shift command ROL loop register BX (BX content will be provided by the second subroutine), in order to process 4 hexadecimal numbers in order to sequentially: 1. Take the number of register shifts using the CL as counter. 2. Move the first hexadecimal value of BX to the far right. Using and (logic "and" operation: When the corresponding position is 1, the result is 1, and the rest is zero) Clear the unnecessary part, resulting in the result: store the BL value into Al, then use the AND The 0FH (00001111) clears the left side of Al. Since the ASCII code of 0 to 9 is 30 h to 39h, the ASCII code of A to F is 41h to 46 h, and the intermittent is 7h, so the result: if the container is less than 3Ah, the AL value is only 30 h, otherwise Al adding 7h. The Add instruction adds two expressions and the result is stored in the left expression. Flag Register is a separate hex register with nine signatures, some assembly instructions (most of which involve comparison, arithmetic or logical operations) execution, the associated flag position 1 or Clear 0, the symbols that are often hit have a zero flag (ZF), symbolic flag (sf), overflow flag (OF) and carry flag (CF). The logo holds the impact on which a certain instruction is executed, and other relevant instructions can be used, and the status of the flag is used, and the action is generated according to the status. The CMP instruction is very similar to subtraction, which is subtracted to reduce the value of the two expressions, but the content of the register or memory has not changed, but the relative flag bit is changed: if the AL value is less than 3AH, the logo flag is placed 0, it is revealed. The JL instruction can be interpreted as: smaller than the transfer to the specified location, greater than, is equal to the down. CMP and JG, JL and other conditional transfer instructions are used together, and the branch structure of the program can be formed, which is common to write assembler common techniques.
The second module Decibino is used to receive the decimal number of keyboards to enter, and convert it into a binary number in the BX register for module 1 BiniHex. Decibin Segment Assume CS: Decibin MOV BX, 0; BX Clear Newchar: MOV AH, 1; INT 21H; Read a keyboard input symbol into Al, and displays SUB Al, 30H; Al minus 30h, the result is existing in Al, Complete the ASCII code transfer binary code JL EXIT; transfer CMP Al, 9D JG EXIT; Left> Up to 5D JG EXIT; left> 8-bit AL converted to 16-bit AX XCHG AX, BX; interchanges AX and BX Data MOV CX, 10D; decimal number 10 into CX MUL CX; expression of the expression is multiplied by the content of AX, and exists in AX XCHG AX, BX Add BX, AX JMP Newchar; unconditional transfer EXIT: INT 20; return to the main program DECIBIN ENDS END The actual result of the CBW is: If the value in Al is positive, AH is filled in 00h; the AH is filled in FFH. XCHG is often used when it is necessary to temporarily retain content in a register. Of course, a subroutine (CRLF) has to make the number of hexadecimal numbers that are displayed in the first-input number of the first input. CRLF Segment Assume CS: CRLF MOV DL, 0DH; Enter the ASCII code 0DH into DL MOV AH, 2 INT 21h MOV DL, 0AH; Renewal ASSII code 0AH into AH MOV AH, 2 INT 21h INT 20; return to the main program CRLF Ends End Now we can merge BiniHex, Decibin, and CRLF modules into a big program. First, we have to slightly change these three module subroutines. Then, write a program to call each subroutine. CRLF Proc Near; MOV DL, 0DH MOV AH, 2 INT 21H MOV DL, 0AH MOV AH, 2 INT 21H RET CRLF ENDP
Similar to the pseudo command with ENDS, Proc and Endp are also pairs, used to identify and define a program. In fact, the true role of Proc is just telling the compiler: The program being called is a proximity (NEAR) or remote (FAR). The general program is directly called by Debug, so use INT 20 returns, the program called by the CALL instruction is used to change the return command RET, and the RET will transfer the control to the address referred to in the top of the stack, and the address is called by call The CALL instruction of this program is put into. The modules are all set, then we combine the subroutine and greatly tell the decihex segment; main program Assume CS: DeciHex ORG 100H MOV CX, 4; cycle number into CX; Sonberger wants to use CX due to subroutine Final stack Repeat: call decibin; call the decimal turn binary subroutine CALL CRLF; call to add back, the wrap tutor Call Binihex; call the binary turn hexadecimal and display the subroutine Call CRLF loop repeat; loop 4 times, continuous operation 4 times MOV AH, 4CH; call DOS21 interrupt 4C function, exit the program, just like INT 20H INT 21H;, but the applicable face is wider, INT20H can't get out, try it Decibin Proc Near Push CX; will CX Press the stack,; Transfer to the subroutine address, and simultaneously set the Call's downlink one command address to return the address and press it in the stack. Call can be divided into proxireverse (NEAR) and remote (FAR): 1.NEAR: IP content is pressed into the stack for program and programs in the same paragraph. 2.FAR: CS, the content of the IP register is pressed into the stack, used for program and programs in different segments. PUSH, POP is another pair of instructions to press the register content, pop-up, and use more to protect register data, and more in subroutine calls. The stack pointer has a "advanced first out" principle, like the Push AX, PUSH BX ... POP BX, POP AX can make the protection data is not bad.
The assembly language super concentration tutorial is here to tell a paragraph, hoping to lay the foundation of your independent design. And more and better skills rely on you usually accumulate. wish you success