Appendix Two Program Language Efficiency Analysis
The following is a 24x 24 point line shape, which is zoomed in 48x 48, and compares its processing speed, occupancy space, and production time, respectively. In order to correctly calculate the execution time, 10,000 processes are specially processed, as for the specified 24x 24-glyph, it is assumed to be a space.
First, askSEMBLY
The combined language changes endless, first in a general approach, transfer it to the point of the point. 1: Page 60, 132 2: CG Segment 3: BUFIN DB 72 DUP (0) 4: Bufot DB 72 * 4 DUP (0) 5: Assume CS: CG, DS: CG, ES: CG 6: Start: 7: MOV AX, CG 8: MOV DS, AX 9: MOV ES, AX 10: CLD 11: MOV BP, 1000; processing 10,000 12: S3: 13: SUB CX, CX 14: MOV BX, CX 15: MOV DX, 1803H; Counting 16: Mov Si, Offset BUFIN; 24 * 24 o'clock start order 17: MOV DI, OFFSET BUFOT; scheduled 48 * 48 storage address 18: mvbyte: 19: MOV BH, DL; do three columns 20: MVDB: 21: Lodsb; Take the original inventory 22: MOV BL, Al 23: MOV CL, 8; do eight-bit 24: MVDB1: 25: RCL BL, 1; left shift once 26: pushf; save status 27: RCL AX, 1; two words at the same time 28: POPF; remove the original shift status 29: RCL AX, 1; again, the dual-site value 30: loop mvdb1; eight times loop 31: Stosw; Deposit 32 : MOV [DI 4], AX; up and down zoom 33: DEC BH; a total of 3 columns 34: JNZ MVDB 35: Add di, 6; move to the second line 36: DEC DH 37: JNZ mvbyte; total 24 lines 38: DEC BP; execution 10,000 39: JNZ S3; completed 40: MOV AX 4C00H 41: INT 21H 42: CG Ends 43: End Start This program is fifteen minutes. After returning, the execution of 934 font is obtained, and the time is 14.5 seconds. If the upper program is analyzed, it can be found that the execution time of this section is wasted in the "loop" of 23 to 30.
In order to increase the speed, the space can be increased, avoiding the loop, and performs eight "shift" action as the secondary: 23: RCL BL, 1 24: RCL AX, 1 25: SHL AX, 1 26: Eight of the same 47: MOV CX, AX; AX is the unit element value 48: SHR CX, 1; CX Get a dual-bit meta array value 49: OR AX, CX; Double-bit ode quirming, the program increased 36 The character, but the execution time is reduced to 7.1 seconds, and the speed is double! Is it still a better way? I believe you must count not count. For example, we know that the origin is doubled, the shape is "double", do a table with a double point, take the corresponding value, you can exempt the procedures of each point shift, and then follow the top 18 below Change to: 18: VT2: 19: Call mvbyte; zoom in a row 20: Sub Si, 3; portrait still must be enlarged once 21: Call Mvbyte; 23: JNZ VT2; then do 24: RET; completed 25: Mvbyte: 26: MOV CL, DL; a line has three-character 27: MVDB: 28: Lodsb; take a word 29: MOV AH, Al; Distribute two 30: And Ax, 0F0H; AH, Al takes four-bit 31: SHR Al, 1; Right to remove 4 times 32: SHR Al, 1 33: SHR Al, 1 34: SHR Al, 1 35: MOV BL, Al 36: MOV Al, ByTetb [BX]; Left element preset table value 37: MOV BL, AH 38: MOV AH, bytetb [bx]; right word element Table value 39: Stosw; Lost word 40: loop mvdb; do three times 41: RET 42; conversion table 43: ByTetb DB 000H, 003H, 00CH, 00FH, 030H, 033H, 03CH, 03FH 44: DB 0C0H, 0C3H, 0CCH, 0CFH, 0F0H, 0F3H , 0fch, 0FFH 45: CG Ends 46: End Start
Replace the method, because there is a XALT directive designed for this program. From Article 25, adjustments are as follows: 25: Mvbyte: 26: MOV CL, 4; for Al left to shift four bits 27: MVDB: 29: Lodsb; MVDB: 29: Lodsb; take a word 30: Mov Ah , Al; Distribute two at 31: And ax, 0f00FH; AH, Al each take four-bit element 32: SHR Al, Cl 33: XLAT; 34: Xchg Al, AH 35: XCHG Al, AH 35: XLAT 36: Stosw 37: DEC DL 38: JNZ MVDB, the execution program 959 characters, the execution speed is 3.2 seconds, the efficiency is better. The disadvantage of the above process is that during the cycle, the speed is lost, and the four-digit check list is also taken. If you use a word dollar check, you need to increase the corresponding value of "Table", change it to the "Total Table", you can check it once.
And stronged by the 20th line, and strive to streamline the instructions, such as: 20: Mov DX, Offset ByTetb 21: MVDB: 22: LODSB 23: Sub Ah, AH 24: SHL AX, 1; 1 Cell must become two FRE 25: Add ax, dx; 之 位置 表 26: MOV BX, AX; BX can be used for indirect addressing 27: MOV AX, [BX]; with a word chart value 28: stosw; check Enter the first line 29: MOV [Di 4], AX; up and down repeated line 30: LODSB 31: SUB AH, AH; at 32: SHL AX, 1; 理 33: Add Ax, DX 34: MOV BX, AX Material 35: MOV AX, [BX]; 2 36: Stosw; Column 37: MOV [Di 4], AX; 38: LODSB; 39: Sub Ah, AH; At 40: SHL AX, 1; 理 41: Add Ax, DX 42: MOV BX, AX; Material 43: MOV AX, [BX]; 3 44: Stosw; Column 45: MOV [Di 4], AX; 46: Add Di, 6; Processing Next Row 47 : Loop mvdb; a total of 24 times 48: Dec BP; do 10,000 times 49: JNZ S3; completed 50: MOV AX, 4C00H 51: INT 21h 52: RET program to this, and there is a summary table for all programs.
1: ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;;;;;;;;;;;;;;;;;;;;;;;;;;
4: ByTetb Label Word 5: DB 000H, 001, 000H, 003H 6: DB 000H, 030H, 000H, 033H, 000H, 03CH, 000H, 03FH 7: DB 000H, 0C0H, 000H, 0C3H, 000H, 0CCH, 000H, 0CFH 8: DB 000H, 0F0H, 000H, 0F3H, 000H, 0FCH, 000H, 0FFH 9: DB 003H, 003H, 003H, 003H, 003H, 00FH 10: DB 003H, 030H, 003H, 033H, 003H, 03CH, 003H, 03FH 11: DB 003H, 0C0H, 003H, 0C3H, 003H, 0CCH, 003H, 0CFH 12: DB 003H, 0F0H, 003H, 0F3H, 003H, 0FCH, 003H, 0FFH 13 : DB 00ch, 000H, 00CH, 003H, 00CH, 00CH, 00CH, 00CH 14: DB 00ch, 030H, 00CH, 033H, 00CH, 03CH, 00CH, 03FH 15: DB 00CH, 0C0H, 00CH, 0C3H, 00CH, 0CCH, 00CH, 0CFH 16: DB 00CH, 0F0H, 00CH, 0F3H, 00CH, 0FCH, 00CH, 0FFH 17: DB 00FH, 000H, 00FH, 003H, 001, 00FH, 00Fh 18: DB 00FH, 030H, 00FH, 033H, 00FH, 03CH, 00FH, 03FH 19: DB 00FH, 0C0H, 00FH, 0C3H, 00FH, 0CCH, 00FH, 0CFH 20: DB 00FH, 0F0H, 00FH, 0F3H, 00FH, 0FCH, 00FH, 0FFH 21: DB 030H, 000H, 030H, 003H, 030H, 00CH, 030H, 00FH 22: DB 030H, 030H, 030H, 033H, 030H, 03CH, 030H, 03FH 23: DB 030H, 0C0H, 03 0H, 0C3H, 030H, 0CCH, 030H, 0CFH 24: DB 030H, 0F0H, 030H, 030H, 030H, 030H, 030H, 0FH 25: DB 033H, 000H, 033H, 003H, 033H, 00CH, 033H, 00FH 26: DB 033H, 030H, 033H, 033H, 033H, 03CH, 033H, 03FH 27: DB 033H, 0C0H, 033H, 0C3H, 033H, 0CCH, 033H, 0CFH 28: DB 033H, 0F0H, 033H, 0F3H, 033H, 0FCH, 033H, 0FFH 29: DB 03CH, 000H, 03CH, 003H, 03CH, 00CH, 03CH, 00FH 30: DB 03CH, 030H, 03CH, 033H, 03CH, 03CH, 03CH, 03FH 31: DB 03CH, 0C0H, 03CH, 0C3H, 03CH, 0CCH, 03CH, 0CFH 32: DB 03CH, 0F0H, 03CH, 03H, 03CH, 03H, 03CH, 0FH 33: DB 03FH, 000H, 03FH, 003H, 03FH, 00CH, 03FH, 00FH 34: DB 03FH, 030H, 03FH, 033H, 03FH, 03CH, 03FH, 03FH 35: DB 03FH, 0C0H, 03FH, 0C3H, 03FH, 0CCH, 03FH, 0CFH 36: DB 03FH, 0F0H, 03FH, 0F3H, 03FH, 0FCH, 03FH, 0FFH 37:
DB 0C0H, 000H, 00CH, 0C0H, 00FH 38: DB 0C0H, 030H, 0C0H, 033H, 0C0H, 03CH, 0C0H, 03FH 39: DB 0C0H, 0C0H, 0C3H, 0C0H, 0CCH, 0C0H, 0CFH 0C0H 40: DB 0C0H, 0F0H, 0C0H, 0F3H, 0C0H, 0CH, 0C0H, 0FFH 41: DB 0C3H, 000H, 0CH3003H, 0C3H, 00CH, 0C3H, 00FH 42: DB 0C3H, 030H, 0C3H, 033H, 0C3H 03CH, 0C3H, 03FH 43: DB 0C3H, 0C0H, 0C3H, 0C3H, 0C3H, 0CCH, 0C3H, 0CFH 44: DB 0C3H, 0C3H, 0C3H, 0F3H, 0C3H, 0FCH, 0C3H, 0FFH 45: DB 0cch, 000H, 0CCH , 003H, 0CCH, 00CH, 0CCH, 030H, 0CCH, 033H, 0CCH, 03CH, 0CCH, 03FH 47: DB 0CCH, 0C0H, 0CCH, 0C3H, 0CCH, 0CCH, 0CCH, 0CFH 48: DB 0cch 0F0H, 0CCH, 0CCH, 0CCH 49: DB 0cfh, 000H, 0CFH, 003H, 0CFH, 00CH, 0CFH, 00FH 50: DB 0CFH, 030H, 0CFH, 033H, 0CFH, 03CH, 0CFH, 03FH 51: DB 0CFH, 0C0H, 0CFH, 0C3H, 0CFH, 0CCH, 0CFH, 0CFH 52: DB 0CFH, 0F0H, 0CFH, 0F3H, 0CFH, 0FCH, 0CFH, 0FFH 53: DB 0F0H, 000H, 0F0H, 003H, 0F0H, 00CH 0F0H, 00FH 54: DB 0F0H, 030H, 0F0H, 033H, 0F0H, 03CH, 0F0H, 03FH 55: DB 0F0H, 0C0H, 0F0H, 0C3H, 0F0H, 0CCH, 0F0H, 0CFH 56: DB 0F0H, 0F0H, 0F0H, 0F3H, 0F0H, 0FCH, 0F0H, 0FFH 57: DB 0F3H, 000H, 00HH, 003H, 00F3H, 00CH, 0F3H, 00FH 58: DB 0F3H, 030H, 0F3H, 033H, 0F3H, 03CH 0F3H, 03FH 59: DB 0F3H, 0C0H, 0F3H, 0C3H, 0F3H, 0CCH, 0F3H, 0CFH 60: DB 0F3H, 0F0H, 0F3H, 0F3H, 0F3H, 0FCH, 0F3H, 0FFH 61: DB 0FCH, 000H, 0FCH, 003H 0fch, 00ch, 0fch, 00FH 62: DB 0FCH, 030H, 0FCH, 033H, 03FCH, 03CH, 0FCH, 03FH 63: DB 0FCH, 0C0H, 0CH, 0C3H, 0FCH, 0CCH, 0FCH, 0CFH 64: DB 0FCH, 0F0H 0fch, 0f3h, 0fch, 0fch, 0fch, 0ffH 65: DB 0FFH, 000H, 0FFH, 003H, 0FFH, 00CH, 0FFH, 00FH 66: DB 0FH, 030H, 0FFH, 033H, 0FFH, 03CH, 0FFH, 03FH 67: DB 0FFH, 0C0H, 0CCH, 0C3H, 0FH, 0CCH, 0FFH, 0CFH 68: DB 0FFH, 0F0H, 0FCH, 0F3H, 0FFH, 0FCH, 0FFH, 0FFH 69: CG Ends 70: End Start This transformation Because adds a conversion table ,
The space is increased to 1471 characters, but the speed is accelerated to 2.5 seconds, which is the best illustration of the space change time. Second, C
C has recently been highly respected by various system companies, and we are compared to the combination language, but unfortunately, in the streamline of the instructions, it is not as strong as the heart, not like a combination language. Therefore, we can only test its efficiency in the way, check the small table and the consumer table. First, the way to use the large table is as follows:
1: main () 2: {3: unsigned char s [24] [3]; 4: UNSIGNED Short Tab [256], D [48] [3], count; 5: Register Short i, J, K; 6 : 7: for (count = 0; count <10000; count ) 8: {9: k = 0; 10: for (i = 0; i <24; i ) 11: {12: for (j = 0; J <3; J ) 13: D [k] [j] = d [k 1] [j] = Tab [S [i] [j]]; 14: k = 2; 15:} 16:} 17 }
The program has a single production time, more fast than the combined language; the space is 4575 characters, it is three times, as for the execution speed of 18 seconds, slow seven times. For another way, try a test small table as the second: 1: main () 2: {3: UNSIGNED CHAR I, J, S [24] [3], D [48] [6], Tab [16]; 4: Unsigned short count; 5: Register Short K, L, X; 6: 7: for (count = 0; count <10000; count ) 8: {9: k = 0; 10: for (i = 0; i <24; i ) 11: {12: l = 0; 13 for (j = 0; j <3; j ) 14: {15: x = s [i] [j]; 16: d [k] [l ] = D [K 1] [L] = Tab [X & 0360 >> 4]; 17: D [K] [L 1] = D [K 1] [L 1] = Tab [X & 017]; 18: L = 2; 19:} 20: k = 2; 21:} 22:} 23:} The occupied space is 4,693 characters, 5 times larger than the combined language; the speed is 30 seconds, then It's more than four times. This proves the flexibility of the combination language, and the most favorable conditions can be selected under the use of the technology of aerial efficiency exchange. Review the way the location is used, how is the result?
1: main () 2: {3: unsigned char s [24] [3]; 4: UNSIGNED SHORT DD [48] [3]; 5: INT I, K, Count; 6: Register Short D, J; 7 : Register unsigned char s; 8: 9: for (count = 0; count <10000; count ) 10: {11: k = 0; 12: for (i = 0; i <24; i ) 13: {14: For (j = 0; j <3; j ) 15: {16: s = ss [i] [j]; 17: d = 0; 18: IF (S & 01) 19: D | = 03; 20: IF (S & 02) 21: D | = 014; 22: IF (S & 04) 23: D | = 060; 24: IF (S & 010) 25: D | = 0300; 26: IF (S & 020 27: D | = 01400; 28: IF (S & 040) 29: D | = 06000; 30: IF (S & 0100) 31: D | = 030000; 32: IF (S & 0200) 33: D | = 0140000; 34: DD [K 1] [J] = DD [K] [J] = D; 35:} 36: K = 2; 37:} 38:} 39:} The space occupied by 4,727 words Yuan, compared to the combined language, the execution time is 29 seconds, almost four times the difference. This way of using high-order commands, the distance from C and the combination language. Even if the combination language is used, it does not use the skills of streamlined instructions. The general program is rarely succumb to the skill of the combination language, so that the true face of the combined language cannot be known.
Third, Basic
10: DIM WD24 (23, 2), WD48 (47, 5), Table (255), Mask (7) 20: R1 = 0 30: R2 = 0 40: REM Test Point, each font Secondary treatment. 50: Mask (0) = 0 60: Mask (1) = 2 70: for i = 2 to 7 80: Mask (i) = Mask (i-1) * 2 90: Next I100: Input A $ 110: for Count = 1 to 10120: K = 0130: for i = o to 23140: t = 0150: for j = 0 to 2160: for m = 0 to 7170: Temp = Table (WD24 (I, J)) 180: Temp = Temp And Mask (M) 190: if Temp = 128 THEN R1 = 192 And R1200: if Temp = 64 THEN R1 = 48 and R1210: if Temp = 32 THEN R1 = 12 And R1220: if Temp = 16 THEN R1 = 3 and R1230 : If Temp = 8 TEN R2 = 192 and R2240: if Temp = 128 THEN R2 = 48 and R2250: if Temp = 64 THEN R2 = 12 and R2260: if Temp = 32 THEN R2 = 3 And R2270: Next M280: WD48 ( K, T) = R1290: WD48 (K, T 1) = R2300: WD48 (K 1, T) = R1310: WD48 (K 1, T 1) = R2320: T = T 2330: Next J340 : K = k 2350: Next I360: Next Count370: Print "Finished" 380: End This program is 10 minutes, accounting of 12,764 characters, the execution time is 23,000 seconds! The foot ban is not suitable for dot matrix processing. Since the above treatment method is mainly shifted, it is very disadvantageous because there is no dedicated instruction of Basic. Now use the check-in method to see how it looks.
10: REM This program will be turned to 48 * 48 20: REM this schedule with QuickBasic Version 4.00 Microsoft INC. 30: DIM WD24 (23, 2), WD48 (47, 2). Table (255) 40: fork = 1 to 100 50: T = 0 60: for i = 0 to 23 70: for j = 0 to 2 80: a = Table (WD24 (I, J)) 90: WD48 ( T, J) = a100: WD48 (T 1, J) = a110: Next J120: Next I130: Next K140: End
The comparison table used in this propellant is 11,642, the execution program is 11,642 characters, and the execution time is 1,800 seconds. Other improvements are of course, but it seems to be close.
Four, Pascal
Pascal is only suitable for the consolidated table, and we have to give up this test before we have developed "Table Law". Now, the total table used in combination languages, how is it?
1: Program pasteable; 2: Var 3: Source: Packed Array [1 ... 24, 1 ... 3] OF -128 ... 127; 4: Objct: array [1 ... 48, 1 ... 3] of integer; 5: table: Array [0255] of Integer; 6: I, J, K, N: Integer; 7: Begin 8: for n: = 1 to 10000 DO 9: Begin 10: K: = O; 11: for i: = 1 TO 24 DO 12: Begin 13: for J: = 1 to 3 do 14: Begin 15: Objct [K, J] = Table [Source [i, j]; 16: Objct [K 1, J] = ObJCT [K, J] 17: End; 18: K: = K 2 19: END 20: END 21: END. This application is 10 minutes, the space accounts for 11,650 characters, the execution time is 17 seconds, compared to Basic good. Obviously, the efficiency of PASCAL is poor in C and a combination language, but if the total number of tables is not considered, only 21 classes, the difference is unconventional.
Five, Fortran
Similarly, Fortran can only use the method of the watch, the program is as follows: