We often use Strcpy, I believe that many people know that it is realized, then have you ever thought about writing a strcpy? Under C, it is said that this is a very classic implementation method, simple and clean:
Char * Strcopy2 (Char * Szdst, Const Char * SZSRC)
{
Char * sztemp = szdst;
While (* SZDST = * SZSRC );
SZDST = '/ 0';
Return sztemp;
}
However, due to the implementation of the C language used, it is inevitable that the frame code will cause C to assembly, which is PUSH EBP; MOV EBP, ESP; MOV ESP, EBP; POP EBP, etc. Save and recovery pointers and registers, etc. Operation.
The actual CRT in the CRT is also implemented with compilation, then we can't come to your own strcpy? Of course, I also write it, I will also wear a C's skin :)
__Declspec (naked) char * STRCOPY (Char * SZDST, Const Char * SZSRC)
{
__ASM
{
MOV EAX, [ESP 4] // SZDST
MOV EDX, [ESP 8] // SZSRC
Begin:
MOV CL, BYTE PTR [EDX]
CMP CL, 0x0
Je end
MOV BYTE PTR [EAX], CL
INC EAX
Inc EDX
JMP Begin
End:
MOV Byte Ptr [EAX], 0x0
MOV EAX, [ESP 4]
RET
}
}
Using __DECLSPEC (Naked) precompiled instruction, then the C function will not generate the framework instructions that hate what PUSH EBP, etc. Then recover again? It is useless to produce a lot of useless operation, so I will do it in the above way :)
However, the effect of the actual test shows that the efficiency of the STRCPY function in the CRT library is still relatively high, why is the Strcopy here is not as high as the STRCPY efficiency in the CRT library? What is the reason, please help analyze it!
Several efficiency tests:
1. Use the strcpy function in the CRT library
Int main (int Argc, char * argv [])
{
Char * p1 = "China";
Char P [20];
Printf ("START TIME: 08D / N", GetTickCount ());
For (int i = 0; i <10000000; i ) {
STRCPY (P, P1);
}
Printf ("End Time:% 08D / N", GetTickCount ());
Return 0;
}
The result is:
Start Time: 29515000
End Time: 29515120
Time: 120ms;
II. Use the above strcopy function, that is, my own assembly implementation:
Change the strcpy function in the test program to implement the strcopy function for my own, as:
Start Time: 29540947
End Time: 29541147
Time: 200ms;
3. Use the strcopy2 function written by the above pure C:
Changing the strcpy function in the test program is the defined Strcopy2 function, resulting: start time: 29727525
End Time: 29728327
Time-consuming: 805ms.
From the time consumption, it is obviously the best in the CRT library, and my implementation is the second, the performance is the use of pure C, Strcopy2.
Didn't see the implementation of the strcpy in the CRT library, can you tell me what is going on first, why is my Strcopy efficiency compare? Thank you!