Test program 2
#include
#include
Void Sum4 ()
{
INT j = 0;
For (unsigned i = 1; i <630001;)
{
J = i ;
J = i ;
J = i ;
J = i ;
}
}
Void Sum5 ()
{
INT j = 0;
For (unsigned i = 1; i <630001;)
{
J = i ;
J = i ;
J = i ;
J = i ;
J = i ;
}
}
int main ()
{
INT I, J; TIMER TIMER;
Timer.Start ();
For (i = 0; i <5; i ) SUM4 ();
Cout << "SUM4: << Timer.getTime () << endl;
Timer.Start ();
SUM4 (); SUM4 (); SUM4 (); SUM4 (); SUM4 ();
Cout << "SUM4E:" << Timer.getTime () << endl;
Timer.Start ();
For (i = 0; i <5; i ) SUM5 ();
Cout << "sum5:" << Timer.gettime () << endl;
Timer.Start ();
Sum5 (); Sum5 (); Sum5 (); Sum5 (); SUM5 ();
Cout << "SUM5E:" << Timer.getTime () << endl;
Timer.Start ();
For (i = 0; i <100; i ) SUM4 ();
COUT << "Sum4h:" << Timer.gettime () << endl;
Timer.Start ();
For (i = 0; i <100; i ) SUM5 ();
Cout << "Sum5h:" << Timer.gettime () << endl;
Timer.Start ();
For (i = 0; i <1000; i )
For (j = 0; j <5; j ) SUM4 ();
COUT << "SUM4T: << Timer.gettime () << endl;
Timer.Start ();
For (i = 0; i <1000; i )
{Sum4 (); SUM4 (); SUM4 (); SUM4 (); SUM4 ();
Cout << "Sum4TE:" << Timer.gettime () << endl;
For (i = 0; i <1000; i )
For (j = 0; j <5; j ) SUM5 (); cout << "sum5t: << Timer.gettime () << endl;
Timer.Start ();
For (i = 0; i <1000; i )
{Sum5 (); sum5 (); sum5 (); Sum5 (); Sum5 ();
Cout << "sum5TE:" << Timer.gettime () << endl;
Return 0;
}
Test results 2
Sum4 Sum4E Sum5 Sum5E Sum4H Sum5H Sum4T Sum4TE Sum5T Sum5TE VC6 Release, generates a file size 57,344B, the time unit ms 8.781 8.45918 7.54705 7.54677 174.909 152.656 8672.62 8794.3 16433.9 7633.09 BCC32, generates a file size 140,800B, the time unit ms 8.5874 8.91789 8.61534 7.64287 169.769 180.58 8548.64 8758.32 17586.1 7835.57
This table is a general trend, and more data I will not be listed. This result is quite unexpected, the change is irregular, I have to draw the following conclusion, I hope everyone is discussed:
1. If it is a single layer loop, the short cycle does not need to be expanded, and the performance improvement is not obvious. For BCC32, it is not necessary to expand. Due to the optimization of the compiler, it is not as good as the original. But for the original, the BCC32 is still boosting.
2. For unfained single-layer short cycles, the performance performance in VC6 and cycles is consistent, BCC32 has been reversed due to optimization.
3. For a short cycle of multi-layer cycles, if the performance of the original function is not very good, the effect of the expansion is not obvious, and it is reversed. The most strange thing is the best performance SUM5. If it is not expanded, the performance is as bad, and it is the best performance after the start, and it is more than double. It is really a strange thing.
In summary, for the original performance, short cycling can improve performance; if you have no confidence in your code, you should not be expanded, sometimes it will be counterproductive.