When writing the paper two days, I thought of this problem, my program GPU wants to output the result to the depth buffer, then handed over to the CPU query, the GPU is not a general purpose processor, and the usual SMP is not suitable. I also read OpenGL EXT SPEC, found that NVIDIA's OcClusion Query told this question, I have never read it, or write some simple notes for these days.
Standard OpenGL only provides two synchronous mechanisms: GLFLUSH, GLFINISH.
Flush, just guaranteed to the user to complete the command in a limited time, but this time cannot be determined, it is just a moderate tool;
FINISH, will execute the Stall CPU until all PENDING's graphics commands are executed.
NV_FENCE extensions provide an intermediate scale that ensures both part of the finish, but also provides the ability to determine if the graphic command is completed. A useful application is to detect the performance of the GPU to complete the GL command, as shown:
Start = getcurrenttime ();
Updatetextures ();
GlsetFencenv (Texture_Load_FENCE, GL_ALL_COMPLETED_NV);
DrawBackground ();
Glsetfencenv (Draw_Background_FENCE, GL_ALL_COMPLETED_NV);
Drawcharacters ();
Glsetfencenv (Draw_Characters_FENCE, GL_ALL_COMPLETED_NV);
GLFINISHFENCENV (TEXTURE_LOAD_FENCE); // Make App HANG, Until Return
TextureLoadendend = getcurrenttime ();
GLFINISHFENCENV (Draw_Background_FENCE);
DrawBackGroundend = getcurrenttime ();
GLFINISHFENCENV (Draw_Characters_FENCE);
DrawcharacTerSend = getcurrenttime ();
Printf ("Texture Load Time =% D / N", TextureLoadendend - Start;
Printf ("DRAW Background Time =% D / N", DrawBackGroundend - TextureLoadend;
Printf ("Draw Characters Time =% D / N", Drawchacters - DrawBackGroundend;