RDTSC与QueryPerformanceCounter
1.概述
RDTSC指令属于汇编指令,取得的是CPU的周期数;QueryPerformanceCounter属于WINAPI,取得的是高精度性能计数器的值,MSDN的解释是Retrieves the current value of the performance counter, which is a high resolution (<1us) time stamp that can be used for time-interval measurements.
由于CPU的频率在运行过程中并不是恒定的,所以用RDTSC计算时间间隔并不准确;用QueryPerformanceFrequency配合QueryPerformanceCounter可较准确地得到时间间隔,有了时间间隔,再结合RDTSC得到的CPU周期数,可大致得出CPU频率。
QueryPerformanceFrequency:Retrieves the frequency of the performance counter. The frequency of the performance counter is fixed at system boot and is consistent across all processors. Therefore, the frequency need only be queried upon application initialization, and the result can be cached.这个函数得出的频率不是CPU频率,而是经过系统修正的高精度计数器的频率,在程序运行过程中恒定,因此可用来计数,只需在程序开始时读取一次。
2.RDTSC
RDTSC(Read Time Stamp Counter)指令返回的是自系统启动以来CPU的时钟数,一共64位数据,存于EDX:EAX,高32位在EDX,低32位在EAX,可用C和汇编混合编程来监视程序的性能。为了防止线程在运行时被调度到其他的处理器上,可用SetThreadAffinityMask(GetCurrentThread(), 8)避免线程转移。
void RDTSC_test() { SetThreadAffinityMask(GetCurrentThread(), 8); unsigned time,time_low,time_high; unsigned hz=2200000000; __asm rdtsc __asm mov time_low,eax __asm mov time_high,edx Sleep(35000); __asm rdtsc __asm sub eax,time_low __asm sub edx,time_high __asm div hz __asm mov time,eax printf("Seconds:%u\n",time); }
这里需要了解一个union LARGE_INTEGER,表示64位有符号整型,这个共用体的方便之处在于,既可以很方便的得到高32位,低32位,也可以方便的得到整个64位。进行运算和比较的时候,使用QuadPart即可。__int64也可用来表示64位,但在驱动开发中,LARGE_INTEGER用得较多。
#define read_tsc(tsc) \ long long tsc; \ _asm rdtsc \ _asm mov dword ptr[tsc],eax \ _asm mov dword ptr[tsc+4],edx #define read_hrpc(pc) \ LARGE_INTEGER pc; \ QueryPerformanceCounter(&pc); void Query() { read_hrpc(pc1); read_tsc(tsc1); cout <<"waiting..." <<endl; Sleep(1000); read_tsc(tsc2); read_hrpc(pc2); LARGE_INTEGER freq; QueryPerformanceFrequency(&freq); double elapse=(pc2.QuadPart-pc1.QuadPart)/(double)freq.QuadPart; double cpu_freq=(tsc2-tsc1)/elapse; cout <<"perf freq:\t" <<freq.QuadPart/1000.0/1000 <<"\tMHz" <<endl; cout <<"cpu freq:\t" <<cpu_freq/1000/1000/1000 <<"\tGHz" <<endl; }
附:(本节与“编程之美——CPU曲线”该节的程序)
#include"Windows.h" #include"math.h" #include"stdio.h" #include"iostream" using namespace std; //画CPU正弦曲线 void SinCPU() { SetThreadAffinityMask(GetCurrentThread(), 8); //产生正弦波形数组 const double SPLIT = 0.01; const int COUNT =200;//共200个点 const double PI = 3.14159265; const int INTERVAL = 300;//波形峰峰值为300 int busySpan[COUNT]; int idleSpan[COUNT]; int half = INTERVAL/2; double rad =0.0; for(int i=0;i<COUNT;i++) {//防止正弦碰触边框,将峰峰值改为150 busySpan[i] = (int)(half+(sin(PI*rad)*half/2)); idleSpan[i] = INTERVAL - busySpan[i]; rad += SPLIT; } long startTime = 0; int j = 0; //主循环 while(1) { j = j%COUNT; startTime = GetTickCount(); while((GetTickCount() - startTime)<=busySpan[j]); Sleep(idleSpan[j]); j++; } } //CPU直接计数法,不具有普遍性,单核50%使用率,2.2GHz void LineCPU1() { SetThreadAffinityMask(GetCurrentThread(), 8); long i; while(1) { for(i=0;i<4400000;i++); Sleep(10); } } //Tick计数法,比较准确 void LineCPU2() { SetThreadAffinityMask(GetCurrentThread(), 8); int busyTime = 100; int idleTime = busyTime; long startTime = 0; while(1) { startTime = GetTickCount(); while((GetTickCount() - startTime) <= busyTime); Sleep(idleTime); } } //内联汇编函数 inline __int64 GetCPUTickCount() { __asm { rdtsc; } } //由于CPU频率不恒定,且非浮点数,此方法误差大 void RDTSC_test() { SetThreadAffinityMask(GetCurrentThread(), 8); unsigned time,time_low,time_high; unsigned hz=2200000000; __asm rdtsc __asm mov time_low,eax __asm mov time_high,edx Sleep(35000); __asm rdtsc __asm sub eax,time_low __asm sub edx,time_high __asm div hz __asm mov time,eax printf("Seconds:%u\n",time); } #define read_tsc(tsc) \ long long tsc; \ _asm rdtsc \ _asm mov dword ptr[tsc],eax \ _asm mov dword ptr[tsc+4],edx #define read_hrpc(pc) \ LARGE_INTEGER pc; \ QueryPerformanceCounter(&pc); //RDTSC与QueryPerformance配合使用得出 //CPU频率和时间间隔 void Query() { read_hrpc(pc1); read_tsc(tsc1); cout <<"waiting..." <<endl; Sleep(2000);//waiting 2s read_tsc(tsc2); read_hrpc(pc2); LARGE_INTEGER freq; QueryPerformanceFrequency(&freq); double elapse=(pc2.QuadPart-pc1.QuadPart)/(double)freq.QuadPart;//时间间隔 double cpu_freq=(tsc2-tsc1)/elapse;//CPU计数值除以时间 cout <<"perf freq:\t" <<freq.QuadPart/1000.0/1000 <<"\tMHz" <<endl; cout <<"cpu freq:\t" <<cpu_freq/1000/1000/1000 <<"\tGHz" <<endl; } void main() { Query(); }