Intel's Real Time Stamp Counter

Introduction

For applications that require accurate time-stamp counters, beginning with Pentium, Intel has introduced an instruction to count the number of processor cycles elapsed since the machine power on. This instruction is known as RDTSC, which upon execution loads the contents of the 64-bit Intel-time stamp register into two 32-bit registers: EDX and EAX. High order 32-bits would go to EDX and Low order 32-bits would go to EAX.

The value returned by RDTSC indicates the number of processor cycles executed and not the number of seconds elapsed. Thus to get the number of seconds elapsed, the returned value need to be divided by the processor clock-speed.

To facilitate easy usage of this instruction, a class _RealTimeStampCounter can be prepared as shown below.

#ifndef __RTSC_H_6482AB30_B593_4e9a_A640_0BEAF4FC9643_
#define __RTSC_H_6482AB30_B593_4e9a_A640_0BEAF4FC9643_
#include <memory.h>
//Declare Real Time Stamp Counter Related Information
#define CPUID	__asm __emit 0fh __asm __emit 0a2h
#define RDTSC	__asm __emit 0fh __asm __emit 031h
class _RealTimeStampCounter
{
  unsigned MHZ;
  unsigned Base;
  unsigned StartCyclesHigh;
  unsigned StartCyclesLow;
  unsigned EndCyclesHigh;
  unsigned EndCyclesLow;
  __int64	 ElapsedCycles;
public:
  _RealTimeStampCounter(unsigned MachineSpeedInMHZ = 100)
  {
    memset(this, 0, sizeof(_RealTimeStampCounter));
    Base = FindBase();
    MHZ = MachineSpeedInMHZ;
  }
  inline void StartCounter()
  {
    GetTimeStamp(&StartCyclesHigh, &StartCyclesLow);
  }
  inline void StopCounter()
  {
    GetTimeStamp(&EndCyclesHigh, &EndCyclesLow);
    unsigned __int64 TempCycles1 = 0, TempCycles2 = 0;
    TempCycles1 = ((unsigned __int64)StartCyclesHigh << 32) | StartCyclesLow;
    TempCycles2 = ((unsigned __int64)EndCyclesHigh << 32) | EndCyclesLow;
    ElapsedCycles = TempCycles2 - TempCycles1 - Base;
  }
  inline __int64 GetElapsedCycles()
  {
    return ElapsedCycles;
  }
  inline double GetElapsedSeconds()
  {
    return ((double)(ElapsedCycles) / (double)(MHZ * 1000000));
  }
private:
  inline void GetTimeStamp(unsigned *pCyclesHigh, unsigned *pCyclesLow)
  {
    __asm
    {
      pushad;
      CPUID;
      RDTSC;
      mov ebx, pCyclesHigh;
      mov[ebx], edx;
      mov ebx, pCyclesLow;
      mov[ebx], eax;
      popad;
    }
  }
  inline unsigned FindBase()
  {
    unsigned Base, BaseExtra = 0;
    unsigned CyclesLow, CyclesHigh;
    __asm
    {
      pushad;
      CPUID;
      RDTSC;
      mov	CyclesHigh, edx;
      mov CyclesLow, eax;
      popad;
      pushad;
      CPUID;
      RDTSC;
      popad;
      pushad;
      CPUID;
      RDTSC;
      mov	CyclesHigh, edx;
      mov CyclesLow, eax;
      popad;
      pushad;
      CPUID;
      RDTSC;
      popad;
      pushad;
      CPUID;
      RDTSC;
      mov	CyclesHigh, edx;
      mov CyclesLow, eax;
      popad;
      pushad;
      CPUID;
      RDTSC;
      sub eax, CyclesLow;
      mov BaseExtra, eax;
      popad;
      pushad;
      CPUID;
      RDTSC;
      mov	CyclesHigh, edx;
      mov CyclesLow, eax;
      popad;
      pushad;
      CPUID;
      RDTSC;
      sub eax, CyclesLow;
      mov Base, eax;
      popad;
    }
    if (BaseExtra <  Base)
      Base = BaseExtra;
    return Base;
  }
};
#endif

Usage

The _RealTimeStampCounter class is simple and straight forward to use. All that is required is - creation of an object to the _RealTimeStampCounter class and calling the StartCounter() and StopCounter() methods on it before and after the code to be monitored respectively, as shown below.

_RealTimeStampCounter RTSC(1000); // The machine CPU Speed is 1GHz, which is equal to 1000MHz.

RTSC.StartCounter();
 for(int i=0; i < 99999; ++i) ; // The piece of code to be time-monitored..
RTSC.StopCounter();

printf("\n Number of Seconds Elapsed = %f", RTSC.GetElapsedSeconds());
printf("\n Number of Cycles Executed = %I64u", RTSC.GetElapsedCycles());

The above code prints the number of seconds elapsed and the number of processor cycles spent in executing the for loop embedded between StartCounter() and StopCounter(). Replacing the for loop with any other piece of code would give the time spent in that code instead. This can come handy in comparing the performance variations across two different pieces of code. There are no restrictions in using this class -a _RealTimeStampCounter object can be started and stopped as many times as one wishes, and more than one _RealTimeStampCounter can be used in conjunction, as long as one takes care of the fact that RDTSC counts every processor cycle executed and hence a multiple object usage might result in an overlapped count.

It should be noted that this mechanism works only on processors that support the RDTSC instruction (all Intel Pentium processors support this instruction).

By   

GopalaKrishna.Palem

Homepage     Other Articles