Darryl Gove's blog
Analyzer probe effect
When profiling an application, using the Sun Studio Performance Analyzer, there is some probe effect due to both the interruption of the application to gather profiling data, and the act of recording that data to disk. The following study is an attempt to quantify the probe effect when running on a T2000 (UltraSPARC-T1) system. This system has the capability to run many threads, consequently the relationship between the probe effect and the number of threads is very interesting.
The following test program was used
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <sys/time.h>
int * restrict array;
int array_length;
int round_off;
long threads=1;
#define SIZE 128*1024*1024
int results=0;
pthread_mutex_t results_mutex;
__thread int sum = 0;
__thread int id;
void* thread_code(void* v)
{
int i;
id = (int)v;
for (i=0; i<SIZE; i++)
{
sum+=array[i];
}
pthread_mutex_lock(&results_mutex);
results+=sum;
pthread_mutex_unlock(&results_mutex);
return 0;
}
void main(int argc,const char** argv)
{
int i,rtn,id;
pthread_t *thread_array;
hrtime_t start_time;
array=(int*)malloc(sizeof(int)*SIZE);
int sum=0;
if (argc==2) {threads=atoi(argv[1]);}
pthread_mutex_init(&results_mutex,NULL);
thread_array=(pthread_t *)malloc(sizeof(pthread_t)*threads);
array_length=SIZE/threads;
round_off=SIZE-(array_length*threads);
for (i=0; i<SIZE;i++)
{
array[i]=1;
}
/*Multithreaded*/
rtn=1;
/*Make the threads*/
start_time=gethrtime();
for (i=0; i<10; i++)
{
for (id=0;id<threads;id++)
{
pthread_create(&thread_array[id],NULL,&thread_code,(void*)id);
}
/*Join the threads*/
for (id=0;id<threads;id++)
{
pthread_join(thread_array[id],NULL);
}
}
printf("Elapsed time (seconds)=%5.3f ",(gethrtime()-start_time)/1000000000.0);
printf("Total is %i\n",results);
}
The study was to look at the runtime of the application as the number of threads was increased, and also when the experiment was recorded to local or remote disk. The other factor that was investigated was the use of low frequency profiling (-p lo).
The results from this can be shown as a graph. The y-axis is the ratio of the runtime under profiling against the runtime without profiling. The x-axis is the number of active threads.
The results suggest that profiling to remote disk can often cause significant probe effect, however this probe effect may be manageable for low active thread counts. The alternatives of recording experiments to local disk, /tmp, or using low frequency profiling (-p lo) all result in much lower overhead.
Posted at 08:16AM Jun 02, 2006 by Darryl Gove in Sun |


