Darryl Gove's blog

Friday Jun 02, 2006

Analyzer probe effect

When profiling an application, using the Sun Studio Performance Analyzer, there is some probe effect due to both the interruption of the application to gather profiling data, and the act of recording that data to disk. The following study is an attempt to quantify the probe effect when running on a T2000 (UltraSPARC-T1) system. This system has the capability to run many threads, consequently the relationship between the probe effect and the number of threads is very interesting.

The following test program was used

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <sys/time.h>


int * restrict array;
int array_length;
int round_off;
long threads=1;

#define SIZE 128*1024*1024

int results=0;
pthread_mutex_t results_mutex;
__thread int sum = 0;
__thread int id;

void* thread_code(void* v)
{
  int i;
  id = (int)v;
  for (i=0; i<SIZE; i++)
  {
    sum+=array[i];
  }
  pthread_mutex_lock(&results_mutex);
  results+=sum;  
  pthread_mutex_unlock(&results_mutex);
  return 0;
}
  


void main(int argc,const char** argv)
{
  int i,rtn,id;
  pthread_t *thread_array;
  hrtime_t start_time;
  
  
  array=(int*)malloc(sizeof(int)*SIZE);
  int sum=0;
  
  if (argc==2) {threads=atoi(argv[1]);}
  
  pthread_mutex_init(&results_mutex,NULL);
  
  thread_array=(pthread_t *)malloc(sizeof(pthread_t)*threads);
  
  array_length=SIZE/threads;
  round_off=SIZE-(array_length*threads);
  
  for (i=0; i<SIZE;i++)
  {
    array[i]=1;
  }
  
   /*Multithreaded*/
   rtn=1;
   /*Make the threads*/
   start_time=gethrtime();
   for (i=0; i<10; i++)
{
   for (id=0;id<threads;id++)
   {
     pthread_create(&thread_array[id],NULL,&thread_code,(void*)id);
   }
   
   /*Join the threads*/
   for (id=0;id<threads;id++)
   {
     pthread_join(thread_array[id],NULL);
   }
}
   printf("Elapsed time (seconds)=%5.3f ",(gethrtime()-start_time)/1000000000.0);
   printf("Total is %i\n",results);
   
}

The study was to look at the runtime of the application as the number of threads was increased, and also when the experiment was recorded to local or remote disk. The other factor that was investigated was the use of low frequency profiling (-p lo).

The results from this can be shown as a graph. The y-axis is the ratio of the runtime under profiling against the runtime without profiling. The x-axis is the number of active threads.

Graph of probe effect against number of threads

The results suggest that profiling to remote disk can often cause significant probe effect, however this probe effect may be manageable for low active thread counts. The alternatives of recording experiments to local disk, /tmp, or using low frequency profiling (-p lo) all result in much lower overhead.

Comments:

Post a Comment:
Comments are closed for this entry.

Calendar

Search this blog

About

Solaris Application Programming

Book resources

The Developer's Edge

Book resources

OpenSPARC Internals

Book resources

Recent entries

Custom search

Tag cloud

book cmt communityone compiler cooltools cpu2006 dtrace gcc libraries linker multithreading openmp opensolaris opensparc optimisation optimization parallelisation parallelization performance performanceanalyzer programming secondlife solaris solarisapplicationprogramming sparc spot sunstudio ultrasparc ultrasparct2 x86

Links

Webcasts

Articles

Presentations

Interesting docs

Navigation

Referers

Feeds