Darryl Gove's blog
When threads go bad
When a thread hits an error in a multithreaded application, that error will take out the entire app. Here's some example code:
#include <pthread.h>
#include <stdio.h>
void *work(void * param)
{
int*a;
a=(int*)(1024*1024);
(*a)++;
printf("Child thread exit\n");
}
void main()
{
pthread_t thread;
pthread_create(&thread,0,work,0);
pthread_join(thread,0);
printf("Main thread exit\n");
}
Compiling and running this produces:
% cc -O -mt pthread_error.c % ./a.out Segmentation Fault (core dumped)
Not entirely unexpected, that. The app died without the main thread having the chance to clear up resources etc. This is probably not ideal. However, it is possible to write a signal handler to capture the segmentation fault, and terminate the child thread without causing the main thread to terminate. It's important to realise that there's probably little chance of actually recovering from the unspecified error, but this at least might give the app the chance to report the symptoms of its demise.
#include <pthread.h>
#include <stdio.h>
#include <signal.h>
void *work(void * param)
{
int*a;
a=(int*)(1024*1024);
(*a)++;
printf("Child thread exit\n");
}
void hsignal(int i)
{
printf("Signal %i\n",i);
pthread_exit(0);
}
void main()
{
pthread_t thread;
sigset(SIGSEGV,hsignal);
pthread_create(&thread,0,work,0);
pthread_join(thread,0);
printf("Main thread exit\n");
}
Which produces the output:
% cc -O -mt pthread_error.c % ./a.out Signal 11 Main thread exit
Posted at 10:02AM Nov 23, 2009 by Darryl Gove in Sun | Comments[3]



printf is not supposed to sig safe.
I know that you know that, but since I don't see any reference to it in your blog, one of your reader may try to use that in some code :-) (I only mention because since you wrote solaris app programming, you are likely to be quoted many times :-))
clicking on the opensparc link, on your about page redirected me to http://box455.bluehost.com/suspended.page/disabled.cgi/opensparc.net which says that the domain opensparc.net is suspended :-(
I am subscribed to your blog for quite some time and enjoy it very much: keep up with the good work.
cheers,
-- paulo
Posted by paulo on November 23, 2009 at 11:20 AM PST #
@Paulo. Exactly! The list of async-safe functions is here:
http://docs.sun.com/app/docs/doc/816-5137/gen-61908?a=view
if you look at the OpenSolaris code for printstack you'll find they use write:
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libc/port/gen/walkstack.c
Thanks for pointing out the problem with OpenSPARC.net. Not sure what's going on there... hopefully the site will be back up soon.
Thanks,
Darryl.
Posted by Darryl Gove on November 23, 2009 at 12:13 PM PST #
And now we have no core file with which to debug the problem... it's almost always a bad idea to try to catch things like this.
Posted by John Levon on November 24, 2009 at 06:21 PM PST #