Darryl Gove's blog

Tuesday Oct 27, 2009

Programming and electronics for kids

I've been continuing to look into programming and electronics for kids. I wrote some of the programming options up a while back. Scratch is still a firm favourite.

On the list of things to try we have brickcc to try out with the lego NXT. Here's an old comparison of the various approaches to programming the NXT brick.

The other on-going project is a microcontroller - the STM Primer. Includes a screen, tilt sensor, and a single button.

On the electronics side (which is what lead me to microcontrollers in the first place), this is a nice article on kits for kids, and a second earlier one. There's also a bunch of kits available at makershed (the most surprising one is an EX-150, which is a couple of kits up from what I had. I think I had the EX-60). Here's a list of some microcontroller starter kits, and a different list of microcontroller like options.

Monday Oct 26, 2009

Things to see in Dorset

We spent a bit of time in Dorset on our trip home. Interesting things that we missed seeing giant sea monsters and unexploded bombs.

Wednesday Oct 21, 2009

GMT

One of the things I didn't manage to do on our recent vacation in the UK was to visit Greenwich Observatory. This is the "home" of Greenwich Mean Time. The BBC has a nice article on the history of GMT. Perhaps next time....

Monday Oct 19, 2009

Fishing with cputrack

I'm a great fan of the hardware performance counters that you find on most processors. Often you can look at the profile and instantly identify what the issue is. Sometimes though, it is not obvious, and that's where the performance counters can really help out.

I was looking at one such issue last week, the performance of the application was showing some variation, and it wasn't immediately obvious what the issue was. The usual suspects in these cases are:

  • Excessive system time
  • Process migration
  • Memory placement
  • Page size
  • etc.

Unfortunately, none of these seemed to explain the issue. So I hacked together the following script cputrackall which ran the test code under cputrack for all the possible performance counters. Dumped the output into a spreadsheet, and compared the fast and slow runs of the app. This is something of a "fishing trip" script, just gathering as much data as possible in the hope that something leaps out, but sometimes that's exactly what's needed. I regularly get to sit in front of a new chip before the tools like ripc have been ported, and in those situations the easiest thing to do is to look for hardware counter events that might explain the runtime performance. In this particular instance, it helped me to confirm my suspicion that there was a difference in branch misprediction rates that was causing the issue.

Tuesday Oct 13, 2009

Surprisingly slow compile time

I had an e-mail which told the sorry tale of a new system which tool longer to build a project than an older system, of theoretically similar performance. The system showed low utilisation when doing the build indicating that it was probably spending a lot of time waiting for something.

The first thing to look at was a profile of the build process using `collect -F on`, which produced the interesting result that the build was taking just over 2 minutes of user time, a few seconds of system time, and thousands of seconds of "Other Wait" time.

"Other wait" often means waiting for network, or disk, or just sleeping. The other thing to realise about profiling multiple processes is that all the times are cumulative, so all the processes that are waiting accumulate "other wait" time. Hence it will be a rather large number if multiple processes are doing it. So this confirmed and half explained the performance issue. The build was slow because it was waiting for something.

Sorting the profile by "other wait" indicated two places that the wait was coming from, one was waitpid - meaning that the time was due to a process waiting for another process, well we knew that! The other was a door call. Tracing up the call stack eventually lead into the C and C++ compiler, which were calling gethostbyname. The routine doing the calling was "generate_prefix" which is the routine responsible for generating a random prefix for function names - the IP address of the machine was used as one of the inputs for the generation of a prefix.

The performance problem was due to gethostbyname timing out, common reasons for this are missed configurations in the /etc/hosts and /etc/nsswitch.conf files. In this example adding the host name to the hosts file cured the problem.

Monday Oct 12, 2009

An aliasing example

The compiler flag -xalias_level allows a user to assert the degree of aliasing that exists within the source code of an application. If the assertion is not true, then the behaviour of the application is undefined. It is definitely worth looking at the examples given in the user's guide, although they can be a bit "dry" to read. So here's an example which illustrates what can happen:

struct stuff{
 int value1;
 int value2;
};

void fill(struct stuff *x)
{
  x->value1=0;      // Clear value1 
  int * r=(int*)x;  // Take the address of the structure
  int var = *r;     // Take the value from value1
  x->value1=var;    // And store it back into value1
}

The above code will clear value1 and then load and store this value back. So for correctly working code value1 should exit the function containing zero. However, if -xalias_level=basic is used to build the application, then this tells the compiler that no two pointers to variables of different types will alias. So pointer to an int will never alias with an int. So the read from *r does not alias with x.value1.

So with this knowledge the compiler is free to remove the original store to x.value1, because it has been told that nothing will alias with it, and there is a later store to the same address. The later store will overwrite the initial store.

Fortunately it the lint utility can pick up these issues:

$ lint -Xalias_level=basic alias.c
(9) warning: cast of nonscalar pointer to scalar pointer is valid only at -xalias_level=any

For the example above the compiler does the correct thing and eliminates all the instructions but the store to value1. For more complex examples there is no guarantee that the code will be correct if it violates the -xalias_level setting.

Friday Oct 09, 2009

Webcast: Improving the performance of parallel codes using the Performance Analyzer

Earlier in the summer I recorded a slidecast on using the Performance Analyzer on parallel codes, it's just come out on the HPC portal.

Calendar

Search this blog

About

Solaris Application Programming

Book resources

The Developer's Edge

Book resources

OpenSPARC Internals

Book resources

Recent entries

Custom search

Tag cloud

book cmt communityone compiler cooltools cpu2006 dtrace gcc libraries linker multithreading openmp opensolaris opensparc optimisation optimization parallelisation parallelization performance performanceanalyzer programming secondlife solaris solarisapplicationprogramming sparc spot sunstudio ultrasparc ultrasparct2 x86

Links

Webcasts

Articles

Presentations

Interesting docs

Navigation

Referers

Feeds