Darryl Gove's blog

Monday Apr 28, 2008

static and inline functions

Hit a problem when compiling a library. The problem is with mixing static and inline functions, which is not allowed by the standard, but is allowed by gcc. Example code looks like:

char * c;

static void foo(char *);

inline void work()
{
  foo(c);
}

void foo(char* c)
{
} 

When this code is compiled it generates the following error:

% cc s.c
"s.c", line 7: reference to static identifier "foo" in extern inline function
cc: acomp failed for s.c

It turns out that there is a workaround for this problem, which is the flag -features=no%extinl. Douglas Walls describes the issue in much more detail.

Monday Mar 17, 2008

auto_ilp32, unsafe for production?

A while back I was reading up on the intel compiler's -auto_ilp32 flag. This flag produces a binary that enjoys the benefits of the EMT64 instructions set extensions, but uses a 32-bit memory model. The idea of the flag is to get the performance from the instruction set while avoiding the increase in memory footprint that comes from 64-bit pointers. I'm totally in favour of the idea, after all, that's the idea behind the v8plus/sparcvis architecture.

However, I was a bit distressed at the details of the implementation. The flag tells the compiler to make two assumptions, firstly that longs can be constrained to 4-bytes (rather than 8-bytes), secondly that pointers can also be held in 4 bytes (rather than 8).

The assumption for longs can be argued that if the application works when compiled for IA32, then any longs in the program do fit into 32-bits, so only using 32-bits for longs is therefore ok.

The docs place this restriction on the use of the flag for pointers:

Specifies that the application cannot exceed a 32-bit address space, which allows the compiler to use 32-bit pointers whenever possible.

The idea being that if the application only uses a 32-bit memory range, then the upper bits of the 64-bit pointer are going to be zero anyway - so why store them.

The problem with both of these is that the code ends up being run as if it were a 64-bit application. So the OS thinks the app is 64-bit, so will quite happily pass 8 byte longs to the app, or pass memory that is not in the low 32-bit addressable memory.

To go into a couple of situations where this could be 'bad':

  • Imagine a program which relies on a zero return value from malloc to tell it to stop allocating memory (perhaps it caches data, or maybe the algorithm it choses depends on how much memory is available). Under -auto_ilp32, the OS keeps returning new memory, so the application thinks it has not run out of memory, but in fact it is getting memory that is no longer addressable using just 32-bits.
  • Consider an application which allocates a chunk of memory to handle a problem. The larger the problem, the more memory is required. At some point the size of the problem will require a chunk of memory that is not entirely 32-bit addressable. Because malloc does not return zero, the application has no way of knowing that the problem is too large.
  • The application takes the pointers to memory that is not 32-bit addressable, and drops the upper 32-bits. Now it has a pointer into 32-bit addressable memory that has already been used for other data, and it starts storing new data on top of the old data.
  • Imagine another situation where the application is profiled, and it's realised that too much time is spent in the memory management library. So the developer produces a new library and gets the application to use this in preference to the system provided library. The only side-effect of this library is that it starts allocating memory at a higher address, which doesn't matter if you have a 64-bit address, but it eats up a sizable amount of the 32-bit addressable space.

In many of these cases the memory corruption problem would just leap out, and the user would know to remove the flag, but there will be some cases where the flag could cause silent data corruption.

But hang on a second ... didn't I just say that SPARC has the same hybrid. Well, not quite. v8plus is, as it's name suggests, built on SPARC V8 - so the OS thinks that it is a 32-bit application. Hence longs are 32-bits in size, as are pointers. There's some code necessary to store the additional state of a V9 processor. But basically, the application is a 32-bit app which happens to use a few more instructions.

The thing that's frustrating is that it's quite possible to make the OS aware of the 32-bit/64-bit hybrid, or to engineer a layer to be able to safely run a 32-bit hybrid app with the OS believing it to be a 64-bit app, but that was not the approach taken. A bit of a missed opportunity, IMO.

Monday Jan 28, 2008

Sample chapter from Solaris Application Programming available

There's a sample chapter from my book up on sun.com/books.

It's chapter 4 which is the chapter which discusses the tools that come with Solaris and Sun Studio. The chapter exists because I find that there are some tools that I use every day, and some tools that I might touch once a month, and some that I use even more rarely. The problems I hit are:

  • What was the name of the tool which ....?
  • What are the command line options to ...?
  • Is there a tool to ....?

Obviously I hit the third problem very infrequently, but I'm sometimes surprised when I discover a tool which I'd previously never heard of which just happens to do exactly what I need. Anyway I hope you find the chapter useful. It's one of my two solutions to this problem.

The other solution is spot which attempts to collect all the data that you routinely need for performance analysis of an application. So it calls the other tools - so you don't need to know the commandlines, or the names of the tools. One of the things that should be noticeable with spot is that it has few commandline options. I was hoping that we'd end up with none, but some are inevitable; but those are really house-keeping options (where to put the report, what to call it). There's only -X which generates an extended report, given the time it can take to get the data, it seemed appropriate to do the high value stuff quickly with an option for the tool to take a longer time when the user specified that it was ok.

Tuesday Nov 06, 2007

Values defined by the compiler

The Sun Studio compilers have a number of default #defines included in the header files, more details on these can be found here:

Tuesday Oct 09, 2007

Compiling for the UltraSPARC T2

Today, Sun launched systems based on the UltraSPARC T2. A question that is bound to come up is what compiler flags should be used for the processor?

Sun Studio 12 has the flag -xtarget=ultraT2 to specifically target the UltraSPARC T2. But before jumping off and using this flag, let's take the flag apart and see what it actually means. There are three components that are set by the -xtarget flag :

  • -xcache flag. This flag tells the compiler to target a particular cache configuration. The flag will have an impact on floating point code where the loops can be tiled to fit into cache. Obviously not all codes are amenable to this optimisation, so the -xcache setting is usually unimportant.
  • -xchip flag. This sets the instruction latencies and instruction selection preferences. The UltraSPARC T2 (in common with the UltraSPARC T1) has a simple pipeline so there is nothing much to gain from accurately modelling the instruction latencies. There are also no real situations where it will do better with one instruction sequence in preference to another (unless one is longer than the other). So for the UltraSPARC T2 this flag has little impact on the generated code.
  • -xarch flag. The -xarch flag controls the target architecture. This is traditionally used principally to control whether 32-bit or 64-bit binaries are generated. However, Sun Studio 12 introduced the flags -m32 and -m64 to separate the address-size of the binary from the instruction set selection. There are no UltraSPARC T2 specific instructions which the compiler currently generates, so the default of the SPARC V9 ISA is fine.
  • To summarise, there is an UltraSPARC T2 specific compiler flag, but for most situations the best target to use would be -xtarget=generic which should give good performance over a wide range of processors.

Thursday Aug 16, 2007

Presenting at Stanford HPC conference

I'll be presenting at Stanford next week as part of their HPC conference (Register here). I plan to cover:

Wednesday Jul 25, 2007

List of Sun Studio redistributable libraries

List of libraries that are included with Sun Studio, and can be redistributed with applications compiled by Sun Studio.

All the documentation for Sun Studio 12.

Tuesday Jul 24, 2007

CMT Developer Tools

We've just released the CMT Developer Tools for Sun Studio 12. The tools are available for both SPARC and x64 systems, although not all tools are on the x64 platform. The list of tools is as follows:

  • ATS - Automatic Tuning and Troubleshooting System. Automatically finds the best compiler flags or locates optimisation bugs in an application without access to the source code. (SPARC and x64)
  • SPOT - Simple Performance Optimisation Tool. Generates an html report detailing where the application is spending time, and information about why it might be spending time there.(SPARC and x64)
  • BIT - Binary Improvement Tool. Reports information about application coverage also produces instruction execution counts, branch taken data etc. (SPARC only)
  • Discover - Sun Memory Error Discovery Tool. Detects memory access issues such in an application. Examples are accesses to uninitialised memory, accesses beyond array bounds, memory leaks, etc. (SPARC only)

Calendar

Search this blog

About

Solaris Application Programming

Book resources

Recent entries

Custom search

Tag cloud

ats bit book c++ cmt communityone compiler cooltools cpu2006 developers dtrace gccfss hpc multithreading openmp opensparc parallelisation parallelization performance performanceanalyzer secondlife solaris solarisapplicationprogramming sparc spot sunstudio t2 ultrasparc ultrasparct2 x86

Links

Webcasts

Articles

Presentations

Navigation

Referers

Feeds