compiler thoughts

All | Personal | Sun
« GCC for SPARC System... | Main | inlining »
20060309 Thursday March 09, 2006

gcc4ss flags

The target goal of 100% compatibility of gcc4ss (GCC for SPARC Systems) with plain GCC wouldn't be achieved if we didn't support all gcc flags. So we do! gcc4ss accepts all gcc flags plus we added more to control Sun Code Generator for SPARC Systems (scg4ss).

The maximum optimization level is still -O3 (same as GCC). At -O3 gcc4ss performs initial inlining and passes IR (Internal Representation) to scg4ss to do advance optimizations and further inlining. scg4ss's heuristics are tuned for sparc processors and can be driven by profile feedback and inter-module/inter-procedure analysis. Unfortunately I'm not in a position to talk about exact numbers, but grab your favourite app and measure -O2 vs -O3 performance with gcc4ss. And send us your results of course!

On top of -O3 we added -fast flag. Those familiar with Sun Studio know about this flag already. -fast is the macro of -O3 -xtarget=native -fns -fsimple=2 and other flags. -xtarget=native determines the available architecture, chip, cache of the machine on which the compiler is running, so you don't have to worry about improper -xarch, -xchip on your build server. Of course there is a -xtarget=generic in scg4ss for 'blended' arch/chip model. -fns and -fsimple=2 allows scg4ss's optimizer to perform aggressive floating point computations which are not strictly conforming with IEEE 754, but makes the floating point code run much faster. Once you're comfortable with -O3, try -fast instead. That what we use to run spec benchmarks.

As an extra topping to your -fast shake you can add -xipo flag to do inter-procedural optimizations. scg4ss's internal representation is stored within object file and fetched back during the link time, hence optimizer can see the IR for all modules at once. Each particular module during -xipo build is compiled with -O0-like level, hence all .o are built quickly, but the linking takes quite some time, because optimizer needs to recompile all modules with original optimization level and call code generator for each .o again. -xipo works best with -xprofile.

-xprofile flag should be used in two steps. Step one to collect train data with -xprofile=collect and step two to use the profile data with -xprofile=use. Normally you don't have to use -xipo during 'collect' phase if you want to use it during 'use', but it's recommended to have optimization level and other flags the same between two phases.

There are bunch of other performance related flags.
Please read about them here:
http://cooltools.sunsource.net/gcc/flags.html

Alexey.

Posted by alexey ( Mar 09 2006, 02:08:17 PM PST ) Permalink Comments [7]

Comments:

Do you plan to submit your GCC changes not related to the SUN backend to GCC? (Looking over the patch I noted some changes to the dwarf* files, there might have been others too).

Posted by p on March 10, 2006 at 11:16 AM PST #

Hi P, GCC committee doesn't like anything that is not GCC. We are going to support gcc4ss on our own.

Posted by Alexey on March 10, 2006 at 02:32 PM PST #

I think you misunderstood, I was referring to some code in dwarw*.c that seemed to add support for a new type of debug infor dwarf_string_reference (or something of the sort, I don't remeber exactly and I don't have the sources anymore). That code looked to me like an improvement to gcc, not just something needed for the Sun backend (I might be wrong though, I didn't look too thoroughly).

Posted by p on March 10, 2006 at 03:00 PM PST #

WRT performance I run the Briggs compiler benchmarks with both gcc-4.1 and the Sun compiler. gcc performs quite a bit better. Try it!

Posted by p on March 10, 2006 at 04:51 PM PST #

Are you talking about dwarf2_indirect_strings? This is the future needed to compile Solaris and I believe there is csl-sol210-3_4 gcc branch with such support. Not sure that it was ever integrated in the main branch.

Posted by 192.18.42.11 on March 10, 2006 at 05:58 PM PST #

Pardon the ignorance, but what's 'Briggs compiler benchmarks' ? Really interested to try them.

Posted by Alexey on March 10, 2006 at 06:02 PM PST #

See http://citeseer.ist.psu.edu/85455.html The code used to be available on the rice.edu ftp server, but does not seem to be there anymore. There are e few files that test about 50 cases for value numbering, code motion, constant propagation, strength reduction. The only place that I could still find the files is in llvm-1.2. Look for dead.c

Posted by 128.195.11.178 on March 10, 2006 at 09:53 PM PST #

Post a Comment:

Comments are closed for this entry.

Disclaimer:

This site is a personal blog and is to be used for informational purposes only. The views expressed on this blog are those of the author only, and should not be attributed to any past or present employers.

Calendar

RSS Feeds

Search

Links

Navigation

Referers