Tuesday June 28, 2005 After struggling a bit to get my laptop working with Solaris 10, everything is now sorted and I'm happy to notice how much faster it is. I arrived back in the United States last Thursday - it's nice to see my own bed. Working from Dublin was great and has left me feeling charged. I finally sorted through a number of balls I've been juggling and am finding myself focusing on interface stability. The Sun GNOME team is also focusing on performance issues (especially GNOME memory footprint), so I got involved and spent some time teaching some of my coworkers how to use the Sun Forte collect profiler, a really useful and easy-to-use performance tool. To use it, the program you are instrumenting must be compiled with Forte. What I really like is that you don't have to compile with any special options, no -g, no special compile steps. In other words, you can generate a performance report for pretty much any program that ships with Solaris.
To get performance timing information, you just need to run "collect programname". When the program exits, a profile report will be created in a subdirectory named test.#.er. You can then review a performance report with the following command:
er_print -metrics i.user:e.user:i.system:e.system:i.wall:e.wall -function test.#.er
And a caller-callees report with the following command:
er_print -cmetrics i.user:e.user:i.system:e.system:i.wall:e.wall -caller-callees test.#.er
The -metrics/-cmetrics argument generates a report that shows inclusive and exclusive user, system, and wall time. Inclusive time is the time spent in that function and in functions called by that function. Exclusive time is the time spent just in that function.
The -sort option is also useful. If you want to sort by effective user time you use "-sort e.user".
For example, here is a functions report for metacity-theme-viewer:
current: i.user:e.user:i.system:e.system:i.wall:e.wall:name current sort metric: Exclusive User CPU Time Functions sorted by metric: Exclusive User CPU Time Incl. Excl. Incl. Excl. Incl. Excl. Name User CPU User CPU Sys. CPU Sys. CPU Wall Wall sec. sec. sec. sec. sec. sec. 0.460 0.460 0.070 0.070 1.511 1.511 <total> 0.110 0.110 0.020 0.020 0.130 0.130 take_deferred_signal 0.020 0.020 0. 0. 0.020 0.020 FcSortCompare 0.020 0.020 0. 0. 0.020 0.020 gdk_rgb_convert_0888 0.020 0.020 0. 0. 0.020 0.020 strcmp 0.010 0.010 0.010 0. 0.020 0.010 FcCacheReadString 0.010 0.010 0. 0. 0.010 0.010 FcCharSetIsSubset 0.010 0.010 0. 0. 0.010 0.010 FcCompareFamily 0.010 0.010 0. 0. 0.010 0.010 FcDebug 0.010 0.010 0. 0. 0.010 0.010 FcLangSetIndex 0.010 0.010 0. 0. 0.010 0.010 FcPatternPosition 0.010 0.010 0. 0. 0.010 0.010 FcValueListDestroy [...]
The callers-callees report looks like this (one report for each function):
Incl. Excl. Incl. Excl. Incl. Excl. Name User CPU User CPU Sys. CPU Sys. CPU Wall Wall sec. sec. sec. sec. sec. sec. 0.020 0. 0. 0. 0.020 0. FcCharSetAddLeaf 0.020 0. 0. 0. 0.020 0. *FcCharSetFindLeafCreate 0.010 0. 0. 0. 0.010 0. FcCharSetPutLeaf 0.010 0.010 0. 0. 0.010 0.010 calloc
This report is for the function FcCharSetFindLeafCreate, since it is the function prefixed with an "*". It shows that FcCharSetFindLeafCreate had one caller, FcCharSetAddLeaf, since it is listed above. It also shows that FcCharSetFindLeafCreate called two functions, FcCharSetPutLeaf and calloc, since these are listed below.
Sun also makes a program called analyzer which is a GUI front-end for er_print, which is nice to use. However, I often find that simply using collect and er_print can help identify what the problem might be if you notice a particular program acting slow.
( Jun 28 2005, 12:08:01 AM CDT ) Permalink Comments [0]