GUADEC followup: evolution mallocs by callstack
Wednesday Jul 05, 2006
Evolution memory usage has been a sore spot for many GNOME desktop users, including those of us who run GNOME on a thin client. In his GUADEC talk on GNOME performance, Federico Mena Quintero mentioned that it would be useful to profile evolution memory usage. This entry shows the dtrace oneliner which profiles arg0 (malloc size) for the calls to malloc during the startup of the evolution process. During our talk I tried to also profile mallocs by callstack, but Federico's blog on this topic, indicated that profiling evolution-data-server mallocs while visiting all email folders might be useful. Here is the dtrace script:
#!/usr/sbin/dtrace -s
pid$target:libc:malloc:entry
{
@memwhere[ustack()] = quantize(arg0);
}
I call it with:
./g_evolutionmalloc.d -x aggsize=10m -c /usr/lib/evolution-data-server-1.2/evolution-data-server-1.6 > edatamalloc.out
(aggsize is the size of the aggregation buffer). Then I launch evolution and visit all of the folders in my mailbox. The output is quite large, remember for each unique callstack which calls malloc, I quantize the size of the malloc. Using the oneliner shows the overall distribution of sizes by evolution-data-server when I run evolution and browse all of my folders:
value ------------- Distribution ------------- count
0 | 0
1 | 27
2 |@ 311
4 |@@@ 663
8 |@@@@@@@@@@ 2046
16 |@@@@@@@@@@@@ 2553
32 |@@@@@@@ 1471
64 |@@@ 721
128 |@ 177
256 |@ 142
512 |@ 132
1024 | 91
2048 | 21
4096 | 5
8192 | 48
16384 | 2
32768 | 0
65536 | 4
131072 | 0
262144 | 2
524288 | 0
Lots of 8 and 16 byte mallocs, is this death by a thousand cuts? Gslice (and/or Solaris's slab allocator) should at least help reduce the time for these small mallocs. But where are we when these allocs take place? You are here. (bzip2 compressed because our roller admin doesn't like files > 3M.) As you might expect, most of the unique callstacks only have one value for arg0 (within a power of 2). Here are a few excerpts that caught my eye: Calendar timezone stuff calls malloc with a wide range of sizes:
libc.so.1`malloc
libecal-1.2.so.3.2.8`icalarray_append+0x3a
libecal-1.2.so.3.2.8`icaltimezone_parse_zone_tab+0x21f
libecal-1.2.so.3.2.8`icaltimezone_init_builtin_timezones+0x24
libecal-1.2.so.3.2.8`icaltimezone_get_utc_timezone+0x21
libecalbackendfile.so`e_cal_backend_file_init+0x63
libgobject-2.0.so.0.1000.2`g_type_create_instance+0x2e4
libgobject-2.0.so.0.1000.2`g_object_constructor+0x23
libgobject-2.0.so.0.1000.2`g_object_newv+0x350
libgobject-2.0.so.0.1000.2`g_object_new_valist+0x331
libgobject-2.0.so.0.1000.2`g_object_new+0x62
libecalbackendfile.so`_events_new_backend+0x35
libedata-cal-1.2.so.1.3.10`e_cal_backend_factory_new_backend+0x35
libedata-cal-1.2.so.1.3.10`impl_CalFactory_getCal+0x336
libedata-cal-1.2.so.1.3.10`_ORBIT_skel_small_GNOME_Evolution_Calendar_CalFactory_getCal+0x35
libORBit-2.so.0.1.0`ORBit_POAObject_invoke+0x3d
libORBit-2.so.0.1.0`ORBit_OAObject_invoke+0x34
libORBit-2.so.0.1.0`ORBit_small_invoke_adaptor+0x2b5
libORBit-2.so.0.1.0`ORBit_POAObject_handle_request+0x334
libORBit-2.so.0.1.0`ORBit_POAObject_invoke_incoming_request+0x54
value ------------- Distribution ------------- count
512 | 0
1024 |@@@ 1
2048 |@@@ 1
4096 |@@@@@@@@@@ 3
8192 |@@@@@@@@@@@@@@@@@@@@ 6
16384 |@@@ 1
32768 | 0
And there are some individual calls to malloc from calendar timezone code which are relatively large.
libc.so.1`malloc
libecal-1.2.so.3.2.8`icalarray_append+0x3a
libecal-1.2.so.3.2.8`icaltimezone_parse_zone_tab+0x21f
libecal-1.2.so.3.2.8`icaltimezone_init_builtin_timezones+0x24
libecal-1.2.so.3.2.8`icaltimezone_get_utc_timezone+0x21
libecalbackendfile.so`e_cal_backend_file_init+0x63
libgobject-2.0.so.0.1000.2`g_type_create_instance+0x2e4
libgobject-2.0.so.0.1000.2`g_object_constructor+0x23
libgobject-2.0.so.0.1000.2`g_object_newv+0x350
libgobject-2.0.so.0.1000.2`g_object_new_valist+0x331
libgobject-2.0.so.0.1000.2`g_object_new+0x62
libecalbackendfile.so`_events_new_backend+0x35
libedata-cal-1.2.so.1.3.10`e_cal_backend_factory_new_backend+0x35
libedata-cal-1.2.so.1.3.10`impl_CalFactory_getCal+0x336
libedata-cal-1.2.so.1.3.10`_ORBIT_skel_small_GNOME_Evolution_Calendar_CalFactory_getCal+0x35
/
libedata-book-1.2.so.2.2.1`e_book_backend_load_source+0x89
libedata-book-1.2.so.2.2.1`e_book_backend_open+0x89
libedata-book-1.2.so.2.2.1`impl_GNOME_Evolution_Addressbook_Book_open+0x5a
libedata-book-1.2.so.2.2.1`_ORBIT_skel_small_GNOME_Evolution_Addressbook_Book_open+0x2e
libORBit-2.so.0.1.0`ORBit_POAObject_invoke+0x3d
libORBit-2.so.0.1.0`ORBit_OAObject_invoke+0x34
libORBit-2.so.0.1.0`ORBit_small_invoke_adaptor+0x2b5
libORBit-2.so.0.1.0`ORBit_POAObject_handle_request+0x334
libORBit-2.so.0.1.0`ORBit_POAObject_invoke_incoming_request+0x54
libORBit-2.so.0.1.0`giop_thread_queue_process+0x105
libORBit-2.so.0.1.0`giop_request_handler_thread+0x5b
libglib-2.0.so.0.1000.2`g_thread_pool_thread_proxy+0xb4
libglib-2.0.so.0.1000.2`g_thread_create_proxy+0x11b
value ------------- Distribution ------------- count
131072 | 0
262144 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1
524288 | 0
Uggh, too much data! So I can add a predicate to the original script so I only have look at malloc calls between 8 and 32 bytes:
#!/usr/sbin/dtrace -s
pid$target:libc:malloc:entry
/arg0 > 8 && arg0 < 32/
{
@memwhere[ustack()] = quantize(arg0);
}
That does thin it out a bit, but there are still 578 unique callstacks. Maybe I could sum the total amount of memory allocated for each callstack and only print the ones which exceed 1M? I'm open to suggestions.
Similar memory profiles of the evolution process itself are also interesting. During evolution startup and folder browse, there are 30 malloc requests for 64K from this evolution process callstack:
libc.so.1`malloc+0x60
libresolv.so.2`__memget+0x1f
libresolv.so.2`ho_byname2+0x42
libresolv.so.2`ho_byname2+0x7c
libresolv.so.2`gethostbyname2_p+0x111
libresolv.so.2`gethostbyname_p+0x45
libresolv.so.2`res_gethostbyname+0x21
nss_dns.so.1`_gethostbyname+0x1c
nss_dns.so.1`getbyname+0x212
libc.so.1`nss_search+0x1a9
libnsl.so.1`_switch_getipnodebyname_r+0x73
libnsl.so.1`_door_getipnodebyname_r+0xf6
libnsl.so.1`_get_hostserv_inetnetdir_byname+0xb62
libnsl.so.1`getipnodebyname+0xdf
libsocket.so.1`get_addr+0x126
libsocket.so.1`_getaddrinfo+0x414
libsocket.so.1`getaddrinfo+0x19
libcamel-1.2.so.0.0.0`cs_getaddrinfo+0x2b
libc.so.1`_thr_setup+0x51
libc.so.1`_lwp_start
value ------------- Distribution ------------- count
32768 | 0
65536 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 30
If anyone finds this useful, I can upload a dump of this or any simililar information.










