WebSphere Tuning Tip: Scalability on Solaris
The single threaded malloc and free functions are in the default libc.so library. You can find libmtmalloc.so and libumem.so libraries in the /usr/lib directory on Solaris 10. One of the main reasons that Sun decides to keep the standard malloc with libc is that there are many ISV's and applications that depend on this library. Thus, we provide the alternate memory allocators as options. Below, I provide an example how you can verify the hot locks with libc.so in your WAS java process.
You can use the plockstat command on Solaris 10. This is a DTrace client, so you have to execute it in the global zone as root. To execute it as non-root, you must have proper privileges granted for DTrace. See the man page for plockstat for more details about the options. Here, I hit <Control-c> after several seconds. Alternately, you can use the "-e <secs>" option to execute plockstat for a number of seconds without having to hit <Control-c> to break out of the command.
# plockstat -H -p <WAS_PID>
^C
Mutex hold
Count nsec Lock Caller
-------------------------------------------------------------------------------
30 89886 0xfe3e1940 libjvm.so`__1cCosRpd_suspend_thread6Fpn
30 78896 0xfe3e17c0 libjvm.so`__1cCosRpd_suspend_thread6Fpn
14 72921 libc.so.1`_uberdata libc.so.1`_lwp_start
30 72633 0xfe3e1800 libjvm.so`__1cCosRpd_suspend_thread6Fpn
30 71996 0xfe3e1840 libjvm.so`__1cCosRpd_suspend_thread6Fpn
1 34800 0xfe3e1b40 libjvm.so`_start+0x4c
2 33500 libc.so.1`libc_malloc_lock libjvm.so`__1cUGenericGrowableArrayUcle
4 32750 libc.so.1`libc_malloc_lock 0xfea32e40
1 28300 libc.so.1`libc_malloc_lock 0xfeaa7a88
1 26600 0xfe3e1940 libjvm.so`__1cCosMstart_thread6FpnGThre
1 26300 0xfe3e1940 libjvm.so`_start+0x4c
1 25600 libc.so.1`libc_malloc_lock libjvm.so`__1cQChunkPoolCleanerEtask6M_
12 25541 libc.so.1`_uberdata libc.so.1`thr_create+0x2c
3 25433 libc.so.1`libc_malloc_lock libjava.so`Java_java_lang_ClassLoader_d
6 23566 libc.so.1`libc_malloc_lock libjvm.so`__1cCosGmalloc6FI_pv_+0x20
1 23500 libc.so.1`__sbrk_lock libc.so.1`_malloc_unlocked+0x1fc
1 21900 libc.so.1`libc_malloc_lock libjava.so`JNU_GetStringPlatformChars+0
2 20200 libc.so.1`libc_malloc_lock libjvm.so`__1cCosGmalloc6FI_pv_+0x20
1 20000 libc.so.1`libc_malloc_lock libverify.so`VerifyFormat+0xdf8
12 19200 libc.so.1`_uberdata libjvm.so`__1cCosNcreate_thread6FpnGThr
1 19200 libc.so.1`libc_malloc_lock libWs60ProcessManagement.so`process_str
1 18900 0xfe3e1b00 libjvm.so`__1cCosMstart_thread6FpnGThre
1 18000 0xfe3e1980 libjvm.so`__1cCosMstart_thread6FpnGThre
1 17800 0xfe3e1ac0 libjvm.so`__1cCosMstart_thread6FpnGThre
1 17500 0xfe3e1b40 libjvm.so`__1cCosMstart_thread6FpnGThre
1 17400 0xfe3e1bc0 libjvm.so`__1cCosMstart_thread6FpnGThre
1 17400 0xfe3e1a40 libjvm.so`__1cCosMstart_thread6FpnGThre
30 17300 0xfe3e1940 libjvm.so`__1cGThreadMdo_vm_resume6Mi_i
4 17250 libc.so.1`libc_malloc_lock libjvm.so`__1cCosGmalloc6FI_pv_+0x20
2 16700 libc.so.1`libc_malloc_lock libjava.so`JNU_ReleaseStringPlatformCha
4 16475 libc.so.1`libc_malloc_lock libjvm.so`__1cCosGmalloc6FI_pv_+0x20
From this output, you notice there are lock contentions with malloc. To improve this situation, you can switch to use the multi-threaded libumem.so library. Assume you have a 32-bit WAS JVM and the default server profile. You need to stop the WAS java process, set LD_PRELOAD environment variable, and restart WAS. LD_PRELOAD is equivalent to LD_PRELOAD_32 by default. For 64-bit, use LD_PRELOAD_64.
bash-3.00# cd ${WAS_PROFILE_BIN}
bash-3.00# ./stopServer.sh server1
bash-3.00# LD_PRELOAD_32=/usr/lib/libumem.so ./startServer.sh server1
You can use the pldd command to verify that the WAS process is indeed started with libumem. The output of pldd will report libumem in one of the lines.
bash-3.00# pldd <WAS_PID>Now, while running some user loads, you can examine your new WAS process with libumem with the plockstat command again.
bash-3.00# plockstat -H -p <WAS_PID>
^C
Mutex hold
Count nsec Lock Caller
-------------------------------------------------------------------------------
1 1317400 libumem.so.1`umem_cache_lock libumem.so.1`umem_update_thread+0x298
1 207200 libumem.so.1`vmem_nosleep_lock libumem.so.1`vmem_populate+0x204
2 126950 0xfe2e1980 libjvm.so`__1cCosRpd_suspend_thread6Fpn
2 106550 0xfe2e1840 libjvm.so`__1cCosRpd_suspend_thread6Fpn
2 105800 0xfe2e18c0 libjvm.so`__1cCosRpd_suspend_thread6Fpn
2 87500 0xfe2e1880 libjvm.so`__1cCosRpd_suspend_thread6Fpn
2 35250 0x41840 libumem.so.1`umem_cache_alloc+0x1f4
1 31200 0x459c0 libumem.so.1`umem_cache_alloc+0x1f4
2 30850 0xfe2e18c0 libjvm.so`__1cGThreadMdo_vm_resume6Mi_i
2 28050 0x364a8 libumem.so.1`vmem_alloc+0x188
1 26300 0x4e340 libumem.so.1`umem_cache_alloc+0x1f4
2 25950 0x44380 libumem.so.1`umem_cache_alloc+0x1f4
1 25100 0x459c0 libumem.so.1`umem_cache_alloc+0xdc
169 24936 libumem.so.1`vmem0+0x30 libumem.so.1`vmem_alloc+0x1f4
1 24700 0x4e8c0 libumem.so.1`umem_cache_alloc+0x1f4
1 24700 0x44340 libumem.so.1`umem_cache_alloc+0x1f4
2 23300 0x4eec0 libumem.so.1`umem_cache_alloc+0x1f4
1 23200 0x453c0 libumem.so.1`umem_cache_alloc+0xdc
13 22876 0x41940 libumem.so.1`umem_cache_alloc+0xdc
2 21850 0x45980 libumem.so.1`umem_cache_alloc+0x1f4
2 20200 0xfe2e1880 libjvm.so`__1cGThreadMdo_vm_resume6Mi_i
1 19600 0x46780 libumem.so.1`umem_cache_free+0xfc
2 19100 0x4a8c0 libumem.so.1`umem_cache_alloc+0xdc
1 17700 0x46340 libumem.so.1`umem_cache_alloc+0xdc
2 17700 0x448c0 libumem.so.1`umem_cache_alloc+0xdc
4 17625 0x467c0 libumem.so.1`umem_cache_alloc+0x1f4
1 17600 0x47940 libumem.so.1`umem_cache_alloc+0xdc
1 17600 0x453c0 libumem.so.1`umem_cache_alloc+0xdc
As you see, you have gotten rid of the malloc lock contentions. Do this for other WAS instances and their java processes. This should improve your application performance and overall system efficiency. Using libumem, you can also gain performance for applications that have heavy dependency on socket communications.


