Co-locating multiple instances of WebSphere on scalable Sun servers
Using the SPECjAppServer 2004 benchmark to test the IBM WebSphere Application Server (WAS) v6.1 and v7.0, we learned the following number of WAS instances needed to drive certain Sun servers to their full utilization (Note: I will use "processors" as "logical processors" herein that is reported by the psrinfo command on Solaris):
- On a T2000 (1 CPU, 8 Cores/4 Threads (32 processors)), 1 WAS instance
- On a T5120/T5220 (1 CPU, 8 Cores/8 Threads (64 processors)), 2 WAS intances
- On a T5140/T5240 (2 CPU's, 8 Cores/8 Threads (128 processors)), 4 WAS instances
- On a T5440 (4 CPU's, 8 Cores/8 Threads (256 processors)), 7 WAS instances.
We have real life experiences with our top Fortune 500 customers who have CMT servers as well as Sun Fire 6900, 25K, and SPARC Enterprise M5000-M9000 servers where hundreds of JVM's are being deployed. Many of our customers use these large systems for consolidating many smaller servers into just a couple or so to be more manageable in their data centers. A key difference with these larger enterprise class systems from rack mountable servers is vertical scaling -- they can have memory capacity as much as 4TB (i.e. M9000) and 64 quad-core CPU's. Yes, that's 4 Tera Bytes and 512 processors (64*4*2)! The challenge for us is to help enable our customers to be able to leverage such physical resources knowing many of these hardware capacity far exceeds software capabilities. Thus, we need to deploy many instances of the software to increase the system utilization level.
I shall point out the key things to be considered to achieve a reliable and scalable system environment for co-locating multiple WebSphere instances on scalable Sun servers:
- JVM Ergonomics: I have discussed about it in my previous blog entry
- JVM tuning with proper Heap sizing and appropriate GC policy: You should do finer tuning of each JVM instance while keeping all co-existing instances on the system in mind. We describe about JVM tuning in our Redbook [Chapter 9.4 (pp. 346-376)] and the IBM Impact 2008 presentation on WAS Performance Management on Solaris that we gave
- Partitioning the system with Dynamic Systems Domain on enterprise class servers, Logical Domain on CMT servers, Solaris Containers on all Solaris 10 systems, and xVM (in the near future) as described in our Redbook Chapter 5
- Isolating Processor and Memory Consumption: Dileep Kumar's blog entry and our Redbook [Chapter 5.4 (pp. 142-149)] and the IBM Impact 2008 presentation on WAS Deployment Best Practices on Solaris we gave
- Isolating Interrupt Processing: prstat -i
- Distributing Network Load: ifconfig
- Increasing certain kernel parameters such as rlim_fd_cur for setting file descriptor limit and segkpsize for kernel stack segment size
- If you have hundreds of JVM's on a system, look into this setting. In Solaris 10, each lwp uses 32KB of kernel stack virtual address space - 24K of stack plus an 8K redzone. The default kernel stack segment size is 2GB for 64-bit kernels, and this limit is reached with 64K lwp's. You can increase the limit by setting segkpsize in /etc/system.
It has units of 8K pages, eg to set the limit to 4 GB use:
set segkpsize = 0x80000
- If you have hundreds of JVM's on a system, look into this setting. In Solaris 10, each lwp uses 32KB of kernel stack virtual address space - 24K of stack plus an 8K redzone. The default kernel stack segment size is 2GB for 64-bit kernels, and this limit is reached with 64K lwp's. You can increase the limit by setting segkpsize in /etc/system.
It has units of 8K pages, eg to set the limit to 4 GB use:
- Leverage Dynamic Caching services (DynaCache) in WAS
You should also make sure other things like proper WAS Thread Pool Settings collectively. If you take the inventory of all configuration settings and tuning parameters of the co-existing software and provide adequate settings not exceeding the physical capacity, your systems will be more reliable and scalable. We have documented most of these best practices in our Redbook.


