Friday Mar 20, 2009

In an earlier post, I talked about my takes from a JRuby on Rails on Glassfish code camp we have run. Thanks a lot to Astrails, who have supported and lead the Ruby/Rails side of the code camp. They are also hosting the code camp disk image, available for download (thank you, Boris).

I am making the code camp content available here, thought it might help with a quick getting started with JRuby, Rails and Glassfish, all integrated with the Netbeans IDE and ready to run as a VirtualBox disk image (OpenSolaris 11.08 guest OS), and enjoying the goodies of OpenSolaris (DTrace, to name just one). Just install VirtualBox (no further installations needed – all needed software is already installed inside the image) on Windows, Mac or other, set and start the disk image (see directions). You can then follow the exercises or just explore the environment.

And note that the compressed disk image is ~1.5GB (more than 4GB after decompression). I am not sure whether it should have been better to split it into a few parts. Anyway, I will see if it works for people...

Login to the OpenSolaris image with user 'Britney' (yes, with uppercase 'B', don't ask why...) and the password is just '1'. No password is required to switch user to root.

Sunday Mar 15, 2009


Web based applications, in particular Java web-based applications, should scale well on multiple processors / machines. If they don't, it would be impossible for the service to grow. And if they do, they should scale well on Sun's CMT servers.

A recent example I have seen is of Trivnet's multi-service mobile payments platform, TRIV Platform version 3.1.1, which has recently been benchmarked on an 8 cores (64 virtual processors, 1.2GHz, 32GB RAM) Sun SPARC Enterprise T5220 server. This power efficient single CPU machine has successfully handled 520 transactions per second (TPS), which was high above the expectations, but the main point is the way it scales.

Good scaling, with the ultimate target of linear scalability, means that in order to handle more load, you proportionally increase the utilization of the computing resources. When it is a 1:1 proportion, it is a perfect linear scalability. The T5220 has 64 virtual CPUs, so the key for gaining high throughput is good parallelism between application threads (or processes). Good parallelism is directly interpreted to being able to utilize all virtual processors.

TRIV 3.1.1 has been deployed on two application server (Weblogic 10.3) instances, running on the T5220 machine. Java 1.6.0_11 was used on Solaris 10 u5. The main significant tuning was using libumem as the allocation library, which allows thread parallel (native, not Java) memory allocation – the performance (throughput) gain was around 15%. The other one was increasing the young generation Java heap (NewRatio=2), as most of the allocations are very short lived, and full garbage collection is rarely needed. There is a significant improvement in GC performance (and parallelism) in Java 1.6.0.

See the graph below. As we keep increasing the load, more CPU is consumed, not far from linear growth. This is what we want to see when examining application scalability. This is not trivial, as internal locking might increase, but again – web applications should scale.




A typical virtual processors load distribution looks like the following:




We stopped at 520 TPS, when the loader machines became the bottleneck. And, remember that if needed (was not needed here), we can always split the machine into Solaris containers or LDOMs when horizontal scaling is preferred.



Monday Feb 23, 2009

We were running a JRuby on Rails on Glassfish code camp last week. We have presentaed and practiced a development environment of Glassfish V3 Prelude, running inside a VirtualBox, which is freely available for all popular desktop platforms. The guest OS we used inside the VirtualBox was OpenSolaris 11/08. People were installing VirtualBox on their laptops, copied a virtual disk image file which we have prepared, and running some examples and exercises.

My main takes from the code camp were -

  1. Glassfish has many benefits for developers (and of course as a deployment application server) – for development, it has a tight integration with Netbeans. Netbeans also has an excellent integration with databases like Mysql, so you really do not need to leave the IDE – controlling Glassfish and Mysql can be very easily done from within Netbeans. It also has a very convenient debugger for JRuby. Ruby/Rails profilers like NewRelic supports Glassfish, and there are also the Java built in profiling capabilities (jconsole, jstat, jmap, jinfo, etc.). Deployment is easy, either from Netbeans, or from the GUI (web based) or command line (Glassfish asadmin capabilities are among the best built command line interfaces I have experienced).

  2. DTrace capabilities in Solaris/OpenSolaris has been widely discussed before, but are continuing to charm (yes, I think this is the appropriate word) developers. Its ability to observe system wide activities, including all end to end high and low level activities, is unique. Its extendability (based on USDT probes), enables new environments and user application to easily and quickly implement new probes. In our code camp scope, we have been showing the capability to trace the whole flow with DTrace – from the DTrace instrumented Javascript engine in Firefox 3.0, to the Solaris DTrace instrumented Ruby (C Ruby, even from Netbeans you can switch between Ruby environments), or Java 6.0 with its built-in DTrace support, to the back-end DB, Mysql, which is easily DTraceable (see here), and Mysql 6.0 will even bring its own built in DTrace probes. Developers of high level environments like Rails are sometimes so detached from what's going on behind the scenes - DTrace can provide the complete picture.

  3. Ruby On Rails is, not all about, but to a large extent about a quick starting of end to end web application development. Starting from a ready to run virtual disk image, which contains all the components and can be run under almost any OS, is well appreciated by developers. You save downloading (except the image itself) and installation (except the trivial VirtualBox installation). I will try to put our ready image (with Mysql automatically running, Glassfish V3 integrated with Netbeans 6.5, Jruby, Dtrace instrumented Ruby 1.8) on the web and refer to it in my next post. Stay tuned...

Thursday Jan 29, 2009

Most commercial applications, particularly 'request/response' based ones, can be very performant on CMT platforms, but they must be able to implement parallelism in their flow. In many cases, it is possible to deploy such applications in a way that enables more parallelism, in some cases Solaris containers and virtual domains can help.[Read More]

Tuesday Dec 09, 2008

Memory corruption incidents might be tough to handle. Solaris' libumem (an alternative allocation library) offers a debugging facility, which is useful for memory inquiring, when trying to detect leaking or corruption. Using libumem is anyway recommended when running a multi-threaded application (especially on a multi-processors or other multi-threaded architecture), and might also be used just for the sake of debugging.

There are good documents describing this facility in details (see below). I will try to give here a quick starting guide.

MDB (modular debugger) is used as the front-end tool to retrieve debugging data from what libumem was collecting. It work with a core file, so one should be generated (automatically or manually with gcore). In order to start working with the basic (and powerful) features, you should:

  • pre-load libumem library and set environment variables for debugging

  • get basic familiarity with the libumem buffer structure in debug mode

  • be familiar with a few mdb commands

Here is a short description of those 3 items:

Pre-loading and Environment Variables

Have these settings active when and where you are running your application:

Pre-load libumem:

export LD_PRELOAD=libumem.so.1

(or setenv LD_PRELOAD libumem.so.1 in csh)

Define UMEM_DEBUG and UMEM_LOGGING, like (ksh/bash):

export UMEM_DEBUG=default

(or setenv UMEM_DEBUG default in csh)

export UMEM_LOGGING=transaction

(or setenv UMEM_LOGGING transaction in csh)



Buffer Structure (when using debug)

Libumem uses memory caches, each contains a set of buffers of a pre-defined size. Thus, there might be one cache for 16 bytes buffers, another one for 512 bytes, etc. Each allocated buffer is structured this way:



Metadata (8 bytes)

User Data


Redzone

(8 bytes)

Debug metadata

(8 bytes)








The first 8 bytes metadata are ignored here, we are interested in the user data, redzone and debug metadata segments.

Zooming in to these segments structure:


User Data

Application available memory

(uninitialized memory is set to 0xbaddcafe)

'0xbb', denotes end of application buffer

Rest of the allocated buffer

(uninitialized memory is set to 0xbaddcafe)

  • '0xbaddcafe' value is written to all uninitialized memory of the user data segment.

Redzone

Value of '0xfeedface' (4 bytes)

Integer value (4 bytes) from which the application allocation size can be calculated

  • The application allocation size is calculated from the last 4 bytes of the redzone (let's denote their decimal integer by x): allocation-size = ((x – 1) / 251) - 8

    Debug metadata

    Pointer to umem_bufctl_audit structure (4 bytes)

    Checksum value (4 bytes)

  • We 'll see in a minute that the umem_bufctl_audit structure, which includes the stack trace of the allocation, can be dumped inside mdb

  • XORíng the pointer to umem_bufctl_audit (first 4 bytes) with the checksum value should result in the value of 0xa110c8ed. If not, this segment is probably corrupted.

A Few MDB Commands to start with and references to examples

Invoke mdb on a core file, simply by:

# mdb core-file

Within the mdb prompt, you might:

scan allocated buffers for potential out of boundary writes:

> ::umem_verify

You will get a list like:

...

umem_alloc_64 2e608 clean

umem_alloc_80 2e808 1 corrupt buffer

...

note that “_64” or “_80” are the sizes of the user data described before. Use the address in the following column for the next step.

You can then run ::umem_verify on a the specific cache:

> address::umem_verify

The latter will give you addresses of the corrupted buffer. Dump the amount of bytes you need in order to get to the bufctl_audit structure:

> buffer-address/numberOfBytesX (i.e., > 37f88/90X)

Match the buffer structure (explained before) with the dumped data, and retrieve the pointer to the bufctl_audit structure. Then run

> bufctl_audit-ptr::bufctl_audit

And if the debug data is not corrupted, you will get the buffer information, including the allocation stack trace.

See an example here, look for 'Traditional Memory Corruption'

still on out of boundary writes

Sometimes the allocation stack is not sufficient. To generate a core immediately after such a malicious write occurs, you might try to use a hidden feature, but with performance impact and memory overhead, so it probably will not fit all cases.

Set UMEM_DEBUG="firewall=1" UMEM_OPTIONS="backend=mmap" and run your application.

check memory status

> ::umem_status

This will help you to detect modify-after-read incidents. See here

See Also

Identifying Memory Management Bugs Within Applications Using the libumem Library

Using libumem to detect modify-after-free corruptions

Using libumem to detect write-beyond-what-you-allocate errors

http://blogs.sun.com/jwadams/entry/debugging_with_libumem_and_mdb

mdb/kmdb, libumem (pdf)


This blog copyright 2009 by Amit Hurvitz