Olio is a web2.0 toolkit designed to help evāaluate the suitability, functionality and performance of various web technologies. The deployment I was looking at involved several components:
- Faban driver on a separate server, generating web requests on average every 5 second per simulated user
- Apache 2.2.9 from CoolStack 1.3.1
- PHP 5.2.6 also from CoolStack
- MySQL, initially 5.1.25 from CoolStack, later switched to 5.1.30 that I compiled myself
Apache/PHP and MySQL were both running on the same T5220, and a processor set had been created for MySQL.
Since Apache was involved, one of the things that I did is edit httpd.conf to create a CustomLog format for its access log.
LogFormat "%h %l %u %t \"%r\" %>s %b %{pid}P %X %D" rscommon
CustomLog logs/access_log rscommon
This added process ID (pid), KeepAlive status, and response time (in microseconds) to each log record. Running an experiment for a very small number of users then showed which web queries were most common and which were inherently much slower than the rest. Temporarily for this purpose I set certain KeepAlive parameters to make it easier to follow the sequence of requests from a particular user, because they'd be serviced by the same process (when using mpm_worker and configuring sufficent processes).
KeepAlive on KeepAliveTimeout 60 MaxKeepAliveRequests 0
For 10 users over 360 seconds, Faban reported 700 Operations, or 1.944 ops/sec. However the access log showed 5579 web requests, so clearly one or more Operations involved many web requests. The access_log confirmed that there were several inherently slow request types which were also quite common.
210 GET /events.php?socialEventID=... 286 GET /taggedEvents.php?tag=... 286 POST /taggedEvents.php 316 GET /index.php 3396 GET /fileService.php?cache=false&file=...
I probably wouldn't have chosen a quasi-open queuing system workload myself for performance tuning: a closed queuing system with zero think time makes it easier to identify bottlenecks. Still, having a list of candidate web queries meant that I could quickly create a simpler workload that exercised just Apache or Apache+PHP. For that I used the Apache utility ab, and created two simple pages, one static and one dynamic via phpinfo().
The motivation for this is to verify what throughput can be achieved for relatively lightweight requests. As will be seen below, there is a trap that involves a nasty interaction between tcp_time_wait_interval, KeepAlive, and the characteristics of this workload.
If KeepAlive is not used then each request creates a new connection, which gets closed at the end of the request. TCP requires a delay, which on Solaris is specified by tcp_time_wait_interval, before the port number can be reused by another connection. There are only 64k different port numbers, so once they've all been used, then a new connection has to wait for a port to become available due to expiry of tcp_time_wait_interval for a previous connection. The default on Solaris is 60000ms, which implies a sustainable connection rate of about 1000/sec. In a LAN environment, MSL (Maximum Segment Lifetime) should be less than 1 second, in which case the default can be safely reduced. This is necessary for very high connection rates.
# ndd /dev/tcp tcp_time_wait_interval 60000 # ndd -set /dev/tcp tcp_time_wait_interval 2000
Alternatively if KeepAlive is used, then each httpd process potentially can sit around idle for KeepAliveTimeout seconds before becoming available to service requests from another user. With a large user population, this means having a lot of apache processes, each with their own connection to mysqld at the backend. Careful attention need to be paid to whether there is sufficient real memory for so many httpd processes. Experience also suggests large numbers of active mysql processes can reduce efficiency.
For 1000 users, each combination failed to deliver the required performance (throughput and/or response time targets).
| KeepAlive | KeepAliveTimeout |
MaxClients | Result |
| on | 60 |
2048 | fail: response times > 5 sec |
| on | 1 | 400 | fail: 90th percentile > target |
| off | |
400 | fail: unstable response times |
The instability with KeepAlive off is particularly interesting as for majority of the run response times and throughput were looking good, when suddenly things deteroriated badly. This is reflected in the graphs generated by Faban of cpu utilisation and response time.


Before pursuing this though, I decided to investigate more deeply the inherently slow transactions. One way of doing this is to select a transaction type and run it repeatedly in a single thread. Using a single thread makes it more practical to instrument things with DTrace's pid provider, as only a single apache process need be started up. Mysqld itself is a single process, but multithreaded. Each apache process establishes its own connection to myqld. I wrote a DTrace script commandx.d to capture SQL text, response time and cpu time from mysql, which I found very useful. [Caveat: it needs to be run using the same data model as the process its observing, either -32 or -64.] DTrace scripts tend to grow organically, as you hack, modify, adapt, enhance, and expand them according to need. This one needs further work to capture stmt ids returned from PREPARE commands.
There are a few other techniques I have found useful for MySQL. One is to capture microstate accounting and other resource information from /proc pseudo-filesystem at start and end of an experiment and output the difference via a small utility I wrote, pidusage. Mysqld maintains a set of global variables about its internal activity that can be queried. To compute the differences between two samples I use the nawk script mysqldiff. Putting it all together the resulting microbenchmark script looks something like this:
mpid=`pgrep -x mysqld` mysql -B -e 'show global status;' >sample1 pidusage -s -p$mpid ab -c 1 -n 100 "http://localhost:8080/events.php?socialEventID=..." pidusage -g -p$mpid mysql -B -e 'show global status;' >sample2 mysqldiff sample1 sample2
For DTracing mysqld:
commandx.d -64 -p $mpid -o commandx.log
And for starting a standalone single apache process:
httpd -k start -X
Combining all this generates a wealth of information, but I'll quote selectively here. 100 queries took 15.908s, and according to ab the mean was 159.077ms, so that's consistent. From microstate accounting, mysqld used 13.084s cpu time so the workload is cpu-bound (the database fits in memory). Statistics from mysqld give a clue about the cpu time: 100 web queries resulted in 500 select statements, which in total read 439800 rows.
Com_admin_commands 100 Com_begin 100 Com_commit 100 Com_select 501 Com_update 100 Handler_read_first 100 Handler_read_key 418800 Handler_read_next 22600 Handler_read_rnd_next 417170 Innodb_buffer_pool_read_requests 934231 Innodb_rows_read 439800 Questions 803
And this is what commandx captured at mysqld level for a typical events.php query (formatted to make the SQL text more readable). Note that all times are in microseconds (with Time referring to elapsed time since the script started logging).
Time Tid RespTM CpuTM Type Len SQL
10070946 6487 51 36 14:PING 0
10071118 6487 123550 123500 03:QUERY 194 select tag, refcount as count
from SOCIALEVENTTAG
where socialeventtagid in
( select socialeventtagid
from SOCIALEVENTTAG_SOCIALEVENT
where socialeventid = '2086')
and refcount > 0
order by tag ASC
10195907 6487 1005 974 03:QUERY 84 select count(username) as count from PERSON_SOCIALEVENT where socialeventid = '2086'
10197179 6487 747 706 03:QUERY 275 select title,description,submitterUserName,imagethumburl,literatureurl,telephone,timezone,
eventtimestamp,street1,street2,city,state,zip,country,latitude,longitude,summary
from SOCIALEVENT as se, ADDRESS as a
where se.socialeventid = '2086' and se.ADDRESS_a...
10199070 6487 746 724 03:QUERY 68 select username from PERSON_SOCIALEVENT where socialeventid = '2086'
10209447 6487 97 73 03:QUERY 17 start transaction
10209761 6487 594 568 03:QUERY 110 update COMMENTS_RATING set comments='', ratings=''
where username='' and socialeventid='2086' and commentid=''
10210475 6487 1356 1329 03:QUERY 113 select commentid,username,comments,ratings,created_at,updated_at
from COMMENTS_RATING where socialeventid = '2086'
10221075 6487 81 65 03:QUERY 6 commit
So a single query dominates. Prior to MySQL 6.0.x, there is a known issue around subqueries not being processed optimally. In this case there is a workaround, which is to use a join instead of a subquery:
select tag, refcount as count from SOCIALEVENTTAG a, SOCIALEVENTTAG_SOCIALEVENT b where a.socialeventtagid = b.socialeventtagid and b.socialeventid = '2086' and a.refcount > 0 order by tag ASC
I chose to use super-smack to test the two versions in isolation, before making changes to the PHP source code. Super-smack provides a convenient facility for plugging keys selected randomly from a dictionary into SQL commands. This makes for a more realistic workload. Since I had already turned off mysqld's Query Cache, this wasn't strictly necessary, but I chose to do it anyway. [Aside: I made a few changes to the super-smack source that I downloaded to clean up output formatting, ensured it used an MT-hot malloc, and replaced calls to random() with a thread-safe random number generator. This last change required passing an additional parameter for per-thread state around the callstack.]
$ super-smack -D. soc002.smack 1 100 Query Barrel Report for client smacker connect: max=0ms min=-1ms avg= 42ms from 1 clients Query_type num_queries max_time min_time q_per_s soc002 100 124 124 8.05 $ super-smack -D. soc002a.smack 1 100 Query Barrel Report for client smacker connect: max=0ms min=-1ms avg= 41ms from 1 clients Query_type num_queries max_time min_time q_per_s soc002a 100 0 0 681.73
The next problem query came from index.php, and took about 99ms.
select count(*) as count From SOCIALEVENT where eventtimestamp >= CURRENT_TIMESTAMP
The problem here is that almost all 18000 records in SOCIALEVENT have an eventtimestamp in the future. This part of the application looks a bit dodgy since it appears to be involved in pagination, in which links are provided for the first 10 pages. My suggestion here is to rely on count(*) of all records being a special case, and subtract the few records older than CURRENT_TIMESTAMP. Its still expensive relative to most queries though.
select count(*) as count From SOCIALEVENT select count(*) as count From SOCIALEVENT where eventtimestamp < CURRENT_TIMESTAMP$ super-smack -D. cnt001.smack 1 100 Query Barrel Report for client smacker connect: max=0ms min=-1ms avg= 47ms from 1 clients Query_type num_queries max_time min_time q_per_s cnt001 100 100 100 9.95 $ super-smack -D. cnt002.smack 1 100 Query Barrel Report for client smacker connect: max=0ms min=-1ms avg= 47ms from 1 clients Query_type num_queries max_time min_time q_per_s cnt002a 100 39 39 24.03 cnt002b 100 0 0 24.03
The third change is a bit subtle, but here's the original query:
select socialeventid,title,summary,imagethumburl,createdtimestamp,eventdate,submitterusername from SOCIALEVENT where eventtimestamp >= CURRENT_TIMESTAMP order by eventdate asc limit 0,10
The issue here is that the order clause sorts by eventdate, not eventtimestamp. For this application, eventdate can actually be derived from eventtimestamp (via DATE function), and eventtimestamp has an index that facilitates getting records in the right order. It makes more sense to order by eventtimestamp.
Ideally I would have rerun the original experiment at this point, but my changes had been applied to a version of the application just prior to transaction start/commit being inserted. So temporarily I just used the modified older version, but note that it doesn't have the explicit transaction logic. This should be a relatively minor thing, as by default each SQL statement is a transaction when using InnoDB engine in MySQL. Before simulating 500+ users, I ran a smaller test to verify from the apache access_log that several of the common requests were now inherently faster. Making the mysql component more efficient also shifts the relative cpu consumption between mysql and apache. Rather than manually adjusting the number of cpus in a processor set, I deleted them and let the OS work it out. Another decision was to run with KeepAlive off, as performance was less stable with it enabled. [Maybe in the future it would be worth hacking the apache source to allow subsecond KeepAliveTimeout ?]
Cranking up the number of users, its now passing all performance criteria at 1800 users. There are a few signs of it approaching instability with the run queue occasionally jumping. At its busiest the system overall is approaching 90% busy, so if throughput is to be increased much further, further attacks will need to be made on the components consuming the majority of the cpu time. Some of the requests are still quite slow (~1 second) due to image processing but I'll leave that to another blog entry.
| UIDriver metric: | 344.169 ops/sec |
| Driver start: | Thu Nov 06 03:35:52 PST 2008 |
| Driver end: | Thu Nov 06 03:43:03 PST 2008 |
| Total number of operations: | 123901 |
| Pass/Fail: | PASSED |
Response Times (seconds)
| Type | Avg | Max | 90th% | Reqd. 90th% | Pass/Fail |
|---|---|---|---|---|---|
| HomePage | 0.317 | 4.511 | 0.500 | 1.000 | PASSED |
| Login | 0.292 | 2.429 | 0.400 | 1.000 | PASSED |
| TagSearch | 0.222 | 5.393 | 0.500 | 2.000 | PASSED |
| EventDetail | 0.108 | 4.104 | 0.200 | 2.000 | PASSED |
| PersonDetail | 0.169 | 3.640 | 0.380 | 2.000 | PASSED |
| AddPerson | 1.390 | 6.814 | 2.040 | 3.000 | PASSED |
| AddEvent | 1.468 | 7.117 | 2.140 | 4.000 | PASSED |