Introduction
DBT2 is a TPC-C like benchmark tool supporting MySQL. The results of a test run include transactions per minute, CPU utilization, I/O activity, and memory utilization. The DBT2 test kit can be downloaded from the DBT sourceforge site. The purpose of this document is to show how to use the virtualization technology: Solaris container to scale-out multiple mysql instances on the Sun's CoolThreads server.
Hardware and Software configurations
Sun SPARC T5240 Server: 2x8-core UltraSPARC T2 Plus(1165 Mhz,128 Threads) Processor; 64GB RAM; 2x146GB Internal Disk
Database: MySQL 5.1.30
Operating System: Solaris Express Community Edition snv_98 SPARC
Test Case 1: Single MySQL server on T5240 in 32 warehouse DBT2 Test
1. Test Results
Threads# Transaction/Minutes CPU Utilization
2 2111.96 1/0/99
4 4734.78 3/1/96
8 8742.33 5/1/93
16 14608.25 10/3/87
32 22215.11 18/5/77
64 18072.00 10/7/83
2. Performance Problem in the Test
In the DBT2 Test, the size of dbt2 with 32 warehouse is around 3G which can be entirely fit into the innodb_buffer_pool, so the single mysql testing on the T5240 did not show much disk I/O . However, the CPU utilization from vmstat/mpstat showed only around 15% with >128 concurrent users connections, MySQL can not vertically scale up the 128 HW threads of the T5240 server in the DBT2 test.
MySQL: innodb_thread_concurrency setting
In theory innodb_thread_concurrency was described as variable to limits number of operating system threads that can run concurrently in the Innodb kernel, the rest will wait in the FIFO queue until the slot is free for execution. This is an option worth experimenting with in the DBT2 test. Setting this value to be 0 disabling thread concurrency checking to allows InnoDB to create as many threads as it needs, in the test, it showed around 5-7% performance gain comparing to set it as 64
Threads# TPM TPM
256 8148.16 8580.10
512 7958.08 8523.50
1024 7642.54 8185.67
Solaris Fixed Priority Scheduler Class For MySQL
There are several scheduling classes available on Solaris: TS(Timeshare), IA (interactive), FSS (fair-share scheduling), FX (fixed-priority), SYS (system), and RT (real-time). Using FX scheduler class for MySQL process can optimize performance than the default time-share (TS) scheduler, by reducing context switching and providing longer quantums on CPU. Switch to fixed priority scheduling mode using command: priocntl -s -c FX <mysql_pid>
You can check the FX scheduler class is used by executing the command:
priocntl -d <mysql_pid>
In the DBT2 test, replacing the TS scheduler to be FX scheduler brings another 1-2% performance improvement.
Threads# Transaction/Minutes Transaction/Minutes
256 8580.10 8694.52(FX)
512 8523.50 8597.90(FX)
1024 8185.67 8373.72(FX)
3. Conclusion: The MySQL 5.1.30 GA can not fully utilize the 128 HW threads on T5240 server, tunings of the innodb's server parameter innodb_thread_concurrency and switching to use Solaris's FX scheduler for MySQL can improve MySQL performance up to 10%, but there is still lots of idle CPU, we would need horizontal MySQL scale-out solutions on the T5240 server by using Solaris containers as part of Sun's virtualization technologies.
Test Case 2: Eight MySQL instances on Solaris containers in 4 warehouse DBT2 Test
Considering the current scalability limit of running single mysql instance on the 128 threads T5240 server, we re-run the DBT2 test with 4 warehouse database using eight MySQL instances concurrently running under eight Solaris zones. The installation and configuration of the non-global zone for the MySQL instance is as bellow:
1.Create the zone's root directory
#mkdir /zone1
2. Create, install and boot a local zone by executing the command:
# create_zone.sh mysqlzone1 /zone1
3. Since the /usr directory on the mysqlzone was inherite from the global zone, we need to first add a new “subdirectory” structure as: /usr/local to the mysqlzone1
#mkdir /zone1/mysqlzone1/local
#chmod 700 /zone1/mysqlzone1/local
#zonecfg -z mysqlzone1
zonecfg:mysqlzone1> add fs
zonecfg:mysqlzone1:fs> set dir=/usr/local
zonecfg:mysqlzone1:fs> set special=/zone1/mysqlzone1/local
zonecfg:mysqlzone1:fs> set type=lofs
zonecfg:mysqlzone1:fs> add options [rw,nodevices]
zonecfg:mysqlzone1:fs> end
zonecfg:mysqlzone1:fs> verify
zonecfg:mysqlzone1:fs> commit
4. Dedicate 16 HW threads to each local zone
# pooladm -e
# pooladm -s
# poolcfg -c 'create pset mysql1-pset (uint pset.min=16;uint pset.max=16)'
# poolcfg -c 'create pool mysql1-pool'
# poolcfg -c 'associate pool mysql1-pool (pset mysql1-pset)'
# zonecfg -z mysqlzone1
zonecfg:mysqlzone1> set pool=mysql1-pool
zonecfg:mysqlzone1> verify
zonecfg:mysqlzone1> commit
zonecfg:mysqlzone1> exit
5.Adjust memory & swapping capping for each local zone
#adjust_memcap.sh 4096m 8192m
1. Test Results
Com# zone0 zone1 zone2 zone3 zone4 zone5 zone6 zone7 CPU%
2 2222.32 2208.30 2222.08 2189.69 2233.20 2185.59 2182.81 2097.74 11/2/87
4 3776.18 3697.26 3772.33 3795.18 3541.84 3510.07 3532.50 3477 18/10/80
8 5803.43 6181.83 5825.04 6298.63 5462.73 5389.83 5693.18 4919 38/7/55
16 7921.16 8137.38 8374.60 9094.10 7661.91 7601.32 8054.97 7377 70/12/18
32 7804.09 8385.24 9370.03 7852.72 8182.36 8135.14 7595.89 7944 76/15/8
64 8029.58 8247.97 8478.72 7465.87 7598.11 7263.20 7460.07 7263 84/15/2
MySQL configurations:
[mysqld]
transaction-isolation = REPEATABLE-READ
sort_buffer_size = 32k
max_connect_errors=1000
max_connections = 1100
table_open_cache=2048
query_cache_size=0
query_cache_type=0
log_slow_queries=/usr/local/mysql/data/slow.log
long_query_time=5
innodb_log_buffer_size=64M
innodb_log_file_size=300M
innodb_buffer_pool_size=3G
innodb_thread_concurrency=0
innodb_additional_mem_pool_size=20M
innodb_support_xa=OFF
innodb_flush_method=O_DIRECT
innodb_flush_log_at_trx_commit=0
innodb_locks_unsafe_for_binlog=1
2. Known issues: With 8 mysql instances running simultaneously in 4 warehouse DBT2 Test, setting innodb_flush_log_at_trx_commit as 1 to flush mysql’s transaction log at each transaction commit could cause I/O bottleneck with internal disk(using separate slice for each zone), which can significantly slow down the MySQL performance in the DBT2 tests. By adding external storage StorageTek6120(4*73G) configured with one RAID 5 volume with two vol slices, the mysql performance with “innodb_flush_log_at_trx_commit =1” can be achieved as the following:
Test Results
Cons# zone0 zone1 zone2 zone3 zone4 zone5 zone6 zone7 CPU%
2 2176.81 2183.81 2218.22 2207.27 2248.63 2228.14 2224.86 2212.41 11/3/86
4 3673.92 3606.14 3662.45 3697.20 3672.41 3713.11 3754.81 3726 21/5/74
8 5535.90 5557.41 5537.47 5580.20 5562.48 5562.80 5715.29 5606 39/8/53
16 7459.16 7476.00 7493.31 7558.91 7530.58 7529.39 7657.34 7552 71/14/15
32 7743.88 7747.29 7777.46 7776.68 7833.00 7813.30 7926.75 7788 82/15/2
64 8142.63 8224.81 8302.57 8229.40 7973.80 7881.89 8180.98 8404 81/18/2
Summary:
1. Some MySQL database workloads cannot scale well on large number of CPU system. The single instance of MySQL server in DBT2 test on T5240 server showed that peak performance can be achieved only at around 22k TPM on Sun’s T5240 server.
2. Using Solaris containers to partitioned system, we can run multiple instances of MySQL server on a single system to maximize the MySQL & applications performance with increased CPU utilization on T5240 server. If we can separate each application and mysql instance running in each zone like our configurations in the DBT2 benchmark test, or we can use the database sharding techiques to add logic in the application level to distribute queries in different mysql instances on the Solaris containers, we can significantly improve the throughput results. On DBT2 test, the peak overall performance can be achieved at around 65k TPM on Sun's T5240 server. Check the DBT2 Test results in the mysql_t5240 Graph.