Yufei Zhu's Weblog
Tuesday Aug 22, 2006
CoolStack
It includes an AMP package (Apache, PHP and MySQL) and individual MySQL, Squid, PHP, Perl and Tomcat package.
It would make life easier to install SAMP on Solaris. It also includes README in each package explaining what flag and option have been used to compile each package in case you need to recompile.
Try it out and let us know how you feel about it.
Posted at 12:54PM Aug 22, 2006 by yufei in Sun | Comments[0]
Tuesday Aug 08, 2006
MySQL InnoDB performance tuning on T2000
MySQL is one of the most popular opensource database. It supports several pluggable storage engines,
MyISAM manages non-transactional
tables and it is good performer especially on MySQL 5.1.
The InnoDB and BDB
storage engines provide transaction-safe tables. They were acquired by
Oracle not so long ago. The MEMORY
storage engine which formerly known as heap storage engine provides
in-memory tables.The
InnoDB supports both file system and
raw disks on Solaris. It does row level locking.MySQL InnoDB allocates a few types of memory buffers out of the heap. The main buffer that is sized by innodb_buffer_pool_size is used to cache data and indexes of the tables. Each buffer block is 16k bytes. There is a dedicated thread that does pread to bring in the data to the buffer from the disk. There are also other types of memory buffers like sort buffer, log buffer, query_cache buffer. This is how to monitor MySQL statistics to get idea of whether need to increase those buffer size.
mysql> show innodb status
It is important to understand how MySQL executes select queries by explain. That can help optimize the select performance.
mysqladmin extended-status
MySQL is single process multi-threaded. For each new user connection, there is 1 threaded created from MySQL. We
can control the number of concurrent threads that can run simultaneous by limiting innodb_thread_concurrency. Normally on T2000 we set it equal to the number of cpus.
There is no dedicated log write threads. There is one io thread that wakes up frequently to do group writes but in our evaluation I rarely see it fires. Each user thread could issue pwrite when the transaction is commited. And it is serialized! The default innodb_flush_log_at_trx_commit is 1, meaning that after each transaction is committed, there is a pwrite first, then followed by fsflush. If set innodb_flush_log_at_trx_commit to 0, the performance would improve quite dramatically, pwrite and fsflush is only done every one second instead of after each transaction commit. But the risk is that customer might lose one second of data in case of power loss. innodb_flush_log_at_trx_commit = 2 would issue pwrite after each commit but fsflush every 1 second. If log disk response time is not fast enough, it could become a huge performance bottleneck. It is very important to size enough IOPs for log disks. We have seen performance being doubled or tripled right away by using a few more disks for the log files. For performance purpose, if you use filesystems to store datafiles and log files, you should use forcedirectio.
This is an example of how to use Solaris Volume Manager(svm) to create a soft partition and then set up the logfile links to this location
metadb -d -f c1t0d0s7 c1t1d0s7
metadb -a -f c1t0d0s7 c1t1d0s7
Then create a concatenation
metainit d10 1 4 c1t0d0s0 c1t1d0s0 c1t2d0s0 c1t3d0s0
You can stop here and do newfs on /dev/md/rdsk/d10 or continue to create a soft partition(for example 10G) on d10
metainit d101 -p d10 10G
After that
newfs /dev/md/rdsk/d101There is innodb_flush_method. The default is performing very well on Solaris which uses fsync() to flush both the data
mount -o forcedirectio /dev/md/dsk/d101 /logs
and log files.
When MySQL starts, there are 10 threads created. Thread 1 handles network connections and create new threads for new user connections. Then there are 4 io threads. There is 1 log write thread doing group commits once a while. There is one insert thread, InnoDB stores data physically by its primary key order and insertion would not cause random reads on the disk, but for the non unique secondary index insert, it could cause a lot of random reads. So to avoid that, InnoDB checks whether the secondary index page is in the buffer pool. If it is, InnoDB does the insertion directly to the index page. If the index page is not found in the buffer pool, InnoDB inserts the record to a special insert buffer structure. The insert buffer is kept so small that it fits entirely in the buffer pool, and insertions can be done very fast. Periodically, the insert buffer is merged into the secondary index trees in the database. Often it is possible to merge several insertions to the same page of the index tree, saving disk I/O operations. There is another thread that I am not sure its purpose. Thread 6 is handling rollback. Thread 7 and 8 are monitoring thread. Thread 9 is the master thread. Thread 10 is the signal handler thread
Here are a few steps of how to tune MySQL InnoDB performance on T2000 or other Solaris platforms
-
Choose at least MySQL version 5.0.22. In my experience MySQL 4.1 has more user locks inside hence it doesn't scale well. MySQL 5.1 beta is even better than 5.0.
-
If your database size is more than 4GB, using MySQL 64bit is better because you can use more than 4GB memory for innoDB database buffers.
- MySQL
5.0.22 64bit from MySQL website is a pretty good performer.
But in case if you want to compile it on your own on Solaris, You can
refer a good whitepaper
from Jenny Chen from Sun Microsystem. Compile time can be reduced
to a few minutes using parallel build "dmake"
in sun
studio 11
- You can use libumem or libmtmalloc to get scalable memory allocation performance on multi-threaded MySQL. The way to do this is before mysql is started, setenv LD_PRELOAD_64 /usr/lib/sparcv9/libumem.so. This example is for 64bit MySQL
- Always keep an eye on MySQL internal statistics. Normally you can
look at reads versus hits and tell if you need to increase certain
buffer size
- mysql> show innodb status;
- mysql> show full processlist;
- mysqladmin extended-status
- Making sure there is no io bottleneck. Collect "iostat -xtcnz 5" on the system and
look at that what is the disk response time (column asvc_t in
miliseconds) and what is the average outstanding ios(column actv).
Making sure that you have enough IOPs for the logfile disks and
datafile disks.
- On solaris systems, you can use "prstat
-Lmc" to monitor the active processes on the systems.
- MySQL startup options are very important. Following is an example of /etc/my.cnf I use to evaluate MySQL sysbench OLTP performance on T2000
#
#MySQL configure
#
# The MySQL server
[mysqld]
port=3306
socket=/tmp/mysql.sock
user=root
datadir=/data
basedir=/usr/local/mysql
max_connections=1600
max_connect_errors=10
table_cache=2048
max_allowed_packet=1048576
binlog_cache_size=1048576
key_buffer_size=16777216
max_heap_table_size=67108864
sort_buffer_size=65536
join_buffer_size=1048576
query_cache_size=26214400
thread_cache=16
thread_concurrency=32
thread_stack=64K
ft_min_word_len=4
default_table_type=MYISAM
transaction_isolation=REPEATABLE-READ
tmp_table_size=64M
skip-locking
skip-concurrent-insert
server-id=1
#######below is for innodb############
innodb_status_file=0
innodb_data_file_path=ibdata1:100M:autoextend
innodb_buffer_pool_size=1800M
innodb_additional_mem_pool_size=20M
innodb_log_file_size=200M
innodb_log_files_in_group=2
innodb_log_buffer_size=80M
innodb_flush_log_at_trx_commit=1
innodb_lock_wait_timeout=300
innodb_max_dirty_pages_pct=90
innodb_thread_concurrency=32
Posted at 05:27PM Aug 08, 2006 by yufei in Sun | Comments[0]
Monday Aug 07, 2006
MySQL MyISAM performance tuning on T2000
The default storage engine of MySQL is MyISAM. Unlike InnoDB, it stores each MyISAM table in three files, the schema file(.frm), the data file(.MYD) and the index file(.MYI). It only supports filesystem. It does table level locking.
Compared to InnoDB, MyISAM doesn't have its own data buffer cache, it only has index buffer cache, the key buffer (variable
key_buffer_size).
We need to use file system buffer cache for data cache for MyISAM
tables.Here are a few steps of how to tune MySQL MyISAM performance on T2000 or other Solaris platforms
-
There is a very good performance feature in MySQL 5.1 beta that allows MyISAM using mmap memory instead of malloced buffers. If you experienced very high mutex contention in earlier MySQL releases, you can get a huge performance improvments. How do you know if you have this problem?
- If you are using MySQL 5.0 or earlier version, during the test
you can collect "lockstat sleep 1" and if you see ufs_lockfs_begin and
ufs_lockfs_end in the callers column.
You are likely to get the performance boost by going to the latest 5.1.
You need to have "myisam_use_mmap=1"
in /etc/my.cnf to be able to use this feature. This is an example of
lockstat output that has the performance issue.
Adaptive mutex spin: 140294 events in 1.145 seconds (122505 events/sec)
Count indv cuml rcnt spin Lock Caller
-------------------------------------------------------------------------------
31341 22% 22% 0.00 9 0x600052c4d10 ufs_lockfs_end+0x70
30952 22% 44% 0.00 10 0x600052c4d10 ufs_lockfs_begin+0xe4
- You can use libumem or libmtmalloc to get scalable memory allocation performance on multi-threaded MySQL. The way to do this is before mysql is started, setenv LD_PRELOAD_64 /usr/lib/sparcv9/libumem.so. This example is for 64bit MySQL. For 32bit MySQL you can do setenv LD_PRELOAD /usr/lib/libumem.so
- Since we have to use filesystem buffer cache to cache data for MyISAM tables, it is important to tune segmap_percent in /etc/system. The default segmap_percent is 12% on solaris, that means you can only get to use 12% of the system memory for filesystem buffer cache. It depends on your database size, but setting it too high could lead low memory on the system causing excessive paging. In our case, we set it to 80%. In /etc/system, set segmap_percent=80. You need to reboot the system to make it effective.
- key
cache is important feature for MyISAM to cache index blocks
that allows multiple threads to access key buffer simultaneously. You
can check the performance of key cache by doing "show status" and look
at what is the ratio of key_reads and key_read_requests. It should be
less than 1%. Otherwise, you might need to increase
key_buffer_size. The maximum for key_buffer_size is 4G however you can
create multiple key cache (The size limit of 4GB applies to each cache
individually, not as a group.) That will also help on the situation
where access to one key cache structure does not block access to the
other key cache
- Making sure there is no io bottleneck. Collect "iostat -xtcnz 5" on the system and
look at that what is the disk response time (column asvc_t in
milliseconds) and what is the average outstanding ios(column actv).
- On solaris systems, you can use "prstat
-Lmc" to monitor the active processes on the systems.
Posted at 12:43PM Aug 07, 2006 by yufei in Sun | Comments[0]