Wednesday December 10, 2008 restricting MySQL memory with Solaris resource controls and rcapd
Now that OpenSolaris 2008.11 is out, that means the next iteration from the OpenSolaris Web Stack project, a.k.a. Sun Web Stack, is available! I'll post another blog or two on this soon.
One of the underutilized features of Solaris (IMHO) is the resource management capabilities. Generally speaking, people know zones have resource controls but they may not be aware that the resource controls have been in Solaris for quite a while, since Solaris 9 updates. In other words, it's not required to use zones to use resource management.
Recently a customer encountered a memory leak bug with MySQL. We'll obviously fix that, but getting it into the patch cycle and tested will take a little while. In the interim, we needed a solution to keep MySQL from leaking so much that it affects the system. Enter Solaris Resource Management. With a bit of experimentation, as expected, I did find that I could effectively limit the max address space and limit the resident memory with both rcapd and generic resource controls.
Assuming the user mysql,which is the default in the version we ship in OpenSolaris Web Stack, here's what I set (I'm running OpenSolaris build 101, but expect similar behavior on S10U6):
# projadd user.mysql # projmod -s -K "rcap.max-rss=100MB" -K "process.max-address-space=(priv,100MB,deny)" user.mysql # rctladm -e syslog=WARNING process.max-address-space
This will limit the max-rss (resident memory, via rcapd, which needs to be enabled: svcadm enable rcap) and the process max address space to 100MB for the user mysql (by configuring the user's default project). In practical terms, prstat showed the memory usage a bit lower, probably due to how the accounting is done for shared objects.
The rctladm command will tell Solaris to syslog a warning (on a global basis) if the process tries to exceed the memory amount set out. Note that becuase I'm using deny on the max-address-space, it can't get any more memory anyway.... Solaris will just act like there's no more memory available when the process tries to allocate some. In this case, we may want to restart the MySQL service, through a cron job that looks to the syslog.... though we'd have to handle that carefully.
I experimented with mysql 5.0 and a small program to get mysql to try to use a lot of memory and show rcapd/resource controls doing their job. I ran rcapd in debug mode to verify it was doing the correct thing.
rcapd: collection types: 0x1 rcapd: vmusage sample flags 0x4 rcapd: getvmusage time: 170.14 milliseconds rcapd: kernel nres 4 rcapd: vmusage_sample rcapd: 0: id: 100, type: 0x4, rss_all: 108523520 (105980KB), swap: 2916417536 rcapd: 1: id: 3, type: 0x4, rss_all: 3014656 (2944KB), swap: 253952 rcapd: 2: id: 10, type: 0x4, rss_all: 1350504448 (1318852KB), swap: 1255268352 rcapd: 3: id: 0, type: 0x4, rss_all: 193441792 (188908KB), swap: 180629504 rcapd: project user.mysql rss/cap: 105980/102400, excess = 3580 kB rcapd: any collection/project over cap = 1, 1 rcapd: enforcing caps rcapd: project user.mysql scanner starting to scan, excess 3580k rcapd: project user.mysql scanner resuming process 29686 rcapd: process 29686: 4/0kB rfd/mdfd since last read rcapd: identified nonpageable schedctl mapping at fea04000 rcapd: identified nonpageable schedctl mapping at fea04000 rcapd: process 29686: 4/0kB rfd/mdfd since hand swept rcapd: process 29686: 2857044/0kB scannable rcapd: identified nonpageable schedctl mapping at fea04000 rcapd: project user.mysql scanner trying to resume from 0x33634000, next 0x33634000 rcapd: project user.mysql scanner paging out process 29686 rcapd: project user.mysql scanner paged out 0x33634000+0t(468/3580)kB rcapd: project user.mysql scanner paged out 0x339b3000+0t(0/3112)kB rcapd: project user.mysql scanner paged out 0x33cbd000+0t(0/3112)kB rcapd: project user.mysql scanner paged out 0x33fc7000+0t(0/3112)kB rcapd: project user.mysql scanner paged out 0x342d1000+0t(120/3112)kB rcapd: project user.mysql scanner paged out 0x345db000+0t(0/2992)kB rcapd: project user.mysql scanner paged out 0x348c7000+0t(0/2992)kB rcapd: project user.mysql scanner paged out 0x34bb3000+0t(0/2992)kB rcapd: project user.mysql scanner paged out 0x34e9f000+0t(0/2992)kB rcapd: project user.mysql scanner paged out 0x3518b000+0t(0/2992)kB rcapd: project user.mysql scanner paged out 0x35477000+0t(4/2992)kB rcapd: project user.mysql scanner paged out 0x35763000+0t(0/2988)kB rcapd: project user.mysql scanner paged out 0x35a4e000+0t(844/2988)kB rcapd: project user.mysql scanner paged out 0x35d39000+0t(1732/2144)kB rcapd: project user.mysql scanner paged out 0x35f51000+0t(412/412)kB rcapd: project user.mysql scanner done, excess 0 rcapd: sleeping 0.71 seconds rcapd: updating statistics... rcapd: project user.mysql status: succeeded/attempted (k): 3580/42512, ineffective/scans/unenforced/samplings: 0/1/0/1, RSS min/max (k): 0/243872, cap 102400 kB, processes/thpt: 1/0, 1 scans over 907 ms rcapd: sleeping 3.81 seconds rcapd: collection types: 0x1 rcapd: vmusage sample flags 0x4 rcapd: getvmusage time: 3.34 microseconds rcapd: kernel nres 4 rcapd: vmusage_sample rcapd: 0: id: 100, type: 0x4, rss_all: 108523520 (105980KB), swap: 2916417536 rcapd: 1: id: 3, type: 0x4, rss_all: 3014656 (2944KB), swap: 253952 rcapd: 2: id: 10, type: 0x4, rss_all: 1350504448 (1318852KB), swap: 1255268352 rcapd: 3: id: 0, type: 0x4, rss_all: 193441792 (188908KB), swap: 180629504 rcapd: project user.mysql rss/cap: 105980/102400, excess = 3580 kB rcapd: any collection/project over cap = 1, 1 rcapd: enforcing caps rcapd: project user.mysql scanner starting to scan, excess 3580k rcapd: project user.mysql scanner resuming process 29686 rcapd: process 29686: 4/0kB rfd/mdfd since last read rcapd: identified nonpageable schedctl mapping at fea04000 rcapd: identified nonpageable schedctl mapping at fea04000 rcapd: process 29686: 4/0kB rfd/mdfd since hand swept rcapd: process 29686: 2857044/0kB scannable rcapd: identified nonpageable schedctl mapping at fea04000 rcapd: project user.mysql scanner trying to resume from 0x35fb8000, next 0x35fb8000 rcapd: project user.mysql scanner paging out process 29686 rcapd: project user.mysql scanner paged out 0x35fb8000+0t(3580/3580)kB rcapd: project user.mysql scanner done, excess 0 rcapd: sleeping 0.88 seconds rcapd: updating statistics... rcapd: project user.mysql status: succeeded/attempted (k): 3580/3580, ineffective/scans/unenforced/samplings: 0/1/0/1, RSS min/max (k): 0/243872, cap 102400 kB, processes/thpt: 1/0, 1 scans over 299 ms rcapd: sleeping 3.81 seconds
Setting the resident set and max memory to the same amount is probably not the right approach for most uses, it's just what I experimented with here. I'd think in most cases you'd set the max higher and set the resident set to something sane for the system. If you want to be sure the mysqld won't overflow to swap, you may want to actually set the max memory to something less than physical memory. Keep in mind, we were pretty coarse grained there by setting it up with the user mysql. You can use a project and the newtask(1) command instead if you want the resource controls to apply to particular processes owned by a user.
This is just a simple example of the kinds of things you can do. Have runaway processes or threads occasionally? You can catch them and kill them with resource controls. You can also use coreadm(1M) to be sure you're capturing the errant behavior to analyze and fix the issues in useful core files, not just droppings all over the filesystem. Have a look at resource_controls(5) and related documentation for details.
( Dec 10 2008, 12:52:44 AM PST ) Permalink