MySQL backup soon becomes an important matter when the database is used in production. The pain-point comes from the fact that while backuping the database is not available to respond to client requests anymore. With mysqldump - the standard tool for performing MySQL backups - and a large database the operation can go over many tenth of minutes if not hours. If I am running my business on line this is simply not acceptable.
The
classical approach to workaround this problem
is to take advantage of MySQL replication. I set up a master/slave
configuration where the slave acts as copy of the master. Then, when
needed, I run mysqldump on the
slave without any service interruption on the master.
But ZFS snapshosts bring a new straightforward approach that avoids the pain and the complexity of a master/slave replication.
Snapshots
are a key feature of ZFS that allows me to save a copy of any ZFS file
system in less than a second. Yesterday, Sept 27th I first created
a myDB
ZFS filesystem for my database, and since I decided
to backup my database everyday I also created a first snapshot. I am using InnoDB, so I made sure that both the data and the logs are located in the myDB filesystem. Today,
Sept 28th, I created a second snapshot. Here are the commands:
Yesterday, September 27th
# zfs create pool/myDB
# zfs snapshot pool/myDB@Sept27.2009
Today, September 28th
# zfs snapshot pool/myDB@Sept28.2009
Note
that as long as the dataset does not change, the snapshot does not use
any disk space, so since there was no change to the database when I
created the first snapshot zfs list reporst me a null USED space :
# zfs list -t snapshot
NAME USED AVAIL REFER MOUNTPOINT
pool/myDB@Sept27.2009 0 - 667M
The commands are simple, but what about the performance? If I run the commands just as is, the filesystem that I save takes 667M on disk. On my laptop I have much more data in my home directory. Let's see how long it takes me to snapshot this data.
# zfs list rpool/export/home/tmanfe
NAME USED AVAIL REFER MOUNTPOINT
rpool/export/home/tmanfe 47.8G 22.5G 28.1G /export/home/tmanfe
# time zfs snapshot rpool/export/home/tmanfe@28.2009
real 0m0.370s
user 0m0.004s
sys 0m0.009s
I am able to snapshot 47.9GBytes of data in 0.37seconds.
Good enough. As long as I can stay bellow a second, this will certainly
go unnoticed by the users. By the way, should my dataset increase, the
snapshot time will stay the same.
Now, what do I need to do with MySQL?
For
my data to be properly and consistently saved, I need to make sure that
it gets written to disk. To do this, MySQL provides me with the command
FLUSH and its TABLES WITH READ LOCK option. About this option the documentation says: "Closes all open tables and locks all tables for all
databases with a read lock until you explicitly release the
lock by executing UNLOCK TABLES. This is a very convenient way to get backups
if you have a file system such as Veritas that can take
snapshots in time". Note that the documentation should be modified to add ZFS as another filesystem that supports snapshots. What FLUSH TABLES WITH READ LOCK really does is:
- Blocks any table update. No new update is accepted
- Wait for tables currently being updated to be closed/flushed
- Blocks any commit
Since I am using InnoDB I also need to do a couple of things:
- First, InnoDB does not guaranty that when flushing tables they are actually written to the filesystem. To address this issue I need to snapshot both the tables and the MySQL binary log files, and make sure I save the MASTER STATUS before unlocking tables. The binary log contains all the changes that have been committed to the database, and the master status tells me which modifications in the log files have been applied to the tables and which are still pending. So even if tables are not in a consistent state on the filesystem, as long as I have the binary log I am able to make tables consistent by replaying all the changes from the log that have not yet been applied to the tables. To sum up, for InnoDB, a straightforward solution consists in having both the log files and the tables on the same filesystem and save the MASTER STATUS while tables are locked by running SHOW MASTER STATUS.
- I also need to turn autocommit off before running the FLUSH command otherwise every MySQL statement - such as FLUSH TABLES WITH READ LOCK - is wrapped into a transaction and at the end of each transaction all the table locks are released. In other words, my attempt to lock tables would be self-canceling, would have no effect.
No such thing is
required with MyISAM.
Then, I create the ZFS snapshot and I release the tables with the command UNLOCK TABLES. I am done with MySQL.
I need to run and synchronize commands at both the shell level and in MySQL. More specifically, after executing the FLUSH in MySQL, I must not close the connection, otherwise the locks will be freed. An easy way to meet these requirements is to use a Python script:
dbh = MySQLdb.connect(host='localhost', user='mysqdl-admin', passwd=mysql_pass)
dbh.query('FLUSH TABLES WITH READ LOCK;')
os.system('zfs snapshot pool/myDB@Sept29.2009 )
dbh.query('UNLOCK TABLES;')
dbh.close()
The
snapshot I created is located on the same disks and server than my
original data. To put a real backup system in place I need to transfer
the snapshot to another backup server or storage system. Once again ZFS
provides me with all I need. The zfs send command enables me to send a snapshot somewhere else. The only prerequisite is that ZFS must be available on my backup server in order to take advantage of the snapshot that it receives.
Here is the exact syntax:
# zfs send -Ri pool/myDB@Sept27.2009 pool/myDB@Sept28.2009 | ssh backupHost “zfs receive -Fd pool”
With the -i option, only the difference between my two snapshots is sent to a system named backupHost. This option enables me to do incremental backups. Ahead of incremental backups I need to send a first complete snapshot to the backup host. I did it yesterday, when creating my very first snapshot:
# zfs send -R pool/myDB@Sept27.2009 | ssh backupHost “zfs receive -Fd pool”
Note that the stream generated by zfs send replicates the filesystem, not the snapshot.
The stream is then funneled into an ssh command that connects to the backup host. Once connected, the zfs receive command is executed that takes the content of the stream - i.e. the filesytem - and copies it into a zfs pool that is local to the backup host.
I am done.
 
Does anyone has time for taking 1TB snapshot in zfs? Also in perf implications after the snapshot or when snapshot is transferred
Posted by Bingu on September 29, 2009 at 06:20 PM CEST #
What about restoring?
Posted by John on September 29, 2009 at 10:26 PM CEST #
For informational purposes: ZFS is a filesystem available on Solaris only. If you are using a flavor of Linux, you can use LVM (Linux Volume Manager) which is above the filesystem level (so you can still have, say, an ext3 filesystem).
Posted by Sheeri K. Cabral on September 29, 2009 at 11:55 PM CEST #
ZFS _is_ available under Linux using FUSE - obviously not ideal but a nice way of playing with it...
It's also available for BSD, two variants I've used being FreeBSD and FreeNAS.
Also, LVM stands _LOGICAL_ volume manager...
Posted by Harry Sheppard on September 30, 2009 at 02:19 PM CEST #
To answer the snapshot question, the time taken to perform snapshots is trivial. A snap of a 600GB filesystem took circa 3 seconds, even with moderately heavy IO to the filesystem.
Posted by Harry Sheppard on September 30, 2009 at 02:20 PM CEST #
I will be covering restore in a next article.
Posted by Thierry Manfé on September 30, 2009 at 04:47 PM CEST #
Thanks Harry for this useful input.
Posted by Thierry Manfé on October 17, 2009 at 07:23 PM CEST #