Tuesday August 04, 2009 Many databases get backed up by simply stopping the database copying all the data files and then restarting the database. This is fine for things that don't require 24 hour access. However if you are concerned about the time it takes to take the back up then don't do this:
stop_database cp /data/file1.db . gzip file1.db cp /data/file2.db . gzip file2.db start_database
Now there are many ways to improve this using ZFS and snapshots being one of the best but if you don't want to go there then at the very least stop doing the “cp”. It is completely pointless. The above should just be:
stop_database gzip < /data/file1.db > file1.db gzip < /data/file2.db > file2.db start_database
You can continue to make it faster by backgrounding those gzips if the system has spare capacity while the back up is running but that is another point. Just stopping those extra copies will make life faster as they are completely unnecessary.
Except where otherwise noted, this site is
licensed under a Creative Commons License 2.0
This is a personal weblog, I do not speak for my employer.
Shouldn't there be a .gz at the end of the second and third lines in the new script?
Posted by Brandon Bergren on August 04, 2009 at 06:34 PM BST #
In my days as a DB admin we just used to backup the transaction logs on a daily basis and do a full database backup weekly for the less busy DBs. Of course we also had expensive software that would backup a live data base without stopping it. So the full backups ended up getting done nightly and the transactions logs got shipped evry 30mins to the warm standbye server.
Posted by James Legg on August 04, 2009 at 06:57 PM BST #
If you are going to background the gzips, then you should probably add a "wait" before the start_databse too.
Makes me think that I should take the talk I gave at last year's CEC on the impact of poor shell scripting on performance, .. .maybe as a bit of a series
Alan..
Posted by Alan Hargreaves on August 05, 2009 at 01:30 AM BST #
Yes there should be a .gz at the end but it is not really the important point. The point is that the cp is completely pointless and just induces lots of extra IO.
Posted by Chris Gerhard on August 05, 2009 at 09:10 AM BST #