Thursday Jun 11, 2009

In my quest to add minor usability features to MySQL I have created two new status variables as patches and submitted them as community contributions to MySQL.

The first patch adds a new status variable for query cache which counts the number of queries which have been invalidated from the query cache due to things like inserts and updates.  More information on this patch can be found at:

http://www.linuxjedi.co.uk/doku.php?id=patches:query_cache_invalidates

The second patch helps with the new replication heartbeat functions in 6.0 by adding a datetime status variable of when the last heartbeat was received.  This patch can be found at:

http://www.linuxjedi.co.uk/doku.php?id=patches:mysql_slave_last_heartbeat

Monday May 04, 2009

For those who haven't seen it so far, my mysqlbinlog patch is aimed for use in continuous backup situations.  It is designed retrieve binary logs direct from remote servers and write them directly to disk in their raw format.

When combined with mysqldump snapshots you can have an almost up-to-the-second point-in-time backup for your server just in case its blown off the face of the earth.  The advantage this has over a blackhole slave is you can run multiple instances of mysqlbinlog on one server with minimal overhead to backup multiple servers into one.  Its also much easier to setup.

After some great feedback from a user of my patch and Luís Soares, who is reviewing my patch as a possible community contribution submission into MySQL 6.0, I have made several fixes.

You can find the new patch here, and for basic usage information please see my previous blog post here.

Change Summary:

  • Fixed a possible segfault
  • There were situations where an extra empty file could be created
  • Fixed a couple of warnings
  • Use more mysys functions
  • Add test cases
  • Minor code cleanups

Monday Mar 02, 2009

After a minor coding blitz I have created 3 patches for mysqlbinlog (one is more of an enhancement of a previous patch):

  • Compression support (patch here)
  • SSL support (patch here)
  • RAW mode (updated patch here) - See update below for updated patch

The compression support patch adds the --compress parameter adds support for compression when retrieving data from a remote MySQL server, much in the same way that other mysql clients do.  Basic tests indicate it compresses data up to 10x for SBR statements.

The SSL support patch adds the --ssl* parameters to the mysqlbinlog client just as the other mysql clients.

The RAW mode patch is almost an entire rewrite of the original RAW mode patch fixing all the problem introduced in it.  For those not familia with it the patch introduces a way of backing up the remote binlog files without having a remote slave running or using something like SCP.  One possible usage for this to have a live up-to-the-second backup of your data like this:

  1. Run mysqlbinlog --raw --read-from-remote-server --host=mysqlserver.address --to-last-log=2 mysqld-bin.000001
  2. Grab a mysqldump --master-data=2 --host=mysqlserver.address
  3. Repeat #2 every hour (or however long you want the snapshots to be)

Change summary:

  • mysqlbinlog safe exits with CTRL-C or kill
  • --raw added which will retrieve raw binlog files from the remote server and save them using the filenames in the replication headers
  • --to-last-log is now an int, =1 is the old behaviour (stop at end of last log), =2 is continuous (sets serverID=255)
  • --result-dir added for use with --raw mode
  • --result-file modified so that when used with --raw mode it is a prefix for the output files

UPDATE 2009-03-13:

A user of this patch noticed 2 bugs with it.  Firstly the --result-dir caused a crash if the directory did not exist, and secondly it was not handling the binlog headers correctly.  I have fixed both of these problems and take4 of the patch can be found here.

Wednesday Feb 11, 2009

After a discussion with Adam Dixon over ways to retrieve binlogs from remote servers we came to the conclusion that mysqlbinlog should have a way to retrieve binlogs direct from a remote mysql server and save them in raw binlog format.

So after a little bit of tinkering I came up with a small prototype patch to do this, to use this simply do:

mysqlbinlog --raw --result-file=out.bin -t -R --host=192.168.1.101 mysqld-bin.000001

This can probably be extended to wait for more data rather than ending when it hits the end of the log, but this is just a proof of concept for now.

This new patch against MySQL 5.1.31 can be found here.

Tuesday Jan 20, 2009

After suggestions by Sinisa on ways to improve my mysqldump progress patch I finally got around to working on these improvements.

This new patch has an extra parameter --show-progress-size which by default is set to 10,000.  So when --verbose is used, every 10,000 lines you will get a status output of the number of rows for a particular table dumped.  So what you should see is something along these lines:

shell> mysqldump -A --verbose >out.sql
-- Connecting to localhost...
-- Retrieving table structure for table testing...
-- Sending SELECT query...
-- Retrieving rows...
-- 10000 of ~94347 rows dumped for table `testing`
-- 20000 of ~94347 rows dumped for table `testing`
-- 30000 of ~94347 rows dumped for table `testing`
-- 40000 of ~94347 rows dumped for table `testing`
...

shell> mysqldump -A --show-progress-size=1500 --verbose >out.sql
-- Connecting to localhost...
-- Retrieving table structure for table testing...
-- Sending SELECT query...
-- Retrieving rows...
-- 1500 of ~94347 rows dumped for table `testing`
-- 3000 of ~94347 rows dumped for table `testing`
-- 4500 of ~94347 rows dumped for table `testing`
-- 6000 of ~94347 rows dumped for table `testing`

With mysqldump \r is used for this output so it will in fact all show on a single line rather than multi-line.  This new patch can be found here.

Sunday Jan 18, 2009

I am not an expert C++ developer, but I do know enough to do my day job as a MySQL Support Engineer, not only delving into the MySQL source code to fix issues but also looking at customer's API usage.

I originally wanted to write some patches that would benefit me with MySQL when dealing with large amounts of data, I would then have a rough idea of how far various client apps were progressing with a task.

Brian Aker first commented on one of my patches saying it would be ported to Drizzle, I was pretty flattered because I didn't think many others would find them useful.  A few days later Monty Taylor contacted me to say the patch had been converted.  I noticed a small bug in the patch so using my experience of Subversion I quickly learned Bazaar, created a branch and sorted the patch.

Soon after the Jay Pipes suggested improvements I could make to it, and within an hour I had made those changes too.

After playing a bit with Drizzle I got interested in the projects goals and how easy it was to contribute.  I thought it could be a good way to develop my programming skills.  So after a little bit of searching the Blueprints I found a task that would glide me in easily to Drizzle, I had a Skype conversation with to Jay about it and set to work.

That was all only around one month ago.  In this short amount of time I have become one of the top 20 contributors, I have fixed several bugs and made several improvements.  Some of these have been to do with fixing drizzledump (the drizzle port of mysqldump), thanks to the small changes I have made, this now works correctly.

I'm happy to be contributing to such a great project in my spare time, the organisation is efficient and the team are very friendly and helpful.  Most importantly you don't have to be a C++ guru to contribute something, the guys are there to mentor any developer who wants to get involved.  I urge anyone who wants to improve their development skills or to be involved in what could possibly be the next big thing to get involved with Drizzle!

Saturday Jan 17, 2009

Monty Taylor recently took my import progress patch and ported it into Drizzle.  I managed to spot a small bug in his conversion so took a bzr branch and fixed it.  I then spoke to Jay Pipes who suggested I could improve it by making it take a parameter which would be the number of lines per output (rather than fixed at 1000 for the original patch).

After I made the improvement I have started taking on more work for the Drizzle project in my spare time and have somehow got myself into the top 20 contributors.  But that is another story.

I have now taken my work on Drizzle for the import progress patch and ported it back to MySQL 5.1.30 in a new patch which can be found here.  This patch now has the parameter "--show-progress-size=#" where # is the number of lines per output.

For example:

shell> mysql --show-progress-size=1500 < input-file.sql
Processing line: 1500
Processing line: 3000
...

With MySQL I have used \r so the progress shows on a single line.  With Drizzle \n is used for gettext compatibility.

Thursday Dec 11, 2008

To complement my patches to show import progress using the mysql client I have created a minor patch to mysqldump.  This new patch for MySQL 5.1.30 will show you the number of rows dumped so far every 10,000 rows when using the --verbose parameter.  The output should be similar to this:

-- Retrieving table structure for table testing...
-- Sending SELECT query...
-- Retrieving rows...
-- 20000 rows dumped

Tuesday Dec 09, 2008

After a discussion with Mark Leith yesterday I decided to modify my import progress patch to have an option to turn it on/off.

This new patch adds a --show-progress parameter to the MySQL client which is disabled by default.  This is because there may be utilities or scripts using the MySQL client and capturing stderr output, the original patch would create some mess in this situation.  To use the new patch you can do:

shell> mysql --show-progress < importfile.sql

Monday Dec 08, 2008

Sometimes when importing data into MySQL using the following method it can take a very long time, especially if the file is very large in size:

shell> mysql < importfile.sql

I have therefore developed a tiny patch to the mysql client which uses stderr to show the progress of the import in number of lines every 1000 lines of import.  This isn't quite a progress bar but should give you an indication of how long the import will take.

The patch for MySQL 5.1.30 is available here.

This blog copyright 2009 by LinuxJedi