Henk Vandenbergh

Thursday Aug 20, 2009

Vdbench Data Validation, synchronous vs. asynchronous journal files

There are times for all of us where our job becomes monotonous. You just grind it out and wait for better days. This week has not been easy, but I survived it partly because of a very fun problem that we ran into yesterday (Thank you Jim Kapus).

Imagine the following scenario: “Henk, I ran two identical tests against two identical storage devices on identical servers, and I got around 17,000 IOPS on each. But I added journaling and one system now gets only 1000 IOPS and the other gets 11,000 IOPS. Shouldn’t I get the same IOPS on both?”

Jim loves to use Swat, so that is how he noticed the huge discrepancy between these runs.

Because of the overhead involved with journaling a drop in throughput is expected, but going from 17,000 IOPS down to 1000 while the other system gets 11,000 is just too much.

After a lot of checking, guessing, comparing and looking at loads of different Swat charts all of a sudden a light bulb when on above my head when Jim mentioned the word ‘journaling’ again. Oops.

Journaling by default does a synchronous write to the Vdbench journal file just before the write is issued, then followed by an other synchronous journal write just after the write is completed. ‘Synchronous’ of course means ‘slow’, which has a clear negative impact on the response time of the journal file and therefore also hinders the throughput on the devices being tested. But again, why such a big difference between the two systems?

It turned out that, though the servers were identical servers, the internal disk drives were not identical. I bet you one drive probably even had its write cache turned on. The response time difference of the journal files was the cause of the huge discrepancy in IOPS.

Solution: Of course we could look for identical disk drives, but since there were no expectations during these tests that the OS or the disk drives where the journal files reside would fail there was a much simpler solution. We did not really need to do synchronous writes. Using the ‘-jn’ Vdbench execution parameter we switched to using asynchronous journal writes. Not only did we end up with equal IOPS on both systems, the IOPS even went back to 17,000 IOPS because we now no longer depended on any journal file speed.

Henk

PS: by the end of this week I will release Vdbench 5.01-1, a release that contains fixes and enhancements to Data Validation and Journaling

Comments:

Post a Comment:
  • HTML Syntax: NOT allowed

Calendar

Feeds

Search

Links

Navigation

Referrers