Henk Vandenbergh

Thursday Nov 19, 2009

/var/adm/messages in Vdbench on Solaris

Vdbench on Solaris scans /var/adm/messages every 5 seconds to see if any new messages have been generated. Just in case the new message is related to the testing (for instance scsi timeouts), Vdbench displays the new message.

Frequently this message is not related to the Vdbench run so it only pollutes your terminal window.

To suppress the message display, add '-d25' as an execution parameter or 'debug=25' at the top of your parameter file. Realize though that IF there is an important message you won't see it. When you use this option the message instead will be written to your 'localhost-0.stdout.html' file.

Henk.

Wednesday Nov 18, 2009

'patterns=' parameter no longer works

The 'patterns=' parameter allows you to tell Vdbench what data pattern to write on your storage.

Alas, I just noticed that this no longer works since the LFSR rewrite in Vdbench 5.01. I am not sure if it is worth it to put in any effort to correct this. To be honest, it is unlikely anyone is using this option, I would have expected a question about this by now.

Henk.

Thursday Oct 22, 2009

Fix for NullPointerException in Vdb.Report.reportKstatDetail

For vdbench501fix1. In an earlier blog I mentioned a problem with Vdbench spreading the requested work over its available JVMs: http://blogs.sun.com/henk/entry/nullpointerexception_running_multi_host_vdbench501.

Here is a fix: Download  seqjvm.tar, place it in the Vdbench install directory. Then untar (tar -xvf seqjvm.tar) and you'll have a new directory and file: /vdbenchxx/classes/Vdb/RD_entry.class

Henk.

Wednesday Oct 21, 2009

'end_cmd=' parameter executed too often

There are two sets of 'start_cmd=' and 'end_cmd=' parameters. One set is used as a 'general' parameter, allowing these commands to be executed at the start and at the end of an execution. The other set is a sub parameter of a Run Definition (RD), allowing these commands to be executed at the start and at the end of each run (RD). Francois just notified me that the former 'end_cmd=' command is not executed at the end of a Vdbench execution, but instead at the end of each RD.

To fix this, place 'endcmd.tar' in your Vdbench install directory, untar (tar -xvf endcmd.tar) and you now should have a new directory and file: vdbench/classes/Vdb/Reporter.class.

Henk.

Tuesday Oct 06, 2009

Vdbench 5.01 reporting incorrect error code text for Windows

Just noticed a few days ago that the code attempting to translate the GetLastError() error codes returned by Windows reads and writes disappeared. Vdbench therefore now puts a Unix error text with a Windows error code and that's pretty confusing.

Until I fix this, just look up your errors on http://msdn.microsoft.com/en-us/library/ms681381%28VS.85%29.aspx

Monday Sep 28, 2009

NullPointerException running Multi-host Vdbench501

Two new problems were discovered last week, both resulting in a NullPointerException.

With the first one, four hosts were defined, but only three were used.Vdbench was trying to print a 'rd started' message on the device reports for that host, but of course no devices were busy. Just remove the unused host to get around this problem.

The second one was caused by the way Vdbench calculates the amount of JVM's needed. Normally that is one JVM per SD per 5000 iops, with a max of eight. This run had 4 SDs per host for a total of 16. Vdbench should have taken multi-host into consideration here, but did not. The resulting eight JVMs per host then caused the first JVM (slave 0) on a host not to have any work to do. Since slave 0 on each host is responsible for collection Kstat statistics and slave 0 was idle, no Kstat data was returned causing the NULLPointerException. You can work around this problem by either specifying '-m4' as an execution parameter, or adding 'hd=default,jvms=4' as you first Host Definition.

Henk.

Tuesday Sep 15, 2009

New Version of Vdbench: vdbench501fix1

After I sent out Vdbench 5.01 some interesting problems were discovered around Data Validation and Journaling. See the release notes on vdbench.org.

I also added the new Vdbench '-print' function that was accidentally dropped during the 501 build.

Enter './vdbench -print device lba xfersize', and the data block on logical byte address 'lba' from device or file 'device' for a length of 'xfersize' will be printed. This new function is very useful when analyzing data corruption issues identified by Vdbench.

Henk.

Wednesday Sep 09, 2009

Fix for 'STF trace start not working' in swat 302

The 'start trace' function on STF for swat 3.02 does not work.

Symptoms: 'invalid parameter: s' or 'Parameter scan error'.

Just put this file into your /swat/ directory, run 'tar -xvf stf_start.tar', and you're done.


'openflags=' when used in a Run Definition (RD) does not work for raw i/o.

The sd=sd1,.....,openflags=xxx or fsd=fsd1,....,openflags=xxx parameter allows you to pass extra options like O_DSYNC to the Unix open() function for each specific SD or FSD. 

You can also use 'openflags=' as an RD parameter to override all previous settings. However, this only works for file system workloads (FSD) and not for raw i/o workloads (SD). This is clearly an oversight, it should work for both.

Henk.

Thursday Aug 20, 2009

Vdbench Data Validation, synchronous vs. asynchronous journal files

There are times for all of us where our job becomes monotonous. You just grind it out and wait for better days. This week has not been easy, but I survived it partly because of a very fun problem that we ran into yesterday (Thank you Jim Kapus).

Imagine the following scenario: “Henk, I ran two identical tests against two identical storage devices on identical servers, and I got around 17,000 IOPS on each. But I added journaling and one system now gets only 1000 IOPS and the other gets 11,000 IOPS. Shouldn’t I get the same IOPS on both?”

Jim loves to use Swat, so that is how he noticed the huge discrepancy between these runs.

Because of the overhead involved with journaling a drop in throughput is expected, but going from 17,000 IOPS down to 1000 while the other system gets 11,000 is just too much.

After a lot of checking, guessing, comparing and looking at loads of different Swat charts all of a sudden a light bulb when on above my head when Jim mentioned the word ‘journaling’ again. Oops.

Journaling by default does a synchronous write to the Vdbench journal file just before the write is issued, then followed by an other synchronous journal write just after the write is completed. ‘Synchronous’ of course means ‘slow’, which has a clear negative impact on the response time of the journal file and therefore also hinders the throughput on the devices being tested. But again, why such a big difference between the two systems?

It turned out that, though the servers were identical servers, the internal disk drives were not identical. I bet you one drive probably even had its write cache turned on. The response time difference of the journal files was the cause of the huge discrepancy in IOPS.

Solution: Of course we could look for identical disk drives, but since there were no expectations during these tests that the OS or the disk drives where the journal files reside would fail there was a much simpler solution. We did not really need to do synchronous writes. Using the ‘-jn’ Vdbench execution parameter we switched to using asynchronous journal writes. Not only did we end up with equal IOPS on both systems, the IOPS even went back to 17,000 IOPS because we now no longer depended on any journal file speed.

Henk

PS: by the end of this week I will release Vdbench 5.01-1, a release that contains fixes and enhancements to Data Validation and Journaling

Sunday Aug 16, 2009

Swat302 buffer overflow: java.nio.BufferOverflowException

When a system has lots of devices Swat may abort with:

java.nio.BufferOverflowException
   at java.nio.LongBuffer.put(LongBuffer.java:785)
   at Utils.Bin.write_record(Bin.java:747) 

You can find a fix here: http://blogs.sun.com/henk/resource/fixes/binfix.tar

- place tar file in your swat install directory, then 'cd /swatxxx' and 'tar -xvf binfix.tar'

You'll now have a new file: classes/Utils/Bin.class.

If that still fails then you have a REAL large amount of device. Contact me at swat@sun.com  if needed.

Henk.

Friday Aug 14, 2009

cr/lf problems on windows vdbench.bat file

Even though when building the distribution file I executed the 'unix2dos' command on Solaris to correct any possible problem with cr/lf issues, the vdbench.bat file still appears to have some problems.

These problems apparently do not show up until you edit the file, for instance to specify the java directory.

I could not get the unix2dos command to resolve this, so what I ended up doing was I just ran 'type vdbench.bat', and then did a copy/paste from the command prompt window.

Henk

Saturday Aug 08, 2009

Vdbench Data Validation while creating a new file

Symptom: "The amount of data blocks in a lun must be at least 3000 times the data transfersize when using Data Validation. "

When running Data Validation your file or lun needs to be large enough to contain at least 3000 blocks.
So, if your have xfersize=8k then your lun or file size must be at least 3000*8k.

However, when you run Data Validation against a new file that will be created during a Data Validation run, Vdbench wants to create and pre-format the file using xfersize=1m. That means that Vdbench wants the file to be at least 3000*1m=3g large. To avoid the problem, just run the creation of the file WITHOUT Data Validation, and then as soon as the format is complete, cancel the run and restart it with Data Validation. Since the file now exists a format with xfersize=1m no longer is needed. In the next version I will change the xfersize for a file creation and format using Data Validation to 64k.

Henk.

Friday Aug 07, 2009

Java deadlock in Vdbench. No new i/o generated.

Discovered a deadlock in the code that handles the I/O scheduling and the prioritization of i/o skew between workloads. All the code around this deadlock has not been changed since vdbench 5.00.
What I think is that the high iops generated during the run's cache hits (200k), together with the fact that there was only one SD, together with this being a sequential run and therefore only ONE JVM, created so much thread concurrency (32 threads in this test) that it brought this problem to light.

If others run into this problem, let me know. I can give you a pre-beta copy of vdbench 5.02 where all this scheduling and prioritization code already has been rewritten. The original code has always been a thorn in my side because of its complexity and this spring I finally had the time to rewrite it.

If you think you have this problem, then, using a JDK (not a JRE) run './vdbench jstack' and send me the output. When confirmed I'll send you a 5.02 pre-beta.

Henk.

Wednesday Aug 05, 2009

Data Validation and Journaling and journal recovery

One thing that I discovered last summer, but have not had the time to resolve yet is the fact that there is a small window after journal recovery where Vdbench may report a Data Validation error while there really is no cause for it. Since Vdbench writes a journal record before it starts the i/o, and then an other after the i/o completes, it can be that Vdbench or the system shuts down before the write is complete. The question then during journal recovery is "did this i/o complete or not". Vdbench does not resolve this. It should accept the block to contain either the BEFORE contents or the AFTER contents.  This windows is pretty small though.

Calendar

Feeds

Search

Links

Navigation

Referrers