All | Filebench | Oddities | Open Source | Paper Boy | Performance Visualisation | Storage
« Previous month (Nov 2005) | Main | Next month (Jan 2006) »
20060127 Friday January 27, 2006

Filesystem Benchmarks: iozone

In my last post I discussed the vxbench I/O load generator which may (or may not) be available from Symantec for the use of all. Recent work with Windows 2003 Server has given me the excuse to use Iozone which has many things in common with vxbench. In fact I feel a taxonomic table coming on:

Feature

VxBench

Iozone

Open sourceNo. Copyrighted but freely available.Yes: ANSI C
Async I/OYes: aread, awrite, arand_mixed workloadsYes -H, -k options
Memory mapped I/OYes. mmap_read, mmap_write workloadsYes. -B option
Multi-process workloadsYes. -P/-p optionsYes. Default
Multi-threaded workloadsYes -t optionYes. -T option
Single stream measurementYes.Yes.
Spreadsheet outputNo.Yes.
Large file compatableNo.Yes.
Random reads/writesYes. rand_[read|write|mixed]Yes. -i 2 option
Strided I/OYes stride=n suboptYes. -j option
Simulate compute delayYes. sleep workload, sleeptime=n secondsYes. -J milliseconds
Caching optionsO_SYNC, O_DSYNC, direct I/O, unbuffered I/OO_SYNC,
OS'sSolaris, AIX, HP, Linux. Not MS WinAs vxbench + MS Win. POSIX

There are challenges in using these tools; the first is that these are not benchmarks; they are load generators with no load (benchmark) defined. And there are two approaches to defining a load (a.) how many operations of a specfic type can be achieved in a set time. (b.) How long does it take to complete a specific number of operations. The difference, for a lot of people, is a matter of taste. The consequence is that each new analyst who approaches these tools starts to write a new cookbook.

Another challenge is that the principle dimensions of performance in a benchmark are

  1. Latency - how long until the first byte is returned to the user or committed to disk and the operation returned from.
  2. Throughput - how much data under different access patterns can be sent to or retrieved from permenant storage.
  3. Efficiency - How much of the system's resources were consumed in moving data to and from storage rather than doing computation upon it. (Resources can be memory, CPU cycles, hardware and software synchronisation mechanisms. In this Millenium we also bring in the consumption of electricity and the generation of heat.

Load generators including the ones we are discussing are pretty good on the first two counts but no good at the third. That distinction marks the difference between a load generator and benchmarking framework such as Filebench, SLAMD or the tools such as Loadrunner from Mercury. It is no minor matter to coordinate the gathering of system metrics with the execution of the workload. Its even more difficult to achieve this accross distributed systems sharing access to a filesystem such as NFS or Shared QFS. In this case a common and precise idea of the current time needs to be maintained accross the systems.

Tools such as Iozone and Vxbench need to be embedded in scripted frameworks to do performance metrics collection - In several Unixes it simply means running any and every tool whose name ends in "stat" in the background. In Microsofts world there are the CIM probes accessible through VBscript or Perl and in Solaris 10, dtrace provides access to arbitary counters.


Putting Iozone to Work

Using Iozone we can generate output similar to the graphs below.

I created a 32 Gb volume accross 12 (Seagate ST13640 disks on 2 JBODS connected via 2 Adaptec Ultra320 SCSI controllers to a Dual 2 Ghz AMD Opteron with 2 Gb RAM). For the care and feeding of this sandbox, I am grateful to Paul Humphreys and his band of lab engineers.

I then ran iozone and collected the results. As it dumps straight into spreadsheet format you can quickly do some interesting graphical analysis such as this example at the tools' website. However I was after something more mundane.

The first graph below is from OpenOffice. I don't like it much because it follows the data so you end with a powers-of-two x-axis which hides important detail. Also all Openoffice graphs tend to look the same without extensive fiddling.

The graph below it is done with R and although it is plainer, I think it gives a clearer picture.

Here is the data and R code - not a lot to it really. I continue to urge you to use this tool as I have in the past. James Holtman has made a compelling case for the use of R in performance analysis in his paper Visualisation of Performance Data (CMG2005) and The Use of "R" for System Performance Analysis (CMG2004). Sadly CMG do not make their papers available to the wider community.

size  FW      NW      FR      NR
4     426.78  327.59  627.4   560.92
8     544.29  467.6   733.93  672.68
16    594.61  362.57  878.66  725.4
32    628.45  587.71  883.71  754.75
64    662.49  606.35  886.44  748.12
128   664.56  619.54  846.31  815.49
256   700.33  666.51  933.96  769.15
512   12.55   13.6    664.16  660.76
1024  8.8     10.77   600.13  592.36
require(lattice)
g_data <- read.table("C:\\home\\dominika\\FATvNTFS.csv", header=T)
attach(g_data)

plot(size, FR, type="l",
   main="NTFS and FAT32 I/O Performance",
   sub="Sequential Reads/Writes to 1 Gb File in 32 Gb Filesystem",
   xlim=c(0,1024),
   ylim=c(0,1000),
   xlab="I/O size (Kb)",
   ylab="I/O rate (Mb/s)",
   lty=5,col=5, lwd=2  )

lines(size,FW,lty=2,col=2, lwd=2)
lines(size,NW,lty=3,col=3, lwd=2)
lines(size,NR,lty=4,col=4, lwd=2)

text(150,700,"FAT Wr"); text(130,600,"NTFS Wr")
text(350,750,"NTFS Rd"); text(400,800,"FAT Rd")

The graph appears to show us a good deal but its what it doesn't show that has to be remembered - the qualitative side to all this.

The expectation of several people I showed it to had been that NTFS being the more modern filesystem should have better performance. Not so but for good reasons. Yes in the simple case FAT32 is faster than NTFS. Out and out performance is not the point of NTFS. It has many value-add features not found in FAT such as file and directory permissions, encryption, compression, quotas, content-addressability (indexing) and so forth. These come at a cost as do other features in NTFS that the OS relies on to provide such facilities as shadow copy and replication.

Longer code path - longer to wait for those I/Os to return!

( Jan 27 2006, 05:46:09 PM GMT ) Permalink

20060123 Monday January 23, 2006

Filesystem Benchmarks: vxbench

For a long time I've used a simple I/O load generator from Veritas called vxbench for doing just that - generating I/O loads against systems that have been configured up either in the lab or on customer sites. vxbench is a tool available on AIX, HP-UX, Linux and Solaris for benchmarking I/O loads on raw disk or file systems. It can produce various I/O workloads such as sequential and random reads/writes asynchronous I/Os, and memory mapped (mmap) operations. It has many options specific to the VERITAS File System (VxFS).

It also has characteristics that I need in a simple load generator - specifically it can generate multithreaded workloads which are essential and it has a simple command-line interface which makes it easy to incorporate in a scripting harness. It can also do strided reads/writes and sleep - important for database-like operations.

vxbench arrives on the CD in the package VRTSspt - the Veritas Software Support Tools and most sites have it to hand. However I've always shied away from publishing any work done with it because I 've never quite pinned down its status as a piece of software in terms of copyright or license. Recently however I've been driven to take a closer look. Two papers appeared recently which I'm afraid I can cite but not give you a URL for:

As corporate history has moved on, the first paper is copyrighted by the VERITAS Software Corporation and the second by the Symantec Corporation. I never realised that CMG does not own the content of its own proceedings but there you are. The second paper is a continuation of the first and uses the same methodology and tools. Yes; vxbench.

At first sight this is a little annoying - as another recent paper (which I won't point you at just at the moment because I want to talk about it in more detail in a later post) pointed out, if you can't reproduce a benchmark from its report, its not really very scientific and I'm sure thats not what the authors of these papers intended. This need for the rigor imposed by writing reproduceability into benchmarking papers is one reason why people working in the field often resort to the "usual suspects" when looking for load generators - iozone, postmark, bonnie++. They all have their weaknesses but are at least available on the net.

So I set about tracking down vxbench. The header in the source code was not encouraging; "This software contains confidential information and trade secrets of VERITAS Software. Use, disclosure or reproduction is prohibited without prior express written permission of VERITAS Software". Well, I won't be sharing any more of the contents of vxbench.c with you, thats for sure. Onward!

to....a Veritas support document pointed out to me by the README that comes with the package. Apparently you can download the package from the VERITAS ftp site (without the need to purchase media and/or a license). The support document was no more encouraging than the source header; "These tools are designed to be used under the direction of a VERITAS Technical Support Engineer only." Does this mean you shouldn't use them in other circumstances? (For "other circumstances" read "benchmarking against competing vendors of storage software") Well, it seems you can. Document 261451 leaves out the sentence that follows the one the I've quoted, but in the README.VRTSspt it continues on; "Any other use of these tools is at your own risk." So you can amuse yourself with vxbench and publish the results but if you fry your disks and panic your system you have only yourself to blame.

Vxbench is a useful tool. Its availability is important - the implementors of Linux LVM (and VxVM!) will no doubt want to study these papers and work to improve their products. I'm glad Symantec continue to make it available to the storage software community.

( Jan 23 2006, 11:24:07 AM GMT ) Permalink

20060105 Thursday January 05, 2006

Goodbye Windows XP Professional x64 Edition

It's time to say "Farewell" to Windows XP Professional x64 Edition. When I bought my Acer Ferrari 4000 Laptop, it seemed the obvious choice - have 64 bit AMD processor; buy 64 bit operating system. I also have Solaris and Suse Linux on the laptop so why do I need XP? Well, I'm doing benchmarking on Windows so it helps to have a sandpit to develop stuff but I have to protect myself in the face of those anxious to promote Solaris on the desktop (i.e the other 400 people in this building) by repeating this mantra.

Many things work OK on XP 64-bit as you would expect; Macromedia Flash, Mozilla, NetBeans, OpenOffice, Quicktime, even more arcane things: PostgreSQL, R, Ptolemy, Vim. So in general, binary compatability was fine, at the application level.

However there was a list of things that just didn't; some didn't attempt to install; some got to the end of the install and then showed their disdain for their new home; some installed but would not run.

Most spooky of all was Microsoft's Windows Utilities for Unix. This is developed for the server market so you would think...but no. Also, there is no Realplayer for x64 or any sign of there being one in the near future; you have to make do with a similar but less functional alternative.

There are no drivers for my scanner - a very popular HP model. This was when doubt began to set in. If these are not available from HP, is there a real problem? Yes of course there is. Yes of course there isn't. ("You work for who?") Solaris got around this when it moved to 64-bit in Solaris 7; it was 64-bit but retained the 32-bit framework and loaded appropriately; also if you didn't like one flavour of addressing, you could simply boot into the other.

Microsoft could have gone down this road but haven't and the reasons are not hard to guess. Many if not most of the drivers in that world are written by 3rd parties and also there are good business reasons for a firm distinction between the 32 and 64 bit products. Its also only fair to say that take-up of Solaris 7 within the installed base was not immediate by any means (Y2K forced the pace eventually) so looked at in those terms, Microsoft are simply where Sun were, about 8 years ago.

Continuing on down the list, I thought the new XP was responsible for the failure of Apple's iTunes to make contact with the iTunes store site. In fact hundreds of new iPod owners thought the same and mused about firewalls, virus protection software and all manner of other possible barriers to their enjoyment on Apple's self-help forum (note singular absence of input from Apple!). Amusingly it never dawned on any of these people (and I'm sure Apple would never admit to the notion) that as it was December 25th and the whole world had just unwrapped their new Christmas present and installed the software, eager to make their first iTunes purchase. The iTunes site might just have gone completely and utterly tits-up and be refusing any further custom due insufficient advertance to IT capacity planning on the part of its owners (who are, er, an IT company). Its one possible explanation but I'm sure Apple can provide a more rational one.

I had already predicted the final nail in the coffin when the HP scanner drivers failed to exist. If Cisco's VPN drivers were not available, I would not be able to work from home; there's not much point in having a laptop if you have to leave it at the office - and VPN drivers are pretty low level stuff and might take a while to write and test. The good news is I will be able to work from home. The bad news is it won't be until the courier van turns up with a shiny new copy of the 32 bit version of Windows XP.

It would be tempting to reiterate the advice given to those contemplating marriage ("Don't.") but the actual moral of the story is that drivers in this world don't ship with dual 32/64 frameworks as Solaris did and you can't just reboot your way from one world to the other - if you wish to go down the XP x64 Edition route, take advantage of the evaluation program and make use of the repository of drivers at PlanetAMD64. What I really hope is that the vendors will co-package the 32-bit and 64 bit versions and detect appropriately at install time (or even runtime). Chances?

And now a word from our sponsors.

( Jan 05 2006, 01:11:09 PM GMT ) Permalink Comments [2]

Calendar

RSS Feeds

Search

Links

Navigation

Referers