Tuesday December 20, 2005 
This blog entry is for the amusement of Solaris folks. Seasoned enterprise Windows administrators, look away.
| Figure 1: The Windows Server 2003 Disk Management GUI - Looking Glass, it ain't! |
For reasons best not gone into, I have to do I/O benchmarking on Windows Server 2003. To be fair, the command shell language has grown up a lot since I used to teach it to help-desk unfortunates 10 years ago. Scarily, it has adopted Unix style I/O redirection (e.g "2>&1" and so forth) and flow control making it the bastard son of Kornshell and DOS Batch. Hmm. Variable substitution is still reassuringly hellish though.
Configuring up filesystems is not a million miles from format(1M) and metainit(1M). In order to make it more interesting I am working with a server connected to a KVM switch which exports the Windows screen over HTTP. This has some interesting effects;
Anyway enough whinging. How is it going? Well. To start with, its worth saying that S2003 comes with a graphical volume manager.(Control Panel -> Administrative Tools -> Computer Management -> Disk Management). This, as the Help menu item will re-assure you, is provided to Microsoft by the VERITAS Software Corporation, about which I have written before. Now you have found the GUI, forget you ever saw it for two reasons:
Real men in this part of the operating system forest use Diskpart - a command shell which you can use interactively or you can pass it scripts on the command line.This is combined with the Format command, so named to confuse Solaris folk. Diskpart plays the role of format(1M) and metainit(1M). Format plays the role of newfs(1M) or mkfs(1M). Do keep up!
I was going to list a quick sample of the filesystem creation process but the KVM web server makes it all too painful to gather the data.. Instead, I feel I probably haven't irritated Jonathon enough by 'fessing up to using XP (my two years at Dell with a Sun Ultra 1 Creator on my desk for similar purposes caused much management wailing and gnashing of teeth) so I'm going to whole-heartedly recommend Bill Stanecks' Windows Command Line Pocket Book from Microsoft Press. If you were brought up in a Unix shell, starting here will cut through a ton of larger books.
Anyone starting I/O tuning wants to match the modal read and write request of their application to the capabilities of the underlying hardware. The mechanism for achieving this is the volume manager (the functionality of which may or may not be part of the file system but never mind). The mechanics of QFS for example are set out here and its counterpart for Solaris Volume Manager is here. Similar information for Veritas Volume Manager is in this large PDF manual. The point I am making is that anyone with the vaguest interest in tuning their I/O subsystem ends up in this section of the manual for the product of choice. With Windows Server 2003, it was quite hard to find anything on this topic. My web search revealed many pages with bland reassurances similar to;
"With a striped volume, data is divided into blocks and spread in a fixed order among all the disks in the array, similar to spanned volumes. Striping writes files across all disks so that data is added to all disks at the same rate."
This really tells me nothing, or rather it begs more questions than it answers: How big are these blocks? What is the order (round-robin, parallel, ...). How do I change these things? Tantalising isn't it. More digging revealed this gnomic utterance;
"For Windows Server 2003, the size of each stripe is 64 kilobytes (KB)."
Thats it. End of story: I expected more, I really did - especially as I mentioned, its got Veritas written on the label. If I'm missing something (another manual? strange Registry witchcraft?) please let me know (I'm Dominic Kay, I work at Sun; take a wild guess at the email address). I know I could download Mark Russinovich and Bryce Cogswells debugging kit and plough the stack traces to reverse engineer the I/O subsystem for myself but you know that might be cheating. I've looked at the Enterprise, Datacentre and Storage versions of Windows Server 2000 and I can't see the vital difference from the vanilla flavour I'm looking for. By the way, the semantic splicing of those product names is a bit frown inducing, like "Large", "Big" and "Not Small".
So I am left with two possibilities as to why there are no tuning knobs for Windows Server 2003 volume manager;
We shall see. Talk to you later.
( Dec 20 2005, 01:50:59 PM GMT ) PermalinkThanks for your kind offer of free alcohol in one of London's top hotels via SMS. Thats not an invitation I get often, especially from the head chef. Unfortunately it arrived just as I had turned in for the night 60 miles down the road on the coast. I could have used a pint though. How long has it been now?
My cellphone didn't wake the baby. Did I tell you we had another? Thats 3 now. No more. Also we have a spanking new kitchen so I can get on with my baking without swearing all the time. Did I tell you I'm a bread baker? Probably not. I'm thinking about doing it for a living but am too scared to open a boulangerie even though I have the capital to do so. I was inspired by The Handmade Loaf What else is new in Kays' life?
* My stepfather died at the end of 2003; My dad died at the end of last year; My mum died 4 months ago. I think my family is done with funerals for a while but we have several house jokes which the children are required to recite, one of which is
Q: What makes God laugh?
A: People with plans
Talking of which....
* I lost my job last week after 7 years. My last day at Sun will be 14 October. Guess I'll need a new email address and blog site, huh. It was nothing personal - I was just amongst the number and so am looking for work, principly in the areas of system performance, capacity planning, benchmarking, system and network architectures, and secondarily in technical team leadership and project management.
My resume is here though for reasons of brevity I have left out all the work I did on fishing boats, in French restaurant kitchens and fighting Argentina (for reasons that are still not clear to me, 13 years later). Some papers I have written are here and here. I'm sure you will offer me a start in your kitchen (as Plongeur de Maison, naturally) but hopefully the world of IT has not yet had its fill of me.
* We are going to buy a house in France ( with a BIG old stone bread oven ). There is symmetry here: My Mum loved the place and her old friends first talked of selling it to me as we drove from the crematorium. Another coincidence is that it is in the town of Descartes and I am hopeless at maths.
* We now have 3 Mercedes with a combined mileage of over half a million miles. The secret is frequent oil changes. Being heir to an oil fortune would also help with the bills. If you hear people saying they don't make them like they used to, believe it. They're still a whole lot more robust than any other brand though.
* Out of the dozen vines in my vinyard in the garden, 4 have fruited. This is year 3 for them so thats not so good. Fortunately the proposed purchase in France has a big field out back so I can desist with this fools' errand and grow them in the climate God provided for the purpose.
Keep in touch. Must run now - I have revision to do as one of my prospective employers expects me to have more than a passing knowledge of TCP internals (never mind).
Dom. ( Sep 10 2005, 01:10:12 AM BST ) Permalink Comments [0]My boss asked me if there was any training I needed. This is a good sign. Unless of course he is looking for something to pad out my severance package such as a course in bricklaying or preparation for the Microsoft Certified Systems Engineer exams. But, hey, the world will always need bricklayers, right? Anyway I replied drily that I get most of my training from Amazon these days. Which is true but the costs are beginning to match those of training courses. Supply and demand dictates that if you want obscure books on systems modelling you must pay the price? Not necessarily so. Here are some expensive titles, for free.
If you are new to the area, I would say that googling for "Markov Chains" is probably not the best way to start, particularly, if like me, you took a fairly relaxed attitude to your mathematics education. Instead go and read Neil Gunther's quite approachable series of articles at the Teamquest site.
Quantitative System Performance Computer System Analysis Using Queueing Network Models dates back to 1984. No doubt if it was still in print it would cost the far side of sixty bucks. Its still very relevant, cited frequently, and is yours for the patience to download and unzip it. A gentle introduction to the joys of mean value analysis: I commend it to you.
If on the other hand you are slightly better mathematically equipped you might want to jump straight in and get Introduction to Queueing Theory (pdf) by Bob Cooper. This and the sundry others are there for the taking at Myron Hlynka's listings.
By now you may well want to play: WinPEPSY (for Windows-enabled readers) is an implentation of PEPSY-QNS it's a very useful tool for graphically constructing and visualising the parameters for systems of network queues. Here is a screenshot to whet your appetite.
Enjoy.
( Aug 08 2005, 11:41:55 AM BST )
Permalink
Comments [0]
I am gratified to learn that it isn't just me who is banging his head against the wall with the blogging infrastructure, Roller.
Phil Harman, who few would describe as a technical slouch, is also vexed. Ditto Richard McDougall. The concept of blogging is brilliant and the way it's been executed on in Sun is marvellous. The gripe is that Roller is a web application. Its a fine effort but it's simply not finished. A keen knowledge of HTML and CSS is required to make forward progress on any but the most minor formatting issues. To which I hear you reply "If you can't even master a trivial markup language and meta-language in order to create your glorified post-it notes, why on earth did they give you a job?" Quite right. Given that I've told my children they can only play computer games if they construct them with the editor and assembler I've supplied, this is hypocrisy of the worst sort.
Another observation is that when initialising a blogspace you get a number of links to other peoples' blogs "for free". These are, I'm told, the great and the good of the Roller project and so forth and I'm advised to retain them as a mark of respect. No. Nor will I link to Jonathan: Every one else does - he really doesn't need me. Instead I shall save my sycophancy for a select few (several of whom I've never physically met - the joys of iWork!). I will only entertain a few links to other bloggers and the criteria are
but broadly, they have to be people who have changed the way I think.
.
I'm not saying this man is the Travis Bickell of the Operating Systems world but would you argue with him? Are you Luco Brazzi?
So thats it. No-one else. Not never. And I won't even link to these until the libel proceedings have subsided.
There are several things that interest me. Filesystem and datapath software design is one. Computer performance is another; particularly datapath performance of course but also the whole stack. Open Source software for helping in improving performance; load generators, probes and monitors, mathematical and graphical software for doing such things as statistical manipulation, implementing queuing theory and simulation; that sort of thing. I'm not alone here. Richard Cockroft, author of perhaps the primary source on Solaris performance has blogged on this topic.
What do I mean by visualising performance? Well, look at the following table, extracted from the Lustre Wiki - data gleaned from a netperf benchmark of 10 gigabit ethernet interfaces, increasing the payload size and the size of the socket buffer:
MBytes/s |
Socket Buffer Size |
||||||||||||||
Send Size |
128K |
256K |
512K |
1M |
2M |
4M |
8M |
16M |
|||||||
8K |
212.79 |
260.79 |
273.72 |
314.31 |
362.51 |
349.24 |
358.81 |
376.20 |
|||||||
16K |
218.68 |
259.10 |
273.53 |
314.24 |
362.34 |
348.82 |
358.39 |
376.09 |
|||||||
32K |
213.63 |
260.07 |
273.29 |
329.90 |
362.17 |
349.00 |
358.63 |
376.01 |
|||||||
64K |
221.17 |
263.98 |
273.31 |
316.10 |
361.51 |
348.74 |
358.11 |
375.91 |
|||||||
128K |
224.50 |
266.42 |
273.96 |
313.34 |
362.08 |
348.88 |
358.20 |
376.39 |
|||||||
256K |
221.97 |
260.96 |
275.27 |
290.05 |
361.51 |
348.68 |
357.97 |
376.48 |
|||||||
512K |
222.43 |
265.68 |
274.28 |
289.10 |
361.28 |
348.95 |
358.14 |
376.37 |
|||||||
1M |
226.24 |
266.02 |
275.66 |
295.67 |
361.64 |
348.70 |
357.93 |
376.71 |
|||||||
This is a common enough scenario. There is one dependent variable; the throughput of the connection. There are two independent variables - the size of the socket buffer and the size of the request. I had to look at that table for quite a while before I could see the result - the relationship. This is very common in benchmarking. Often, only two causal factors would be considered to be on the light side; the mount parameters for a filesystem can run to a dozen or more.
OK, so this example is not one that is going to set the world alight but its in the public domain, which helps. I have to get drunk with people who, in terms of scientific visualisation, have bigger fish to fry. But these days we (Sun) have bigger fish on the chopping board - especially petabyte storage and grids; both of the compute and storage varieties. You canot build these things in the lab on a whim; you have to model and modelling means visualisation.
I found this graph more intuitive:
![]() |
require(lattice)
g_data <- read.table
(fileName <- choose.files("*.csv"), header=T)
print(wireframe(g_data$mbs ~ g_data$soc_buf
* g_data$send_kb,
zlab="Mb/s" ,
ylab="Send size (Kb)" ,
xlab="Socket buffer size (Kb)" ,
drape = TRUE,
colorkey = TRUE
) )
|
The code to the above is for the The R Package, a free software environment for statistical computing and graphics, more of which below. I think the key message is "This is not a lot of code" (to 'fess up, I did have to deprocess the pretty printed table back to CSV). So this more or less tells us that one of the variables has little effect. But we can do better than this:
![]() |
g_data <- read.table(fileName <- choose.files("*.csv"), header=T)
print(splom( ~ g_data)
|
This gives us a scatterplot matrix. In two lines of code we can compare the relationship between every variable in the test and the relationships leap from the page. In our case there are only three dimensions but trellis graphics (in S-Plus, the commercial version) or lattice graphics (in R) allow us several graphical methods to visually explore our data.
What does it tell us? That after a certain point, increasing the size of the buffer provides no further boost in throughput. This is important as kernel memory is a finite resource.
Then its just a matter of drilling down for the "management summary" (But 'fessing up again, I am daintily sidestepping the thorny topic of non-linear regression analysis. Another day.):
![]() |
xyplot(mbs ~ soc_buf_kb , aspect = "xy", ylab = "Mb/s" , xlab = "Socket buffer size (Kb)") |
So then. The my elevator pitch for R.
When all is said and done, its really good for performance & capacity planning "exploration"; later on I'll measure an elephant for you in pretty quick time in R.
So endeth my first blog; respect and gratitude to David Levy for requisite motivational arse kicking and Simon Dachtler for finding time to produce my banner graphic while still keeping the Far-East manufacturing economy ticking over.