All | Filebench | Oddities | Open Source | Paper Boy | Performance Visualisation | Storage
Main | On Blogs and Blogger... »
20050721 Thursday July 21, 2005

Visualising Performance


Visualising Performance

There are several things that interest me. Filesystem and datapath software design is one. Computer performance is another; particularly datapath performance of course but also the whole stack. Open Source software for helping in improving performance; load generators, probes and monitors, mathematical and graphical software for doing such things as statistical manipulation, implementing queuing theory and simulation; that sort of thing. I'm not alone here. Richard Cockroft, author of perhaps the primary source on Solaris performance has blogged on this topic.

What do I mean by visualising performance? Well, look at the following table, extracted from the Lustre Wiki - data gleaned from a netperf benchmark of 10 gigabit ethernet interfaces, increasing the payload size and the size of the socket buffer:

MBytes/s

Socket Buffer Size

Send Size

128K

256K

512K

1M

2M

4M

8M

16M

8K

212.79

260.79

273.72

314.31

362.51

349.24

358.81

376.20

16K

218.68

259.10

273.53

314.24

362.34

348.82

358.39

376.09

32K

213.63

260.07

273.29

329.90

362.17

349.00

358.63

376.01

64K

221.17

263.98

273.31

316.10

361.51

348.74

358.11

375.91

128K

224.50

266.42

273.96

313.34

362.08

348.88

358.20

376.39

256K

221.97

260.96

275.27

290.05

361.51

348.68

357.97

376.48

512K

222.43

265.68

274.28

289.10

361.28

348.95

358.14

376.37

1M

226.24

266.02

275.66

295.67

361.64

348.70

357.93

376.71

This is a common enough scenario. There is one dependent variable; the throughput of the connection. There are two independent variables - the size of the socket buffer and the size of the request. I had to look at that table for quite a while before I could see the result - the relationship. This is very common in benchmarking. Often, only two causal factors would be considered to be on the light side; the mount parameters for a filesystem can run to a dozen or more.

OK, so this example is not one that is going to set the world alight but its in the public domain, which helps. I have to get drunk with people who, in terms of scientific visualisation, have bigger fish to fry. But these days we (Sun) have bigger fish on the chopping board - especially petabyte storage and grids; both of the compute and storage varieties. You canot build these things in the lab on a whim; you have to model and modelling means visualisation.

I found this graph more intuitive:
require(lattice)

g_data <- read.table
   (fileName <- choose.files("*.csv"), header=T)

print(wireframe(g_data$mbs ~ g_data$soc_buf 
   * g_data$send_kb,
	zlab="Mb/s" ,
	ylab="Send size (Kb)" ,
	xlab="Socket buffer size (Kb)" ,
	drape = TRUE, 
	colorkey = TRUE
	) )

The code to the above is for the The R Package, a free software environment for statistical computing and graphics, more of which below. I think the key message is "This is not a lot of code" (to 'fess up, I did have to deprocess the pretty printed table back to CSV). So this more or less tells us that one of the variables has little effect. But we can do better than this:
g_data <- read.table(fileName <- choose.files("*.csv"), header=T)
print(splom( ~ g_data)

This gives us a scatterplot matrix. In two lines of code we can compare the relationship between every variable in the test and the relationships leap from the page. In our case there are only three dimensions but trellis graphics (in S-Plus, the commercial version) or lattice graphics (in R) allow us several graphical methods to visually explore our data.

What does it tell us? That after a certain point, increasing the size of the buffer provides no further boost in throughput. This is important as kernel memory is a finite resource.

Then its just a matter of drilling down for the "management summary" (But 'fessing up again, I am daintily sidestepping the thorny topic of non-linear regression analysis. Another day.):

xyplot(mbs ~ soc_buf_kb , 
	aspect = "xy", 
	ylab = "Mb/s" , 
	xlab = "Socket buffer size (Kb)")

So then. The my elevator pitch for R.

When all is said and done, its really good for performance & capacity planning "exploration"; later on I'll measure an elephant for you in pretty quick time in R.

So endeth my first blog; respect and gratitude to David Levy for requisite motivational arse kicking and Simon Dachtler for finding time to produce my banner graphic while still keeping the Far-East manufacturing economy ticking over.


( Jul 21 2005, 04:29:31 PM BST ) Permalink Comments [3]

Trackback URL: http://blogs.sun.com/dom/entry/visualising_performance
Comments:

Always nice to see netperf mentioned :)

Posted by rick jones on August 25, 2005 at 02:01 AM BST #

http://blogs.sun.com/eclectic/

Posted by Wayne Horkan on October 03, 2007 at 01:49 AM BST #

i think you have a intresting website it is very complicated it is me rosie your friend simon dachtler i am his daughter rosie you know me and nice web site

Posted by rosie dachtler on December 23, 2008 at 05:24 PM GMT #

Post a Comment:

Name:
E-Mail:
URL:

Your Comment:

HTML Syntax: NOT allowed

Calendar

RSS Feeds

Search

Links

Navigation

Referers