Henk Vandenbergh
Swat: a disk and tape I/O performance and workload reporter.
Swat: a disk and tape I/O performance and workload reporter.
Swat: Sun StorageTek Workload Analysis Tool.
This tool was written to help Sun’s engineering and sales and service organizations understand a customer’s storage I/O workload.
Swat can be used for among many other reasons:
- Problem analysis
- Configuration sizing (just buying x GB of storage just won’t do anymore)
- Trend analysis: is my workload growing, and can I identify/resolve problems before they happen?
Swat is storage agnostic, so it does not matter what type or brand of storage you are trying to report on. Swat reports the host’s view of the storage performance, using the same Kstat (Solaris) data that iostat uses.
Swat consists of several different major functions:
- Swat Performance Monitor (SPM)
- Swat Trace Facility (STF)
- Swat Trace Monitor (STM)
- Swat Real Time Monitor
- Swat Local Real Time Monitor
- Swat Reporter
Swat Performance Monitor (SPM):
For Solaris and Windows, with an attempt in the current Swat 3.01 beta to also collect data on AIX and Linux.
Swat 3.01 also reports Network Adapter statistics on Solaris, Windows, and Linux.
A Swat Data Collector (agent) runs on some or all of your servers/hosts, collecting I/O performance statistics every 5, 10, or 15 minutes and writes the data to a disk file, one new file every day, automatically switched at midnight.
The data then can be analyzed using the Swat Reporter.
Swat Trace Facility (STF):
For Solaris and Windows. STF collects detailed I/O trace information. This data then goes through a data Extraction and Analysis phase that generates hundreds or thousands of second by second statistics counters. That data then can be analyzed using the Swat Reporter.
You create this trace for between 30 and 60 minutes for instance at a time when you know you will have a performance problem.
A disk I/O workload traced using Swat can be replayed on any test system to any type of storage using Vdbench (an other of my tools). This allows you to trace a customer’s production I/O workload, bring the trace data to your lab, and then replay your customer’s I/O workload on whatever storage you want. Want to see how the storage performs when the I/O rate doubles or triples? Vdbench Replay will show you.
With this you can test your customer’s production workload without the hassle of having to get your customer’s data base software and licenses, his application software and licenses, or even his production data.
Solaris uses TNF to collect trace data. Solaris alas only allows for a 128MB trace buffer. To allow for longer traces Swat monitors how full the trace buffer is, and once it is 80% full it will offload the trace buffer to disk, all the while allowing the creating of new trace data to continue. This continues until the requested trace duration has been reached.
Swat Trace Monitor (STM):
With STF you need to know when the perform problem will occur so that you can schedule it to run. Not every performance problem however is predictable. STM will run an in-memory trace and then monitors the overall storage performance. Once a certain threshold is reach, for instance response time greater than 100 milliseconds, the in-memory trace buffer is dumped to disk and the trace then continues for an other nn-seconds before terminating.
Swat Real Time Monitor:
When a data collector is active on your current or any network-connected host, Swat Real Time Monitor will open a java socket connection with that host, allowing you to actively monitor the current storage performance either from your local or any of your remote hosts.
Swat Local Real Time Monitor:
The quickest way to start using Swat. Just enter ‘./swat –l’ and Swat will start a private Data Collector for your local system and then will show you exactly what is happening to your current storage workload. No more fiddling trying to get some useful data out of a pile of iostat output.
Swat Reporter:
The Swat Reporter ties everything together. All data collected by the above Swat functions can be displayed using this powerful GUI reporting and charting function.
You can generate hundreds of different performance charts or tabulated reports giving you intimate understanding of your storage workload and performance.
Swat will even create JPG files for you that then can be included in documents and/or presentations. There is even a batch utility (Swat Batch Reporter) that will automate the JPG generation for you. If you want, Swat will create a script for this batch utility for you.
Some of the many available charts:
- Response time per controller or device
- I/O rate per controller or device
- Read percentage
- Data transfer size
- Queue depth
- Random vs. sequential (STF only)
- CPU usage
- Device skew
- Etc. etc.
Swat has been written in Java. This means, that once your data has been collected on its originating system, the data can be displayed and analyzed using the Swat Reporter on ANY Java enabled system, including any type of laptop.
Swat is now available for the general public:
The version currently available for the general public (Swat 3.00) will expire on September 3, 2009.
We are also planning to make Swat open source. One major issue that needs to be resolved is the fact that Swat uses a licensed third-party Java based charting package named KavaChart. All that code will have to be rewritten. I’ll keep you posted.
Henk.
Posted at 12:12PM Jul 31, 2008 by Henk Vandenbergh in Sun | Comments[2]
Henk,
How SWAT calculate the response time (round trip)? where is the check point? (for example if check point is on HBA: HBA send out IO request to Storage and come back a respond to HBA)
thanks,
Peisui Allen
Posted by Peisui Allen on September 18, 2008 at 02:33 PM MDT #
The response times are calculated:
For SPM: Swat uses the raw Kstat performance statistics maintained by Solaris.
For STF: The delta between the strategy and the biodone trace probes.
Posted by Henk Vandenbergh on September 18, 2008 at 02:48 PM MDT #