The following screenshot shows two spikes of unusually high disk I/O latency during a streaming write test:
This screenshot is from Analytics on the 7410. The issue is not with the 7410, it's with disk drives in general. The disk latency here is also not suffered by the client applications, as this is ZFS asynchronously flushing write data to disk. Still, it was great to see how easily Analytics could identify this latency, and interesting to see what the cause was.
See this video for the bizarre explanation:
Don't try this yourself...
Ok, Brendan, I have to ask... how the hell did you discover this behaviour?
Posted by Boyd Adamson on January 01, 2009 at 02:08 AM PST #
Must of been the poor IO they got from the system when Brendan screamed after seeing the cricket results.
Posted by Chris Gerhard on January 01, 2009 at 03:31 AM PST #
An <A HREF="http://www.publications.parliament.uk/pa/ld200001/ldhansrd/vo010124/text/10124-01.htm">highly unreliable source</a> suggests that 121DB's is about as load as the word QUIET can be shouted. Curious how the Gregg primordial scream measures up.
Posted by Clive King on January 01, 2009 at 05:54 AM PST #
Wow!
Do you get the same effect from continuous noise and vibration? i.e. in vehicle mounted gear or if you put the JBODs in the same room with a jet engine at max power? (aka the fans on a 52xx)?
Posted by Paul Murphy on January 01, 2009 at 08:25 AM PST #
Brendan, a short loud noise is obviously the extreme end of the spectrum. I wonder if we would see a performance improvement by running a system in a quiet data center with no fans running on the chassis. This is not sustainable, but might give designers some ideas for next generation disk platforms.
Posted by Jesse St. Laurent on January 01, 2009 at 11:48 AM PST #
Great :D
Posted by Adam Ziaja on January 01, 2009 at 12:39 PM PST #
"Alright spindles, I want to see your war face! Arrrgh!"
Posted by Andre on January 01, 2009 at 02:19 PM PST #
Boyd, as Bryan said ( http://blogs.sun.com/bmc/entry/catching_disk_latency_in_the ) I found this when investigating an intermittent latency issue caused by vibration on disk missing a drive bracket screw (which Adam found once before). After reinserting the disk, I enabled additional Analytics to get a good trace of the effect - but it wasn't reproducing. I was trying to figure out why, and wondering how close the disk was to the vibration point - when I shouted at it to give it some extra vibration. This worked better than expected, and caused issues in all the disks I was shouting at, not just the missing screw one. I was using the Analytics by-disk breakdown to map where I was shouting at (although in the video I shouted a bit too loud, and vibrated most disks in both JBODs). If I can get the missing screw issue to reproduce, I'll post a blog entry with the screenshots.
I do have a decibel meter, I need to take it into the lab and test further - I would think that my shout at that range is much, much louder than anything else, including the blade server during POST. But this may be possible with a softer continuous noise. Perhaps there is a "brown note" for disks ("rust note"?) - some harmonic that destroys their performance; although based on what we think is happening (disk heads being thrown off track), we think this has more to do with shock force than resonance. I'm sure there are disk vendor engineers out there who know all about this (Bryan links to a good reference in his blog entry).
... Chris, yes - it's hard/expensive but I do get a TV cricket feed over here in the US. I'll stay away from JBODs during the Ashes.
Posted by Brendan Gregg on January 01, 2009 at 04:44 PM PST #
Would bolting the chassis in the rack fix or mitigate this? It's only a 5 minute job, and I'd guess the manufacturer says you should do so in the manual. It'll also stop the chassis sliding out and falling on the floor if you have to push a cable hard into it at the back (I've heard of this happening before ..).
Posted by Mark on January 01, 2009 at 05:00 PM PST #
This has been suspected for a long time can sound effect harddrive worse than background interfere sources answer is yes. At long last someone has data captured it.
Some of the noise in data transfer speeds has been suspected coming from outside sources. It was always electronic noise or vibration effecting read speed.
Thank you sound transfer vibrations is for sure another factor. That sound provides enough vibration that is interesting. More study with sound protected areas will be needed. Before people start mounting there drives on rubber and the like really that could make the drive more open to sound interference. Rubber is good against earth based vibrations bad against air based.
Now next question do drives have a harmonic frequency. That would be the worse out come if a sound generator can produce exactly the right frequency to stuff up reads of drives inside range out side human hearing. I do have my own suspects of this because I can hear out side human range due to operation stuff up in ears. Replacing fans with dead bearings I could hear but no one else could did improve drive through put. I replaced the fans because the noise was driving me up wall. So yes at first I go into trouble for a non required change. Then I was getting thanked the next day for doing something that boosted the drive array through put. Like being asked what settings did you change and having to answer none all I did was change the fans. Kinda too late because I had already dropped the fans threw a industrial shredder so no one could refit them. 120 mm case fans for cooling. They would have been no where near 120 DB luck to been 60 db.
Mind you this kind of interfere of harddrive is used by http://www.ninsight.at/tsunami/index.shtml Sound interfere really adds them a new problem for them.
Posted by oiaohm on January 01, 2009 at 05:11 PM PST #
But I can still yell at my SSD, right?
Posted by Ken on January 01, 2009 at 11:00 PM PST #
Have you done any tests with RAID-5 or RAID-1 -sets? :)
And yes, I forwarded your video to our storage guys, also told that they should keep their voice down.
Greetings from Finland.
/Marko P
Posted by Marko P on January 02, 2009 at 12:56 AM PST #
Temporary change in humidity and temperature of the nearby disks should be also take into account having a look at your video. The influence of these factors could be ruled out by simple blowing some air from the mouth without shouting the same way how it was tested (using the palms on both sides).
Regards,
Dw.
Posted by Dwokfur on January 02, 2009 at 04:07 AM PST #
Hey, Brendan,
I especially like your use of foreshadowing at the beginning of the video: "Sorry for screaming..."
Chad
Posted by Chad Mynhier on January 02, 2009 at 05:07 AM PST #
This is a wonderful discovery!
One more example of the power of the human voice.
heh heh heh.
Happy New Year to you guys.
Posted by Deirdre B. on January 02, 2009 at 07:16 AM PST #
The sound translates into mechanical vibration. So then, why wouldn't the noise in the chassis create additional vibrations, and the combined effects of the drive vibrations and the noise it creates, also affect the performance? Well, of course it would. Looks like a call for a redesigned chassis.
Posted by Roland Ansgar on January 02, 2009 at 07:56 AM PST #
How about bad breath? Does brushing your teeth before shouting helps? :)
Posted by mauronr on January 02, 2009 at 08:16 AM PST #
"This ... is ... SPARCa!!"
Posted by 300 disks on January 02, 2009 at 08:58 AM PST #
hm, disk drives as decibel meter.
makes sense from a mechanical point of view; the drive heads should behave like extra accurate sensors.
do you have a 3-axis accelerometer strapped into that rack anywhere? i'd guess you could get good feedback from that too.
Posted by Edward Vielmetti on January 02, 2009 at 09:23 AM PST #
My bet is this is a hoax to get people to find out about your project/tool.
Either way, it worked. I'll keep an eye out on this blog/tool going forward.
Posted by Adam Prato on January 02, 2009 at 09:25 AM PST #
Nice analytics software - much nicer than the crap that VML created for us a few years back. Those guys are idiots - thats what you get when you hire a computer company from Kansas City.
Posted by ChrissyT on January 02, 2009 at 11:07 AM PST #
I think this is a normal effect. Drives need highly accurate tracking opration, any vibration disturbs this creating need for suplimentary tracking decisions. They compensate for most problems like putting many hdd's one next to each others when disc vibrations create crossover mecanical noise but a higher frequency vibration as voice 800-1600Hz i think may delay track confirmation thus affecting general driver latency. It will be intresting to find out if the driver looses performance constantly on constant voice/signal disturbance or if is only a momentary effect until they compensate for it eaven if the disturbance remanins.
Posted by Andrei on January 02, 2009 at 11:25 AM PST #
Jesse said: "I wonder if we would see a performance improvement by running a system in a quiet data center with no fans running on the chassis."
It's absolutely possible: put your machines outside and let the open air do your cooling for you. If we wouldn't insist on encasing our CPUs and hard drives, and then putting those cases in a big building, we wouldn't need anywhere as much cooling AND it would be a lot more quiet. And yeah, I know, precipitation and flying animals are issues, but I bet we can find better solutions to those issues.
I seem to recall Intel talking about this, but I can't find the reference. Supposedly Microsoft decided to investigate it too, and set up a rack of machines outside of one of their datacenters, covered only loosely with a tarp. Despite the dustier outside environment, the failure rate of the rack outside was in line with the failure rate inside the carefully environment-controlled building. And zero cooling cost.
Posted by Jorg Brown on January 02, 2009 at 12:43 PM PST #
ChrissyT said: "Those guys are idiots - thats what you get when you hire a computer company from Kansas City."
No, VML is a joke around KC. There's plenty of talent around here, none of it at VML.
Posted by matt on January 02, 2009 at 02:10 PM PST #
This reminds me of the MythBusters "Pottery Record" segment. In the myth, which was previously used as a plot device in "The X Files: Hollywood A.D." (2000) and "CSI: Crime Scene Investigation: Committed" (2005), people could extract auditory information off pottery.
Posted by Paul Tiffany on January 02, 2009 at 03:15 PM PST #
You can afford to shout at a million dollar computer setup.
Try this on a opensolaris desktop with 512MB memory.
Shutdown -- remove the 256MB memory card (the other half) --
Restart the computer ..
Search for a Sun engineer or programmer, don't shout just whisper --
Posted by oruwatun on January 02, 2009 at 03:33 PM PST #
Lol ... that was killer
Posted by dom on January 02, 2009 at 05:20 PM PST #
This patent refers to noise-induced drive failure:
http://www.freepatentsonline.com/5587855.html
"As hard disk drives become smaller, and are used in more diverse environments, their ability to withstand physical shocks, vibrations and noise decreases, and consequently, the need to protect the disk drives against physical shocks, vibrations and noise becomes more apparent. Typically, noise arises from the rotation of the spindle motor and the disks, and also from the actuator that supports the read/write heads when a positioning operation is performed. While these are sources of acoustic noise in the typical disk drive, amplification can be caused by the mechanical base and cover. The acoustic noise could ultimately cause errors to arise in head-to-track positioning accuracy, thus limiting data recording densities. Similarly, physical shocks and vibrations from an external source could also cause errors to arise in the head-to-track positioning accuracy, and directly damage the disks which could cause loss of data, and damage to the heads which could end the life of the disk drive life, resulting in a total loss of data. Consequently, resistance to physical shocks, vibrations and noise is critical to protecting the disk, the head and the various bearings in a disk drive from damage. Conventional disk drives designed to reduce physical shocks, vibrations and noise are however, detailed, elaborate, expensive and often ineffective. "
Posted by Edward Vielmetti on January 02, 2009 at 06:14 PM PST #
In the interests of science, I was able to reproduce Brendan's results on my PC at home. I started a 15GB file copy from a drive in an external enclosure to a drive mounted in a trayless hot-swap sled in the tower case of the PC.
Under normal conditions, the transfer rate for this copy is about 55 MB/s. Screaming at the destination drive causes throughput to drop noticeably, as indicated by the green arrows. Having now independently confirmed your findings, Brendan, is there still time to submit a paper to FAST '09? :))))
http://luxography.ca/Images/DWF/yelling_at_drives.png
Posted by Brian Tao on January 02, 2009 at 09:08 PM PST #
The comment about temperature and humidity from exhaling doesn't fit the response time of the change in the latency. It's all about the noize :-)
It makes sense, with the geometries involved, that HDU's would be pretty sensitive.
I read an article in Wired about a digital cinema camera and one of the things that they had to overcome was the microphonics of some of the parts. Evidently, explosions and vibrations caused problems with crystal oscillators and other critical timing passive components.
It's an analog world after all...
Posted by Goat on January 02, 2009 at 11:33 PM PST #
Whoa, It's great test, but have you done test for another disk storage?
Posted by Lukas Hakos on January 03, 2009 at 01:46 AM PST #
Jorg Brown said: ..."I seem to recall Intel talking about this, but I can't find the reference. Supposedly Microsoft decided to investigate it too, and set up a rack of machines outside of one of their datacenters, covered only loosely with a tarp. Despite the dustier outside environment, the failure rate of the rack outside was in line with the failure rate inside the carefully environment-controlled building. And zero cooling cost."
Intel: http://weblog.infoworld.com/sustainableit/archives/2008/09/intel_air_side.html
Microsoft's "Tent City" server farm: http://blogs.msdn.com/the_power_of_software/archive/2008/09/19/intense-computing-or-in-tents-computing.aspx
Posted by z1d on January 03, 2009 at 04:43 AM PST #
Posted by just my 2 cents on January 03, 2009 at 08:21 AM PST #
Don't have the drives some accoustic management with feedback? (i don't know how aam works, but i've on one drive something that looks like mic)
Posted by just_me on January 03, 2009 at 10:44 AM PST #
Put them in vacuum, no air, no voice/noise vibrations.
Obviously remains direct vibrations from casing, fans or the other disks.
Anyway interesting test.
Brendan, i didn't envy you, what the hell of noise in there??!!?? :D.
Posted by elwood_mb on January 04, 2009 at 02:02 AM PST #
Elwood,
unfortunately no cooling either, so it would fry pretty fast because of not being able to radiate heat faster than it is generated.
Posted by Robert Varga on January 04, 2009 at 02:26 AM PST #
Posted by Cinetica Blog on January 04, 2009 at 07:50 AM PST #
Title: SUBHARMONIC TRAVELING WAVES IN A GEOMETRICALLY NONLINEAR CIRCULATE PLATE
Author(s): NAYFEH TA, VAKAKIS AF
Source: INTERNATIONAL JOURNAL OF NON-LINEAR MECHANICS Volume: 29 Issue: 2 Pages: 233-245 Published: MAR 1994
So as an undergrad I worked with the authors of this paper to solving the what kind of harmonics develop in spinning disks when you excite them at frequencies other than the natural frequency of the disk. They had solved the subharmonic case so I solved the superharmonic case and found that you could generate stable waves in the disk that had amplitudes up to 10 times the thickness of the disk. So shouting at the disk could cause these types of harmonic vibrations with amplitudes high enough to cause read errors. It was kinda cool seeing your video and then thinking "hey I've seen something like that before"
Posted by Thurston on January 04, 2009 at 10:42 AM PST #
Posted by fintanr's weblog on January 04, 2009 at 12:04 PM PST #
Posted by newz.dk on January 05, 2009 at 04:20 AM PST #
You should do this to Jonathan ....
Posted by ???? on January 05, 2009 at 05:38 AM PST #
So... has anyone possibly thought that this could NOT be a vibration problem?
Let me start off by saying that the demonstration was awesome. I used to work @ Sun and one of the best parts about the place was people getting crazy and just testing stuff like that (or installing a bbq igniter on an E6500 system board).
One thing you may want to test... try your test on "cold" disks. I am going to assume that the system had been up and running for a while. Not sure what JBOD you are using but if it is like most Sun gear, heat/cooling are major factors in design. I am wondering if when you are screaming at the disks if it is not just a factor of you disrupting the airflow enough around those specific disks that you see that spike. The initial scream would momentarily disrupt the airflow and eventually SHOULD "normalize" even if you were to continue to scream.
I could be totally off my nut with that theory... but you are the guys screaming at your disks. =)
Posted by drew on January 05, 2009 at 06:59 AM PST #
This is not unusual, as it the write performance of a disk drive, depending on type and manufacturer is affected when a particulat disturbing frequency matches the natural frquency of the drive (actuators). It will cause the performance to fall to its knees. I saw this on a three dimensional array that I designed (5 drives on a blade) where an adjacent drive under random reads ( the heads going back and forth at a certain frequency) crossed over the resonant frquency of the drive next to it doing writes (more sensitive). By pressing on the top cover of the drives (in the sheet metal enclosure) I was able to change the natural frequency of the drive to cure that phenonomem. The HD disk cover itself can be a microphone (or speaker) translating vibrations to the actuator. Maxtor drives had a notch at 500 HZ and 1000 HZ that you just could not excite them without losing all performance. All drives have something similar. You just found the right frequency. Try yelling in a tenor voice rather than a baritone. :o)
Posted by Mike Konshak on January 05, 2009 at 09:01 AM PST #
Brian, great work! If someone can find a resonant "rust note" for disks, that may be worthy of a publication. :) The starting point may be the paper that Thurston mentioned "SUBHARMONIC TRAVELING WAVES IN A GEOMETRICALLY NONLINEAR CIRCULATE PLATE" - generating stable waves in disks sounds interesting indeed.
Lukas, yes - I shouted at a non-Sun disk array (with FC disks) and got the same effect.
Roland, the chassis has already been carefully designed to keep vibration away from the disks, which it does really well - and I'm able to check this using the DTrace/Analytics software I demo'd.
Posted by Brendan Gregg on January 05, 2009 at 11:42 AM PST #
Did you or anyone think that the vibration actually came from you not so gently put your hands on the disks to make your cone. I honestly don't think your voice is doing that - more like your hand contacting the disks thus making the minor vibration.
The vibration resistance of a normal operating hard drive is .5g The radial acceleration of a disk at rest (in a rack) isn't going to increase .5g or enough to make that disk suspend itself to protect it against shock and decrease its performance.
Posted by Todd on January 05, 2009 at 06:01 PM PST #
Come on guys.... all physics aside, remember that MACHINES are people too!
Posted by OPIFEX on January 06, 2009 at 09:23 AM PST #
There is a vibration sensor on disk, it can rectificate the read/write head according to the environment vibration, maybe the tester's voice generate a sound that can resonate with the sensor, which sequentially made the write head's displacement and cannot aimed at the right track, so after the sound, the write head re-armed at the right track, the IO moved on normally.
My msn: myprotein0007@hotmail.com
Posted by MelonHead on January 06, 2009 at 08:30 PM PST #
Hi, what happens here is that you disturb the vibration/G sensors. In
cheap disks, this does not have a dedicated processor, so the vibration sensors irq:s steals i/o time from the disk.
Nothing new here.
Posted by The Kmattski man on January 07, 2009 at 04:00 AM PST #
Huh. Makes me wonder if a rack-mounted black noise generator might improve disk performance.
Posted by Nick on January 07, 2009 at 10:56 AM PST #
当初振动传感器设计肯定已经抵偿了机器噪声,但是没有考虑某人会对着吆喝,呵呵
Posted by 冬瓜头 on January 07, 2009 at 06:10 PM PST #
It should be noted that this is in fact a well-understood phenomenon, at least by most drive array manufacturers. The human voice can indeed produce frequencies and amplitude that will interact with the coils in a disk drive. (As an aside, you perhaps have seen the YouTube video demonstrating the use of a disk drive as a speaker...and the coils that operate heads are often called "voice coils" from their heritage).
Also, not all drives will demonstrate this effect, or the same significance of the effect - different classes of drives have various mitigation techniques to minimize the impact of vibrations. Desktop/consumer ATA drives are perhaps the most susceptical to this effect, exhibiting the 500ms impact on I/O response times. Enterprise-class fibre channel drives are designed to have no more than a 50ms impact on I/Os due to vibrational interference.
Perhaps most important is the design of the drive cage apparatus. As mentioned, a single missing screw can have a significant detrimental impact on vibration; an "enterprise-class" drive cage should be designed specifically to limit or restrict vibrational interference between adjacent drives. In addition, isolation of drives from interference caused by service actions (e.g. replacing a failed drive or power supply) is also critical - at least when performance and availability are important to the application.
Posted by the storage anarchist on January 09, 2009 at 04:13 AM PST #
More than a screenshot, you posted a screamshot.
Posted by enlavin on January 12, 2009 at 03:24 AM PST #
Hello , i know big big geeks are surfing on this blog,i need help on Disk performance issue. i am working as a field support engineer.
Isuue is:
My client have a Sun Netra 1280(8 sunsparc III and 16 Gb Memory with Sun StorageTek 3320 SCSI Array with 12 Segate 148 Gb HDD. Netra 1280 connects to Raid box with 2 Sun StorEdge™ PCI/PCI-X Dual Ultra320 SCSI Host Adapters each having one connection to Raid Box.On Sun server we installed Oracle 10.1 g.The applications installed on this such that it uses oracle much frequently.
Now the problem we are facing unnecessary latency.To know performance of Storage point of view I logged on to Raid box,Checked from the firmware screen manu that the speed not going beyond 50 MB/s on Oracale backup time which is the highest Disk Usage activity.As Ultra 320 and Raid 3320 assures 320 MB/s transfer speed,Is it normal and Expected behavior.
Please let me know any one have any comments on this.And if you need any more info please feel free to ask.
Posted by Santosh Choudhury on February 08, 2009 at 10:57 PM PST #
This isn't a streaming write test... It's a SCREAMING write test. :-D
Posted by Admiral Memo on February 15, 2009 at 10:34 PM PST #
I am not surprised at all. If you look at the 3rd chart you'll see a saw tooth performance... indicating response to a cyclical/random noise in the room or minute aberrations in the disk build. Now super impose a shock-wave in the form of a scream the you'll trigger an immediate perfomance degradation in the shape of an impulse. What ever disk Brendon is using, it obviously has a dampner to absorb shocks otherwise there would be quite a bit of residual vibration from the scream
Posted by Koo Koo on March 04, 2009 at 07:05 PM PST #
How about bad breath? Does brushing your teeth before shouting helps? :)
Posted by Warhammer Online Gold on March 11, 2009 at 10:02 PM PDT #
Have you made this test with the JBOD flipped 90 degrees? I'm curious to see if this changes the effect of the vibrations on the latency when fflipped vertically instead of horizontally.
Posted by Dhiok on April 20, 2009 at 11:31 AM PDT #
You posted nice informative one. Thanks for sharing this blog.
Posted by Computer Support on April 30, 2009 at 02:28 AM PDT #
doesn't surprise me either
Posted by Kitchen Stainless Steel Sinks on May 28, 2009 at 11:48 AM PDT #
If you look at the 3rd chart you'll see a saw tooth performance... indicating response to a cyclical/random noise in the room or minute aberrations in the disk build. Now super impose a shock-wave in the form of a scream the you'll trigger an immediate perfomance degradation in the shape of an impulse.
Posted by club penguin on June 06, 2009 at 06:29 PM PDT #
Thanks for sharing this info post.
Posted by Computer Help on July 03, 2009 at 03:45 AM PDT #