DC Temperature measurement done right
Being part of Sun's Managed Operations group, keeping an eye customers computing environment during the current heat waves is extremely important. I can not rely on generic temperature readings from a Datacenter provider as they only probe on request or never get deep into a customer rack. So for the last 5 years I've been using a simple kit that incorporates DS1820 sensors spread out across my customer racks, all communicating via a serial interface and a simple perl proggie. The sensors can be run up to at least 200 feet from the monitoring server and run over low voltage wire - which means I can (and do) use a standard network patching system to locate my probes anywhere I can run a network cable.
Using this method rather than pulling the internal CPU or system temperature readings to determine a given environments cooling performance gives me some major advantages:
- System agnostic monitoring, no need to have a custom SNMP poller, MIBs or scripts based on a hardware or OS type (we manage more mixed windows, linux, storage, solaris and network heterogeneous environments than single technology customers)
- Ability to measure air temperature BEFORE passing through a given computer's cooling system and if needed calculate a delta to highlight cooling inefficiencies.
We've used data collected from these probes to match fluctuations in temperature to hardware failures and developed a predictive/proactive monitoring solution for our customers (as well as given some good data to hold a given DC provider to their contracted environment SLA's). We've even been able to pinpoint hot spots, that when targeted with a Sun Galaxy (X4100's and X4200's) hardware we managed to reduce the average air temperature by around 10F while doubling processing power.
All great stuff for work and now I'm going to set one up at home and see if I can pump the results realtime to my blog.
Posted by 42 [General] ( July 10, 2007 05:02 PM ) Permalink
