e diel tetor 31, 2004 | Paul Rogers' Weblog Notes during my pilgrimage
|
|
This weekend's cinematic outing was "Being Julia," a pretty period piece about London theater in the 1930s based on a novella of W. Somerset Maugham. I love Annette Benning and Jeremy Irons. I also liked the background music and since they often go to nightclubs we get Cole Porter tunes sung live. The costumes and scenery are fantastic. However, several elements of the story strained credulity and ultimately made the movie less than satisfying. The final scene of Julia's revenge is quite good but the movie dragged in the middle.
Why do bad things happen to good people
(2004-10-31 06:01:17.0) Permalink
This is just so cool. And here, for all these years, I thought I was just a geek. Of course, being a geek made high school hell. OK, I admit it, I was not on the football team. But I was the first kid in my high school to get a teletype login to Darthmouth's computer in the 1960s. I was the first kid to waste thousands of dollars playing Canadian road race on that teletype.   I learned Basic and Fortran on that computer. I worked my way through university in the computer center operating and managing a Univac 1108. I wrote my programs for that thing on punched cards. (The link brings back so many memories. Interesting insight into the economics of mainframes. In 1968 you could pick up a 1.3 MHz CPU with half a megabyte of magnetic core memory and 100 megabyte hard drive for a mere US$1.6 million. We had several FastRand II rotating drum memory which weighed about 2 tons, had 90MBs usable and rotated at 880 RPM with an average access time of 92 milliseconds.) Its not actually me in this picture from 1973, but I did hang many of those old half inch reels of tape and several spewed all over the computer room.
But before you feel sorry for me, check it out!   Now in the new millenium I have been upgraded from just a dweeby geek to a technosexual!!!
(2004-10-27 17:58:56.0) Permalink The New York Times has a great article asking the question "What Makes an Equation Beautiful" (NB ID and password required.) The NY Times picked up on an article from Physics World where Robert P. Crease surveyed the readers of Physics World asking the question what are The Greatest Equations Ever (NB No id required for this article) and the winner is below:
My reaction to this display of over-the-top high school athletics is to think of the horrific practice condemned explicitly in the Old Testament called "passing your children through the fire to Molech" which is an idiom for sacrificing your children by burning them. However, the barbarity of child sacrifice, or even Roman gladiatorial combat, seems to be alive and well in high school athletics where winning is everything and no one counts the cost in physical, emotional, spiritual or psychological terms. Where is the line between excellence in athletics and abuse of high school students? This movie shows it is not hard to cross over that line. (2004-10-23 18:58:16.0) Permalink When I started this blog I said that I did performance and capacity planning for Sun's customers. I want to offer up a technical study or two to help others with performance issues. I entitled this Capacity vs Performance in order to highlight the difference. Often a capacity limitation manifests itself as a performance issue. In order to differentiate between performance and capacity, performance might be defined as 'How fast it is going' while capacity is 'the maximum performance of the system or an individual component.' Imagine capacity as the dump truck carrying a load and performance as a sports car racing. Even a sports car has to slow down for corners. Not to be too simple but we need to look at each component of the system's performance, CPU, memory, network, disk and tape. One specific example was a customer who has a directory on the internet. Their customers submit searches from multiple sites and the Service Level Agreement (SLA) was no more than 5% of requests with response times of over 3 seconds. Currently 15% of request take more than 3 seconds which puts our customer in a penalty situation. The system is a 6800 with 12x900MHz CPUs. Unfortunately someone attempted to fix the problem by 'throwing more iron' at it and adding CPUs and memory without knowing why there was a problem. Lets look at a few numbers. From vmstat: procs memory page disk faults cpu r b w swap free re mf pi po fr de sr m0 m1 m1 m1 in sy cs us sy id 0 2 0 8948920 5015176 374 642 10 12 13 0 2 1 2 1 2 132 2694 1315 14 3 83 0 19 0 4089432 188224 466 474 50 276 278 0 55 5 5 4 3 7033 6191 2198 19 4 77 0 19 0 4089232 188304 430 529 91 211 211 0 34 8 6 5 4 6956 9611 2377 16 5 79 0 18 0 4085680 188168 556 758 96 218 217 0 40 12 4 6 4 6979 7659 2354 18 6 77 0 18 0 4077656 188128 520 501 75 217 216 0 46 9 3 5 2 7044 8044 2188 17 5 78 There is something odd about these numbers. On vmstat, we look at the right 3 columns, us=user, sy=system and id=idle, so there is over 50% idle CPU available to throw at the problem. One way to detect a memory problem is to look at the sr, Scan Rate, column of vmstat (near the middle of the display.) If the page scanner ever starts running, or sr gets over 0, then we need to dig deeper into the memory system. The very odd part of this display is that the blocked queue on the left of the display has 18 or 19 processes in it but there are no processes in the run queue. That means we are blocking somewhere in Solaris without using all the CPUs available to us. So now, we need to turn to the I/O subsystem. With Solaris 8, the iostat command has a new switch, -C which will aggregate I/Os at the controller level. My favorite iostat command is iostat -xnMCz -T d (interval in seconds) (count of iostat outputs):
extended device statistics
r/s w/s Mr/s Mw/s wait actv wsvc_t asvc_t %w %b device
396.4 10.7 6.6 0.1 0.0 20.3 0.0 49.9 0 199 c1
400.2 8.8 6.7 0.0 0.0 20.2 0.0 49.4 0 199 c3
199.3 6.0 3.3 0.0 0.0 10.1 0.0 49.4 0 99 c1t0d0
197.1 4.7 3.3 0.0 0.0 10.2 0.0 50.4 0 100 c1t1d0
198.2 3.7 3.4 0.0 0.0 9.4 0.0 46.3 0 99 c3t0d0
202.0 5.1 3.3 0.0 0.0 10.8 0.0 52.4 0 100 c3t1d0
Whoa! On controller 1 we are doing 396 reads per second and on controller 3 we are doing 400 reads per second. On the right side of the data we see that iostat thinks the controller is almost 200% busy (iostat error...never checked to see if there has been a bug filed.) So then the individual disks are doing almost 200 reads per second and iostat figures thats 100% busy on the disks. That leads us to a rule of thumb or hueristic, that individual disks perform at approximately 150 I/Os per second. This does not apply to LUNs or LDEVs from the big disk arrays. So our examination of the numbers lets us suggest adding 2 disks to each controller and relaying out the data. Unfortunately, due to the disk array configurations, we could only add 1 disk to each controller. That did improve the situation as seen by the next iostat:
extended device statistics
r/s w/s Mr/s Mw/s wait actv wsvc_t asvc_t %w %b device
410.6 5.4 4.8 0.0 0.0 5.7 0.0 13.7 0 218 c1
386.0 9.0 4.6 0.0 0.0 5.3 0.0 13.4 0 211 c3
129.4 2.2 1.5 0.0 0.0 1.9 0.0 14.7 0 73 c1t0d0
139.4 1.8 1.6 0.0 0.0 2.3 0.0 16.0 0 79 c1t1d0
141.8 1.4 1.7 0.0 0.0 1.5 0.0 10.4 0 66 c1t2d0
133.0 1.0 1.6 0.0 0.0 2.1 0.0 15.6 0 76 c3t0d0
125.4 2.2 1.5 0.0 0.0 1.9 0.0 14.6 0 72 c3t1d0
127.6 5.8 1.5 0.0 0.0 1.4 0.0 10.2 0 63 c3t2d0
We are still close to the top end of the performance of an individual disk but we dropped from 15% of transactions out of the SLA down to 6 or 7% of transactions out of the SLA. And the CPUs look good: procs memory page disk faults cpu r b w swap free re mf pi po fr de sr m0 m1 m1 m1 in sy cs us sy id 0 2 0 9283064 5482928 787 1293 36 0 0 0 0 0 23 0 13 5145 14763 1394 27 6 67 0 1 0 6547512 2483056 869 984 110 0 0 0 0 0 14 0 8 5377 8114 1372 23 6 71 0 1 0 6525816 2461496 1190 1230 0 0 0 0 0 0 0 0 0 6414 17808 1402 33 9 58 0 1 0 6516240 2451976 1316 481 0 0 0 0 0 0 0 0 0 5432 8226 1509 30 7 63 0 1 0 6506616 2442768 684 660 0 0 0 0 0 0 0 0 0 5188 16922 1259 26 7 67 Now we still have plenty of idle CPU time and only 1 or 2 in the blocked queue. It would have been nice to be able to add 2 disks to each controller but even 1 disk on each alleviated this problem. After this, the customer studied some of the internal design of their directory search algorithms. As the proverb says, Fixing one performance or capacity limitation only exposes the next issue. The point of this exercise is looking at all the numbers and attempting to locate the precise nature of the problem. Do not assume adding CPUs and memory will fix all peformance problems. In this case the search programs were exceeding the capacity of the disk drives which manifested itself as a performance problem of transactions with extreme response times. All those CPUs were waiting on the disk drives. One other thing to note in this example is that there were no 'magic' /etc/system parameters to tweak. There are fewer and fewer knobs (or parameters) in Solaris to adjust to improve performance. (2004-10-18 20:39:25.0) Permalink Comments [2] What an interesting day October 16th is:
Team, I tend to participate in aliases and share problems I have run into. Tonight I have to confess that I croaked my Windows 2000 partition when I attempted to dual boot my laptop last Saturday. First let me reassure you that I know this can be done and I completed the process Thursday night. However, I will confess my failure in order to save you heartache and grief. I am excited about trying out the new features of Solaris 10 x86 like DTrace and Containers. Download your copy of Solaris 10 x86 here. I first cleaned up my harddrive by deleting outdated files and taking out the trash. Then I defragmented my drive using the Windows 2000 Disk Defragmenter. This is important for resizing the Windows partition. Then I did follow RULE #1 and made a backup of my important files. I used the backup facility of the Nero tool (burning CDs, not Rome, get it.) I made 2 mistakes here. Mistake #1 was I did not follow RULE #2...Verify your backup and so several files would not restore later because of media problems. Mistake #2 was that I backed up my data starting at the level of 'My Documents,' not at the level of my User ID (one level up) which would have included my Application Data folder, that is my bookmarks file and my Outlook PST file. Now I have 'recent' backups of those files, but I lost 2 weeks of data when I thought I had fresh backups of these files. My problem was not understanding some features of the tools I attempted to use. I picked the tool Partion Commander to resize the 40GB harddrive into a 25GB partition for Windows 2000 and a 14+GB partition for Solaris 10 x86. Unfortunately for me, Partition Commander installed a utility, checkmbr (Check Master Boot Record) which automatically attempts to reinstall a base Master Boot Record. When you install another OS like Linux or Solaris x86, the new OS must update the master boot record and offer you the choice of which OS partition to boot. The repartion worked and the Solaris x86 install worked fine. I rebooted Solaris x86 several times and was fine. The problem occurred when I rebooted the Windows 2000 partition and the automatic utility checkmbr found the Solaris boot partition chooser in the master boot record. It attempted to restore it to its original state and then neither partition would boot. I believe you can and should do this. There are issues in doing this that are challenging but documents like this one can help you. I happened to have a Toshiba Tecra 9100 laptop which needs some BIOS updates: Disable USB Legacy FDD support Disable USB Legacy support for keyboard and mouse if a separate setting Disable Parallel port On Thursday night, I got out my Knoppix CD, which everyone should have in their CD case as a rescue CD. It has a utility, qtparted which I used to partition my hardrive. Other versions of Linux also have this utility. I then rebooted Windows 2000 and let checkdisk run to get used to the new partition size. Then I took my stack of 4 Solaris 10 x86 CDs and ran the install. Sucess! I am ironing out a few display issues but looking forward to writing my first DTrace program tomorrow. (2004-10-13 20:34:44.0) Permalink
"You have made us for yourself, and our hearts are restless until they find their rest in you, O Lord" This famous quote is from the first paragraph of the first book of The Confessions of St. Augustine. I decided to reach back and review a classic for my contemplation for the next few weeks. My suggestion is not to read about great books but read the great books yourself.
This weekend's cinematic outing was Primer.   Cherryl Dawson and Leigh Ann Palone, The Movie Chicks, summed it up as, "The movie is hard to describe and even harder to fully grasp. It's not for everybody, but could be a cult hit amongst the pocket-protector crowd." I know they are talking about me and, if you are reading this, perhaps they are talking about you too. Four guys extend their extreme engineering work weeks in a garage and develop something amazing. The protagonists are engineers and dramatic and compelling. Shot on an amazingly low budget (note the picture of the device itself), this movie demonstrates it is not a multimillion dollar blockbuster in lots of ways, but it is creative in the way the movies Pi or Momento were. (2004-10-11 09:47:39.0) Permalink In the creative music video category, I nominate Sarah for this effort. She had the video on its own website but I guess the traffic was too heavy after bloggers started watching it. If you have Itunes installed, click this link One warning though...do not watch this at work. I hate the sight of grown men sniffling without a Kleenex. When I watch it, I am faklemp. (2004-10-09 08:21:31.0) Permalink My mother in law gave me a sweatshirt with the complaint "So many books, So little time" and how true it is. Naturally I read my share of technical manuals, but for enjoyment I have been reading historical fiction lately. My favorites have been from the Roman period. First, Lindsey Davis has a great series of novels starring Marcus Didius Falco, the emperor Vespasian's chief fixer/detective/spy. Then Colleen McCullough, author of the Thornbirds miniseries, did a series of novels on Julius Ceasar. On my doctor's recommendation I am switching historical periods and reading some of Bernard Cornwell's novels on the Napoleanic wars. I picked up Sharp's Eagle yesterday, so I am going to go read now. (2004-10-09 08:09:22.0) PermalinkEverbody is positively mental over Grid computing (aka distributed computing) today. You may be curious about the Grid phenomena but you are worried that you have not yet installed Solaris x86 at home. Not to worry. I know you are going to make the switch Real Soon Now, but even before you do, you can participate in the grid with your home computer even if it runs an operating system from another company. :)
Grids have been a long time. In the 90's I participated in SETI@Home aka Search for Extraterrestrial Intelligence aka Search for Little Green Men. Its an impressive project that has 5.2 million users who have donated 2 million YEARS of computer time to the project. Today they are working at 66.5 Teraflops (trillion floating point operations) per second. That's some serious computational rock and roll. For my money though Little Green men is just so last millenium. I mean if Agent Mulder has given up the search, why am I still working on it? :)
Another of my favorite grid projects is Folding@Home . This Grid project is modeling "protein folding, misfolding, aggregation and related diseases" (like Alzheimer's.) Currently they have 171,628 CPUs running Windows, Mac OS X and Linux, which means that they have 196 Teraflops on the problems. The image preceding this paragraph is a Beta Amaloid peptide, "thought to be responsible for nerve cell death in Alzheimer's Disease." It is a part of Projects 722-724 so your computer could be a part of helping with Alzheimer's research. Truly a worthy cause and you can join up here. I was folding proteins for Professor Pande, but 2 years ago I had a colon polyp removed that was becoming cancerous. So these days my computers are working on cancer research headed up by the University of Oxford. There are over a million members with almost 3 million computers working on the problem and yesterday 270,000 results sets were submitted. If you are concerned about cancer, download the software from here. Then join my team-- SunONE. I am not actually the captain but I liked the name of the team. I urge you to pick one of these projects and let your computer(s) crunch numbers while you are not actively using it. Each one of these processes runs in the background at a lower priority than all other work your computer is doing. Most can be paused but you will not notice them in the background because your OS only schedules them when there are idle cycles available. If you are surfing the web, they back off. Even while you are typing they work, and trust me, a modern CPU is doing a lot of waiting around even if you type 300 words per minute. If "a mind is a terrible thing to waste," wasting good computer cycles is just criminal. Thanks for your support (2004-10-08 08:08:29.0) PermalinkI've been with Sun for 7 years in Professional Services. I primarily do large systems, HA clusters, Oracle 9iRAC clusters and performance and capacity planning on these systems for our customers. I will talk more about my work later. (2004-10-06 19:55:56.0) Permalink Comments [1] In the spiritual department, I am currently working my way through the books of Chronicles and reading the book He Gave Us Stories by Richard Pratt. It's a technical tome on the interpretation of Old Testament narratives suitable for the serious laymen or a preacher. (2004-10-06 19:51:10.0) Permalink |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||