Friday October 14, 2005 Well, my final day as a Sun employee approaches. Since being kicked out by Sun UK as part of its cost cutting exercise I have managed to find another job, which is both good news and shows that there are other companies out there that want Oracle skilled people and know how to use them to their advantage.
Since being told of my redundancy so that Sun could save money, the main other news I have seen from Sun is its continued spending of money to buy other companies - Tarantella, Seebeyond and Storagetek. The cost of Storagetek alone was $4.1 billion.
( Oct 14 2005, 04:21:44 PM BST ) Permalink Comments [1]
Sun UK has had another round of redundancies, and I have been caught by it this time, as have a number of other good people. So it is goodbye to Sun from me, and to this blog.
I was looking forward to writing lots of interesting things about the new Galaxy servers and Niagara as they came to market, and we got to see what they could do running Oracle databases and applications. Especially when they would be running Solaris 10, with all its new features and advantages.
At the moment I am not angry by the decision (it is important to keep the business viable), just mainly disappointed. It seems a waste of the past 6 years as an Oracle / Performance specialist at Sun to be 'let go' when Sun is releasing some of the most interesting technology and products during the whole of that time (see the previous paragraph).
Anyway, I still wish Sun the best, as it is the most innovative computer systems company out there, and does have some interesting products and technology coming down the line. As I see some of this stuff get released I might blog about it at my personal blog at johnmbrady.blogspot.com
( Sep 16 2005, 10:19:15 AM BST ) Permalink Comments [1]
I've just spent a long weekend in
Northern Ireland
attending a wedding.
Everything went well, and everyone had a good time.
One of the nice things during the wedding ceremony itself,
was that there was a piece of music played on a harp at one point.
This made a nice change from the organ traditionally played in churches,
and the acoustics of the harp gave a different feel to the church.
The harp player was Rachel Hair,
who has been playing the Scottish harp since a child.
We made a weekend of the visit over from England, and had a day doing some of the tourist stuff. We visited the Giant's Causeway which is quite a spectular piece of coastline, and Bushmills Distillery which makes Irish Whiskey.
( Aug 30 2005, 12:39:57 PM BST ) Permalink Comments [0]
DTrace is really good. I'm sure most of you know that by now, so I'm not telling you anything new. But I was in an internal presentation on DTrace the other day, and I was just blown away by the ease with which the presenter created D scripts (the DTrace language) to find out what was happening on a system, and then drill down further.
Up to now I have always viewed Solaris Containers as the most useful feature of Solaris 10 i.e. the feature that you would benefit from the most, and would be the one that made you migrate to Solaris 10. But after this presentation I have changed my mind. DTrace gives you so much observability on what is happening on your system. And it does it immediately, non-intrusively (no need to recompile your application), in real time (meaning as it happens with no special set up), gives you as much detail as you want (including access to all arguments to functions and system calls), and is totally safe and always ready to use in Solaris 10.
Presuming that most application environments have some inefficiency or other in them, DTrace just opens up everything for you to find out where your bottlenecks are, and help identify the root causes of them. And you don't need the source code of any application to do this.
Want to see what is consuming up all the I/O on the system? Then write a small DTrace script to count the calls to the read and write system calls. In fact, also look at the size of each I/O too, and show the distribution and count the I/Os by process.
I know that this is all standard DTrace stuff, and that it has all been described many times in the DTrace documentation and other articles and blogs. So I won't waste time trying to describe what DTrace is capable of. I just wanted to say, that if I was an administrator or developer I would be begging for Solaris 10, so that I could use DTrace to investigate any anomaly with the system.
Containers would take some time to do all the necessary work: design the final configuration, set it all up, migrate the applications over from other systems and environments into each container, and require monitoring to ensure that performance of each application was acceptable. But DTrace I could use immediately on Solaris 10, on all applications, regardless of their configuration and set up.
( Aug 10 2005, 10:57:55 AM BST ) Permalink Comments [0]
Taking a snapshot of the activity on a system, or a baseline profile of its performance, is a useful tool for dealing with future performance problems. By establishing a baseline, and recording all of its associated details, we have a reference point for comparison at any point in the future. Should any performance problems be reported about the system in the future, then we can compare the current profile of the system to its previous baseline. Any differences will help us identify what has changed and from that the cause of the change in behaviour on the system. How else are we find out what has changed, and is causing the change in performance behaviour of the system? Without a baseline for reference, you literally do not know what has changed since the system was last working properly.
A performance profile consists of saving information about the utilisation of all of the resources on the system (CPU, Memory, Disk, Network, etc.), and all of the processes on the system (the consumers of the resources). This information is normally taken as a series of snapshots over a period of time. Examples range from every hour throughout a 24 hour day, to every minute during the peak hour of the day.
If you have some kind of performance management or monitoring tool, then it should be capable of capturing this data for you over your representative period, and then saving it away somewhere permanently for later use. If you don't have such a tool, then you can achieve something similar using standard UNIX tools like sar, ps, System Accounting and maybe even top (or prstat on Solaris), saving the outputs to a set of files. Of course it will require some manipulation to turn this raw data into a profile of the system and the applications running on it. But it must be better to have some data, no matter how raw it is, than no data at all.
Once this snapshot of the system behaviour has been established, we now have a point of reference for what we consider to be normal activity on the system. Should anything appear to be wrong at any point in the future, then we can compare it to this baseline snapshot, and find out what is different.
Performing this baseline does not require a lot of effort, yet has enormous potential benefits:
The cost of this is very little in real terms – some disk space to store the performance profile data, and some software tools to capture that data. Note that these tools would be needed to undertake any performance problem analysis anyway. So if you are serious about performance management on your systems, and have such tools, then there is no real extra expense to using them to baseline your systems.
( Aug 02 2005, 02:03:08 PM BST ) Permalink Comments [0]
Sun's future Niagara processor has many implications when it comes to measuring the performance of a system, and the behaviour of the applications running on it. A single threaded processor and its execution core are either utilised or not at any given point in time – all or nothing. With a multi-threaded processor like Niagara, utilisation of the execution core can now have values between 0% and 100% - there are shades of grey between black and white.
Our goal is to maximise the performance of our applications on the computer hardware configuration we have available to us. One aspect of this is to understand how utilised the system capacity is. Another aspect is to understand how the resources are being used by each of the applications running on it. With single threaded processors this simplistic on/off view is close enough to the truth, and we have been used to it for many years. With multi-threaded processors we can either retain this simplistic view, or we can enhance it and be honest about the existence of the shades of grey.
Until now a processor has only had a single execution core capable of executing instructions from one thread at a time. To switch between threads involved intervention by the operating system, which in turn involved executing instructions from the kernel to save the current state of this thread, and then choose another thread to schedule on this processor. So only ever one thread of a process was executing on that processor.
The operating system, such as Solaris, records the elapsed time that the thread spends running on that processor, and that the processor is busy (utilised) during this period. This is used to update the running total of the total CPU consumption of the thread and the busy time of the processor, which is used in calculating its overall utilisation.
With a Niagara processor the operating system can still record the same pieces of data about each thread it schedules on each execution core, and they are still valid (i.e. they reflect the scheduling performed by the operating system). However, they no longer tell the full story. Some things are missing which could be very relevant for a multi-threaded processor such as Niagara. We will now also be interested in the efficiency of the processor – how much of its potential capacity has been used and how much is spare - and how effectively a thread has been able to execute on it.
A Niagara processor will have 8 execution cores. Each of these execution cores will be able to concurrently deal with 4 threads. So Solaris will see 32 separate processors for scheduling threads on. However, there are in reality only 8 execution cores, not 32. Each execution core is being shared by 4 separate threads, with the execution core switching between their instruction streams during relatively long memory accesses by one thread to keep itself efficiently utilised. But at any instant in time the execution core is only executing one instruction at a time, not four instructions at a time.
Four threads concurrently executing on a Niagara processor can have different performance behaviours dependent on which execution cores the operating system schedules them on. If the four threads are each scheduled onto different execution cores, and presuming that there are no other active threads, then there is no interference between the execution of the threads. This is similar to having four separate single threaded processors.
However, instead, if the four threads are all scheduled onto the same execution core, then they will be sharing that execution core, and there will be some form of interference between them, as they each in turn have their instructions executed while other threads access data from memory.
Due to this sharing of the execution core, I would now like to know about the utilisation of the physical execution cores in the Niagara processor, as well as the normal thread level scheduling. I would like to know about the Niagara's utilisation both in terms of number of concurrently executing threads, and overall efficiency (how many potential execution cycles were successfully used and how many were not due to lack of a ready instruction to execute).
Obviously, if an execution core is not at maximum efficiency and has unused execution cycles due to long memory accesses, then it is worth adding an extra thread of execution to it. If it is at maximum efficiency already then we would be better off adding in extra, separate execution cores to handle extra threads.
Knowing this information would help us make the right decisions on how to size our systems, and how much extra capacity it could deal with. So the big question is, how is Solaris going to actually measure and report both CPU utilisation and thread CPU usage (consumption) on multi-threaded Niagara processors? Another interesting set of challenges to have with the new design principles behind Niagra.
( Jul 25 2005, 02:33:44 PM BST ) Permalink Comments [3]
Having posted a few entries to my blog here, I thought it was about time I tweaked its presentation and how it all got displayed.
I have played with HTML in the past to create a few simple, crude web pages, so I thought I would stand a chance with this. However, these were static pages that never changed, and it was really about simple formatting of some text and pictures on the page. Roller, which is used as the engine behind all these blogging entries, is a different and more up to date beast than the HTML I was used to. So I had to do some quick catching up with the current state of HTML, before I could go ahead and amend my blog configuration to achieve what I wanted.
The first thing I had to come to grips with is that HTML has evolved a bit, and that HTML 4.0 is quite different to HTML 3.2 in a number of areas. This is most obvious with regard to formatting the display of your text. HTML 3.2 allowed you to mix formatting specifications, such as FONT tags, in amongst the content that was being formatted by them. HTML 4.0 has deprecated all this, separating formatting specifications from the content it formats. (However, HTML 4.0 still currently supports FONT tags for all the existing pages on the web that use it.) This allows you to reformat the whole of a page, by simply changing a few format controls, which should be grouped together at the start of the page. To further enhance this ability, HTML allows you to centralise all your formatting controls into one file, and use them in all your web pages via Cascading Style Sheets (CSS). However, if needed, you can still override these within a web page, and essentially define a local, anonymous class style and use it at the same time.
Roller uses a set of CSS pages to define standard formats (called a class in HTML). The web pages that Roller uses to publish your blog entries refer to these classes to control the formatting of different parts of the contents of the web page. So to change some aspects of the formatting and layout of my blog, meant that I had to find out what classes were being used in my web blog page, and how these were defined in the CSS pages.
The master Roller page (_decorator) includes a page called _css, which in turn expands to include themes/base.css, followed by themes/basic/theme.css because I am using the Basic Theme. One of the rules of CSS seems to be that it uses the last definition it finds of any class style. We can see that theme.css redefines some classes that have already been defined in base.css. In other words, I can use my blog's _css file to redefine class formats used by my blog, in order to customise the appearance of my blog page.
My _css file now includes the following additional lines at the end of it:
<style type="text/css">
.entries {width: 75%}
.rightbar {width: 25%}
h2 { font-size:15px }
h3 { font-size:16px }
h4 { margin:10px; }
pre.code {font-size:12px; }
ul.recentposts {list-style-type:disc}
</style>
This does the following for me:
Then in my _day page I changed the bold around the entry title to be h3 instead:
<h3>$entry.title</h3>
In my main weblog page itself, I added a section to list the most recent 10 blog entries I have made, after the Referers section.
<h2>Recent Posts</h2> <div class="sidebar"> #showRecentEntries(10) </div>
So now you see the results of these changes around you on my weblog page. I have also made a few other smaller changes, around the order of the sections in the right hand sidebar, and the display of my blog description after the title at the top of the page. I shall also probably make a few more tweaks over time. At the moment I am considering whether I should swap around my blog title and its description. I see that a number of other people have, so that the title is more punchy and descriptive, and the description states the name of the person. If this page is not called "John Brady's Weblog" but something else, then I have obviously gone ahead and changed it.
( Jul 14 2005, 10:46:12 AM BST ) Permalink Comments [0]
Having preached the benefits of monitoring your systems by measuring what is happening on them, the next question is “How do I actually measure system activity?”. Well, there are a number of third party products out there that will do this for you. They are worth having because they are designed to continually collect data about the system and then let you report on it and analyse it in a number of ways later. Also, they can help you manage performance across a large number of systems. But we'll get to those later. Presuming you don't have access to these kinds of tools and want to do be doing something rather than nothing, what can you do with Solaris out of the box?
I'm a big fan of sar, the System Activity Report package, although it is by no means a perfect tool. Sar and its associated package of commands come standard with Solaris (and most UNIXes in general), and you can even get it for Linux. Sar gathers most system activity statistics at once, has a low overhead to using it, and can store its data in a binary file for later analysis. You can use sar to collect system activity data by either using the associated commands that come with it (sa1 and sa2) or by directly running the data collector itself (sadc).
I like sar because it can be set up very quickly, and will have minimal impact on the system if you do not collect data too often. The Solaris kernel is always gathering and calculating these statistics - all we are doing is telling it to save them to disk every now and then. So the only real cost is the disk space required to save the daily data files away.
Solaris comes with some cron entries for the sys user, ready for you to enable sar to collect data. The default entries in the crontab for sys that do this are:
# 0 * * * 0-6 /usr/lib/sa/sa1 # 20,40 8-17 * * 1-5 /usr/lib/sa/sa1 # 5 18 * * 1-5 /usr/lib/sa/sa2 -s 8:00 -e 18:01 -i 1200 -A
The net effect of this would be to collect data every 20 minutes during the working day (8am – 5pm, Monday to Friday), to a data file named /var/adm/sa/sadd, where dd is the day of the month. At 6:05pm every day it would process the day's data, and produce a text file with all the sar reports in it, called /var/adm/sa/sardd. It also deletes sar data files older than 7 days.
To enable sar data collection at this default frequency, all you need to do is edit this file (as the root user do 'crontab -e sys') and remove the leading '# ' on these 3 lines. Do not edit the file directly, you must use the crontab command to make changes.
However, many systems are now 24 x 7, so these defaults may be inappropriate. To change the collection frequency to every 15 minutes every day (which would now give us 96 sample points per day), edit the crontab as before, but replace the two sa1 lines with the following:
0,15,30,45 * * * * /usr/lib/sa/sa1
Sa1 and sa2 are actually just shell scripts. It would be very easy to customise these. However, you should not change these scripts in place, as they might get overwritten by future upgrades to the operating system. It would be better to copy these scripts to another location, such as /usr/local/lib/sa, creating it if it does not exist, and change the sys crontab entries to refer to the files in these locations instead. Then you can edit and customise these local scripts as desired.
One possibility would be to extend the retention period for the sar data files, and to not bother producing the sar report files every day or only produce the CPU utilisation report. Using a naming convention of sayymmdd instead of sadd we can safely retain the data files for over a year (400 days allows year on year comparisons).
Sa1 and sa2 use sadc, the SAR Data Collector. You can use sadc directly instead of using sa1 and sa2, to perform regular collection of activity data to a file. Sadc can be run to collect data and save it to a file, at a regular collection interval for a number of sample points. You can use sadc manually to gather more detailed data for any specific period of time you want, and analyse the data later using sar.
sar takes a number of options. One option is to specify the data sets you want to see. The default is -u for CPU utilisation. Another option (-f) allows you to specify the data file to read from and report on. And you can also specify the time periods you want reported (-s and -e).
Sar gives you one half of the performance monitoring picture – how busy are each of the resources in the system? The other half – what is using the resources? - is much more difficult, for a number of reasons. There is no direct utility that captures process level data at regular time intervals to a data file for later reporting. There are some utilities that give per process information, but they all work separately and tend to produce simple text files. Given the number of processes on most systems, these can run to large files, and could make navigation awkward.
For a simple list of all process on a system you can use ps. For a list of the most active processes, you can use prstat, which is really meant to be a real time analysis tool, rather than a data capture and reporting tool. System Accounting can also be used to report on process resource consumption (acct), but suffers from the flaw that it is only updated when a process finishes. Therefore you never get any information about a process until it terminates.
There are a number of third party tools, which I shall not begin to attempt to list or describe or compare. You can either use specialised performance management products, or use more generic system management products that have a module for collecting and reporting on system performance.
I use TeamQuest quite a lot, and am happy with it for what I use it for. I can install TeamQuest Manager easily and have it immediately start collecting data on what the system is up to. Using View I can then report on both the resource utilisation and the processes that are consuming them. If necessary I can use Model to perform 'what-if' analysis on a problem system to see the effect of any configuration changes.
( Jul 12 2005, 11:38:27 AM BST ) Permalink Comments [0]
Most people probably want a 'scalable' system when they buy a computer. 'Scalable' can be taken to mean a whole bunch of different things, but for what I am talking about here we can presume it means both the ability for the system to support increases in workload while maintaining the same transaction response time, and that by doubling the system configuration (the resources) it can support twice the workload.
Scalability of a system is not really about what is there in the system, but rather what is 'not' there. Scalability becomes limited or constrained when something gets in the way. Less is more. Good scalable design is about either avoiding the things that can get in the way, or explicitly designing around them when they are unavoidable. Examples of things that have limited system scalability in the past include:
The classic description of the results from this kind of scalability is Amdahl's Law. However, computer hardware and software vendors have spent many years developing various techniques that let them build and deploy large systems that do not suffer from Amdahl's Law. So, generally, today's systems are more balanced designs, that scale well within their stated capacity.
With today's generation of computers, it is actually the memory sub-system that limits the scalability. This is because the processors in computers have got at least 1,000 times faster in the past 10 years, whereas memory has probably only got 100 times faster. It is the relative ratio between these two that is important, not the absolute amount of the increases. Also, the other end of the computer system where it connects to the outside world – the network – and where work comes from and results go back to, has also got a lot faster over the past 10 years. We have moved from 10 Mb/sec through 100 Mb/sec to 1 Gb/sec (Gigabit Ethernet) as standard for many networks.
Modern processors work internally at multiples of the external system bus frequency – anything from 4 to 10 times would be possible. So a 2 GHz processor may be interfaced to a 400 MHz bus, for a multiple of 5. This already shows that any memory access is going to waste multiple internal cycles of the processor. On top of this, modern memory sub-systems do not respond with the data within a single system cycle. It is several system cycles after being given an address that they respond with the data.
How does this affect a CPU? If a CPU had an internal clock speed of 1 GHz, then one CPU cycle is 1 ns (nanosecond). If the total time to obtain data from memory was 100 ns, then the CPU has been idle for most of the 100 CPU cycles. (It will not have been totally idle, as modern CPUs use an internal pipeline of sub-tasks to execute each instruction in parts. Stages of the pipeline already executing other instructions will be able to finish them during the memory access).
So in modern computers we end up with the situation that often the CPU is idle, wasting cycles waiting for data from memory, and that it is the memory sub-system that is the limitation to how the system scales as more work engines (CPUs) are added to the system. Clearly there is an imbalance between a processor's ability to do work and how quickly the work can be supplied, so that the rest of the system spends time waiting on memory. This is where the key focus of good, balanced system design should be.
A good system design principle is therefore about hiding this difference between real memory speed and CPU speed, so that the impact on the CPU of the much slower memory is minimised. Most CPUs today have areas of silicon on them dedicated to this memory interface, doing things to try and offset the relative cost of memory access. This is where you will find things like data and instruction caches running at the same speed as the CPU, branch prediction, pre-fetch buffers and write behind buffers. Many of these are aimed at trying to get the data before the CPU needs it, which is not always possible due to the variations in how programs behave.
Sun's future Niagara processor has a new approach to this 'memory speed hiding' principle, by having four threads co-exist within the CPU's execution core at the same time. The CPU will only ever be executing one of these threads at any moment in time, as other current CPUs do. However, when the currently executing thread needs an external memory access, the CPU simply switches to another thread while this is happening. Thus the delay incurred for the memory access for one thread, is actively used to execute instructions of another thread. This has a number of benefits:
As someone who spends a lot of time concerned with the performance of computer systems, and the actual performance achieved by customers with their applications running on real hardware, the Niagara processor looks like a great win-win deal to me. It uses a simpler CPU execution core design, has a zero-cost switch between threads, hides memory access times, and increases overall system throughput and utilisation. And with less hardware (just one processor) than current systems.
I believe that as Sun ships an actual system using the resultant CPU from Niagara we will see radically different behaviour profiles from systems and their applications. We will have to learn to interpret CPU utilisation and application throughput in different ways. An existing application could behave differently on a Niagara based system, and achieve a greater throughput, yet with only a single CPU. In this case, less is truly more.
As the saying goes – “May you live in interesting times”.
( Jul 11 2005, 02:31:19 PM BST ) Permalink Comments [0]
Is RAC better on a few large computer nodes, or on many small nodes? This question came up in the office yesterday. We hear a lot from Oracle about using Real Application Clusters with Linux and commodity systems and the like, but which is really the best solution? And why?
Obviously everyone has opinions about Oracle RAC, but the question was about what published material was there out there that could be shown to a customer. For me, this material falls into two camps:
Technically, RAC cannot make anything go any faster. A single transaction on a RAC database will take longer time to complete than on a non-RAC database. This is because of both the extra work that a RAC database has to do with checking ownership of blocks containing data before accessing them, and also any messages and data blocks that need to be transferred over the cluster interconnect between the nodes as part of that transaction. This is all extra to what would occur on a non-RAC database, and so must make each transaction take longer time to complete.
Although transferring a block between two nodes is faster than a disk access (say 1 millisecond compared to 10 millisecond), it is still far slower than a memory access within a node (1 microsecond or less). It is not the absolute values which are important here, which can vary a lot, but the relative scale of each. So, although a cache fusion block transfer between nodes is 10 times quicker than reading that block from disk, it is still about 1,000 times slower than accessing that block from memory in the computer system. See Oracle RAC's Secret in Dave Brillhart's Blog for a more detailed description of this issue.
What RAC does offer is both higher levels of availability and out of the box scalability, if you need them.
Using RAC on two or more nodes significantly reduces any failover time due to a single system failure. As well as reducing the time to failover any service between nodes to potentially sub-minute, it can reduce the percentage of users affected by the single node failure. If you load balance user connections across all the nodes in the RAC cluster, then only those on the crashed node are affected. Other users' connections remain, and are uninterrupted.
Clustering systems together using RAC also allows you to support a greater workload than a single system could handle. However, scalability is not what I call linear. By linear I mean that twice the resource results in twice the throughput. With RAC the best published results give about 1.8 times the throughput for 2 times the resources. This 1.8 is compounded each time you double the resource, so that 4 nodes would give you about 3.24 times the throughput of 1 node (1.8 * 1.8). If you went to 8 nodes, you would expect 5.8 times the throughput of 1 node. This is why most published implementations of RAC use no more than 4 nodes.
The only scenario under which RAC can offer better than 1.8 scalability factor, is if the workload is highly partitioned between the nodes in the cluster, with no data sharing between the users on the nodes. In essence you end up with a number of independent nodes, with discrete user populations, who never share data and so never need to get a data block from another node via the cluster interconnect. But the users are all hosted on a shared, clustered database giving very fast failover times in the event of any node crashing for any reason. This can give good scalability along with high levels of availability.
Which brings us to the practical side of using Oracle RAC. What is involved in actually deploying RAC and maintaining it on a day to day basis? For me, the sweet spot is between 2 and 4 nodes in the cluster. Any more than this and the management and maintenance tasks take up too much time and resources.
Any maintenance operation for a node has to be repeated across all the nodes. And the more nodes the longer such operations will take. Most upgrades, whether the operating system or Oracle database, will involve downtime of the service on the node being upgraded. While other users can continue on the other nodes of the cluster, these upgrades have to be done one after the other, one node at a time. So more nodes just makes things harder and take longer. And if you wanted to minimise disruption to users by only doing 1 node each night, it would take you over a week to upgrade an 8 node system, and half a month for 16 nodes.
More nodes also leads to a greater rate of node failure. If 1 node has a failure rate of x failures per year, then 8 nodes will fail at 8x per year i.e. 8 times as often. So, for example, if we expected an average of 1 failure per year for a single node, then an 8 node cluster would have a node failing every 1.5 months or 6 weeks (presuming a 4 week month) on average, and a 16 node cluster would have a node failure every 3 weeks on average.
This is why somewhere between 2 and 4 nodes offers all the benefits of high availability and load balancing, while minimising the impact of maintenance and management tasks, and the frequency of node failures.
How are customers doing this in the real world? In fact, how are Oracle deploying RAC themselves? Oracle have consolidated over 70 separate database instances into one single global instance using Oracle RAC. And they decided that the best architecture to deploy this on to achieve their targets of performance and availability was a 4 node cluster. And the platform for this? A cluster of Sun Fire 12Ks, each with 36 CPUs.
In fact, since then these systems have been upgraded, and the CPU count increased. Larry Ellison seems very happy with these big Sun servers. (If that link is broken, here is another link to a cached copy of that article on Google.)
So, from a technical and practical point of view between 2 and 4 nodes seems to be the best architecture for deploying business critical databases on using Oracle Real Application Clusters.
( Jul 05 2005, 12:48:21 PM BST ) Permalink Comments [3]
I'm a big believer in proactive performance management, which means lots of different things. One aspect of it is doing proper performance monitoring of your systems.
Performance monitoring a system really means measuring and recording what is happening on that system. In simple terms – if you don't measure it, how can you know what is going on? If you don't know what is happening on each of your systems, how are you ever going to be able to diagnose the real cause of a performance problem?
And if you don't record it somewhere, how can you do any kind of analysis on the data being gathered? You need to have reliable data on what the system was doing, both recently and in the past.
Reliability is both about recording the data and measuring it accurately. You need to record the data so that you are looking at a consistent set of related values, rather than a set of constantly changing values. Accuracy is important, as inaccurate data will not help you identify the real cause of any performance problem, and so help with finding a solution. You need to be able to trust the performance data being gathered about your system.
However, it is not enough to know that CPU utilisation has risen to 80%. You also need to know what is consuming those CPU cycles. Only then can you focus in on the culprit(s), and try to find out what they are doing.
So performance monitoring needs to collect data from a system perspective (a collection of resources) and a workload perspective (a collection of processes), to get both sides of the system activity equation:
( Jul 04 2005, 12:07:29 PM BST ) Permalink Comments [0]
As previously posted, I like analogies when describing how best to approach managing performance of a computer system running a business application. One analogy I have tended to use is to compare performance management to insurance.
In today's world we accept that we need insurance for all kinds of things. Apart from the fact that some of these are mandatory (car insurance if you drive a car in the UK), most people understand that the consequences of not having insurance when tragedy strikes far outweigh the costs of obtaining that insurance in the first place. Even though you never intend or expect to make a claim against that insurance.
So today we understand that we need separate insurance policies to cover many different aspects of our life:
In fact the list of different types of insurance you can buy today just goes on and on.
I see proactive performance management as a form of insurance. By paying some extra money up front to instrument your systems, and to monitor and record what is happening on them, you will be in the best possible situation to respond when something starts misbehaving.
If something ever goes wrong, then you will already have in place all the information you need to analyse what is happening, identify the root cause, and decide on the most appropriate form of action to remedy it.
But this is not what most computer departments do. They wait until something goes wrong, and then take an iterative approach to trying different fixes until one of them succeeds. Often these fixes either involve down time of the application for each change, or significant monetary outlay to obtain extra resources (typically CPUs, memory or disks).
But the key point is that without proper information about how the application and systems are behaving, you cannot identify the true cause of the problem. Often you are just using "rules of thumb" you have, and are tackling anything that seems unusual. Whether or not that is related to the cause of the performance problem.
Some of the performance management and analysis tools out there can record where time is being spent by the application, and how much of what resources it is using. With this information you can easily identify what has changed when a performance problem is reported. Having identified the cause, you can determine what effect any changes you propose might have on the overall performance of the application and the systems it is running on. Knowing where the application is spending its time during each transaction, will help you focus on the areas that would give the greatest payback.
Furthermore, these performance management tools will let you easily identify any change in the performance behaviour of the systems and the application. So you can identify changes in normal behaviour before they grow to the level of impacting the observed performance of the application. Even if the degree of change is very small, you can still use trend analysis to estimate when in the future there could be a noticeable impact on performance.
And all this for some extra money up front. Instead of having to keep teams of troubleshooters around, just in case. And then experiencing lengthy periods of degraded performance and service levels when any performance problem occurs, while you try different fixes until one of them works. And then hoping that you have finally fixed it all, and that it doesn't happen again.
( Jun 30 2005, 03:29:47 PM BST ) Permalink Comments [2]
I am a big believer in what I call "Proactive Performance Management". In other words, doing something about performance of an application on a computer system before it becomes a problem. By which point, of course, it is too late.
One of the problems I have is persuading people that this is something worth spending time, effort and money on today. Most people take the approach of "If it ain't broke, don't fix it", and so do not see the benefit of spending money on addressing a problem that doesn't yet exist. So I am always on the lookout for any good descriptions of the dangers of not addressing performance properly, and of the benefits when you do.
I also like analogies, as they stop us getting stuck in a set of specialised terminology related to computers. And a good analogy will get the point over, and show that the principle applies to other scenarios too. Which should increase the strength of the argument being put forward.
So, while reading Adrian Cockcroft's blog I came across a posting comparing fighting house fires to managing performance ( Playing with Fire ). And this made a lot of sense to me. No one would prefer to live in a building that was not well designed, and had taken the consequences of fire into account. Otherwise, you would end up spending a lot of of your time dealing with spontaneous fires. Given the choice most people would choose a well designed, safe building.
So why do we not design performance into the environments in which we deploy software applications? Why do we continue to presume that nothing needs to be done about performance, and end up spending significant amounts of time and effort "fighting fires" when some system or other starts behaving badly?
The analogy to avoiding fires brings out another point. You do not add performance or performance management in at the end, when the system has been built and deployed. Performance is not something you can just bolt on to an existing system. Just like you cannot bolt on fire safety to an inadequately designed building after it has been built. Good performance management needs to be designed in from the very start of the system.
( Jun 30 2005, 10:37:14 AM BST ) Permalink Comments [1]
Like many other people, no doubt, after much time I have eventually decided to become another blogger at Sun. Here, believe it or not, is my first weblog post.
I'm John Brady, part of Sun Microsystems in the United Kingdom. Basically I'm a kind of Technical Consultant specialising in the Oracle database software, and all things related to performance of computer systems. A quick summary of my history:
Since starting as an application developer many years ago, I have always been working with relational database systems on UNIX in one way or another. During the past 15 years, I have been working with some of the largest and most powerful UNIX systems, especially multi-processor (SMP) systems, for different system manufacturers. In this time I have always focused on performance of large databases running on UNIX multi-processor systems. As a result I have experience of the issues involved in designing, building and deploying large complex server based solutions.
Given this, I will mainly be making posts about the Oracle database software, and performance management of systems, including good design of large, scalable systems.
( Jun 29 2005, 03:46:37 PM BST ) Permalink Comments [1]