Wednesday November 08, 2006
Unbreakable Oracle 10g Release 2 : What if you have ORA-600 kcratr1_lastbwr ?
This an interesting story that happened yesterday on one of our customer site. An engineer powered off the wrong rack of equipment containing a Sun Fire X4600 running Oracle 10g Release 2. Almost no transactions were performed at time so when the system came up the customer expected the database to be up and running very quickly.
In reality this is what happened :
Tue Nov 7 11:19:42 2006
ALTER DATABASE OPEN
Tue Nov 7 11:19:42 2006
Beginning crash recovery of 1 threads
parallel recovery started with 16 processes
Tue Nov 7 11:19:44 2006
Started redo scan
Tue Nov 7 11:19:44 2006
Errors in file /xxx/oracle/oracle/product/10.2.0/db_1/admin/xxx/udump/xxx_ora_947.trc:
ORA-00600: internal error code, arguments: [kcratr1_lastbwr], [], [], [], [], [], [], []
Tue Nov 7 11:19:44 2006
Aborting crash recovery due to error 600
Tue Nov 7 11:19:44 2006
Errors in file /xxx/oracle/oracle/product/10.2.0/db_1/admin/xxxtest/udump/xxxtest_ora_947.trc:
ORA-00600: internal error code, arguments: [kcratr1_lastbwr], [], [], [], [], [], [], []
ORA-600 signalled during: ALTER DATABASE OPEN...
Not too pretty ! Checking the ASM configuration and the IO subsystem showed nothing wrong. So what to do if you do not have a backup handy ?
Well, here is the idea .... what would we do if we had a backup that was inconsistent ?
The recover database command will start an Oracle process which will roll forward all transactions stored in the restored archived logs necessary to make the database consistent again. The recovery process must run up to a point that corresponds with the time just before the error occurred after which the log sequence must be reset to prevent any further system changes from being applied to the database.
So we tried :
startup mount
Tue Nov 7 11:54:03 2006
Starting background process ASMB
ASMB started with pid=61, OS id=1070
Starting background process RBAL
RBAL started with pid=67, OS id=1074
Tue Nov 7 11:54:13 2006
SUCCESS: diskgroup xxxTESTDATA was mounted
Tue Nov 7 11:54:17 2006
Setting recovery target incarnation to 2
Tue Nov 7 11:54:17 2006
Successful mount of redo thread 1, with mount id 2364224219
Tue Nov 7 11:54:17 2006
Database mounted in Exclusive Mode
Completed: ALTER DATABASE MOUNT
Tue Nov 7 11:54:32 2006
recover database
Tue Nov 7 11:54:32 2006
Media Recovery Start
parallel recovery started with 16 processes
Tue Nov 7 11:54:33 2006
Recovery of Online Redo Log: Thread 1 Group 3 Seq 4 Reading mem 0
Mem# 0 errs 0: +xxxTESTDATA/xxxtest/onlinelog/group_3.263.605819131
Tue Nov 7 11:59:25 2006
Media Recovery Complete (xxxtest)
Tue Nov 7 11:59:27 2006
Completed: ALTER DATABASE RECOVER database
alter database open
alter database open
Tue Nov 7 12:03:01 2006
Beginning crash recovery of 1 threads
parallel recovery started with 16 processes
Tue Nov 7 12:03:01 2006
Started redo scan
Tue Nov 7 12:03:01 2006
Completed redo scan
273 redo blocks read, 0 data blocks need recovery
Tue Nov 7 12:03:01 2006
Started redo application at
Thread 1: logseq 4, block 12858574
Tue Nov 7 12:03:01 2006
Recovery of Online Redo Log: Thread 1 Group 3 Seq 4 Reading mem 0
Mem# 0 errs 0: +xxxTESTDATA/xxxtest/onlinelog/group_3.263.605819131
Tue Nov 7 12:03:01 2006
Completed redo application
Tue Nov 7 12:03:01 2006
Completed crash recovery at
Thread 1: logseq 4, block 12858847, scn 824040
0 data blocks read, 0 data blocks written, 273 redo blocks read
Tue Nov 7 12:03:02 2006
Thread 1 advanced to log sequence 5
Thread 1 opened at log sequence 5
Current log# 1 seq# 5 mem# 0: +xxxTESTDATA/xxxtest/onlinelog/group_1.261.605819081
Successful open of redo thread 1
Tue Nov 7 12:03:02 2006
MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Tue Nov 7 12:03:02 2006
SMON: enabling cache recovery
Tue Nov 7 12:03:03 2006
Successfully onlined Undo Tablespace 1.
Tue Nov 7 12:03:03 2006
SMON: enabling tx recovery
Tue Nov 7 12:03:03 2006
Database Characterset is UTF8
replication_dependency_tracking turned off (no async multimaster replication found)
Starting background process QMNC
QMNC started with pid=56, OS id=1128
Tue Nov 7 12:03:05 2006
Completed: alter database open
And we are up and running ! The real thing that Oracle should work on is the quality and clarity of their error messages.
At this point this is quite poor ...
Unbreakable database, maybe. Automatic (and simple) , not yet.
Nov 08 2006, 04:44:44 PM PST Permalink
Do you need an OLTP benchmark for ANSI v2 databases ?
Do you need a new OLTP benchmark ?
A benchmark that could be lightweight as well as heavyweight ?
That could be IO intensive or CPU intensive ?
That could run on any database and any operating system ?
That you could run on your laptop but also scale up to a 144 cpus Sun Fire E25k+ ?
That could run standalone, in client/server or in a 3 tier model ?
That would produce instantly color charts and comprehensive PDF or HTML reports ?
Well, send me an email or a comment ( I am sure you are smart enough to find my email address somewhere)
if you want it. If my mailbox becomes full - I'll see what I can do....
MrBenchmark
Oct 19 2006, 01:43:17 PM PDT Permalink
A second benchmark to compare V40z, V490 and T2000 : iGenRAM v1.2
The iGenRAM v1.2 benchmark is a Java-based memory application useful to compare the memory performance of different systems. Based on the functional requirements of the California Lotto, this application is simulating a California lotto play consisting of :
Players play Lotto tickets by choosing series of six numbers. Each thread is simulating 6 million tickets played. To store tickets, memory is allocated in Java in the form of multi-dimensional integer arrays.
The system generates a list of six winning numbers.
The system has to determine which ticket won and for what amount. For this, it needs to browse all tickets and compare to the winning combination.
Total duration of this tasks will produce the throughput in Lotto computed per seconds or iGenRAM_Thp and the iGenRAM_RT average response time.
Systems with low memory latency and scalable memory interconnect will succeed. We expect good things from the Sun Fire T2000.

Next up : iGenRAM v1.2 results on V490, V40z and T2000
Jan 02 2006, 02:28:01 PM PST Permalink
What is SWaP and iGenCPU SWaP values
- Performance: Using industry-standard
benchmarks or your own benchmark !
- Space: Measuring the height of the server in rack units (RUs).
- Power: Determining the watts consumed by the system, using data from actual benchmark runs or vendor site planning guides
The SWaP metric is calculated this way :

I recently provided iGenCPU v2.1 benchmark results for various platforms (see previous entries). I did start with this benchmark as it is the absolute worst case for the T2000. Let see how it translates into SWaP numbers :

Or as expressed in a chart :

So what is the message here ?
If you are running floating point intensive applications (the immense majority of commercial applications are not), and you need a small form factor, the AMD equipped V20Z/V40z or better the Galaxy line ( Sun Fire™ X4100 and Sun Fire™ X4200 ) are the right answer.
I hope I convinced you here that the SWaP metric was not designed to make the UltraSPARC-T1 a winner every time but to provide a critical metric for modern datacenters !
In 2006, we will look at a second microbenchmark called iGenRAM , and then explore LDAP performance (iGenLDAP), database performance (iGenOLTP) and even web performance (iGenWEB).
I am leaving for Tahoe, so see you on the Heavenly slopes or next year on this forum...
Dec 16 2005, 11:54:42 AM PST Permalink
MrBenchmark iGen benchmarks : A clarification
Thank you for your comments on my previous post.
Walter's guidelines are good and applicable to all standard benchmarks. Please note that my benchmarks are not standards.. all results are provided for your information and without any type of performance guarantee....
If you are searching for standard benchmarks results , do not hesitate to go there (BM seer blog)
Also, apple-to-apple comparisons is only a dream from my prospective. Why ? Because all of this systems (and processors) are different from the bottom-up...
I do see my comparison data as a way to provide an approximate idea of how systems rank versus each other fon a specific workload. No benchmark is universal...so you can never say System X is better/faster than System Y without being specific on the application tested.
Next up : SWaP values for iGenCPU v2.1
Dec 15 2005, 05:23:58 PM PST Permalink
iGenCPU results on UltraSPARC T1 (T2000) and UltraSPARC IV+ (V490+)
This is an update on my previous entry, adding the UltraSPARC T1 CoolThread server T2000 and the V490+.The throughput increase between the UltraSPARC IV+ and the UltraSPARC IV is fairly proportional to the clock frequency.No surprise here. The simplicity of this microbenchmark does not allow taking benfits of many of the UltraSPARC IV+ innovations.
Regarding the T2000, please note that I am not afraid to publish the results. This is not a marketing blog...Indeed, this benchmark is not recommended for the UltraSPARC T1 due to the fact that 25% of the instructions are floating point operations (see two previous blog entries).
It does not prevent us to collect the data. Please remember that we want our customers to run the right platform for their workload. So, if your workload is generating more than 2% of floating point operations, the UltraSPARC T1 is probably not what you should choose....
Note : The column threads is the number of threads used to observe the highest throughput with a response time (RT) less than 100 ms.
| Processor | Frequency | # CPU | # Cores | Ram | OS | Threads | Fractals/s | RT (ms) |
| Intel XEON | 3 Ghz | 2 | 4 (HT) | 4 GB | S10 03/05 | 3 | 31.16 | 96.26 |
| Sun UltraSPARCIIIi | 1.2Ghz | 4 | 4 | 8 GB | S10 03/05 | 4 | 53.01 | 75.4 |
| AMD OPTERON | 2.4Ghz | 4 | 4 | 8 GB | S10 HW2 | 5 | 90.18 | 55.4 |
| Sun UltraSPARC IV | 1.2Ghz | 4 | 8 | 8 GB | S10 03/05 | 8 | 98.88 | 80.8 |
| Sun UltraSPARC IV+ |
1.5Ghx |
4 |
8 |
8 GB |
S10 HW1 |
8 |
123.48 |
78.36 |
| Sun UltraSPARC T1 |
1.2Ghz |
1 |
8 |
8 GB |
S10 HW2 |
9 |
18.62 |
93.08 |

Let me know your thoughts....
Next, I will publish a description of the iGenRAM 1.6 and related benchmark results. We will see that the T1 is pretty good at this....
Dec 14 2005, 02:24:01 PM PST Permalink
iGenCPU 2.1 results - V40z, V65x, V490 and V440
As promised here are our first iGenCPU v2.1 benchmark results. See previous blog entry for the benchmark description.
As reported by my pfp tool , this benchmark is producing about 25% of floating operations and 75% others...
Therefore, absolutely not recommended for a UltraSPARC T1-based T1000 and T2000...
This table will show us the performance obtained on this benchmark for four popular Sun Microsystems servers :
The V40z single core , the V65x, the V440 and the V490 all using Solaris 10 . Please note that this servers may be available
today at a higher frequency.
Note : The column threads is the number of threads used to observe the highest throughput with a response time (RT) less than 100 ms.
| Server | Processor | Frequency | # CPU | # Cores | Ram | OS | Threads | Fractals/s | RT (ms) |
| V65x | Intel XEON | 3 Ghz | 2 | 4 (HT) | 4 GB | S10 03/05 | 3 | 31.16 | 96.26 |
| V440 | Sun UltraSPARCIIIi | 1.2Ghz | 4 | 4 | 8 GB | S10 03/05 | 4 | 53.01 | 75.4 |
| V40z | AMD OPTERON | 2.4Ghz | 4 | 4 | 8 GB | S10 HW2 | 5 | 90.18 | 55.4 |
| V490 | Sun UltraSPARC IV | 1.2Ghz | 4 | 8 | 8 GB | S10 03/05 | 8 | 98.88 | 80.8 |
.

Please use the comments section for your observations, I am sure you will have plenty...
Next, I will publish iGenCPU 2.1 results for UltraSPARC T1 and UltraSPARC IV+ and provide my observations....
Dec 12 2005, 05:26:43 PM PST Permalink
MrBenchmark benchmarks : Opteron vs UltraSPARC IV vs UltraSPARC T1
Whiners came to me saying : "MrBenchmark : Enough theory, please give us some benchmark results.."
And I said, fine...so here we are ..I will publish some informal benchmark results in this forum. And Yes I will compare
UltraSPARC IV, IV+, Opteron , UltraSPARC T1 and even Xeon !
Let me present the first microbenchmark of my serie . It is called iGenCPU. It is written in 100% pure Java. I am using Java 1.5
The iGenCPU benchmark is a JavaTM-based CPU micro-benchmark used to compare the CPU performance of different systems.
Based on a customized Java complex number library, the code is computing Benoit Mandelbrot's highly dense fractal structure using
integer and floating-point calculations. The simplicity of the code as well as its non-recursivity allow a very scalable behavior using
less than 64 Mb of memory per thread.
iGenCPU reports multiple statistics. We are mostly interested in analyzing iGenCPU_Thp (how many fractals per second can we
compute with this number of threads ?) and iGenCPU_RT (what is the average time needed to compute a complete fractal with this number of threads ?)
IgenCPU use the system this way as represented by my iTarget chart :

Next to come : our first iGenCPU benchmark result (table & diagram ): V40z (4xAMD Opteron @2.4Ghz with 8GB RAM ) vs V490 (4xUltraSPARC IV 1.2Ghz 8GB RAM)
Dec 09 2005, 04:25:28 PM PST Permalink
How to demonstrate the value of the CoolThread UltraSPARC T1 servers (T1000 - T2000) to your boss ?
Well, after a very long entry presenting my pfp tool , here is a very short one...
To demonstrate the value of the CoolThread UltraSPARC T1 servers (T1000 - T2000) to your boss
there is only one thing to do : make her/him benchmark it using Sun Sim Datacenter
(yes ! your boss is gonna run a benchmark and she/he will like it ! )
How to do it and simulate all your Datacenter with UltraSPARC T1000 or T2000 ?
Very simple, download Sim Datacenter here ,
and run it on Solaris 9 or 10 !
What, you don't have Solaris 10 on your laptop ?
Get it right now on this Solaris page ...
Easy,no ?
Dec 08 2005, 05:04:37 PM PST
Permalink
Is my workload recommended for a CoolThread UltraSPARC T1 server ( T1000 - T2000 ) ?
Since the pre-release and announcement of UltraSPARC T1 systems (T1000 - T2000),
our customers coming in the Sun Solution Benchmark Center have been very interested to know if their
application will work well on UltraSPARC T1. While assessing the multi-threaded nature of a
workload is easy using standard system tools, it is less straightforward to obtain at will
the amount and proportion of floating points instructions executed by a system. Some complex
tools exist but we would like to have a simple go/no-go binary that would answer
only this question. (If you are interested in a more detailed analysis of a cpu behavior, please
ask me about a great tool called ripc )
The key information coming from our UltraSPARC T1 engineers is the choice they had to make (because
of space limitations) to have a single floating point unit shared by the 8 cores (and 32 strands).
Please note that this challenge has been solved on the next release of this processor.
They tell us that in there best estimation any workload doing more than 2% of the total amount of instructions
using floating-points will not be recommended for UltraSPARC T1. Between 1% and 2% is the gray area where
they recommend us to try because a number of the simpler FPU commands were moved to the
core and dont incur a 40 cycles penalty.
The idea of this article is to explain how to get this information and provide a simple tool
(for all UltraSPARC based systems).
The UltraSPARC III (or UltraSPARC IV core) has a maximum of four instructions that can
be fetched from cache in a clock cycle and a total of sixteen fetched instructions that
can wait for an execution unit to become available. Six parallel execution units exist on
the chip : one load/store unit, one branch unit, two identical integer Arithmetic Logical
Units, one add (and therefore substract) floating point unit named FA_PIPE (see FP 1
on the schema below and one multiply(and therefore divide) floating point unit named FM_PIPE.
(see FP 2 below).

For the UltraSparc III (and IV or IV+), multiple performance
instrumentation counters are provided to analyze the CPU performance
behavior under load but for our purpose we need to consider only three of them :
1-The total number of instructions completed not counting annulled, mispredicted or
trapped instructions. This is the Instr_cnt counter
2-The total number of instructions completed on the FA_PIPE. This is the FA_pipe_completion
counter.
3-The total number of instructions completed on the FM_PIPE. This is the FM_pipe_completion
counter.
Note that the counters 2 and 3 are also incremented for some type of VIS instructions. Therefore,
they have to be considered only as estimations.
For the UltraSPARC T1 based systems, it is simpler as the single counter FP_instr_cnt is directly provided.
As you already deducted, we will be able to determine the percentage of floationg point
operations with the formula :
%FP_ops = 100 * (FA_pipe_completion + FM_pipe_completion) / Instr_cnt
We are also able to provide this simple heuristic :
if ( %FP_ops < 1%) -> Recommended for UltraSPARC T1
else if (%FP_ops between 1% to 2%) -> Possible fit for UltraSPARC T1
else -> Not recommended for UltraSPARC T1
To do this, here is a program named pfp that you can use as pfp <duration in seconds>
If you are on a T1000 or T2000 system, please use the flag -n as this program does not detect the cpu
type in its first release.Please remember to run your workload first and while it is running,
use this program as shown below.
paris # ./pfp 30
We observed 22756679 instructions separated in 0.20% floating point and 99.80% others
This workload is recommended for UltraSPARC T1 systems.
ontario # ./pfp -n 30
We observed 342593950 instructions separated in 0.77% floating point and 99.33% others
This workload is recommended for UltraSPARC T1 systems.
If you just want the percentage of floating point instructions, you can also do
paris # ./pfp -s 30
0.20
Finally, you can also use the tool on Solaris 8 or Solaris 9 with :
Dtrace # ./pfp -ps 30
1.97
The binary of this tool can be found here.
Dec 07 2005, 05:02:21 PM PST
Permalink