Saturday Nov 08, 2008
Several years ago, I wrote about how Oracle views multi-threaded processors. At the time we were just introducing a dual-core processor. This doubling of the number of cores was presented by Solaris as virtual CPUs and Oracle would automatically size the CPU_COUNT accordingly. But what happens when you introduce a 1RU server that has 128 virtual CPUs?
The UltraSPARC T1/T2/T2+ servers have many threads or virtual CPUs. The CPU_COUNT on these systems is sized no different than before. So, the newly introduced T5540 with 4xUltraSPARC T2+ processors would have 256 threads and CPU_COUNT would be set to 256.
So, what does CPU_COUNT have to do with memory?
Thanks to my friends in the Oracle Real World Performance group, I was made aware that Oracle uses CPU_COUNT to size the minimum amount of SGA allowed. In one particular case, the DBA was trying to allocate 70 database instances on a T5140 with 64GB of memory and 128 virtual CPUs. Needless to say, the SGA_TARGET would have to be set fairly low in-order to accomplish this task. A SGA_TARGET was set to 256MB, but the following error was encountered.
ORA-00821: Specified value of sga_target 256M is too small
After experimentation, they were able to start Oracle with a target of 900MB, but with 70 instances this would not fly. Manually lowering the CPU_COUNT allowed the DBA to use an SGA_TARGET of 256MB. Obviously, this is an extreme case and changing CPU_COUNT was reasonable.
Core and virtual CPU counts have been on the rise for some years now. Combine rising virtual CPU count with the current economic climate and I would suspect that consolidation will be more popular than ever. In general, I would not advocate changing CPU_COUNT manually. If you had one instance on this box, the default be just fine. CPU_COUNT automatically sizes so many other parameters that you should be very careful before making a change.
Thursday Feb 14, 2008
I usually really dislike blog entries that have nothing to say other than repackage bug descriptions and offer them up as knowledge, but in this case I have made an exception since the full impact of the bug is not fully described.
There is a fairly nasty Oracle bug with 10.2.0.3 that prevents the use of DirectIO with Solaris. The metalink note
"406472.1" describes the failure modes but fails to mention the performance impact if you use "filesystemio_options=setall" and fail to have the mandatory patch "5752399"
in place.
This was particularly troubling to me since we have been recommending for years the use of the "setall" to ensure all the proper filesystem options are set for optimal performance. I just finished working a customer situation where this patch was not installed and their critical batch run-times were nearly 4x as large... Not a pretty situation.... OK, So bottom line:
MAKE SURE YOU INSTALL "5752399" WHEN USING ORACLE 10.2.0.3 !!!
Friday Aug 04, 2006
"Why does Oracle call times() so often? Is something broken?
When using truss or dtrace to profile Oracle shadow processes, one often sees a lot of calls to "times". Sysadmins often approach me with this query.
root@catscratchb> truss -cp 7700
^C
syscall seconds calls errors
read .002 120
write .008 210
times .053 10810
semctl .000 17
semop .000 8
semtimedop .000 9
mmap .003 68
munmap .003 5
yield .002 231
pread .150 2002
kaio .003 68
kaio .001 68
-------- ------ ----
sys totals: .230 13616 0
usr time: 1.127
elapsed: 22.810
At first glance it would seem alarming to have so many times() calls, but how much does this really effect performance? This question can best be answered by looking at the overall "elapsed" and "cpu" time. Below is output from the "procsystime" tool included in the
Dtrace toolkit.
root@catscratchb> ./procsystime -Teco -p 7700
Hit Ctrl-C to stop sampling...
^C
Elapsed Times for PID 7700,
SYSCALL TIME (ns)
mmap 17615703
write 21187750
munmap 21671772
times 90733199 <<== Only 0.28% of elapsed time
semsys 188622081
read 226475874
yield 522057977
pread 31204749076
TOTAL: 32293113432
CPU Times for PID 7700,
SYSCALL TIME (ns)
semsys 1346101
yield 3283406
read 7511421
mmap 16701455
write 19616610
munmap 21576890
times 33477300 <<== 10.6% of CPU time for the times syscall
pread 211710238
TOTAL: 315223421
Syscall Counts for PID 7700,
SYSCALL COUNT
munmap 17
semsys 84
read 349
mmap 350
yield 381
write 540
pread 3921
times 24985 <<== 81.6% of syscalls.
TOTAL: 30627
According to the profile above, the times() syscall accounts for only 0.28% of the overall response time. It does use 10.6% of sys CPU. The usr/sys CPU percentages are "83/17" for this application. So, using the 17% for system CPU we can calculate the overall amount of CPU for the times() syscall: 100*(.17*.106)= 1.8%.
Oracle uses the times() syscall to keep track of timed performance statistics. Timed statistics can be enabled/disabled by setting the init.ora parameter "TIMED_STATISTICS=TRUE". In fact, it is an *old* benchmark trick to disable TIMED_STATISTICS after all tuning has been done. This is usually good for another 2% in overall throughput. In a production environment, it is NOT advisable to ever disable TIMED_STATISTICS. These statistics are extremely important to monitor and maintain application performance. I would argue that disabling timed statistics would actually hurt performance in the long run.
Hi Glenn,
I fear you show one of the biggest probl...
Thanks for the feedback... Indeed the strengt...
Hi Glenn,
can you provide some more informa...
Thanks Martin for testing on 11gR1. The environme...