GIRI MANDALIKA's SCRATCHPAD
Oracle Business Intelligence : Workaround / Solution to "[46036] Internal Assertion" Error
Symptom:
When checking in changes to Oracle BI repository (RPD), Admintool fails with an error message:
[46036] Internal Assertion: Condition FALSE, file server/Utility/Generic/NQThreads/SUGThread.cpp, line 515
Solution / Workaround:
Edit <BI_HOME>/server/Config/NQSConfig.INI configuration file to increase the value of SERVER_THREAD_STACK_SIZE parameter. Replace the line SERVER_THREAD_STACK_SIZE = 0; with SERVER_THREAD_STACK_SIZE = 512 KB; and restart the Analytics server (SAS)
Posted at 08:49PM Jul 11, 2009 by Giri Mandalika in Enterprise | Comments[0]
Another MySQL Connector/C++ Webinar on July 9th
On May 20th, Ulf Wendel and Andrey Hristov from the MySQL Connectors development team delivered a webinar around MySQL Connector/C++. We have another Connector/C++ webinar scheduled for July 9th (Thursday) at 10:00 AM PT. This time, I will be talking about the Connector/C++ driver for about 40 minutes. I am not involved with the development of Connector/C++ in any way. However I have been playing with, and evangelizing this driver for the past few months - so in a way I'm qualified to talk about the driver and its features. As of now, I have the following topics in the agenda.
* What is MySQL Connector/C++?
* High Level Architecture
* Installation, Dependencies
* Implemented Classes
* Driver, Connection
* Statement, PreparedStatement
* ResultSet
* ResultSetMetaData, ParameterMetaData, DatabaseMetaData
* Savepoint
* Transactions
* Stored Procedures
* Exceptions
* Debug Tracing
It is free for all the registered users. If interested, please check the following web page for the registration and for the WebEx access details.
Getting the Most Out of the New MySQL Connector/C++
[Updated 07/10/2009]
Presentation material from the Webinar: Let's Explore MySQL Connector/C++
Check the "Developing Database Applications Using MySQL Connector/C++" white paper in pdf or HTML format.
Posted at 01:49AM Jul 06, 2009 by Giri Mandalika in MySQL | Comments[0]
Sun Studio: Debugging Multi-Threaded Applications with dbx
(Crossposting the three and half year old blog entry "as is" from my other blog hosted on blogger. It needs some serious editing, but I believe the content is still relevant. Source URL: http://technopark02.blogspot.com/2005/12/sun-studio-debugging-multi-threaded.html)Multi-threading lets different tasks to run concurrently in a single process, hence multi-threaded programs would run faster [or achieve better throughput] on machines with multiple processors and on CPUs with multiple cores. On an SMP (Symmetric Multi-Processing system, where multiple processors share a single memory system) system with no CMT (Chip Multi-Threading), software threads are executed on different processors; and on an SMP system with CMT, the threads are executed on cores, and logical processors in CMP (Chip Multi-Processing) processors. As revolutionary chip designs are evolving, many important commercial applications like Oracle, SAP, Siebel, PeopleSoft are designed to be multi-threaded.
Debugging a multi-threaded (MT in short) application is a bit hard, due to the number of software threads running in parallel, compared to a single threaded program where only one task will be running per process, at any given time. Thread synchronization plays an important role when concurrently running threads have to share global resources. Improperly synchronized threads may starve, and lead to unnecessary dead locks, and race conditions. So, it is good to have an MT aware debugger handy, during development and in support phases of software life cycle, to debug threading issues.
Fortunately on Solaris, Sun Studio's debugger, dbx, has support for MT applications that are designed to use Solaris threads, and/or POSIX threads. With dbx, it is possible to get information like thread state, stack trace, locks from all threads, navigate between threads, suspend/resume threads, put break points in a thread and can do step by step execution in a function in a designated thread. Note that Solaris Modular Debugger (mdb) also has support for MT programs; but this blog post concentrates on Studio's dbx.
Siebel processes were used to show various dbx commands in the following examples. Siebel is a multi-threaded application, written in C/C++.
Core dump analysisThe following example shows some useful commands to get the stack trace in the thread, where the process crashed. For more information about dbx commands, type help or help <command> in dbx environment ie., at dbx prompt.
% ls -lh core -rw------- 1 giri other 273M Dec 9 16:56 core % file core core: ELF 32-bit MSB core file SPARC Version 1, from 'siebprocmw' % /opt/SS11/SUNWspro/prod/bin/dbx siebprocmw core For information about new features see `help changes' To remove this message, put `dbxenv suppress_startup_message 7.5' in your .dbxrc Reading siebprocmw core file header read successfully Reading ld.so.1 Reading libsslcwsl.so Reading libssscsci.so Reading libssscscf.so ... ... Reading libsbcfui.so Reading libsbcfuiapps.so t@1 (l@1) terminated by signal KILL (Killed) 0xfd2bc7e0: ___nanosleep+0x0008: blu _cerror ! 0xfd2206a0Since we don't know which thread crashed the process, let's list all known threads with
threads command. threads -all lists all threads, including zombies.(dbx) threads
> t@1 a l@1 ?() LWP suspended in ___nanosleep()
t@2 b l@2 MwTimerThread() LWP suspended in __pollsys()
t@3 b l@3 MwAsyncSignalThread() sleep on 0xfd874078 in __lwp_park()
t@4 b l@4 MwThread() LWP suspended in __pollsys()
t@5 b l@5 MwThread() LWP suspended in __pollsys()
o t@6 b l@6 MwThread() signal SIGABRT in __lwp_kill()
t@7 b l@7 MwThread() LWP suspended in __pollsys()
t@9 b l@9 MwThread() LWP suspended in ___nanosleep()In the above list, t@1 is the current thread, which is indicated by ">", and the start function is not known (indicated with a "?()").
(dbx) thread current thread ($thread) is t@1 (dbx) where current thread: t@1 =>[1] ___nanosleep(0x4, 0xffbfd9a8, 0x0, 0xff000000, 0x0, 0x0), at 0xfd2bc7e0 [2] _sleep(0x64, 0x0, 0xfd2e8bc0, 0xfd0e2000, 0xfd0e2000, 0x0), at 0xfd2afaa0 [3] thr_t::do_thr_action(0xfd86ba10, 0xc, 0x1608, 0xfd86ba20, 0x1, 0x2), at 0xfd770e14 [4] thr_t::t_sleep(0xfb80f5c0, 0x0, 0xffbfdb0e, 0xffbfdb08, 0xfd8546cc, 0xffffffff), at 0xfd770c58 [5] MwWaitForMultipleObjects(0xfb80f5c0, 0x2, 0xfb80f5c8, 0x2, 0xffffffff, 0x9cd48), at 0xfd774dd4 [6] WaitForMultipleObjectsEx(0x2, 0xffbfde3c, 0x0, 0x100000, 0x0, 0x9cd48), at 0xfd77fe9c [7] OSDNTWait::WaitForThread(0xc, 0xffffffff, 0xffbfdecc, 0xd0108, 0x1004f, 0xff8a1d64), at 0xffa7b050 [8] OSDWaitTid(0xc, 0xffffffff, 0xffbfe7c4, 0x0, 0xc, 0xc), at 0xff05f1c4 [9] scfEventFacility::scfEventFac::ShutdownCmd(0xe14450, 0x1, 0x7, 0xfe4de0f4, 0xffbfe7c8, 0xff48f8d4), at 0xff819884 [10] scfEventFacility::scfEventFac::Shutdown(0xffbfe96c, 0xff877530, 0x0, 0x5e000, 0xff874e8c, 0x5e114), at 0xff819390 [11] ScfSisDetach(0x0, 0x0, 0x0, 0xffffffff, 0xffbfe96c, 0xfc81c), at 0xff781ed4 [12] _shutdown(0x6479c, 0x0, 0x651a8, 0x651a8, 0x7, 0x0), at 0x49c7c [13] wmain(0x12a, 0x6479c, 0x0, 0x0, 0xffbfedac, 0x6479c), at 0x4995c [14] main(0xfd85f310, 0xc94, 0xffbfef90, 0x54, 0xfd85f310, 0xc00), at 0x4d3cc
This is not exactly what we are looking for. The above call stack shows where the current thread (t@1) is waiting. Since our interest is to find out the thread that is responsible for the process crash, we need to look for an o before the thread id. t@6 is the ill fated thread in the list of all known threads; and the process was killed because of a SIGABRT in lwp_kill method. Note that OS provides the necessary abstraction for creating, and destroying threads; and also has the freedom of killing malfunctioning threads when things go haywire. In this example, __lwp_kill() was called by the operating system, due to some event which we are going to investigate.
thread -info <tid> command provides more information like what exactly happened in application code that triggered the forcible shutdown.
(dbx) thread -info t@6
Thread t@6 (0xfcb80c00) at priority 0
state: bound to l@6
base function: 0xfd770ff4: MwThread() stack: 0xfa380000[524288]
flags: BOUND|DETACHED|SUSPENDED
masked signals: SEGV
Currently active in __lwp_killObserve that kernel trapped an illegal memory access with a SEGV signal. The default behavior for a SEGV, is to shutdown the process with a possible core file generation (aka core dump). Let's switch to thread t@6 with thread <tid> command, and get to the instruction which raised the segmentation fault.
(dbx) thread t@6
t@6 (l@6) stopped in __lwp_kill at 0xfd2bd5ec
0xfd2bd5ec: __lwp_kill+0x0008: bcc,a,pt %icc,__lwp_kill+0x18 ! 0xfd2bd5fc
(dbx) thread
current thread ($thread) is t@6
(dbx) where
current thread: t@6
=>[1] __lwp_kill(0x0, 0x6, 0x0, 0x6, 0xffff0000, 0x0), at 0xfd2bd5ec
[2] raise(0x6, 0x0, 0xfd2a1af4, 0x42770, 0xfd2e4278, 0x6), at 0xfd25d884
[3] abort(0xe15220, 0x1, 0x0, 0xa6544, 0xfd2e7298, 0x0), at 0xfd23de38
[4] SehScanInvokeTryList(0x44bd308, 0x108000, 0xfd8571c4, 0x0, 0x2, 0x0), at 0xfd74c9d4
[5] Signal_Handler::raise(0xc0000005, 0xfa37cde8, 0x0, 0x2, 0xfa37cc80, 0x1800), at 0xfd74d778
[6] Raise_Exception::operator()(0x67670, 0xb, 0xfa37d0a0, 0xfa37cde8, 0xfd86a07c, 0x2c), at 0xfd74d8dc
[7] __sighndlr(0xb, 0xfa37d0a0, 0xfa37cde8, 0xfd74d7c8, 0x0, 0x1), at 0xfd2bc52c
---- called from signal handler with signal 11 (SIGSEGV) ------
[8] CSSSqlObj::GetTrxDbConn(0x458a7d8, 0x0, 0x1394478, 0x64c00, 0x0, 0x4611290), at 0xf91de72c
[9] CSSSqlObj::Execute(0x4611290, 0x0, 0x0, 0x0, 0x0, 0xfe4dd294), at 0xf91c7b98
[10] CSSBusComp::SqlExecute(0x4606640, 0x0, 0x0, 0x0, 0x1, 0x4b22e84), at 0xf9a9c160
[11] CSSBCBase::SqlExecute(0x4606640, 0x0, 0xfa37d6fc, 0x0, 0x1, 0xf57be3e8), at 0xf56c2294
[12] CSSBusComp::Execute(0x0, 0x0, 0x0, 0x0, 0x4606640, 0xfa37d7cc), at 0xf9a6b118
[13] CSSMsgBoardMaintSvc::UpdTaskHistory(0x44b5ae0, 0xfa37df90, 0x0, 0x4567d14, 0xf8611198, 0x489cd94), at 0xf85f2d48
[14] CSSMsgBoardMaintSvc::HandleEventDataList(0x44b5ae0, 0x43a0018, 0xff486b38, 0x0, 0xfa37e0ac, 0xf8611198), at 0xf85f5afc
[15] CSSMsgBoardMaintSvc::ReadTaskHistory(0x44b5ae0, 0x43a0018, 0xf85f4e60, 0x44b5ae0, 0x43a0018, 0x1), at 0xf85f53c0
[16] scfEventFacility::scfEventFac::CallRegSub(0x2a59448, 0x4109bd8, 0x0, 0x0, 0x8, 0x2), at 0xff81ad20
[17] scfEventFacility::scfEventFac::HandleCurrProcEvents(0xe14450, 0x7530, 0xe14450, 0xff432ef0, 0xff874e8c, 0x1),
at 0xff81b19c
[18] scfEventFacility::scfEventFac::scfEventThreadMain(0x0, 0x0, 0x0, 0x7400, 0xfa37fc90, 0xd0001), at 0xff81a7dc
[19] OSDWslThreadStart(0x101d58, 0xff81a580, 0x101d58, 0x6, 0x0, 0x101d70), at 0xff05bec8
[20] _AfxThreadEntry(0xffbfde34, 0xe9568, 0x0, 0x1, 0x0, 0x17289c), at 0xfeb95730
[21] MwThread(0x1, 0x0, 0x1, 0x0, 0xfd86bed0, 0xe15220), at 0xfd771230From the above stack trace it is clear that the binary doesn't contain necessary debug information to show high level instructions; so, let's try to get the disassembly with dis command.
(dbx) dis GetTrxDbConn / 50 More than one identifier 'GetTrxDbConn'. Select one of the following: 0) Cancel 1) `libsscfdm.so`#__1cPCSSModelPhysDefMGetTrxDbConn6MpkH_pnJCSSDbConn__ [non -g, demangles to: CSSModelPhysDef::GetTrxDbConn(const unsigned short*)] 2) `libsscfdm.so`#__1cJCSSSqlObjMGetTrxDbConn6kM_pnJCSSDbConn__ [non -g, demangles to: CSSSqlObj::GetTrxDbConn()const] > 2 0xf91de6c0: GetTrxDbConn : save %sp, -96, %sp 0xf91de6c4: GetTrxDbConn+0x0004: mov %i0, %i5 0xf91de6c8: GetTrxDbConn+0x0008: ld [%i0 + 388], %i0 0xf91de6cc: GetTrxDbConn+0x000c: cmp %i0, 0 0xf91de6d0: GetTrxDbConn+0x0010: be,pn %icc,GetTrxDbConn+0x60 ! 0xf91de720 0xf91de6d4: GetTrxDbConn+0x0014: sethi %hi(0x5b400), %l6 0xf91de6d8: GetTrxDbConn+0x0018: call GetTrxDbConn+0x20 ! 0xf91de6e0 0xf91de6dc: GetTrxDbConn+0x001c: mov %o7, %o7 0xf91de6e0: GetTrxDbConn+0x0020: sethi %hi(0x2d1400), %o5 0xf91de6e4: GetTrxDbConn+0x0024: xor %l6, 88, %l4 0xf91de6e8: GetTrxDbConn+0x0028: inc 420, %o5 0xf91de6ec: GetTrxDbConn+0x002c: sethi %hi(0x1000), %l5 0xf91de6f0: GetTrxDbConn+0x0030: add %o5, %o7, %l3 0xf91de6f4: GetTrxDbConn+0x0034: add %l5, 868, %l1 0xf91de6f8: GetTrxDbConn+0x0038: add %l3, %l4, %l2 0xf91de6fc: GetTrxDbConn+0x003c: ld [%l2], %l0 0xf91de700: GetTrxDbConn+0x0040: ld [%l0 + %l1], %o4 0xf91de704: GetTrxDbConn+0x0044: cmp %o4, 0 0xf91de708: GetTrxDbConn+0x0048: be,a,pn %icc,GetTrxDbConn+0x68 ! 0xf91de728 0xf91de70c: GetTrxDbConn+0x004c: ld [%i5 + 128], %i2 0xf91de710: GetTrxDbConn+0x0050: ld [%o4 + 88], %l7 0xf91de714: GetTrxDbConn+0x0054: cmp %i5, %l7 0xf91de718: GetTrxDbConn+0x0058: bne,a,pn %icc,GetTrxDbConn+0x68 ! 0xf91de728 0xf91de71c: GetTrxDbConn+0x005c: ld [%i5 + 128], %i2 0xf91de720: GetTrxDbConn+0x0060: ret 0xf91de724: GetTrxDbConn+0x0064: restore %g0, 0, %o0 0xf91de728: GetTrxDbConn+0x0068: ld [%i2 + 188], %i1 0xf91de72c: GetTrxDbConn+0x006c: ld [%i1 - 16], %i3 0xf91de730: GetTrxDbConn+0x0070: cmp %i3, 0 0xf91de734: GetTrxDbConn+0x0074: bge,pn %icc,GetTrxDbConn+0x90 ! 0xf91de750 0xf91de738: GetTrxDbConn+0x0078: add %i2, 188, %i4 0xf91de73c: GetTrxDbConn+0x007c: clr %o0 0xf91de740: GetTrxDbConn+0x0080: call RequiredConditionIsFalse [PLT] ! 0xf94b0684 0xf91de744: GetTrxDbConn+0x0084: mov 84, %o1 0xf91de748: GetTrxDbConn+0x0088: ld [%i4], %i1 0xf91de74c: GetTrxDbConn+0x008c: ld [%i5 + 388], %i0 0xf91de750: GetTrxDbConn+0x0090: call GetTrxDbConn ! 0xf90e0e00 0xf91de754: GetTrxDbConn+0x0094: restore %g0, 0, %g0 0xf91de758: GetTrxDbConn+0x0098: unimp 0x0 ... ...
To see the actual C++ instruction which seg faulted, compile the binary with -g (debug) option, and reproduce the crash. If the source code is readable from the location where you run the dbx session, you will see the actual high level instructions.
Some fun with an active process
The objective of this section is to show how to use some of the dbx commands to get some useful information, from a running MT process.
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
2754 giri 399M 302M sleep 59 0 0:00:34 2.0% siebmtshmw/21
% /opt/SS11/SUNWspro/prod/bin/dbx - 2754
For information about new features see `help changes'
To remove this message, put `dbxenv suppress_startup_message 7.5' in your .dbxrc
Reading -
Reading ld.so.1
Reading libsslcwsl.so
Reading libssscsci.so
Reading libssscscf.so
...
...
Reading libsscasvbc.so
Reading libswcasvfr.so
Attached to process 2754 with 21 LWPs
t@1 (l@1) stopped in __pollsys at 0xfd13d1c4
0xfd13d1c4: __pollsys+0x0004: ta 8
(dbx) threads
> t@1 a l@1 ?() running in __pollsys() <- t@1 is always the default current thread under dbx
t@2 b l@2 MwTimerThread() sleep on 0xfb80f4c0 in __lwp_park()
t@3 b l@3 MwAsyncSignalThread() sleep on 0xfd774078 in __lwp_park()
t@4 b l@4 MwThread() running in __pollsys()
t@5 b l@5 MwThread() running in __pollsys()
t@6 b l@6 MwThread() sleep on 0xf9b7eb80 in __lwp_park()
t@7 b l@7 MwThread() running in __pollsys()
t@8 b l@8 MwThread() running in _so_recv()
t@9 b l@9 MwThread() sleep on 0xf927fb68 in __lwp_park()
t@10 b l@10 MwThread() sleep on 0xf877f500 in __lwp_park()
t@11 b l@11 MwThread() sleep on 0xf867fa40 in __lwp_park()
t@12 b l@12 MwThread() sleep on 0xf857fa50 in __lwp_park()
t@13 b l@13 MwThread() sleep on 0xf847fa38 in __lwp_park()
t@14 b l@14 MwThread() running in __pollsys()
t@15 b l@15 MwThread() sleep on 0xf827f490 in __lwp_park()
t@16 b l@16 MwThread() running in __pollsys()
t@17 b l@17 MwThread() sleep on 0xf807f490 in __lwp_park()
t@18 b l@18 MwThread() running in __pollsys()
t@19 b l@19 MwThread() sleep on 0xf4c7f490 in __lwp_park()
t@20 b l@20 MwThread() running in __pollsys()
t@21 b l@21 MwThread() sleep on 0xf4a7f490 in __lwp_park()Put a break point in thread 21 (t@21) for all calls to memcpy():
(dbx) stop in memcpy -thread t@21 More than one identifier 'memcpy'. Select one of the following: 0) Cancel 1) `libc.so.1`memcpy 2) `libc_psr.so.1`memcpy a) All > a dbx: warning: 'memcpy' has no debugger info -- will trigger on first instruction dbx: warning: 'memcpy' has no debugger info -- will trigger on first instruction Will create handlers for all 2 hits (2) stop in _private_memcpy -thread t@21 <- implicit break point set by dbx (3) stop in _memcpy -thread t@21 <- implicit break point (dbx) cont t@21 (l@21) stopped in _memcpy at 0xfe1f04c0 0xfe1f04c0: _memcpy : nop
Note that dbx is synchronous -- when any thread or lightweight process (LWP) stops, all other threads and LWPs stop as well.
(dbx) thread current thread ($thread) is t@21 (dbx) where current thread: t@21 =>[1] _memcpy(0x5080e14, 0xff406b38, 0x2, 0x36, 0x1, 0x6c), at 0xfe1f04c0 [2] SSstring::GetWriteBuffer(0xf4a7e6ac, 0xff406b28, 0xff874e8c, 0x32, 0x0, 0xff3b2ef0), at 0xff31ffcc [3] sciProcState::sciBlock::FormatLatchName(0xf4a7e6ac, 0x1, 0x7, 0x853c, 0xffa30bd8, 0x8400), at 0xffa02744 [4] sciProcState::sciProcState(0x5ad31f8, 0xf9fc0000, 0xf4a7e644, 0xff406b3c, 0x0, 0x0), at 0xffa012c4 [5] sciProcState::GetSciProcState(0xf4a7e7f8, 0x26fcb8, 0x5ad31f8, 0xff88db30, 0x5f5e4, 0x61e6c90), at 0xffa014f0 [6] SciCheckShutdown(0xf4a7e8cc, 0x34151f8, 0x74, 0x26fcb8, 0x0, 0x2ef798), at 0xff9fe0e4 [7] SciGetInterrupt(0x0, 0x6a20950, 0x0, 0xf4a7e864, 0x25cd94, 0x1da84), at 0xff9fde40 [8] _smiMessageQ::ProcessMessage(0x15f85c0, 0x6a20950, 0x0, 0x0, 0x24a360, 0x32e18f0), at 0x2158e4 [9] _smiMessageQ::ProcessRequest(0x3380c48, 0x6a20950, 0x191, 0x2, 0x5ae22f0, 0x15f85c0), at 0x21461c [10] _smiWorkQueue::ProcessWorkItem(0x15f98b8, 0x3380c48, 0x6a20950, 0x5ae2390, 0x0, 0x101f180), at 0x208d08 [11] _smiWorkQueue::WorkerTask(0x15f98b8, 0x5b7f6b8, 0x3326338, 0x1500e0, 0x0, 0x0), at 0x208764 [12] SmiThrdEntryFunc(0x32f72d8, 0x70000f, 0x700010, 0x0, 0x0, 0x0), at 0x1f7a0c [13] OSDWslThreadStart(0x3380568, 0x1f75a0, 0x3380568, 0x15, 0x0, 0x3380760), at 0xfefdbec8 [14] _AfxThreadEntry(0xf4b7de5c, 0x3386210, 0x0, 0x1, 0x0, 0x17289c), at 0xfeb95730 [15] MwThread(0x1, 0x0, 0x1, 0x0, 0xfd76bed0, 0x33cdc40), at 0xfd671230
Let's step into memcpy() with stepi, and observe how the thread state changes.
(dbx) stepi
t@21 (l@21) stopped in _memcpy at 0xfe1f04c4
0xfe1f04c4: _memcpy+0x0004: nop
(dbx) threads
t@1 a l@1 ?() running in __pollsys()
t@2 b l@2 MwTimerThread() sleep on 0xfb80f4c0 in __lwp_park()
t@3 b l@3 MwAsyncSignalThread() sleep on 0xfd774078 in __lwp_park()
t@4 b l@4 MwThread() running in __pollsys()
t@5 b l@5 MwThread() running in __pollsys()
t@6 b l@6 MwThread() sleep on 0xf9b7eb80 in __lwp_park()
t@7 b l@7 MwThread() running in __pollsys()
t@8 b l@8 MwThread() running in _so_recv()
t@9 b l@9 MwThread() sleep on 0xf927fb68 in __lwp_park()
t@10 b l@10 MwThread() sleep on 0xf877f500 in __lwp_park()
t@11 b l@11 MwThread() sleep on 0xf867fa40 in __lwp_park()
t@12 b l@12 MwThread() sleep on 0xf857fa50 in __lwp_park()
t@13 b l@13 MwThread() sleep on 0xf847fa38 in __lwp_park()
t@14 b l@14 MwThread() running in __pollsys()
t@15 b l@15 MwThread() sleep on 0xf827f490 in __lwp_park()
t@16 b l@16 MwThread() running in __pollsys()
o t@17 b l@17 MwThread() breakpoint in _memcpy()
o t@18 b l@18 MwThread() breakpoint in _memcpy()
o t@19 b l@19 MwThread() breakpoint in _memcpy()
t@20 b l@20 MwThread() running in __pollsys()
*> t@21 b l@21 MwThread() single stepped in _memcpy()In the above example, t@17, t@18 and t@19 are stopped at calls to memcpy(); and t@21 stepped into memcpy(). Get out of memcpy() with step up command.
(dbx) step up _memcpy returns 84413972 t@21 (l@21) stopped in SSstring::GetWriteBuffer at 0xff31ffd4 0xff31ffd4: GetWriteBuffer+0x0114: ld [%i1 + 4], %i2
Clear the break point (in current thread) with clear command
(dbx) cont t@21 (l@21) stopped in _memcpy at 0xfe1f04c0 0xfe1f04c0: _memcpy : nop (dbx) clear cleared (3) stop in _memcpy -thread t@21Locks
thread -blocks [<tid>] lists all locks held by the given thread, blocking other threads. If tid is not specified, dbx lists the locks held by the current thread. In the following example, t@21 (current thread) is not holding any locks.
(dbx) thread -blocks Locks held by t@21:
thread -blockedby [<tid>] shows the synchronization object (monitor) on which the given thread is blocked. If tid is not specified, dbx shows this information for the current thread. Note that only sleeping threads must be in blocked state.
(dbx) thread -blockedby t@10 Thread t@10 is blocked by: 0xf877f500 (0xf877f500): thread condition variable (dbx) thread -blockedby t@12 Thread t@12 is blocked by: 0xf857fa50 (0xf857fa50): thread condition variable (dbx) thread -blockedby t@17 Thread t@17 is not asleep
syncs command lists all synchronization objects ie., locks/monitors.
(dbx) syncs All locks currently known to libthread: 0x01020320 (0x01020320): thread mutex(unlocked) 0x010203f8 (0x010203f8): thread mutex(unlocked) 0xf827f490 (0xf827f490): thread condition variable 0xf827f4a0 (0xf827f4a0): thread mutex(unlocked) 0xf877f500 (0xf877f500): thread condition variable 0xf877f510 (0xf877f510): thread mutex(unlocked) 0xf927fb68 (0xf927fb68): thread condition variable 0xf927fb78 (0xf927fb78): thread mutex(unlocked) 0xf867fa40 (0xf867fa40): thread condition variable 0xf867fa50 (0xf867fa50): thread mutex(unlocked) 0xf9b7eb80 (0xf9b7eb80): thread condition variable 0xf9b7eb90 (0xf9b7eb90): thread mutex(unlocked) 0x015c2ed8 (0x015c2ed8): thread mutex(unlocked) 0x015c2f38 (0x015c2f38): thread mutex(unlocked) 0x015c2f18 (0x015c2f18): thread mutex(unlocked) 0x015c2dd8 (0x015c2dd8): thread mutex(unlocked) 0x015c34d8 (0x015c34d8): thread mutex(unlocked) 0x03325fb8 (0x03325fb8): thread mutex(unlocked) 0x033264b8 (0x033264b8): thread mutex(unlocked) 0x033261b8 (0x033261b8): thread mutex(unlocked) 0x017a6ce8 (0x017a6ce8): thread mutex(locked) 0xfa4f4314 (0xfa4f4314): process mutex(locked) 0x0332c438 (0x0332c438): thread mutex(unlocked) 0x0332c348 (0x0332c348): thread mutex(unlocked) 0x02fcd7e8 (0x02fcd7e8): thread mutex(unlocked) 0x0028f860 (0x0028f860): thread mutex(unlocked) __1cUCSSSISLocalTransSrvrKs_instLock_+0x8 (0xff1ee220): thread mutex(unlocked) 0x034150e8 (0x034150e8): thread mutex(unlocked) 0x034151d8 (0x034151d8): thread mutex(unlocked) __uberdata+0x80 (0xfd168c40): thread mutex(unlocked) 0x01878b98 (0x01878b98): thread mutex(unlocked) 0x01878aa8 (0x01878aa8): thread mutex(unlocked) 0xfa4c7e9c (0xfa4c7e9c): process mutex(unlocked) libc_malloc_lock (0xfd1676f8): thread mutex(unlocked) 0x0179cb30 (0x0179cb30): thread mutex(unlocked) 0x0179c830 (0x0179c830): thread mutex(unlocked) 0xfa5c2664 (0xfa5c2664): process mutex(unlocked) 0xfa5c2c94 (0xfa5c2c94): process mutex(unlocked) 0x0161dd90 (0x0161dd90): thread mutex(unlocked) 0x0101f6e0 (0x0101f6e0): thread mutex(unlocked) 0x0101f718 (0x0101f718): thread mutex(unlocked) 0x0101f770 (0x0101f770): thread mutex(unlocked) 0x0101f508 (0x0101f508): thread mutex(locked) 0x0101f5a8 (0x0101f5a8): thread mutex(unlocked) 0x015bfe90 (0x015bfe90): thread mutex(unlocked) 0x015bfe20 (0x015bfe20): thread mutex(unlocked) 0x015bfe58 (0x015bfe58): thread mutex(unlocked)
To get information about a synchronization object at a given address, use sync -info <address>
(dbx) sync -info 0x0028f860 0x0028f860 (0x28f860): thread mutex(unlocked) Lock is unowned No threads are blocked by this lock (dbx) sync -info 0xf877f500 0xf877f500 (0xf877f500): thread condition variable (dbx) sync -info 0xfd1676f8 libc_malloc_lock (0xfd1676f8): thread mutex(unlocked) Lock is unowned No threads are blocked by this lockTracing
trace command can be used to trace the executed source lines, function calls, or variable changes. The following example traces the thread creation, and prints a message whenever a thread gets created.
(dbx) trace thr_create
(4) trace thr_create
(dbx) cont
trace: thread created t@22 on l@22
trace: thread created t@23 on l@23
Reading libsrlcver.so
Reading libsscafsbc.so
...
(dbx) threads
*> t@1 a l@1 ?() signal SIGINT in __pollsys()
t@2 b l@2 MwTimerThread() sleep on 0xfb80f4c0 in __lwp_park()
t@3 b l@3 MwAsyncSignalThread() sleep on 0xfd774078 in __lwp_park()
...
...
t@20 b l@20 MwThread() running in __pollsys()
t@21 b l@21 MwThread() sleep on 0xf4a7f490 in __lwp_park()
t@22 b l@22 MwThread() running in __pollsys() <- new thread
t@23 b l@23 MwThread() sleep on 0xea6ff490 in __lwp_park() <- new threadIn the above example, there is no information about who created the threads t@22 & t@23. Even to get that information, use when command as shown below:
(dbx) when thr_create { echo "New thread $newthread was created by thread $thread"; }
(6) when thr_create { kprint "New thread ${newthread} was created by thread ${thread}"; }
(dbx) cont
New thread t@24 was created by thread t@10
New thread t@25 was created by thread t@24$newthread and $thread are pre-defined variables of dbx, which holds the thread ID of a newly created thread, and the thread ID of the current thread, respectively.
Similarly thread exits can be traced as follows:
(dbx) trace thr_exit (5) trace thr_exit (dbx) cont New thread t@26 was created by thread t@10 New thread t@27 was created by thread t@26 trace: thr_exit t@27Suspending/Resuming threads
To suspend the execution of a thread, run the command thread -suspend <tid>; to resume the suspended thread, thread -resume <tid>
(dbx) thread -suspend t@26 Thread t@26 suspended (dbx) thread -resume t@26 Thread t@26 unsuspendedBreak point with
stop commandThe following example shows how to set a break point to stop the execution, when a new thread with id t@34 gets created.
(dbx) stop thr_create t@34 (9) stop thr_create t@34 (dbx) cont t@10 (l@10) stopped in tdb_event_create at 0xfd1377e8 0xfd1377e8: tdb_event_create : retl trace: thread created t@34 on l@34 (dbx) where <- who initiated the new thread creation? entire call stack current thread: t@10 =>[1] tdb_event_create(0x2, 0x1084, 0x3ff, 0x0, 0xfc8e1c00, 0x1000), at 0xfd1377e8 [2] _thrp_create(0x180, 0x10f8, 0xfd1377e8, 0x1e, 0xc1, 0xfde32000), at 0xfd138c04 [3] _pthread_create(0xf877f310, 0x0, 0xfd670ff4, 0xf877f318, 0x0, 0xfd168bc0), at 0xfd12d104 [4] MwCreateThread(0x0, 0xfeb95630, 0xf877f414, 0x4, 0x0, 0x9383cb0), at 0xfd671460 [5] CreateThread(0x0, 0x0, 0xfeb95630, 0xf877f414, 0x4, 0x9383cb0), at 0xfd67d124 [6] CWinThread::CreateThread(0x9383c80, 0x4, 0x0, 0x0, 0xfd164278, 0x88cabc9), at 0xfeb95f1c [7] AfxBeginThread(0xffa7a420, 0x88cabc0, 0x0, 0x0, 0x4, 0x0), at 0xfeb958a4 [8] WslCreateThread(0xfefdbe00, 0x5c135c0, 0x0, 0x88cabc0, 0xf877f584, 0x16b8c), at 0xffa7a4cc [9] OSDCreateThread(0x211200, 0x5b40660, 0x0, 0x0, 0x5ab1590, 0x5c135c0), at 0xfefdc16c [10] SmiDispatchThrdMain(0x101f180, 0x5ab1588, 0x5ab1590, 0xf877fd64, 0xf877fcec, 0xff40f8d4), at 0x1f53f4 [11] OSDWslThreadStart(0x10b8ad0, 0x1f5240, 0x10b8ad0, 0xa, 0x0, 0x15d07e8), at 0xfefdbec8 [12] _AfxThreadEntry(0xffbfeaac, 0x2f4948, 0x0, 0x1, 0x0, 0x17289c), at 0xfeb95730 [13] MwThread(0x1, 0x0, 0x1, 0x0, 0xfd76bed0, 0x15cd558), at 0xfd671230Light Weight Processes (LWPs)
Application (user) threads are not visible to the kernel. Kernel treats light weight processes (LWPs) as the only schedulable entities within a process. LWPs bridge the user level and kernel level threads. Each process contains one or more LWPs; and each LWP is associated with a kernel thread. Prior to Solaris 9, each of LWPs would run one or more user level threads (ie., 1xN). From Solaris 9 onwards, there is one LWP for every user level thread (ie., 1x1).
Use lwps command to list all LWPs in the process.
(dbx) lwps l@1 running in _private_mprotect() l@2 running in __lwp_park() l@3 running in __lwp_park() l@4 running in __pollsys() l@5 running in __pollsys() l@6 running in __lwp_park() l@7 running in __pollsys() l@8 running in _so_recv() l@9 running in __lwp_park() l@10 running in __lwp_park() l@11 running in __lwp_park() l@12 running in __lwp_park() l@13 running in __time() l@14 running in __pollsys() l@15 running in __lwp_park() l@16 running in __pollsys() o l@17 breakpoint in SSstring::GetWriteBuffer() l@18 running in __lwp_unpark() o l@19 breakpoint in SSstring::GetWriteBuffer() l@20 running in __pollsys() *>l@21 breakpoint in SSstring::GetWriteBuffer()
lwp command displays the current LWP. To switch to a different LWP, use lwp <lwpid>. lwp -info [<lwpid>] shows some useful information for a given LWP.
(dbx) lwp current LWP ($lwp) is l@21 (dbx) lwp -info l@21 breakpoint in SSstring::GetWriteBuffer() masked signals are: (dbx) lwp -info l@12 l@12 running in __lwp_park() masked signals are: (dbx) lwp l@18 t@18 (l@18) stopped in __pollsys at 0xfd13d1c4 0xfd13d1c4: __pollsys+0x0004: ta 8
Scalability issues
In general, MT applications that make heavy use of the standard {Solaris operating system's} memory allocator, may exhibit poor scalability. This problem occurs when multiple threads are in malloc() or free() waiting to obtain the mlock.
If the application suffers from this scalability issue, the top of the thread stacks (which can be obtained using either dbx or pstack command) will appear as below:
lwp_park mutex_lock_queue slow_lock freeor
lwp_park mutex_lock_queue slow_lock malloc
One such problem was described in this Solaris forum's thread slow_lock making application hang.
MT aware memory allocatorsmtmalloc, umem libraries of Solaris distribution will resolve this kind of scalability problem. libmtmalloc was introduced in Solaris 7; and libumem was introduced in Solaris 9 Update 3. These userland memory allocators are packaged as a drop-in replacement to the standard malloc() and free() library calls; so, to take advantage of these allocators, link the MT application with any of these allocators.
mtmalloc, umem allocators are a redesign of the standard library; and hence results in finer grained locking. These libraries will significantly outperform the standard library in cases where multiple concurrent requests are made to the memory allocator. In the case of a single threaded application, the standard memory allocator will however provide better performance. The standard memory allocator also provides a smaller memory footprint. Note that the trade-off with mtmalloc, umem allocators is much bigger memory footprint, due to the way the memory gets allocated. For these reasons the standard memory allocator may be preferred in cases where the advantages of mtmalloc and umem, do not apply. Make sure to experiment with these memory allocators to see which one fits best for your application.
Linking with mtmalloc or umemAt compile time, the application can be linked against mtmalloc or umem library. Adding -lmtmalloc or -lumem, option to the link line results in the application being linked appropriately.
% cc -mt -o my_program my_program.c -lmtmalloc or
% cc -mt -o my_program my_program.c -lumem
You can check the library dependency with ldd my_program.
If re-building the application by linking with mtmalloc or umem, is not feasible, either of these libraries can be preloaded with LD_PRELOAD environment variable, when the program is executed.
% setenv LD_PRELOAD libmtmalloc.so
% ./my_program
or
% setenv LD_PRELOAD libumem.so
% ./my_program
You can verify whether the library is preloaded, with pldd `pgrep my_program`.
Posted at 01:46AM Jun 23, 2009 by Giri Mandalika in Benchmarks | Comments[0]
Invitation to a Free MySQL Connector /C++ Webinar
This Wednesday (May 20, 2009) Andrey Hristov and Ulf Wendel from the MySQL Connector development team are going to talk about the MySQL Connector for C++ at 10:00 AM PT. Andrey and Ulf are planning to talk about the architecture, API, portability, support for: buffered/unbuffered result sets, prepared statements, stored procedures; and the planned features for Connector/C++ 1.0.6 GA.
Plan to attend if you are a C++ developer working [or planning to work] on MySQL database applications. Anyone can attend this webinar for free, and it may last for about 45 minutes. Register at the following location to receive further instructions on how to join the web conference:
Register for the new MySQL Connector/C++ webinar
Posted at 12:07AM May 19, 2009 by Giri Mandalika in MySQL | Comments[2]
Installing Siebel Web Extension (SWE) on top of Sun Java System Web Server 7.0
As of today, Sun Java System Web Server 7.0 is not a certified platform to deploy Siebel 8.0 enterprise on. We are working with Oracle Corporation to make this certification happen so our customers can take advantage of the performance optimizations that went into the web server release 7.0.
Meanwhile those who want to give it a try can do so with little effort. In release SJSWS 7.0, the start/stop/restart/.. scripts were appropriately relocated to bin directory under the virtual web server instance. The installer for Siebel 8.0 Web Server Extension looks for the start script [of the web server] under the home directory of the virtual web server instance. (because it was the default location until the release of SJSWS 7.0). The installation fails if the installer cannot find the start script in the location it is expecting it to be.
Due to the relocation mentioned above, installation of the Siebel Web Server Extension fails at the very last step where it tries to modify the start script with a bunch of LD_PRELOADs so the Siebel Web Extension loads up and runs on the Sun Java System Web Server. To get around this failure, all you have to do is to create a symbolic link in the home directory of the virtual web server instance pointing to the startserv script residing in the bin directory.
The following example shows the necessary steps.
% pwd /export/pspp/SJWS7U5/https-siebel-pspp % ln -s bin/startserv start % ls -l start lrwxrwxrwx 1 pspp dba 13 May 17 17:01 start -> bin/startserv
Install Siebel Web Extension in the normal way. No other changes are required.
AFTER SWE INSTALLATION:
% ls -l start* -rwxr-xr-x 1 pspp dba 4157 May 17 17:38 start -rwxr-xr-x 1 pspp dba 3456 May 17 17:38 start_.bak % mv bin/startserv bin/startserv.orig % mv start bin/startserv
Notice that the Siebel installer actually made two copies of the startup script from the symbolic link. The original bin/startserv remained intact after the SWE installation.
Finally start the Web Server instance by running the startserv script. It should start with no issues.
% pwd /export/pspp/SJWS7U5/https-siebel-pspp/bin % ./startserv Sun Java System Web Server 7.0U5 B03/10/2009 16:38 info: swe_init reports: SWE plug-in log file info: CORE5076: Using [Java HotSpot(TM) Server VM, Version 1.5.0_15] from [Sun Microsystems Inc.] info: HTTP3072: http-listener-1: http://siebel-pspp:8000 ready to accept requests info: CORE3274: successful server startup
Before we conclude, do not forget the fact that Sun Java System Web Server 7.0 is not yet certified with Siebel 8.0 release. Use the instructions mentioned in this blog post at your own risk. However if you do like to take that risk, consider installing the latest release of Sun Java System Web Server, which is SJSWS 7.0 Update 5 as of this writing.
Stay tuned for the certification news though.
Posted at 02:32AM May 18, 2009 by Giri Mandalika in Enterprise | Comments[1]
Saturday Jul 11, 2009
