Saurabh Mishra's Weblog

20050721 Thursday July 21, 2005

An interesting signal delivery related problem

Recently, we found an interesting performance problem using Dtrace. The program was when using Virtual timer created using setitimer(2). The interval passed was 10m (one clock tick) but SIGVTALRM signal used to arrive late and sometimes 6 ticks or more. Now how will you Dtrace the code and from where will you start tracing? I'll start tracing from signal generation to delivery. In Solaris kernel to post a signal we use sigtoproc() and eat_signal() is  called on the thread to make the thread on proc (TS_ONPROC) depending upon the state (TS_RUN, TS_SLEEP, TS_STOPPED). psig() is called we kernel finds a pending signal (for instance when returning from trap).

The program spins in userland after setting up the timer. Since the state of thread would be TS_ONPROC, it would be required to poke the  target CPU if thread happens to be running on different CPU. So I started tracing following functions: sigtoproc(), eat_signal(), poke_cpu() and psig(). Now lets take a look at the Dtrace probes output:


CPU Probe ID              Function
  8  11263                 eat_signal:entry  1027637980027920 sig : 28
  8   2981                   poke_cpu:entry  1027637980030560 cpu : 9
  8  11263                 eat_signal:entry  1027637990025440 sig : 28
  8   2981                   poke_cpu:entry  1027637990032160 cpu : 9
  8  11263                 eat_signal:entry  1027638000036320 sig : 28
  8   2981                   poke_cpu:entry  1027638000043600 cpu : 9
  8  11263                 eat_signal:entry  1027638010025520 sig : 28
  8   2981                   poke_cpu:entry  1027638010032240 cpu : 9
  8  11263                 eat_signal:entry  1027638020023840 sig : 28
  8   2981                   poke_cpu:entry  1027638020031280 cpu : 9
  8  11263                 eat_signal:entry  1027638030028720 sig : 28
  8   2981                   poke_cpu:entry  1027638030035920 cpu : 9
  8  11263                 eat_signal:entry  1027638040024480 sig : 28
[.]
  9   8317                       psig:entry  1027638170086480 sig : 28

If you calculate the difference (ie timestamp) between psig() and the first eat_signal(), you will notice that the difference is huge.

1027638170086480-1027637980027920
190058560 = 19 ticks (190 ms)


We also noticed that CPU 8 (from where sigtoproc() is being called by clock_tick()) is poking CPU 9, however CPU 9 is not preempting the current running thread (program which is spinning). So why and how will it happen? In order to understand this, I'll first describe a bit on how preemption works in Solaris. In order to preempt a running thread, kernel sets t_astflag (using aston() macro) and also sets appropriate CPU preemption flag. There are two CPU preemption flags viz: cpu_runrun for user level preemptions and cpu_kprunrun for kernel level preemptions. RT threads can preempt TS or SYS or IA class threads since kernel level preemptions typically kicks off when current  running threads priority is <= 100 (KPQPRI). For signal we don't set CPU level preemption flags. We just need to set t_sig_check and t_astflag followed by poke call.

Since we are interested in user level preemption, we should know what happens when CPU 8 poked CPU 9 (using cross call). If the current running thread on CPU 9 is in userland, then we call user_rtt() which calls trap() if the checks for t_astflag succeeds. So lets check whether t_astflag would be set when we call eat_signal() or not. And that's where the problem was. If the target thread in eat_signal() is TS_ONPROC, we should set t_astflag and then poke the CPU. It will be clear from the following probe that the running thread on CPU 9
was getting preempted because the time quantum finished and clock would have set t_astflag in cpu_surrender().

  9  15055               post_syscall:entry  1027637970269440
  8  11263                 eat_signal:entry  1027637980027920 sig : 28
  8   2981                   poke_cpu:entry  1027637980030560 cpu : 9
[.]
  8  11263                 eat_signal:entry  1027638040024480 sig : 28
  8   2981                   poke_cpu:entry  1027638040026800 cpu : 9
[.]
  8  11263                 eat_signal:entry  1027638110024160 sig : 28
  8   2981                   poke_cpu:entry  1027638110026560 cpu : 9
  8   2435              cpu_surrender:entry  1027638170024720 t:3001b7af3e0
  8   2981                   poke_cpu:entry  1027638170027280 cpu : 9
  8  11263                 eat_signal:entry  1027638170032720 sig : 28
  9   2919              poke_cpu_intr:entry  1027638170033760
  8   2981                   poke_cpu:entry  1027638170034400 cpu : 9
  9   3390                       trap:entry  1027638170037840 type :512, pc: 10984, ast:1
  8   2981                   poke_cpu:entry  1027638170038640 cpu : 9
  9   2919              poke_cpu_intr:entry  1027638170045680
  9   1497               trap_cleanup:entry  1027638170054880 0
  9   8317                       psig:entry  1027638170086480 sig : 28
  9   2278                   trap_rtt:entry  1027638170117440
  9  15055               post_syscall:entry  1027638170143360
  9   8317                       psig:entry  1027638170150880 sig : 2

So Dtrace did help us in finding out where the problem could be. This is just once example. Happy Dtracing...
(2005-07-21 00:47:44.0) Permalink Comments [0]

20050718 Monday July 18, 2005

Dtrace rocks...


Sometime back I had a problem with my desktop and as a result it started crawling whenever Java ticker used to kick in. I think I must share this with the rest of the world. I'd also share a kernel problem that we cracked and it was related to performance. So Dtrace has helped in solving many problems so far.

My desktop running Solaris 10 started crawling when I noticed that Xsun is eating up 68% of CPU. From prstat(1M)

# prstat
   PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP
   594 ******     85M   78M run     30    0  14:03:19  68% Xsun/1
   796 root       16M   13M sleep   59    0   1:06:25 5.8% stfontserverd/18
[.]

I then started Dtrac'ing Xsun and noticed that lwp_sigmask() syscall call is being made too frequently by Xsun. Here is the data :-

# ./syscall.d
^C
Ran for 26 seconds


  writev                                                         2832
  pollsys                                                        3261
  read                                                           5910
  doorfs                                                        27199
  lwp_sigmask                                                  217592

LWP ID     COUNT
1          217592

              libc.so.1`__systemcall6+0x20
              libc.so.1`pthread_sigmask+0x1b4
              libc.so.1`sigprocmask+0x20
              libc.so.1`sighold+0x54
              libST.so.1`fsexchange+0x78
              libST.so.1`FSSessionDisposeFontInstance+0x8c
             9063

              libc.so.1`__systemcall6+0x20
              libc.so.1`pthread_sigmask+0x1b4
              libc.so.1`sigprocmask+0x20
              libc.so.1`sigrelse+0x54
              libST.so.1`fsexchange+0xc0
              libST.so.1`FSSessionGetFontRenderingParams+0x8c

...and many more such stack traces from libST.so.1`fsexchange().

Infact the stack is like this:-

              libc.so.1`__systemcall6+0x20
              libc.so.1`pthread_sigmask+0x1b4
              libc.so.1`sigprocmask+0x20
              libc.so.1`sighold+0x54
              libST.so.1`fsexchange+0x90
              libST.so.1`FSSessionGetFontRenderingParams+0x8c
              libST.so.1`GetRenderProps+0x344
              libST.so.1`GlyphVectorRepQuery+0xf4
              libST.so.1`STGlyphVectorQuery+0xd0
              SUNWXst.so.1`_XSTUseCache+0x68

Notice that in this stack trace, we are calling sighold() and sigrelse() too frequently. So this process is disabling and enabling signals for some reason. Looks like we are rendering characters, but why do we block and unblock signals in this path?. Here is the Dtrace script which was used :-

#!/usr/sbin/dtrace -s

#pragma D option quiet

BEGIN
{
        start = timestamp;

}

syscall:::entry
/execname == "Xsun"/
{
        @s[probefunc] = count();
}

syscall::lwp_sigmask:entry
/execname == "Xsun"/
{
        @c[curthread->t_tid] = count();
        @st[ustack(6)] = count();
}

END
{
        printf("Ran for %d seconds\n\n", (timestamp - start) / 1000000000);

        trunc(@s,5);
        printa(@s);

        printf("\n%-10s %-10s\n", "LWP ID", "COUNT");
        printa("%-10d %@d\n", @c);

        printa(@st);
}

In fact Dtrace could help us in solving much more complex problems. Happy Dtrac'ing...
(2005-07-18 19:44:08.0) Permalink

20050713 Wednesday July 13, 2005

And the gambling continues


2) 8th Aug 2005

This is my second blog on this subject. The other day I got the opportunity to hear about the elements of investing on CNBC TV18 channel (classroom program).

-> Knowledge and understanding of business
-> stock that are considered undervalued have larger scope
-> Look at the benchmarks (charts, reviews)
-> Convition to succeed
-> Time management is critical
-> Margin of safety.

Finding value stock is very tough. Most people go by P/E ratio and we still see loads of stocks which are overvalued but moving up  continuously. So you pay for the quality of the stock. We should perhaps look at the stories which have growth and the managment of the company also needs to be good in order to deliver the guidance. So growth and management of the company are two critical things which you must consider while investing. Afterall investing increases the wealth of a person.


1) 13th July 2005

Since this's my first blog on Indian stock market, I thought I should write
about some useful tips if you are an investor. In past one year, I figured out
couple of things which are worth mentioning here :-

(a) Be less greedy. You should be greedy because you want to make money :-) (b) Keep strict stop losses on a counter
(c) Look at the fundamental view and adhere to technical analysis (like moving avg, volumes, stop losses, support levels and all) (d) Participate in momentum driven stocks. For instance VSNL was up Rs 80 in two consequetive trading sessions. (e) Look at the long term view on the stock. (f) Keep your portfolio diversified (g) Invest in good sectors which have good growth potentional and should less depend upon commodity prices like steel pricess or fright rates. (h) It's difficult to time top or bottom. So catch the train when it's slow and get out when the steam is cooling a bit. (i) Punter play is not so easy in the market, so riding the wave is more appropriate (j) Buying good stocks when they are low (due to margin pressure or whatever) could give good returns if the fundamental story is intact (k) You should book profits from time to time. I think you should always have short term, medium term and long term view on the stock. Happy investing!!...
(2005-07-13 02:24:36.0) Permalink Comments [0]

20050706 Wednesday July 06, 2005

Photographs of Solang Valley and Chaukori

I'm putting the photographs of Solang Valley in Himanchal Pradesh (14 kms from Manali) and Chaukori in Pithoragarh distt. of Uttaranchal. I went alone to Solang for skiing in early March 2005 and then to Chaukori for a trip.

My uncle, Dipanker (in HP) and I recetly scaled Kumaraparvata (in western ghats) again and this time we did it in one day and that too when rains were at its peak. We decided to drive upto Subrahmanya and were out of Bangalore by 4.30 pm on 1st July 2005. Next day we started the trek at 6.30 am and were back to Subrahmanya by 8.15 pm. We were all wet and in fact climbing the last hurdle i.e a rock clif was most difficult because of the water fall. If you do this trek in summar you will a rock whereas if you do this trek in monsoon you will find a water fall. So the last climb was difficult. I had goofed up on the way back. Since the peak had gatthered thick fog, rain and high velocity winds, I had failed to find the route back to the water fall. It was when we tried to approach towards Pushpagiri, we noticed a small trail going somewhere and that was the route back to Subrahmanya. It was 4.30 pm by the time we discovevered the route. The Leeches had troubled us a lot. Our cloths were red by the time we were in Subrahmanya. Next day we drove from Subrahmanya and it was amazing. We took too many halts on the way back because we had ample time to return to bangalore. We couldn't take photographs because of rains.

  
(2005-07-06 20:08:39.0) Permalink Comments [1]


Locations of visitors to this page
archives
links
referers