David Blasingame's Weblog David Blasingame's Weblog

Monday Dec 07, 2009

Over the years I've looked at a ton of truss outputs.  I've always been bothered by path orders within the truss output.  Let me take, for example, the uname command.  If I run it and look at the time it takes to set up a path, on a 1200 MHz processor, it does take some work.  Not much but some time to set up.


0.0824 stat64("/home/dave/bin/uname", 0xFFBFEBA0)   Err#2 ENOENT
 0.0827 lwp_sigmask(SIG_SETMASK, 0x00000000, 0x00000000) = 0xFFBFFEFF [0x0000FFFF]
 0.0828 lwp_sigmask(SIG_SETMASK, 0x00020000, 0x00000000) = 0xFFBFFEFF [0x0000FFFF]
 0.0829 stat64("/usr/dt/bin/uname", 0xFFBFEBA0)         Err#2 ENOENT
 0.0831 lwp_sigmask(SIG_SETMASK, 0x00000000, 0x00000000) = 0xFFBFFEFF [0x0000FFFF]
 0.0832 lwp_sigmask(SIG_SETMASK, 0x00020000, 0x00000000) = 0xFFBFFEFF [0x0000FFFF]
 0.0833 stat64("/usr/openwin/bin/uname", 0xFFBFEBA0)    Err#2 ENOENT
 0.0834 lwp_sigmask(SIG_SETMASK, 0x00000000, 0x00000000) = 0xFFBFFEFF [0x0000FFFF]
 0.0835 lwp_sigmask(SIG_SETMASK, 0x00020000, 0x00000000) = 0xFFBFFEFF [0x0000FFFF]
 0.0836 stat64("/usr/ccs/bin/uname", 0xFFBFEBA0)        Err#2 ENOENT
 0.0838 lwp_sigmask(SIG_SETMASK, 0x00000000, 0x00000000) = 0xFFBFFEFF [0x0000FFFF]
 0.0839 lwp_sigmask(SIG_SETMASK, 0x00020000, 0x00000000) = 0xFFBFFEFF [0x0000FFFF]
 0.0840 stat64("/usr/bin/uname", 0xFFBFEBA0)            = 0


Now if I extrapolate this to a serious of commands then you will start seeing some difference (7 in this case)


Commands with paths in them 



-bash-3.00$ time ./commands_path.sh

real    0m0.071s
user    0m0.019s
sys     0m0.051s


Commands without paths defined


-bash-3.00$ time ./commands.sh

real    0m0.146s
user    0m0.020s
sys     0m0.054s


Loop Impact


In this uname_loop.sh script, I'm setting a variable HOST to a part of my hostname before and dashes.  It is not a real useful script, but I'm only trying to show the impact of the path name not being set. 


Path not set 


-bash-3.00$ cat uname_loop.sh
#!/usr/bin/ksh

max=500;
i=0;
while [[ $i -lt $max ]];
do
   HOST=`uname -n | cut -d"-" -f1`;
   (( i += 1 ));
done


******************End Script***********


now to time it.


-bash-3.00$ time ./uname_loop.sh

real    0m29.680s
user    0m2.155s
sys     0m7.448s


Change the script for the define the paths


-bash-3.00$ cat uname_loop_path.sh
#!/usr/bin/ksh

max=500;
i=0;
while [[ $i -lt $max ]];
do
   HOST=`/usr/bin/uname -n|/usr/bin/cut -d"-" -f1`;
   (( i += 1 ));
done


And now for the time it takes.


-bash-3.00$ time uname_loop_path.sh

real    0m5.791s
user    0m1.958s
sys     0m6.276s


Well that is a pretty significant impact, especially in loops.  Give it a try and see if it makes your script run faster.



Tuesday Sep 29, 2009

I've been working on a way to gather data about an application when the TCP receive window hits zero, an indication that the application is too busy for additional data.   What I've been wanting to do is stop the applicaton, gcore it and run a couple of proc commands (pfiles, pstack).  The tricky part is the trigger.  At first I thought I had it solved.  Set a probe on tcp_rput_data() and see if the window size hits zero.  Here is a snippet of the code.


Receive Window Size Probe


fbt:ip:tcp_rput_data:entry
/((conn_t *)arg0)->conn_tcp->tcp_rwnd == 0/
{
   connp = (conn_t *)arg0;
   tcp = (tcp_t *)connp->conn_tcp;
   printf("pipeline full or window closed %d\n",
        ((conn_t *)arg0)->conn_tcp->tcp_rwnd);


   system("pstop %d", $1);
   system("pfiles %d", $1);
   printf("\n");
   system("pstack %d", $1);
   printf("\n");
   system("gcore -o webservd.%d %d", x, $1);
   system("prun %d", $1);


}


I put it on the server I want to monitor an of course, it does not work.


dtrace: error on enabled probe ID 1 (ID 31450: fbt:ip:tcp_rput_data:entry): invalid address (0x0) in predicate at DIF offset 36


Using MIBs


So of course search around and can't figure out how to fix it.  I asked some various people and tried some emails and got good input, but all the changes still did not fix my issue.  But going through the tcp.c source code, I found a MIB that fires when tcp_rput_data() hits a receive window size of zero.


dtrace -n 'mib:ip:tcp_rput_data:tcpInWinProbe'



verifying this with snoop shows the probe only fires during a zero window condition, not caused by a reset.


bash-3.00# /usr/sbin/snoop -d bge0 | grep "Win=0"
Using device /dev/bge0 (promiscuous mode)
    elvis -> server   TCP D=56344 S=8016 Rst Ack=3385993653 Win=0  (*** does not fire)
  server -> lab146 TCP D=45597 S=5001 Ack=728518838 Seq=3393801077 Len=0 Win=0  (*** probe fires)






Monday Sep 28, 2009

LOST 


With all the programs that have kept into businesses, government and education, there can be the tendency to lose focus on the mission.  When we lose that focus, we become ineffective.  This will hurt the organization and if too many people lose focus, the organization will fail.  Very similar to the hit TV show, we become LOST.  Since I'm in service, I try to focus on the mission of a service organization and making sure my understanding of the service industry lines up with my mission. (4 8 15 16 23 42)


Mission 


So what's the mission?  In trying to decide the mission, I believe you should think of the mission from the customers perspective.  If my air conditioning in my house breaks down, what do I want?  Quick, effective repair, especially during the middle of summer.  I usually have a few constraints associated with the repair, such as cost and more importantly knowledge and reputation of the person doing the repair.  I've learned the hard way that an individual who lacks the knowledge will result in more costly repairs.


Just Fix It! 


In the technology sector it may be a little harder to define because of the complexity of all the interactions. But essentially, customers want to move from a state of broken to a state of fix or working.  In other words, just fix it!  Just Fix It, a simple motto but very important.  What does it take to Just Fix It?  Knowledge.  Knowing what and where the problem is probably the most difficult task a service engineer has.  It takes skill, experience, and understanding of the product and environment to determine the problem.  But at times there can be road blocks imposed by the organization which limits effectiveness to fix a problem. 


Leaders


So leaders within organizations need to focus on the mission.  Just Fix It!  Think what would happen if everyone in an organization, especially in a service organization made that their motto.  Instead of protectionism, Just Fix It would break down barriers.  Instead of bureaucratic red tape, Just Fix It would create effective documentation (I like to call green tape).  Instead of low morale, Just Fix It,  would make people more effective and shift to higher morale.  The list just goes on.  So Just Fix IT.





Wednesday Jul 08, 2009


Wireshark is a great tool for looking at network problems.  It's easy to use and easy to apply filters.  But there is one setback.  Memory.



One way to solve this is to get more memory, but that can be expensive.  I once needed to look at a very large snoop capture as a whole.  So I loaded wireshark on a Netra T2000 with 32G of memory.  Guess what, it worked great.  Just took a long time since I was looking at a very large file.


Unfortunately, I don't always have access to one of these great servers.



Wireshark also has a less known feature that you can edit the snoop with called edit cap.  This gives you the ability to split up a capture into more manageable pieces.  For example if you wanted to split up a snoop into the first 50000 packets you would do this.


editcap -r snoop.out   snoop-0 1-50000


Now on my opensolaris laptop, I can view captures without having to add more memory. 

Tuesday Apr 21, 2009

Recently I've been working deep in the network protocol stack.  I've had the need to use tools external to what I usually use.  Let me explain.  Say for instance you have a complex network environment with multiple switches and hosts possibly even with trunking/aggregation in the mix.  Within Solaris you can use snoop to gather packet data to look at the what is going on the wire from a host perspective.  Which is great if you want to see what is coming into a Solaris Host.  After gathering the snoop data, I like using wireshark to look at and filter out data.  But how do you gather visibility in other points on the network.


CT900In the telco space, if you have a CT900, then you can use port mirroring to see the packets entering into  server blade or even the ATCA switch.  This is great if you are only looking at a single port, but if you are trunking into the switch, then you cannot use port mirroring to gather data.  Port mirroring has limitation in high volume traffic.  And if your not using telco equipment, you need another way of monitoring data.





A number of Cisco switches have port mirroring capabilities, but have similar limits as our ATCA switch.  Cisco does have a network analysis module product line which can be added to some of their series switches which will gather captures across it's backplane, but then again you are limited to a single switch.


In a simple environment you could build an ethernet tap and look at a single link.  But in my testing you can't guarantee gigabit speed on a home built tap, so you might need to buy a commercial grade product.  This would allow you to monitor a single port, but once again you would lack visibility into a complex environment.


Thus enter in the protocol or network analyzer.  I can gather data from end points on a server, but in between points, the protocol analyzer or sniffer is needed.  There aren't a whole lot of these tools out there.  I've had the privilege of using the XLink Analyzer from Fluke Networks recently.  Even though complex, it has the right tools to help in diagnosing problems.  Unfortunately the XLink Analyzer has one big drawback, price.


The protocol analyzer and a good network engineer and really help discover problems with configurations, routing, hardware, software and potentially tunning opportunities to reduce network load.  And in a network centric world that we now live in it is a critical tool for any corporation.  There may be other possibilities of doing similar protocol analysis using a Solaris Server with a couple of network cards.  I'll discuss that configuration in a later blog.


It's time to start paying attention to the Network.

Thursday Mar 26, 2009

With all the rumors abounding that Sun is going to be bought by IBM, I've reflected about what I really love about Sun.  Without a doubt, in the technical ranks, employees are passionate about Sun.  Over my employment at Sun, I have seen passion rule the technical ranks.  Passion has kept up my spirits and drive for improvement during years of layoffs.  Passion has kept me working all hours of the night to resolve customer problems.  Passion has given me the ability to think outside of the box in troubleshooting.  Passion for Sun and our customers have ruled my work ethic.


The financial analyst for years have been predicting the demise of Sun, but passion has kept the company going.  Will this passion transfer to any corporation that might buy Sun.  Who knows?  I guess the better question would be, what created this passion.  Some of what I believe are; The corporate culture, the creativity you see in Sun's Products over the years, being an underdog in a pool of big fish, April Fools pranks and the people.


I would even take it one step further.  We have customers that are passionate about Sun.  The system administrators and developers that I've worked with over the years love Solaris.  And why not, it's the coolest operating system out there.  And with OpenSolaris it has really become easier to use, install and manage.  These customers are passionate, because we have tried to do what is right for them.


Keep up the passion! 

Friday Feb 13, 2009

A common method of troubleshooting is to count errors and the part with the most number of errors is the culprit.  So Solaris has part of it's interface kstat structures.  These structures are initialized and maintained since boot.  However, this method of troubleshooting is flawed and can be dangerous by itself.  It does not take into consideration, cause and effect.



For instance when a disk is having problems, on a scsi bus, the HBA will reset the bus to try to re-establish communication with the disk.  Every drive on that bus will reset.  Now kstat counters will increment and report errors on drives that may not be experiencing the problems.  Overtime these counters may accumulate more errors on drives than the failing disk.


Real example.  Looking at these  events you will notice targets 0 and 1 have a huge number of hard errors.  So you may be apt to conclude that one of these disks is causing c1t3d0 to have some transport errors.


#  iostat -En
c1t0d0           Soft Errors: 0 Hard Errors: 230 Transport Errors: 31
Vendor: SEAGATE  Product: ST373207LSUN72G  Revision: 045A Serial No: 060433Y77Y
Size: 73.40GB <73400057856 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 230 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c1t1d0           Soft Errors: 0 Hard Errors: 230 Transport Errors: 31
Vendor: SEAGATE  Product: ST373207LSUN72G  Revision: 045A Serial No: 060433Y55L
Size: 73.40GB <73400057856 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 230 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c1t2d0           Soft Errors: 0 Hard Errors: 45 Transport Errors: 13
Vendor: SEAGATE  Product: ST314670LSUN146G Revision: 045A Serial No: 060843QDRV
Size: 146.80GB <146800115712 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 45 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c1t3d0           Soft Errors: 0 Hard Errors: 7 Transport Errors: 482
Vendor: SEAGATE  Product: ST314670LSUN146G Revision: 045A Serial No: 060843RTC7
Size: 146.80GB <146800115712 bytes>
Media Error: 5 Device Not Ready: 0 No Device: 2 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0

This would be a reasonable conclusion, but incorrect.  What really is happening in this case is target 3 is failing with an incorrect phase and the bus resets.  On reset the must reset and since a write is in flight to target 0, target 0 records an error.


Jan 27 19:11:44 blaze scsi: [ID 107833 kern.warning] WARNING: /pci@1c,600000/scsi@2 (glm0):
Jan 27 19:11:44 blaze        Resetting scsi bus, got incorrect phase from (3,0)
Jan 27 19:11:44 blaze genunix: [ID 408822 kern.info] NOTICE: glm0: fault detected in device; service still available
Jan 27 19:11:44 blaze genunix: [ID 611667 kern.info] NOTICE: glm0: Resettingscsi bus, got incorrect phase from (3,0)
Jan 27 19:11:44 blaze scsi: [ID 107833 kern.warning] WARNING: /pci@1c,600000/scsi@2 (glm0):
Jan 27 19:11:44 blaze        got SCSI bus reset
Jan 27 19:11:44 blaze genunix: [ID 408822 kern.info] NOTICE: glm0: fault detected in device; service still available
Jan 27 19:11:44 blaze genunix: [ID 611667 kern.info] NOTICE: glm0: got SCSI bus reset
Jan 27 19:11:44 blaze scsi: [ID 107833 kern.warning] WARNING: /pci@1c,600000/scsi@2/sd@3,0 (sd3):
Jan 27 19:11:44 blaze        auto request sense failed (reason=reset)
Jan 27 19:11:47 blaze scsi: [ID 107833 kern.warning] WARNING: /pci@1c,600000/scsi@2/sd@0,0 (sd0):
Jan 27 19:11:47 blaze        Error for Command: write(10)               Error Level: Retryable
Jan 27 19:11:47 blaze scsi: [ID 107833 kern.notice]  Requested Block: 53976752                  Error Block: 53976752
Jan 27 19:11:47 blaze scsi: [ID 107833 kern.notice]  Vendor: SEAGATE                           Serial Number: 060433Y77Y
Jan 27 19:11:47 blaze scsi: [ID 107833 kern.notice]  Sense Key: Unit Attention
Jan 27 19:11:47 blaze scsi: [ID 107833 kern.notice]  ASC: 0x29 (scsi bus reset occurred), ASCQ: 0x2, FRU: 0x2
Jan 27 19:11:47 blaze scsi: [ID 107833 kern.warning] WARNING: /pci@1c,600000/scsi@2/sd@1,0 (sd1):
Jan 27 19:11:47 blaze        Error for Command: write(10)               Error Level: Retryable
Jan 27 19:11:47 blaze scsi: [ID 107833 kern.notice]  Requested Block: 53997104                  Error Block: 53997104
Jan 27 19:11:47 blaze scsi: [ID 107833 kern.notice]  Vendor: SEAGATE                           Serial Number: 060433Y55L
Jan 27 19:11:47 blaze scsi: [ID 107833 kern.notice]  Sense Key: Unit Attention
Jan 27 19:11:47 blaze scsi: [ID 107833 kern.notice]  ASC: 0x29 (scsi bus reset occurred), ASCQ: 0x2, FRU: 0x2

So when using the counters, use then to give you the big picture, but make sure you look through the events to evaluate the problem.

Wednesday Nov 19, 2008

"Everybody's lost but me", is one of my favorite quotes from Indian Jones "The Last Crusade".  Yet how often do we feel like no one knows what is going on but ourself.  Why?  Usually because your the person who has the technical expertise. With OpenSolaris you don't have to be lost.  Since this is in my Debugging Tips section, I would like for everyone to get the expertise in at least being able to start understanding problems.


So today's tip is with looking at what is transpiring on your server as it relates to Storage Area Networks.  Under Solaris 10, there is a feature within the leadville stack that traces fiber channel events.  It's scattered all through fp driver source call FP_TRACE.


Looking at a live system, you can retrieve this data using mdb (Modular Debugger)


# mdb -k
Loading modules: [ unix krtld genunix dtrace specfs ufs sd isp pcisch sgsbbc md ip sctp usba wrsm fcp fctl qlc nca ssd zfs random ptm nfs crypto sppp logindmux cpc wrsmd fcip ipc ]

Using the fptrace macro. 


> ::fptrace

fp trace buffer contents
[Wed Oct 22 14:31:50 2008] 0=>fp(1)::fp_fabric_login Begin; port=60003ab3000, job=6000584ff18
[Wed Oct 22 14:31:50 2008] 1=>fp(3)::fp_fabric_login Begin; port=60005837000, job=60005851f20
[Wed Oct 22 14:31:50 2008] 2=>fp(1)::fp_fabric_login End; port=60003ab3000, job=6000584ff18
[Wed Oct 22 14:31:50 2008] 3=>fp(1)::fp_flogi_intr; port=60003ab3000, pkt=600058399c8, state=1
[Wed Oct 22 14:31:50 2008] 4=>fp(3)::fp_fabric_login End; port=60005837000, job=60005851f20
[Wed Oct 22 14:31:50 2008] 5=>fp(3)::fp_flogi_intr; port=60005837000, pkt=6000583b9c8, state=1
[Wed Oct 22 14:31:50 2008] 6=>fp(3)::fp_plogi_intr: port=60005837000, job=60005851f20, d_id=fffffc, jcount=1 pkt=6000583b6f0, state=1
[Wed Oct 22 14:31:50 2008] 7=>fp(1)::fp_plogi_intr: port=60003ab3000, job=6000584ff18, d_id=fffffc, jcount=1 pkt=600058396f0, state=1
[Wed Oct 22 14:31:50 2008] 8=>fp(3)::GAN response; port=60005837000, d_id=20700
[Wed Oct 22 14:31:50 2008] 9=>fp(3)::GAN response details; port=60005837000, d_id=20700, type_id=7ba68800, pwwn=21 0 0 e08b e c2 8d, nwwn=20 0 0 e0 8b e c2 8d
[Wed Oct 22 14:31:50 2008] 10=>fp(3)::GAN response; port=60005837000, d_id=20000
[Wed Oct 22 14:31:50 2008] 11=>fp(3)::GAN response details; port=60005837000, d_id=20000, type_id=7ba68800, pwwn=20 3 0 3ba 4d 36 95, nwwn=10 0 0 3 ba 4d 36 95
[Wed Oct 22 14:31:50 2008] 12=>fp(3)::GAN response; port=60005837000, d_id=20600
[Wed Oct 22 14:31:50 2008] 13=>fp(1)::GAN response; port=60003ab3000, d_id=10700
[Wed Oct 22 14:31:50 2008] 14=>fp(1)::GAN response details; port=60003ab3000, d_id=10700, type_id=7ba68800, pwwn=21 0 0 e0 8b 9 e8 e9, nwwn=20 0 0 e0 8b 9 e8 e9
[Wed Oct 22 14:31:50 2008] 15=>fp(1)::GAN response; port=60003ab3000, d_id=10000
[Wed Oct 22 14:31:50 2008] 16=>fp(1)::GAN response details; port=60003ab3000, d_id=10000, type_id=7ba68800, pwwn=20 3 0 3ba 4e 85 83, nwwn=10 0 0 3 ba 4e 85 83
[Wed Oct 22 14:31:50 2008] 17=>fp(3)::GAN response; port=60005837000, d_id=20700
[Wed Oct 22 14:31:50 2008] 18=>fp(3)::GAN response details; port=60005837000, d_id=20700, type_id=7ba68800, pwwn=21 0 0 e0 8b e c2 8d, nwwn=20 0 0 e0 8b e c2 8d
[Wed Oct 22 14:31:50 2008] 19=>fp(3)::GAN PD stuffing; pd=6000392c0c0, port_id=70201, sym_len=0 fc4-type=20010000
[Wed Oct 22 14:31:50 2008] 20=>fp(1)::GAN response; port=60003ab3000, d_id=10600
[Wed Oct 22 14:31:50 2008] 21=>fp(3)::GAN response; port=60005837000, d_id=20000
[Wed Oct 22 14:31:50 2008] 22=>fp(3)::GAN response details; port=60005837000, d_id=20000, type_id=7ba68800, pwwn=20 3 0 3ba 4d 36 95, nwwn=10 0 0 3 ba 4d 36 95
[Wed Oct 22 14:31:50 2008] 23=>fp(3)::GAN PD stuffing; pd=60003bd9780, port_id=201, sym_len=0 fc4-type=10010000
[Wed Oct 22 14:31:50 2008] 24=>fp(3)::GAN response; port=60005837000, d_id=20600

>> More [<space>, <cr>, q, n, c, a] ?

 


MDB gives you the capability of scrolling through the events like more does.  You can also pull the data from a core dump, either live core or a post mordem dump.


# mdb unix.0 vmcore.0
Loading modules: [ krtld genunix specfs dtrace ufs ssd fcp fctl qlc pcisch sd ip hook neti sctp arp usba nca random md logindmux ptm cpc sppp crypto fcip wrsmd zfs nfs ipc ]

> ::fptrace
fp trace buffer contents
[Fri Sep 12 2008 13:09:34:745:629:500] 0=>fp(1)::fp_get_lilpmap Begin; port=60032c18000, map=60032c180f0
[Fri Sep 12 2008 13:09:34:745:720:800] 1=>fp(1)::fp_get_lilpmap End; port=60032c18000, map=60032c180f0
[Fri Sep 12 2008 13:09:34:745:726:300] 2=>fp(1)::fp_fabric_login Begin; port=60032c18000, job=60032c95f18
[Fri Sep 12 2008 13:09:34:745:755:500] 3=>fp(1)::fp_fabric_login End; port=60032c18000, job=60032c95f18
[Fri Sep 12 2008 13:09:34:746:029:800] 4=>fp(1)::fp_flogi_intr; port=60032c18000, pkt=60032c8d9c8, state=e
.......

[Fri Nov 14 2008 00:45:57:990:286:400] 3299=>fp(0)::fp_unsol_cb: s_id=3703ef, d_id=375c00, type=1, r_ctl=22, f_ctl=290000
seq_id=0, df_ctl=0, seq_cnt=0, ox_id=ffff, rx_id=ffff ro=0, buffer[0]:5
[Fri Nov 14 2008 00:45:57:990:323:200] 3300=>fp(0)::Accepting LOGO; d_id=3703ef, pd=60032cfcac0, tolerance=0 retain=1
[Fri Nov 14 2008 00:45:57:990:391:400] 3301=>fp(0)::fp_ulp_devc fired; port=60032dc1000, len=1
[Fri Nov 14 2008 00:45:57:990:559:100] 3302=>fp(0)::fp_plogi_group begin; port=60032dc1000, job=60032c95428
[Fri Nov 14 2008 00:45:57:990:567:600] 3303=>fp(0)::fp_plogi_group contd; port=60032dc1000, pd=60032cfcac0
[Fri Nov 14 2008 00:45:58:045:465:100] 3304=>fp(0)::fp_plogi_intr: port=60032dc1000, job=60032c95428, d_id=3703ef, jcount=1 pkt=60032dc96f0, state=e
[Fri Nov 14 2008 00:45:58:045:564:200] 3305=>fp(0)::PLOGI to 3703ef failed. state=e reason=5.
[Fri Nov 14 2008 00:45:58:045:723:700] 3306=>fp(0)::fp_plogi_group end: port=60032dc1000, job=60032c95428
[Fri Nov 14 2008 00:45:58:045:729:900] 3307=>fp(0)::fp_plogi_group begin; port=60032dc1000, job=60032c94ee8

Very cool.  Looks somewhat similar to a portlogdump from the brocade switches.  Besides, now that solaris is open, you can actually see what things mean.  Take for instance fp_unsol_cb.  Take a look at opensolaris code and you see what fp_unsol_cb (line 10454)  does.  And how we are tracing.



   10468 	FP_TRACE(FP_NHEAD1(1, 0), "fp_unsol_cb: s_id=%x,"
10469 " d_id=%x, type=%x, r_ctl=%x, f_ctl=%x"
10470 " seq_id=%x, df_ctl=%x, seq_cnt=%x, ox_id=%x, rx_id=%x"
10471 " ro=%x, buffer[0]:%x", buf->ub_frame.s_id, buf->ub_frame.d_id,
10472 buf->ub_frame.type, buf->ub_frame.r_ctl, buf->ub_frame.f_ctl,
10473 buf->ub_frame.seq_id, buf->ub_frame.df_ctl, buf->ub_frame.seq_cnt,
10474 buf->ub_frame.ox_id, buf->ub_frame.rx_id, buf->ub_frame.ro,
10475 buf->ub_buffer[0]);


You can now go through the source code and see what things mean.  Now there shouldn't be anymore need for you to quote the Young Indiana Jones, "Everybody's lost but me".  At least those who read my blog. 



Thursday Nov 13, 2008

My First Sun Storage 7000.  Ok not a real 7000 but a virtual 7000.  I downloaded the VMware Player and installed the virtual machine on top of Windows XP on my acer laptop.  I had one problem when I installed the virtual machine.  "VMware Workstation cannot open one of the virtual disks needed"  I fixed this by modifying my the following file:


"Sun Storage VMware.vmx"


Added: 


diskLib.sparseMaxFileSizeCheck = "FALSE".  


After that, it worked. (slowly)  My laptop isn't powerful enough to handle running much on it.  Hey, I'm in service.  Do you think we get the top of the line gear?  It worked despite the lack of hardware.  And the interface is easy to use.  How did that happen?  It didn't take long to configure the virtual device through the web interface.


My first impression.  Very cool.  I've been reading about the technical changes that have been coming in ZFS for sometime now.  After reading Brendan about the L2ARC and Jonathan's blog about flash drives, I thought it would be great to mirror the two into a technology.  But the additional features such as "Analytics", makes this product rock.  As someone who has had to track down IO performance problems for quite some time, I now have great visibility into a storage product.


This product is unreal.  It changes everything, and makes life better for everyone in IT who uses it.

Monday Nov 10, 2008


What do you get when you mix Sun Servers, ZFS, DTrace and Commodity Storage?


A:  Nervous competitors. 


Sun just took the storage business and made it a commodity.  Customers are getting the best innovations in the market in a general purpose storage platform.  Think about all the features of ZFS, now available to you on a storage platform.  And you don't have to be a storage guru anymore to understand Storage anymore.  With DTrace and FMA, storage couldn't be easier.  Best of all it's not a secret, like some proprietary vendors. 


Want to know how flash drives can be incorporated into an openstorage product?  Check out ZFS L2ARC.


How about iSCSI?  Integrated right into ZFS.


Snapshots?  ZFS.  Replication?  ZFS send/recv


Want to look at performance.  Industry leading DTrace.  Fault Management?  Built right into Solaris FMA. 


Awesome.  Engineering delivered.  Now it's time to sell.

Sunday Nov 09, 2008

Debugging Sun Systems and Software is one of my main roles at Sun.  I enjoy this part of my job.  Especially in the face of escalations, where I get to play Hero.  So I've started a new category on my blog, Debugging Tips.  Dedicated to helping those in the service and system administration roles understand some of the debugging principles I employ in practical situations.  So let me start with SCSI disks.


Because they are mechanical in nature, disk drives failures are one of the most common type of problems I run across.  Going through /var/adm/messages can be quite difficult to the unseasoned veteran of SCSI problems.  Take the following.

Nov  6 18:30:24 blaze scsi: [ID 107833 kern.warning] WARNING: /pci@8,700000/pci@3/scsi@2/sd@0,0 (sd6):
Nov  6 18:30:24 blaze        Unhandled Sense Key 'Vendor Unique'
Nov  6 18:30:24 blaze scsi: [ID 107833 kern.warning] WARNING: /pci@8,700000/pci@3/scsi@2/sd@0,0 (sd6):
Nov  6 18:30:24 blaze        Error for Command: read(10)                Error Level: Retryable
Nov  6 18:30:24 blaze scsi: [ID 107833 kern.notice]  Requested Block: 768                       Error Block: 912
Nov  6 18:30:24 blaze scsi: [ID 107833 kern.notice]  Vendor: SEAGATE                            Serial Number: ABCDEFGHI
Nov  6 18:30:24 blaze scsi: [ID 107833 kern.notice]  Sense Key: Vendor Unique
Nov  6 18:30:24 blaze scsi: [ID 107833 kern.notice]  ASC: 0x80 (<vendor unique code 0x80>), ASCQ: 0x0, FRU: 0xa
Nov  6 18:30:24 blaze scsi: [ID 107833 kern.warning] WARNING: /pci@8,700000/pci@3/scsi@2 (glm0):
Nov  6 18:30:24 blaze        Resetting scsi bus, <null string> from (0,0)
Nov  6 18:30:24 blaze genunix: [ID 408822 kern.info] NOTICE: glm0: fault detected in device; service still available
Nov  6 18:30:24 blaze genunix: [ID 611667 kern.info] NOTICE: glm0: Resetting scsi bus, <null string> from (0,0)
Nov  6 18:30:24 blaze scsi: [ID 107833 kern.warning] WARNING: /pci@8,700000/pci@3/scsi@2 (glm0):
Nov  6 18:30:24 blaze        got SCSI bus reset
Nov  6 18:30:24 blaze genunix: [ID 408822 kern.info] NOTICE: glm0: fault detected in device; service still available
Nov  6 18:30:24 blaze genunix: [ID 611667 kern.info] NOTICE: glm0: got SCSI bus reset
Nov  6 18:30:24 blaze scsi: [ID 107833 kern.warning] WARNING: /pci@8,700000/pci@3/scsi@2/sd@1,0 (sd8):
Nov  6 18:30:24 blaze        SCSI transport failed: reason 'reset': retrying command
Nov  6 18:30:27 blaze scsi: [ID 107833 kern.warning] WARNING: /pci@8,700000/pci@3/scsi@2/sd@0,0 (sd6):
Nov  6 18:30:27 blaze        Error for Command: read(10)                Error Level: Retryable
Nov  6 18:30:27 blaze scsi: [ID 107833 kern.notice]  Requested Block: 1024                      Error Block: 1024
Nov  6 18:30:27 blaze scsi: [ID 107833 kern.notice]  Vendor: SEAGATE                            Serial Number: ABCDEFGHI
Nov  6 18:30:27 blaze scsi: [ID 107833 kern.notice]  Sense Key: Unit Attention
Nov  6 18:30:27 blaze scsi: [ID 107833 kern.notice]  ASC: 0x29 (scsi bus reset occurred), ASCQ: 0x2, FRU: 0x2


This can quickly fill up your messages file and make it difficult to determine the root cause of a problem, especially on a SCSI bus, when the component in question doesn't fail.  And these messages were written by a programmer of the device driver, which makes it a little more difficult to understand.  Because of the complexity of the messages files, some people like to use iostat -E to summarize the problem.  Unfortunately this can often lead to a mis-diagnosis.  Take the following output for instance.



sd6       Soft Errors: 12 Hard Errors: 112 Transport Errors: 136
Vendor: SEAGATE  Product: ST336607LSUN36G  Revision: 0207 Serial No: ABCDEFGHI
Size: 36.42GB <36418595328 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 105 Recoverable: 12
Illegal Request: 0 Predictive Failure Analysis: 0
sd8       Soft Errors: 0 Hard Errors: 68 Transport Errors: 459
Vendor: SEAGATE  Product: ST336607LSUN36G  Revision: 0207 Serial No: CDEFGHIJK
Size: 36.42GB <36418595328 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 68 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
sd10      Soft Errors: 0 Hard Errors: 76 Transport Errors: 282
Vendor: FUJITSU  Product: MAN3367M SUN36G  Revision: 1804 Serial No: EFGHIJKLM
Size: 36.42GB <36420075008 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 76 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0

If I was to determine which component was failing by the shear number of events, sd8 would win.  In this case, I would be wrong.  Seeing that they are spread out on all three drives on this bus, maybe I could guess, the HBA was failing.  Wrong again.  iostat -E summarizes what is available in the kstat structures, meaning active since the last reboot.  That could create a problem itself.  It's only a counter of events.  On a scsi bus, all devices are affected in some way or another by each other.  If one device determines it needs to reset, the whole bus will reset.


So the key to looking at the problem is to start with the very first event and then see what is happening.  Get a feel for what is going on a try to figure out a pattern.  In snippet of /var/adm/messages above, you see sd6 reporting an error, because of this, the glm driver (Host Bus Adapter) resets the bus, then other drives report the scsi bus reset.  This my be an overly simple way of looking at what is happening on the scsi bus but it is a good start. 


In summary, look at the errors, get a pattern of events and if needed summarize the errors.  But remember, you need to start with the first event and to see what is going on.  If you start in the middle you will more than likely get the diagnosis wrong.  If you base your diagnosis only on counters, you will probably get the wrong diagnosis.



Wednesday Oct 15, 2008


With the current economic crisis, it is my opinion is that people want a leader right now.  Someone who can motivate people and take control of things.  Someone they can trust and can guide them out of these troubled times.  Someone who is "looking out for them" as Bill O'Reilly says. 


When I was in the military we defined Leadership as the art of influencing and directing men and women to accomplish a mission.  Influencing is the first part of the action required by definition. How to influence?  First of all you have to know and understand what people are going through.  That means you have to know your people and their circumstances.  What motivates people and how can you inspire them.  For a small few, monetary compensation alone inspires them.  But for most, it's their value to the organization or cause.  Just think about it.  How many organizations are there who have non paid volunteers?



Next, leadership is an art.  You may have some natural tendencies toward leadership, but just like every other skill, leadership must be developed, practiced and refined. You aren't given leadership skills, by virtue of your position.  Manager is not in the definition, so you don't have to be a manager to lead.  Some of the people I respect most at Sun have a technical role and not a manager role.  And in so many cases I work on for customers, the customer is looking for the technical resource to lead them out of their crisis. 


For example, I was on a call a few weeks ago in which the server was running out of swap space for an unknown reason.  Their online application was going to crash unless I could resolve the problem.  The customer was not looking for a manager to help them out, but a competent technical resource who could quickly narrow down their problem and direct them to a fix.  As a matter of fact, the manager was looking to me to lead the customer through their crisis.  With some communication to understand what was going on and a few commands, I was able to lead the customer out of the problem.


My son's football coaches tell them to lead by example.  The economic crisis we face today in the US, did not have those leaders who lead by example.  Instead we have seen greed by example.  No sacrifices by people who should be our leaders.  Instead the sacrifices are on the backs of others who will truely lead.  Those in power are going to have to change and put others first to lead this economy out of our problems.  Isn't that how you lead in the first place, put others first.  And as a technical resource, I am going to lead to help my peers and customers through this struggle.  Let's do what we can to lead out of the US financial crisis.



Monday Oct 06, 2008

The integration of iSCSI into to ZFS makes it really simple to make your own storage system.  Let me take you through a very simple example.


 Indentify a lun to use.


# format
Searching for disks...done


AVAILABLE DISK SELECTIONS:
       0. c0t1d0 <LSILOGIC-LogicalVolume-3000 cyl 65533 alt 2 hd 16 sec 136>
          /pci@780/pci@0/pci@9/scsi@0/sd@1,0
       1. c0t3d0 <FUJITSU-MAY2073RCSUN72G-0401-68.37GB>
          /pci@780/pci@0/pci@9/scsi@0/sd@3,0


Create a zpool


# zpool create blaze c0t3d0
# zpool list
NAME                    SIZE    USED   AVAIL    CAP  HEALTH     ALTROOT
blaze                    68G     88K   68.0G     0%  ONLINE     -


Partition up pool as you would like.


# zfs create -V 2g blaze/disk1


Turn on iSCSI


# zfs set shareiscsi=on blaze/disk1


Now on the target add the target server


 # iscsiadm add discovery-address 10.10.10.10


# iscsiadm modify discovery -t enable


View if you like


# iscsiadm list discovery-address -v


Now you can format the lun on the target system and create the file system.  Simple and quick.  There is more for configuring dynamic multipathing

Monday Sep 08, 2008


My Billfold is going to explode.  I'm unable to fit my latest reward card in my wallet.  The shear number of reward club cards that I have is staggering.  Gas Stations, Grocery Stores, Sporting Good Stores, Airlines, Auto Rental and Hotels are all trying to make me their loyal customer by getting me to sign up with their business and then track everything I buy with them.  Let's not forget all the credit cards, fitness membership IDs, member only stores, drivers license and various hunting and fishing license, no wonder my poor wallet is sick and not to mention the pain such a large bulge inflicts on it's carrier ( no, I don't want to start a class action law suite).


And to gap everything off, each reward membership has an internet site for you to maintain and use your rewards.  You need to hirer a secretary to manage this nightmare.  That's not the way to build loyalty.  A good product and good service is really the only way to build loyalty.  I don't use my Grocery Shopping Card because I like the store.  I use it because I forced to if I want to save money.  That gas program card, the only thing I like about it is that it shuts up the speaker next to the pump quicker than pressing mute multiple times.  Sporting Good stores, really rewarding me after spending a huge amount with a $10.00 certificate.  Airlines, the truth is the way you treat us and pack us into an airplane, then charge us for a snack, let me not forget baggage fees.  All but one Airline have lost my loyalty.


So businesses take note.   When I purchase a product, quality is very important to me but service is probably the most important factor.  Starting with how you treat me during the sell and helping me determine what is best for my needs, and ending up with how you treat and work with me to after the sell.  I realize that some items break, but if you treat me right and do what is right, you will win my loyalty.  Those reward cards have got to be adding to my expense in the first place.  Get rid of them.


And to my fellow Sun Services employees, do your best and treat the customer with respect and concern and we will win their loyalty that way I don't have to carry one more reward card in my wallet.



Thursday Sep 04, 2008

One measure of how popular something has become is to do a google search on it.  For example, if I want to see how popular DTrace is I type in dtrace and I get 904,000 entries; opensolaris 5,640,000; Sun Microsystems 28,100,000; "David Blasingame" 748 (I need to work on that). 


The google search method will show you how much interest there is in a person or topic.  I believe executives can use this method as a gauge of their products.  Let me take politics as an example.  How much interest has been generated by the nomination of Govenor Sarah Palin for Vice President; "Sarah Palin" 7,010,000.  It would be interesting to see the number of hits for her prior to her nomination.  I wonder if google tracks this. 


Just think, you introduce a new product and you want to track interest then google it and chart it.  How google tracks things, I have no idea.  But for a relative measure of success, and since most people use google for their search, you can use it to track the interest generated by your products. 


Now how to generate excitement.  One way is to make really good product like Solaris 10, OpenSolaris, or check out mysql 257,000,000.  Compared to oracle 129,000,000.  It's no wonder Sun decided to buy mysql.  So what Sun needs to do is to generate more excitement.  Hopefully the more interest and excitement, the more revenue.