News, How-to's, and More SolarisCAT

Tuesday Feb 10, 2009

One new feature in Solaris CAT 5.1 is the ability to run the tool's sanity checks without having to fully start the tool.  You can do this using the --sanity_checks option. For example:

# scat --sanity_checks vmcore.0
sanity checks: settings...
NOTE: /etc/system: ce:ce_taskq_disable set to 0x1 2 times
NOTE: /etc/system: module ge not loaded for "set ge:ge_intr_mode=0x833"
vmem...CPU...
WARNING: CPU0 has cpu_intr_actv for PIL 1
WARNING: TS thread 0x3000f36f520 on CPU3 using 98%CPU
WARNING: TS thread 0x33732a2a7e0 on CPU515 using 98%CPU

sysent...clock...misc...
WARNING: 213 severe kstat errors (run "kstat xck" )
WARNING: tmpfs filesystem on /tmp using 4.66G virtual memory

NOTE: kcage_freemem < kcage_lotsfree
WARNING: 1 pending softcalls (no softlevel1 interrupt queued)
done

These are the same checks that are run when the tool first reads the crash dump, as well as when the coreinfo command is used.

What is even better is that, with the proper permission, e.g., you are running the command as root, this can be run on the live system at any time to get a snap shot of the overall "health" of the system.  For example, sanity check output for a healthy system would look like:

#scat --sanity_checks
sanity checks: settings...vmem...CPU...sysent...misc...done


Once you've installed the tool, complete details on the sanity checks can be found in /opt/SUNWscat/docs/sanity.html.

NOTE: In some cases things like disk drivers generate a few errors in kstats during initialization.  If Solaris CAT reports kstat counts of one or two hits for a device, be sure to research the devices kstats before assuming there's a  problem. Within Solaris CAT, a simple thing to try is "kstat xck" to run a cross check of the kstats. You can also display kstats outside Solaris CAT using Solaris' kstat(1M) command.


Friday Feb 06, 2009

It's always fun when you can beat your own goals and release something early.  The bits made it through the tests, the legal tasks got completed, and the packaging was done so why wait until next week when one could "pull the trigger" today.  We're, therefore, happy to announce that Solaris CAT 5.1 is now available for download here. The release notes are provided in /opt/SUNWscat/docs/relnotes_5.1.html after you install the package or in our last blog entry.

If you have comments, questions, want to report a bug, or request an RFE, please feel free to send us a note at SolarisCAT_Feedback@sun.com.

Wednesday Feb 04, 2009

As promised, we'll be releasing Solaris Crash Analysis Tool updates every six months or so.  Yes!  The final release process for Solaris CAT 5.1 is reaching its end and you'll soon be able to download the latest and greatest bits. Come back here on Feb 15th and you'll likely see the release announcement.

Though this release mostly addresses bugs and added functionality to support the latest changes in the Nevada/OpenSolaris kernel there are a few new features.  Here's the release notes:

General

Solaris 2.4, 2.5, and 2.5.1 Support

This version of Solaris CAT no longer supports Solaris 2.4, 2.5, or 2.5.1. Please use the 5.0 version of the tool if support for those Solaris releases is required.

FMRI Reporting

In Solaris Nevada/OpenSolaris build 86 and up, the FMRI string for the SMF service is maintained with each proc. Therefore, the FRMI string for that service is now displayed with the command name for all threads and procs.

Solaris Volume Manager Scans for Active Data Set

On Solaris 9 and up, the svm command now scans for a MD set that has active devices. This means that the command increments the set it is using until it finds a set that contains metadevices. The command will start with the set defined, 0 by default. The command displays a message when the set number is changed. For example:

SolarisCAT(vmcore.14/10U)> svm
Solaris Volume Manager Status (md_status):
    MD_GBL_DAEMONS_LIVE (Master daemon has been started)
    MD_GBL_OPEN (Administration is open)

Active Metadata Set(s):
    Set   Address     Name  Status
    0     0x7003ef08  null  MD_SET_SNARFED MD_SET_NM_LOADED
    1     0x7003ef78  foo   MD_SET_SNARFED MD_SET_NM_LOADED

 SVM's md_set 0 is empty, trying set 1 instead.

d0 (ms_unit @ 0x6001468c8c8) (md_set 1) Concat/Stripe
    actual # blocks:      113207296 (53.9G)
    Unit Status:          Okay
    Stripe 0:
    Device       Starting Block  State
    239(did),96  0               Okay

New Commands

CYCLIC

cyclic

This new command displays the cyclic at the address specified. That address is sometimes referred to as a cyclic_id.

This command ONLY works on Solaris 9+.

SLISTT

slistt

This new command displays kernel list_t linked lists and the structures linked by them. The address provided needs to be the address of a list_t structure. This is often embedded in other structures. In those cases, the offset into the structure of the list_t should be added to the structure address for printing.

This command ONLY works with CTF data.

Interface Changes

callout

This command was substantially changed in its command-line interface for consistency, and consistency with the kernel code.

By default, all tables are displayed. Instead of the rt and ts options, there are now the flags -r for selecting the realtime callouts and -n for selecting the normal callouts.

Options were added to include relevant structure addresses (-a), to decode the callout's argument into the thread or process it represents (-t) or to display only expired callouts (-e).

Finally, an option was added to display only a specified callout. The callout is selected using its XID.

scat --sanity_checks

The new command line argument --sanity_checks can be used to run a quick check of a running system or crash dump file. The intent is to allow easy access to these extensive checks without having to run Solaris CAT interactively.

send_scat_explore
scat --scat_explore

In release 5.1, one can now run scat_explore in a quasi standalone mode using the --scat_explore option to scat. For customers who open a Sun Service Request, crash data can now be gathered and transmitted to Sun using the send_scat_explore command. The syntax for send_scat_explore is:
send_scat_explore [-n service_number] [-e email_address] [unix.x] vmcore.x

Where:

  • -n service_number - sets the Sun Service Request number to assign to the crash data
  • -e email_address - sets the reply-to email address that Sun should used to acknowledge the receipt of the data.
  • [unix.x] vmcore.x -the crash dump from which crash data should be gathered. Please note that unix.X need not be supplied and the core number, X, can be specified with or without the vmcore. prefix.

If the above -n and -e options are not specified the user is promted for them at run time.

If the system in question is not configured to send email directly to Sun, the crash data can be collected manually using scat --scat_explore. The scat_explore feature will print the name of the dirctory in which the data was placed and will also place in that directory a compressed tar archive of the crash data.