This page validates as XHTML 1.0, and will look much better in a browser that supports web standards, but it is accessible to any browser or Internet device. It was created using techniques detailed at glish.com/css/.
"Herd 2.0" what is that ?
"Hardware Error Report and Decode (HERD) utility" . Its a monitoring daemon for all Sun AMD processor based systems running fairly recent versions of suse and redhat linux that is designed to monitor the machine check log and decode any errors there. Earlier version were available but had knowledge of the hardware topology of only certain Sun AMD processor based systems but with version 2.0 its ALL Sun AMD based systems.
With the complete topology of the systems it can decode the information that the CPU provides for a error into physical components eg to a specific DIMM
From the Sun download centre ... https://cds.sun.com/is-bin/INTERSHOP.enfinity/WFS/CDS-CDS_SMI-Site/en_US/-/USD/ViewProductDetail-Start?ProductRef=HERD-2.0-M-G-F@CDS-CDS_SMI
Low frequencies of correctable errors are an expected part of modern computing and you need to consult your hardware service provider to determine when a DIMM is generating too many Correctable Errors and should be swapped.
I remember that when you wanted to add a driver to control a scsi device you added the driver to the right directory, created a driver.conf file, edited that file to select what target/lun combination you wanted and maybe set some properties or address filters, ran devfsadm or did a boot -r and it all just worked...
Now it appears I am old fashioned and that the mpt scsi driver became one of these new fangled "self identifying" HBA devices with the solaris 10 mpt patch 125081-09 and later revisions that are now in the Kernel patch and hence in the latest updates to solaris 10.
We did document this in the scsi(4) manual page its just I don't think we really documented that mpt had changed type.
So if you add a SL500 tape changer device to a mpt scsi bus a /devices node will be created by mpt now rather than the target driver. As mpt has no idea what driver will attach to that it uses the name "sgen" for scsiclass 08, st for scsiclass 01, sd for scsi classes 00,03,05. This is the big difference previously the /devices node would only have appeared when your target driver attached it and it would have used the name of your driver so you could search for it with find during your solutions install script and put a nice simple entry in /etc/devlink.tab to get a /dev symbolic link.
Now you will always get a /devices node called sgen@XXX for a changer device attached to mpt as soon as mpt attaches and probes the bus, without any modification to sgen.conf sgen will attach to it. so how do I get my "foo" driver to attach to it?
"man -s4 scsi" says that you will get a list of compatible properties attached to the /devices node, ranging from some very specific strings though to very generic strings. The sgen driver picks up on the very generic entry, but you can use update_drv -a -i "specific compat property" foo to insert the association ahead of sgen. At this point sgen should let go of the device and the foo driver will attach to it. You can undo this association by using update_drv -d -i "specific compat property" foo to delete the association and allow sgen to re-attach to the changer device.
So for my sl500 how do i see these compatible properties..
prtconf -v | grep -i stk
value='scsiclass,08R.vSTK.pSL500.r1126' + 'scsiclass,08.vSTK.pSL500.r1126' + 'scsiclass,08R.vSTK.pSL500' + 'scsiclass,08.vSTK.pSL500' + 'scsa,08.bmpt' + 'scsiclass,08R' + 'scsiclass,08' + 'scsiclass'
#
so to add my foo driver ahead of sgen and take control of all scsi class 08 changers attached to STK SL500's I used
update_drv -a -i '"scsiclass,08.vSTK.pSL500"' foo
and to get symbolic links under /dev/foo I added this line ( tab not spaces) to /etc/devlink.tab
type=ddi_pseudo;name=sgen; foo/\M0
instead of the old fashioned form ...
type=ddi_pseudo;name=foo; foo/\M0
all without editing a single driver.conf file...
Of course the old /dev links for sgen will still be there under /dev/scsi/changer/ but they can be removed with a devfsadm -C
The same would have to be done for the SL500's tape devices but they would show up under /devices as a node with a name like st@XXXX and you would have to add an alias for them using update_drv -a and a line in /etc/devlink.tab using name=st; my_driver
certainly some installation programs and solutions will come unstuck when faced with this new mpt behaviour. Hopefully this will help explain what you are seeing.
The customers brand new machines were having problems with a jvm stopping in garbage collection. The jvm was set to have a 3.1 Gbyte heap but after a few hours of running the application would grind to halt, the garbage collector stats showed that garbage collection was taking an increasing amount of time up into the 10s of seconds..
prstat -m 1 on the box with the application showed the jvm using a lot of user and system time, the system time was odd, but at the top of the active processes was rcapd. A quick look in /etc/project showed that someone had set a resource control to limit RSS for the user.root project of 1.7 Gbytes.
The line looked something like..
user.root:123245::::rcap.max-rss=1707745024
So that was bad, everytime the processes that belonged to root got an RSS of over 1.7Gbytes rcapd would pick on them and page out their pages, and then set them running again. In the case of the jvm as soon as the garbage collector had to manage a heap bigger than 1.7 Gbytes it would scan the objects paging all those heap pages back in, then rcapd would page them all back out. The garbage collector was now being limited by the speed of the swap device and was fighting a losing battle.
iostat -xnz 1 showed 100s of Mbytes/sec going to swap and then 100s of Mbytes/sec coming back from swap.
vmstat -p 1 just showed a massive amount of page ins.
rcapd was disabled using rcapadm -D and the new machines flew along. I did some investigations here in the office using a small test case that malloc'ed a large array and then looped around touched its pages and that was crippled as soon as the array was much bigger than the cap, but interestingly as rcapd is procfs based sometimes a procfs command like pfiles or truss would either fail with "the process is traced" or it would stop the rcapd getting to the process's proc files so it would escape capping for a short while...
ksh: /usr/sbin/dtrace -qs /var/tmp/aio.d $(pgrep aio)
------- 1648930 value axis is in millisecs -------
write against /devices/pseudo/pts@0:1
value ------------- Distribution ------------- count
-1 | 0
0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 82
1 | 0
kaio writes to /devices/pci@0/pci@0/pci@2/scsi@0/sd@1,0:a,raw
value ------------- Distribution ------------- count
2 | 0
4 |@@@@@@@@@@@@@@@@@@@@ 299
8 |@@@@@@@@@@@@@@@@@@@@ 298
16 | 1
32 | 0
kaio writes to /devices/pci@0/pci@0/pci@2/scsi@0/sd@1,0:b,raw
value ------------- Distribution ------------- count
2 | 0
4 |@@@@@@@@@@@@@@@@@@@@ 295
8 |@@@@@@@@@@@@@@@@@@@@ 295
16 |@ 9
32 | 0
------- 1648940 value axis is in millisecs -------
write against /devices/pseudo/pts@0:1
value ------------- Distribution ------------- count
-1 | 0
0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 114
1 | 0
kaio writes to /devices/pci@0/pci@0/pci@2/scsi@0/sd@1,0:a,raw
value ------------- Distribution ------------- count
1 | 0
2 | 2
4 |@@@@@@@@@@@@@@@@@@@@ 328
8 |@@@@@@@@@@@@@@@@@@@@ 334
16 | 1
32 | 0
kaio writes to /devices/pci@0/pci@0/pci@2/scsi@0/sd@1,0:b,raw
value ------------- Distribution ------------- count
2 | 0
4 |@@@@@@@@@@@@@@@@@@@@ 328
8 |@@@@@@@@@@@@@@@@@@@@ 329
16 | 7
32 | 0
^C
application ( aioread/aiowrite) libaio -------- kaio (aioread/aiowrite) a volume manager ssd target driver mpxio - multipathing leadville - san stack qlc - hba --- disk --- qlc interrupt handler leadville completion routines ssd interrupt handler the volume managers completion routines kaio (aiowait) ----- libaio application ( aiowait)
# modload /home/timu/src/syscalls/test1 can't load module Dec 9 21:27:22 aaaaa test1: REQUEST INFO : Out of memory or no room in system tables Dec 9 21:27:22 aaaaa test1: INITIALIZED Dec 9 21:27:22 aaaaa genunix: WARNING: system call missing from bind fileObvious I must have missed adding an entry to /etc/name_to_sysnum !
# grep timu /etc/name_to_sysnum timuSyscall 66oh! Its not that as I had used an unused number. ah! How does the system match the module name to the syscall number, there is nothing in the driver I have written to do that...It must be the name..
# cp test1 timuSyscall # modload /home/timu/src/syscalls/timuSyscall Dec 9 21:30:04 aaaaa timuSyscall: REQUEST INFO Dec 9 21:30:04 aaaaa timuSyscall: INITIALIZED #So the module name that you are modloading must match the name you put in /etc/name_to_sysnum.
ksh: mdb unix.0 vmcore.0
Loading modules: [ krtld genunix specfs dtrace ufs scsi_vhci sd mpt px ip sctp arp usba fcp fctl nca lofs audiosup md cpc random crypto zfs fcip logindmux ptm sppp nfs ipc ]
> ::walk ereportq_dump | ::ereport -v
class='ereport.io.fire.pec.mfp'
ena=13e4662751600401
detector
version=00
scheme='dev'
device-path='/pci@1e,600000'
primary=1
tlu-uele=00000000001fffff
tlu-uie=001fffff001fffff
tlu-uis=0000000000040000
tlu-uess=0000000000040000
tlu-rueh1l=0000000000000000
tlu-rueh2l=000000007e4151da
tlu-tueh1l=000000010000000f
tlu-tueh2l=0080004400000000
class='ereport.io.fire.pec.mfp'
ena=13e466e707800401
detector
version=00
scheme='dev'
device-path='/pci@1e,600000'
primary=1
tlu-uele=00000000001fffff
tlu-uie=001fffff001fffff
tlu-uis=0004000000040000
tlu-uess=0004000000040000
tlu-rueh1l=0000000000000000
tlu-rueh2l=000000002db088b7
tlu-tueh1l=0200000100000f0c
tlu-tueh2l=0000104400000000
I bought myself a USB card from maplin.co.uk - a Velleman k8055 card, 2 analog inputs, 2 analog outputs, 5 digital inputs, 8 digital outputs. I bought it to experiment with libusb.
It took about an hour to solder together and as soon as I plugged it in to my Solaris laptop ( nv_b51) it was autodetected and HID attached to it - odd to have it export the HID class but easy to fix.
I took the usb vid and pid from /var/adm/messages and used update_drv -a to force a binding to the ugen driver. Unplugged and replugged the card and the ugen driver bound to the card.
Then via google I found a k8055 linux libusb application, using the studio 11 compilers ( free from www.sun.com) that built without error. The k8055 work perfectly, I can read the sensors and write values to the ports - taken all the fun out of it...
I think I'll turn it into a kernel driver to give me some practise..
It is "%k" as in
/usr/sbin/dtrace -i 'BEGIN{ @foo[ cpu, stack(3)] = count();} END{ printa("cpu %d stack %k count %@d\n", @foo);}'
Several folks have asked when should a program set the stack size rlimit.. just before exec() is the only sensible point.
Once your process has started up things have been mapped just below the reserved stack space, the size of which is the value of the stack space resource limit at the time the program assembled its address space ( ie during exec).
lets use pmpa and have a look..
ulimit -S -s 20000
ulimit -S -s
20000
sleep 20 & pmap $!
2460: sleep 20
00010000 8K r-x-- /usr/bin/sleep
00022000 8K rwx-- /usr/bin/sleep
00024000 8K rwx-- [ heap ]
FE700000 864K r-x-- /lib/libc.so.1
FE7E8000 32K rwx-- /lib/libc.so.1
FE7F0000 8K rwx-- /lib/libc.so.1
FE810000 8K r-x-- /platform/sun4u-us3/lib/libc_psr.so.1
FE820000 24K rwx-- [ anon ]
FE830000 184K r-x-- /lib/ld.so.1
FE86E000 8K rwx-- /lib/ld.so.1
FE870000 8K rwx-- /lib/ld.so.1
FFBFE000 8K rw--- [ stack ]
total 1168K
mdb
> (FFBFE000-FE870000)%0t1024=D
20024
>
so the first shared library ld.so.1 has been mapped below the reserved swap space.
ulimit -S -s 200000
ulimit -S -s
200000
sleep 20 & pmap $!
[1] 2463
2463: sleep 20
00010000 8K r-x-- /usr/bin/sleep
00022000 8K rwx-- /usr/bin/sleep
00024000 8K rwx-- [ heap ]
F3700000 864K r-x-- /lib/libc.so.1
F37E8000 32K rwx-- /lib/libc.so.1
F37F0000 8K rwx-- /lib/libc.so.1
F3840000 8K r-x-- /platform/sun4u-us3/lib/libc_psr.so.1
F3850000 24K rwx-- [ anon ]
F3860000 184K r-x-- /lib/ld.so.1
F389E000 8K rwx-- /lib/ld.so.1
F38A0000 8K rwx-- /lib/ld.so.1
FFBFE000 8K rw--- [ stack ]
total 1168K
mdb
> (FFBFE000-F38A0000)%0t1024=D
200056
>
So if I use setrlimit to change the current stack space setting to a bigger number then all future mappings will be pushed down below that reserved space but existing mappings won't move, and if your stack tries to grow over them you will get a segv signal. So you should only ever increase the stack size rlimit just before a call to exec().
This stack size will only affect the default stack for the main thread in a process, the stack for other threads are sized at thread_create() time either using the default 1MB or a program specified amount.
My customer was complaining that his server process was running out of memory, malloc() was returning NULL. A pmap() of the process showed it was a 32 bit application so limited to a touch less than 4GB of
address space. The pmap showed it had only 600MB of space used, a small stack section, lots and lots of shared libraries and a 500Mb heap ( malloc stuff) that was right up to the base of the shared libraries, so had no room to grow.
A bit of careful looking and there was an approximately 2gb hole in the address space starting at about 2gb - how odd!
After getting a truss of the application starting up it became obvious, it was performing a setrlimit( RLIMIT_STACK to RLIM_INFINITY) just before the hole appeared. That call sets the stack size to 2GB ( the stack starts out way up the top of the address space near 4gb on a 32 bit application), the kernel when handing out user address space has to avoid the area reserved for the stack, so all future mmaps are located below 2GB halfing the process's available address space.