Solaris 9 Kernel patch 122300-28 and Solaris 10 Kernel patch 137111-02 did introduce cache line retirement (a while ago).
Before cache line retirement a single weak cell, creating CE's, in a L2/L3 cache in a processor could cause the processor to be offlined and it would no longer do useful work and would just await replacement. With cache line retirement the offending cache line will no longer be used and the processor will continue run as normal. This will result in more processing power being available and less downtime. A Ultrasparc IV+ cpu module has 2 MB L2 cache, and 32MB L3 cache. If 64 out of 524,288 L3 cache lines are retired the cpu module will be offlined, so well before it can have any impact on the system performance. (The actual implementation is a little more complex as caches are organized in index's and ways) (Generic primer here http://en.wikipedia.org/wiki/CPU_cache ). On Solaris 10 cache line retirement has been implemented with FMA (Fault Management Architecture). In Solaris 10 cache line retirement is persistent through reboot through FMA log replay which happens at Solaris boot time.
To see how if any and how many cache lines are retired on S9:
kstat -n pn_cacheline_retire
And for Solaris 10
fmdump -av
To see if a CPU module needs replacement in S10
fmadm faulty -a
Example FMA message:
--------------- ------------------------------------ -------------- ---------
TIME EVENT-ID MSG-ID SEVERITY
--------------- ------------------------------------ -------------- ---------
Oct 28 15:15:38 9f7ef7e2-4282-cd2d-f7b4-eeac8c9986d6 SUN4U-8001-1E Major
Fault class : fault.cpu.ultraSPARC-IVplus.l3cachetag
Affects : cpu:///cpuid=4/serial=80010221135D2559
cpu:///cpuid=20/serial=80010221135D2559
degraded but still in service
FRU : "Slot C" (hc://:product-id=SUNW,Sun-Fire-V890:server-id=s064130/component=Slot C)
Description : The number of errors associated with this CPU has exceeded
acceptable levels. Refer to http://sun.com/msg/SUN4U-8001-1E for
more information.
Response : The fault manager will attempt to remove the affected CPU from
service.
Impact : System performance may be affected.
Action : Schedule a repair procedure to replace the affected CPU, the
identity of which can be determined using fmdump -v -u
.
Also
psrinfo
will show "offline" processors.
Generally Solaris 10 with FMA doesn't alert the Sys Admin until action does need to be taken.
In that case it'll post a message in /var/adm/messages .
Solaris 9, by default more verbose in error reporting will notify about each retired index and way in /var/adm/message. And will subsequently offline the processor if a threshold is crossed.
Reference:
UltraSPARC-IV+
Reference:
SunSolve Document ID:23862 Need to have access to SunSolve contract documents.
Cache line retirement should not be confused with Memory page retirement.
In the beginning of TP ethernet getting ports from different vendors to play nice together was sometimes a challenge, those days are over....... I guess in order to battle this problem some companies require the ethernet ports to be software wise "forced" to the highest speed and "full duplex", this sometimes seems to work but too often causes wierd connection problems. Now that most of these interoperatibility problems have gone the "old" rule is still continuing to cause, hard to find problems, every now and then. Nowadays, and for quite some time, autonegotiation works (and if not have the vendors fix it). I guess its now not more than an urban legend, IT style.