James Legg
OpenSolaris 2009.06 as a Sun Ray server with EA2 of SRSS 4.2
In the my department we run Sun Ray servers for our users to use and abuse the latest builds of OpenSolaris and to try and find and log bugs. We used to maintain a collection of Nevada based servers but the regular upgrades became less regular as we got stuck with live upgrade issues and the time taken to maintain them increased. Currently Sun Ray 4.2 or 4.1 on OpenSolaris 2009.06 is not supported but getting it working has been possible for a while now - the community efforts are documented on wiki.sun-rays.org.
We have two systems a Sun Fire x4240 with 2 Quad Core Opterons 32GB RAM and a 64 thread SPARC T5220 with 64GB of RAM. Both are running build 122 from the /dev IPS repository. 123 will go onto another server and then we will probably upgrade them one at a time when a new build comes out.
These days installing OpenSolaris 2009.06 is easy in our lab using AI (automated installer). Once you have a working install these are some of the things you need to do to get everything working. Most of this information is from the various internal and external wikis (see above) but I'm going to write it down in case anybody else want to play.
- Set up a static IP and disable NWAM.
$ svcadm disable svc:/network/physical:nwam
$ svcadm enable svc:/network/physical:default - Grab the EA2 Sun Ray Server Software release gunzip and untar it
- Install SUNWdhcs SUNWdhcsb SUNWdhcm
SUNWmfrun SUNWtltk SUNWdtbas
this can be done using IP
$ pfexec pkg install SUNWdhcsb SUNWdhcm SUNWmfrun
$ pfexec pkg install SUNWmfrun SUNWtltk SUNWdtbas - Install the software using utinstall as normal.
- If you running less than build 115 (that includes the release version of 2009.06) then you will need to work around 6822673 by editing /etc/opt/SUNWut/loginGUI.start
197c197 < $LOGIN_GUI_PROG -l "$LOGIN_TYPE" "$@" & --- > LANG=C $LOGIN_GUI_PROG -l "$LOGIN_TYPE" "$@" &
- Edit /etc/pam.conf and add the pam stack for gdm support.
# START: To support gdm on SRSS, added following by hand... gdm auth requisite /opt/SUNWut/lib/pam_sunray_hotdesk.so.1 gdm auth requisite /opt/SUNWut/lib/sunray_get_user.so.1 property=user gdm auth required /opt/SUNWut/lib/pam_sunray_amgh.so.1 gdm auth sufficient /opt/SUNWkio/lib/pam_kiosk.so log=user ignoreuser gdm auth requisite /opt/SUNWkio/lib/pam_kiosk.so log=user gdm auth sufficient /opt/SUNWut/lib/pam_sunray.so gdm auth requisite /opt/SUNWut/lib/sunray_get_user.so.1 prompt gdm auth required /opt/SUNWut/lib/pam_sunray_amgh.so.1 clearuser gdm auth requisite pam_authtok_get.so.1 gdm auth required pam_dhkeys.so.1 gdm auth required pam_unix_cred.so.1 gdm auth required pam_unix_auth.so.1 gdm account sufficient /opt/SUNWkio/lib/pam_kiosk.so log=user gdm account sufficient /opt/SUNWut/lib/pam_sunray.so gdm account requisite pam_roles.so.1 gdm account required pam_unix_account.so.1 gdm session requisite /opt/SUNWut/lib/pam_sunray_hotdesk.so.1 gdm session required /opt/SUNWkio/lib/pam_kiosk.so log=user gdm session required pam_unix_session.so.1 gdm password required pam_dhkeys.so.1 gdm password requisite pam_authtok_get.so.1 gdm password requisite pam_authtok_check.so.1 gdm password required pam_authtok_store.so.1 # END: To support gdm on SRSS
- Reboot the system.
At this point you can configure the Sun Ray server as your used to, do your utpolicy setting, set utcrypto up to your preferences all the normal stuff that you do when setting up a Sun Ray server.
On the basic level that should be it - almost eveything things works fine we have NCSM, RHA and AMGH working correcly. And since we configured it sever selection. When we find things that don't we log bugs. It's certainly nice to have OpenSolaris on my desktop at work.
Posted at 09:28PM Sep 22, 2009 by James Legg in Sun | Comments[0]
OpenSolaris Support and Extra Repositery Certificate Expiry checking
So you paid for support for OpenSolaris, and have a certificate for the supported IPS repositery at pkg.sun.com/supported or you have got your self a free certificate from pkg.sun.com/extras the so that you can add VirtualBox and/or Adobe Flash, but you want to get some advanced warning when your certificate is due to expire?
Take a look at check_cert.sh and see if it does what you want.
Edit the top of the script so that it has a sensible e-mail and number of months to check for:
EMAIL_ADDR=user@example.com
WARN_MONTHS=3
Run it something like this from a regular cron job:
0 0 10,20 * * check_cert.sh /var/pkg/ssl/OpenSolaris_standard_support.certificate.pem
0 0 10,20 * * check_cert.sh /var/pkg/ssl/OpenSolaris_extras.certificate.pem
Your system will have to be setup to let mailx send email someplace sensible for this script to work - and configuring sendmail is a little bit beyond the scope of this blog post.
Please do let me know if you spot an bugs or think there is a better way.
Posted at 10:36PM Jul 13, 2009 by James Legg in Personal |
"S.M.A.R.T. Capable but command failed" aka ZFS saves the day!
I get to join the ranks of smug people that have had ZFS save their bacon/data:
So my system threw a bit of a wobbly when I unplugged it to move it (dosn't get turned off much these days). It seems like one of the Western Digital WS5000YS RE (thats RAID Edition) drives failed to come back to life after being spun down. The first clue was the S.M.A.R.T. Capable but command failed error in the BIOS POST messages, the 2nd was the zio_read_data_fail messages and getting dropped to the grub> prompt instead of boot selection screen.
Ahh but surely ZFS will save the day? Well yes but the key to all situations like this is preperation - and of course I had not prepared properly. Hey this is my personal home workstation and apart from being unable to stream music and movies to the PS3 my users are pretty laid back about outages.
The problem is although I have a mirrored ZFS root pool (rpool) I didn't install the grub bootloader onto the 2nd disk despite the action of attaching a 2nd mirror to the rpool specifically telling you to do this.
I had some weirdness and had to reset the BIOS to defaults and eventually remove the bad drive to get the system to boot from the 2009.06 Live CD I suspect this was due to the amount of errors I got on the console from the disk as it booted.
So once I had booted of the 2009.06 Live CD I could mount my pool and try and work around the damage.
Firstly import the pool using the -f flag to overide the fact that it technically still in use.
jack@opensolaris:~$ pfexec zpool import -f rpool
cannot share 'rpool/export/home': smb add share failed
cannot share 'rpool/export/home/james': smb add share failed
cannot share 'rpool/export/home/media': smb add share failed
I can safely ignore the messages about being unable to share out my filesystems - don't think the livecd has cifs, so smb shares fail.
Make a place to mount the root file system, I'm mounting 111a as it is my lastest BE on this system (it has been broken since before 2009.06 came out - I told you my users where laid back)
jack@opensolaris:~$ mkdir /tmp/mnt
jack@opensolaris:~$ pfexec mount -F zfs rpool/ROOT/111a /tmp/mnt
Now the bit that I should have done before I had the disk fail:
jack@opensolaris:/rpool/boot/grub$ pfexec installgrub -m /tmp/mnt/boot/grub/stage1 /tmp/mnt/boot/grub/stage2 /dev/rdsk/c9d0s0
Updating master boot sector destroys existing boot managers (if any).
continue (y/n)?y
stage1 written to partition 0 sector 0 (abs 16065)
stage2 written to partition 0, 271 sectors starting at 50 (abs 16115)
stage1 written to master boot sector
jack@opensolaris:/rpool/boot/grub$
Umount and reboot:
jack@opensolaris:~$ pfexec umount /tmp/mnt
jack@opensolaris:~$ pfexec reboot
I am back in a working boot enviroment, though only running on a single disk, it looks like Western Digital will accept this disk as an RMA so it only goes to see how long I have to wait. I think I'm getting itchy already only having a single platter between me and data loss - I wonder how much a temp replacment would cost.
james@frank ~ $ zpool status
pool: rpool
state: DEGRADED
status: One or more devices could not be opened. Sufficient replicas exist for
the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
see: http://www.sun.com/msg/ZFS-8000-2Q
scrub: scrub in progress for 0h23m, 17.09% done, 1h54m to go
config:
NAME STATE READ WRITE CKSUM
rpool DEGRADED 0 0 0
mirror DEGRADED 0 0 0
c6d0s0 UNAVAIL 0 0 0 cannot open
c9d0s0 ONLINE 0 0 0
errors: No known data errors
james@frank
Posted at 10:51PM Jun 08, 2009 by James Legg in Personal | Comments[2]
Sheffield University Tech Day Slides
As promised to those that attended the Sun Tech Day and install fest at Sheffield University attached are my slides.
http://blogs.sun.com/jameslegg/resource/sheff-tech-demo-final.pdf
If anybody has any questions feel free to drop me a line and I will do my best to answer them. (though I may be off line for the next week or so as I am on holiday)
Thanks to all the attended and especially to the organizers for making me welcome. I hope that you found the talk interesting and informative.
Posted at 05:24PM Mar 21, 2009 by James Legg in Sun | Comments[1]
So you want to dual boot Linux with OpenSolaris 2008.11?
How to get back you Linux grub entries after installing OpenSolaris 2008.11 on a system.
[Read More]
Posted at 11:00PM Mar 18, 2009 by James Legg in Personal | Comments[3]
Progress
My dad lent me a book the other day (a 1972 edition of Starship Troopers) - this punchcard was the bookmark I found in it. According to wikipedia it's an IBM 80 that was originally designed in 1928. Thats an 80 year old bit of computer related technology. Next to it is the computer I carry with me in my pocket at all times capable of voice, video and VOIP calls, mobile data for my laptop, email, web, calender and it can pin point my current location on this planet.
It doesn't work when you get it wet - but then I bet the punch card wouldn't either.

Posted at 11:26PM Nov 23, 2008 by James Legg in Personal | Comments[3]
Mobile Internet under OpenSolaris
I've just spent a couple of hours discovering a bit about how ppp is set up under Solaris. I now know enough to get a data connection to Vodafone UK using a USB cable, my Nokia E71 and my laptop (running OpenSolaris 2008.05 build 99).
Solaris supports a lot of usb based modems using the usbsacm driver. For my Nokia E71 I plugged it in (selecting PC Suite on the phone from the pop menu) and a serial device appear in /dev/term/0 (and /dev/term/1) for simplicity I create a link using:
james@ickle ~ $ ln -s /dev/term/0 /dev/e71
Pppd appears to be the best choice for a modem connection under Solaris - I assume that once this was used for modems and ISDN links when they where more common.
To configure pppd to work with Vodafone I have create the following files the extra init string in /etc/ppp/vodafone-chat are from when I set wvdial up under Linux - I'm not sure how important they are (and I havn't tested yet)
james@ickle ~ $ touch /etc/ppp/options
james@ickle ~ $ cat /etc/ppp/vodafone-chat
'' 'ATZ'
'OK' 'ATQ0 V1 E1 S0=0 &C1 &D2 +FCLASS=0'
'OK' 'AT+CGDCONT=1,"IP","internet"'
'OK' 'ATD*99#'
CONNECT ''
james@ickle ~ $ cat /etc/ppp/peers/vodafone
modem
e71 # use this device (ln -s /dev/term/0 /dev/e71)
460800 # baud rate
noauth # do not authenticate the ISP's identity (client)
noipdefault # assume no IP address; get it from ISP
defaultroute # install default route; ISP is Internet gateway
usepeerdns
noccp # ISP doesn't support free compression
novj
user "web" # username for vodafone gprs
nodetach
show-password
crtscts
connect "/usr/bin/chat -V -t15 -f /etc/ppp/vodafone-chat" # dial into ISP
james@ickle ~ $
To bring up the connection exectute the pppd call command
james@ickle ~ $ pppd call vodafone
ATZ
OK
ATQ0 V1 E1 S0=0 &C1 &D2 +FCLASS=0
OK
AT+CGDCONT=1,"IP","internet"
OK
ATD*99#
CONNECTSerial connection established.
Using interface sppp0
Connect: sppp0 <--> /dev/e71
possibly broken peer detected; restarting LCP
LCP: Rcvd Code-Reject for Identification id 46
local IP address 10.49.31.29
remote IP address 10.6.6.6
primary DNS address 10.203.65.68
secondary DNS address 10.203.65.68
As I used the nodetatch in my /etc/ppp/vodafone config file the pppd stays in the foreground so you can just use Ctrl-C to disconnect. If you want you could just do a pkill pppd instead.
^CTerminating on signal 2.
Connection terminated.
Connect time 0.8 minutes.
Sent 513 bytes (13 packets), received 364 bytes (10 packets).
james@ickle ~ $
A couple of last things to note is that the pppd daemon does not seem to sort out the DNS for you so you either have to manually edit /etc/resolv.conf with the nameserver information or copy /etc/ppp/resolve.conf over your existing /etc/resolv.conf (back it up first!)As I normally use NWAM to bring up my network connections and when i was testing this i was at home I disabled it while I was using the 3G modem connection - I don't know of the top of my head what would happen if was active at the same time. Hopefully at some point NWAM will be expanded to cope with mobile phone data connection and data cards - I'm looking forward to the new GUI improvements in the next version.
Most of this information was cribbed from this guide on opensolaris.org.
Posted at 08:18PM Oct 19, 2008 by James Legg in Personal |
S3 Suspend
My home built workstation that is made up of an Asus mainboard with an intel 965 chipset and a Nvidia 8800GS graphics card now suspends (and resumes) as of OpenSolaris 2008.11 build 97. It has actually suspended since about build 93ish but before had never resumed. It just used to stick at some kind of half resumed state without logging any messages (or bringing back the screen) and I must admit to never getting around to hooking it up to serial console to try and debug it further.
To test if suspend works on your hardware try Randy's instructions to enable S3 resume.
I'm still poking at my x40 Thinkpad in an attempt to persuade it to suspend/resume so far it is 50% working... (it goes to sleep... forever).
As standard when it boots it uses the vgatext driver that doesn't appear to support the requisite DDI_SUSPEND (it has it but only as a placeholder for debugging supposedly). I have messed about /etc/driver_alias to persuade it to load the i915 driver for the graphics card (instead of vgatext) (Intel 845GM).This was achieved by adding the line.
i915 "pciclass,030000" before the vgatext lines.
After this the x40 suspends but still won't resume just beeps spins the fans up and sits there, and I still get no logs. I will need to learn more about how it works to try and debug it, unfortunately no serial port on this laptop so that option is out as well.
Posted at 09:47PM Sep 17, 2008 by James Legg in Personal |
New Blog Location - same miscellaneous content
I'm moving my fairly intermittent blog over to blogs.sun.com/jameslegg from it's old home at jameslegg.blogspot.com I won't be migrating content but new stuff will appear hear and the old site will just go unupdated. The type of content will stay the same, with random notes from whatever interesting tech I am playing with at the moment but at least on blogs.sun.com I can mention any work related bits as well.
Posted at 08:04AM Sep 10, 2008 by James Legg in Personal | Comments[0]