Monday Jan 05, 2009

Crossbow was integrated in Solaris (snv_105) last month and the bits are now available for download.

It opens door to several interesting possibilities. One can now build virtual stack around any service (like HTTP, FTP, NFS etc.), transport protocol or virtual machines (Containers, Xen domUs). Each virtual stack can be assigned priority and bandwidth on a shared NIC without causing performance degradation. It is achieved with Virtual NICs (VNICs) which are built on top of dedicated resources like Rx/Tx rings, DMA channels, kernel queues and threads and CPUS. For detailed designed document, click here.

While the project spanned over couple of years, my first exposure to crossbow was as Summer Intern in 2007. I worked on prototype for implementing VLANs as VNICs. It required moving VLAN processing from DLS to MAC layer thereby simplifying the implementation of GLDv3 framework. Also, now VLANs could be assigned to virtual machines. One could also do bandwidth control and fanout on VLAN interfaces. You can find more details here.

Two semesters, Masters degree and a thesis defense later, I was back at Sun working on Crossbow. Working on this open source project has been fantastic learning experience. The project was nearing integration when I returned. Thus, the primary focus has been on code consolidation for cleaner design and better readability as well as comprehensive testing for uncovering and fixing bugs to deliver a robust product.

It gave me exposure and opportunity to work on various aspects of the products. To name a few:
    - Flows: configured on the basis of Layer 3 and/or Layer 4 classification rule. One can associate properties like maximum bandwidth and priority with packets matching the rule.
    - Etherstubs: Can be viewed as special VNIC created without any underlying physical NIC. Since it allows us to build VNICs on top of it, one can build arbitrarily complex networks virtually, perform performance analysis and debug on single box.
  And needless to say, I learned innumerous nitty-gritties of the implementation and how a world class operating system code is developed.

Project Crossbow's contribution is not just limited to the features it currently supports. It is in fact a framework from which a feature-rich product can be delivered. Several extensions like providing bandwidth guarantees, supporting real time priorities, providing VNIC and flow abstraction over aggregations, supporting Crossbow over non-ethernet clients (e.g. wifi) are in pipeline.

Sunday Aug 17, 2008

Playing with kernel can be tricky. Little mistake and one can land up with system that won't even boot! If you are fiddling with a kernel module, it is best to have backed up its working version on the same machine somewhere else. I hadn't and I ran into following on reboot:

Use is subject to license terms.
DEBUG enabled

WARNING: The following files in / differ from the boot archive:

changed /kernel/drv/amd64/vnic
changed /kernel/drv/vnic

The recommended action is to reboot to the failsafe archive to correct
the above inconsistency. To accomplish this, on a GRUB-based platform,
reboot and select the "Solaris failsafe" option from the boot menu.
On an OBP-based platform, reboot then type "boot -F failsafe". Then
follow the prompts to update the boot archive. Alternately, to continue
booting at your own risk, you may clear the servic/ke by running:
"svcadm clear system/boot-archive"

Aug 14 16:42:01 svc.startd[100004]: svc:/system/boot-archive:default: Method "/lib/svc/method/boot-archive" failed with exit status 95.
Aug 14 16:42:01 svc.startd[100004]: system/boot-archive:default failed fatally: transitioned to maintenance (see 'svcs -xv' for details)
Requesting System Maintenance Mode
(See /lib/svc/share/README for more information.)
Console login service(s) cannot run

Root password for system maintenance (control-d to bypass)

If I had stored working copy of module, then I could have simply booted into Failsafe mode and overwritten the troublesome copy with backup and rebooted. However, I hadn't. The problem:

* Failsafe mode allowed mounting the disk in read/write mode but the machine was not on network.

* Normal mode booted with aforementioned error messages. While the network was up, the disk was mounted in read only mode. Thus, though sftp to my build machine worked, it would not let me fetch any files.

Thanks to Gopi, I could fix it. Just follow these simple steps:

1> cat /etc/vfstab

In the output, look for "device to mount" entry with FS type "/". It should look something similar to /dev/rdsk/c1t0d0s0.

2> mount -o remount /dev/rdsk/c1t0d0s0 # i.e. device to mount with FS type "/"

You can now copy correct modules form network and get your system up and running!