A Penguin in Purple Clothing

Tom Duffy's Online Journal

All | General | Linux | Music | OpenIB | Personal

20050210 Thursday February 10, 2005

Late night hacking at openib workshop

A whole bunch of guys from Trilabs came to the openib workshop in Sonoma. In order to make the conference more productive, they brought some hardware. Well, with a little power, brains, and beer, anything can happen. They hijacked the bar in the hotel and grouped around for a all night hackathon. Did they ever get LinuxBIOS to initialize the VGA head? Well, sorta: if you forget to hook up the power to the card, X does come up, which it shouldn't. In any event, a fun night was had by all, including the HPC Microsoft guy who happened upon us and shared a beer.

(2005-02-10 18:21:42.0) Permalink Comments [1]

20050209 Wednesday February 09, 2005

openib.org conference

I am sitting at the openib developers conference in Sonoma, CA. A large collection of vendors and open source contributers are here talking about what has been done and what needs to be done to make openib world class. It has been a very helpful conference and there are some very good things coming. SDP will be checked into the tree probably Thursday. [uk]DAPL will be ready, license-wise, in about a month. Both iSER and SRP are in the works.

(2005-02-09 09:05:41.0) Permalink Comments [0]

20050104 Tuesday January 04, 2005

OpenIB in Linus's development tree
Great News! Dave Miller took the OpenIB rev 5 patches posted to netdev and lkml and put them into his netdev tree. Subsequently, Andrew Morton and Linus Torvalds pulled the patches into their trees. So, now OpenIB will be in 2.6.11 and is already in 2.6.10-mm1.

If you want the cutting edge, go grab a copy of Linus's bk tree or download this patch on top of 2.6.10.
(2005-01-04 10:44:05.0) Permalink Comments [0]

20041216 Thursday December 16, 2004

OpenIB ready for inclusion

The OpenIB base codebase has been submitted for inclusion into the 2.6.11 branch of the Linux kernel.

This is the basic gen2 tree that has been developed over the past year or so through an open source process involving national labs and many vendors including TopSpin, Voltaire, Intel, IBM, Infinicon, Mellanox, and Sun.

The gen2 tree includes:

  • mthca driver (InfiniBand Host Channel Adapter)
    • Mellanox tavor support
    • Mellanox arbel support (in tavor compat mode)
  • middle layer software (abstraction of VAPI)
  • One upper layer protocol: IPoIB

A discussion is underway both on the Linux Kernel Mailing List as well as on the OpenIB mailing list. If you have anything to say, please speak up. (2004-12-16 08:13:46.0) Permalink

20041118 Thursday November 18, 2004

OpenIB news

Two good news items on the OpenIB front:

  1. OpenIB gets funding from the Department of Energy. http://news.zdnet.com/2100-9593_22-5446887.html
  2. The openib gen2 codebase is cleaned up and ready to be submitted to the Linux kernel mailing list next Monday. Here is Roland's post on this matter.
(2004-11-18 09:25:37.0) Permalink

20040921 Tuesday September 21, 2004

Debugging sucks (sometimes)

I spent all day trying to figure out this problem. I was trying to redo the threading in the OpenIB IPoIB code to use kthreads instead of a Topspin home grown method. And in fact, I think I almost have it. It is just that there seems to be some weirdness still going on. If my client cannot join the multicast group, it behaves fine. It backs off according to the exponential algorithm and dies properly when the interface is brought down.

My problem seems to stem from when the interface actually works ok. If I try to bring it up and it can join the mulitcast group, it does it in a two stage process. The first stage works fine, it can come up and join the group. It is when it goes to the second part that it fails. It locks up the console. I can log in via ssh on another screen, but the console locks hard. I don't have a fresh enough brain to figure this out tonight. Maybe tomorrow with more coffee, I will be albe to diagnose the problem.

So, goodnight folks, I am s00per frustrated.

(2004-09-21 22:09:50.0) Permalink

20040920 Monday September 20, 2004

IPoIB Works!

Woot!!!

[root@nisus ~]# uname -a
SunOS nisus.SFBay.Sun.COM 5.10 s10_64 sun4u sparc SUNW,Sun-Fire-280R
[root@nisus ~]# netstat -nr
 
Routing Table: IPv4
  Destination           Gateway           Flags  Ref   Use   Interface
-------------------- -------------------- ----- ----- ------ ---------
192.168.0.0          192.168.0.78         U         1      6  ibd1
10.0.0.0             10.6.98.78           U         1  87102  eri0
224.0.0.0            10.6.98.78           U         1      0  eri0
default              10.6.98.1            UG        1      1  eri0
127.0.0.1            127.0.0.1            UH        8 292360  lo0
[root@nisus ~]# ping 192.168.0.233
192.168.0.233 is alive

[root@sins-stinger-10 root]# uname -a
Linux sins-stinger-10 2.6.9-rc2openib #3 SMP Mon Sep 13 10:40:38 PDT 2004 x86_64 x86_64 x86_64 GNU/Linux
[root@sins-stinger-10 root]# ping 192.168.0.78
PING 192.168.0.78 (192.168.0.78) 56(84) bytes of data.
64 bytes from 192.168.0.78: icmp_seq=0 ttl=255 time=1.21 ms
64 bytes from 192.168.0.78: icmp_seq=1 ttl=255 time=0.130 ms
64 bytes from 192.168.0.78: icmp_seq=2 ttl=255 time=0.125 ms
 
--- 192.168.0.78 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2002ms
rtt min/avg/max/mdev = 0.125/0.488/1.210/0.510 ms, pipe 2
[root@sins-stinger-10 root]# netstat -nr
Kernel IP routing table
Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
10.6.98.0       0.0.0.0         255.255.255.0   U         0 0          0 eth0
192.168.0.0     0.0.0.0         255.255.255.0   U         0 0          0 ib0.8001
169.254.0.0     0.0.0.0         255.255.0.0     U         0 0          0 eth0
0.0.0.0         10.6.98.1       0.0.0.0         UG        0 0          0 eth0

So, what does all this mean? Well, it means that Linux/OpenIB and Solaris are configured on the same IB network and are able to talk to each other vi IPoIB. That is pretty cool. Thanks to Roland and Jeremy for all the help to make this first important compatibility step happen.

(2004-09-20 14:46:40.0) Permalink Comments [1]

20040916 Thursday September 16, 2004

IBSRM

Today, I fixed a bug with the OpenIB IPoIB implementation (this is the module that allows you to run an IP network on top of an Infiniband network). There was an issue with /proc file creation that was b0rking unloading of the module. Roland has been also actively fixing up ipoib and it is getting pretty good.

Also, I got a couple of Arbel cards in finally after a mixup with the first order. Arbel is a PCI Express Infiniband card. Since I needed to bring down my brinup system to install the card, it gave me a good opportunity to upgrade the Opteron's from 244s to 248s bringing them both from 1.8Ghz to 2.2Ghz. Q had already put the faster 3200 memory in there, so now I have a rocking machine for sure. With big, honking heat sings and fans on their, it looks pretty cool too. I should take a picture and put it up here...

The picture here is a graphical representation of the IB network. The big blue dot is a sleipner IB switch. The red ones are host nodes. Even though there is nothing indicating what is what, the one with the SM on it is a Sparc64/Solaris box running IBSRM (Sun's subnet manager). The other red dots are Linux systems with openib on them. One is an Opteron system. The other a Sparc64/Linux box as mentioned previously.

(2004-09-16 16:25:50.0) Permalink

20040915 Wednesday September 15, 2004

OpenIB on sparc64/linux

One of my skunx works projects for a while has been to play with sparc64/linux. I have a couple of old e250's in the lab and a blade 100 on my desk that run Debian Sid sparc64. Well, since Roland wanted to test more platforms for openib, I said I would check out sparc64. I got it to compile with a few minor tweaks.

On Monday, I spent a bit of time trying to debug an interaction between openib on Linux and Sun's Infiniband subnet manager (ibsrm) on Solaris. I was getting byte swap issues on the wire because I was using a x86_64 client with a sparc64 host (x86_64 is little endian, sparc64 big).

I thought it might be a problem with ibsrm, so I asked Jeremy from the East coast to take a look. While he was setting up a Linux machine to test with it, I thought I would try a different tack.

If I ran openib on sparc64, there wouldn't be an endian issue, so I thought I might get a little further.

I needed to upgrade the firmware on the Tavor IB card and I had two options: pull the card and stick it into a Solaris box or try to get the recently open sourced tavor flash tool working on sparc64/Linux. The latter seemed more interesting, plus I wouldn't have to go into the lab and touch hardware :-) After Roland helped worked out a struct packing issue, the tool compiled and I took the leap. Luckily, it worked and I didn't end up with a dead card.

Next was to get the tavor driver working on sparc64. This required a bit more, as the openib code mostly assumed that the PCI configuration was x86ish. Sparc64 has an IOMMU and only allows you to DMA into a 32bit address space. In any event, once again Roland pulled through with a quick hack to take out the x86 assumptions the driver loaded properly.

Next on the plate: get IPoIB working with IBSRM.

(2004-09-15 18:35:17.0) Permalink

Calendar

« November 2009
SunMonTueWedThuFriSat
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
     
       
Today

RSS Feeds

XML
All
/General
/Linux
/Music
/OpenIB
/Personal

Search

Links


Navigation



Referers

Today's Page Hits: 17