Thursday February 10, 2005
A Penguin in Purple ClothingTom Duffy's Online Journal Late night hacking at openib workshop A whole bunch of guys from Trilabs came to the openib workshop in Sonoma. In order to make the conference more productive, they brought some hardware. Well, with a little power, brains, and beer, anything can happen. They hijacked the bar in the hotel and grouped around for a all night hackathon. Did they ever get LinuxBIOS to initialize the VGA head? Well, sorta: if you forget to hook up the power to the card, X does come up, which it shouldn't. In any event, a fun night was had by all, including the HPC Microsoft guy who happened upon us and shared a beer.
(2005-02-10 18:21:42.0) Permalink Comments [1] I am sitting at the openib developers conference in Sonoma, CA. A large collection of vendors and open source contributers are here talking about what has been done and what needs to be done to make openib world class. It has been a very helpful conference and there are some very good things coming. SDP will be checked into the tree probably Thursday. [uk]DAPL will be ready, license-wise, in about a month. Both iSER and SRP are in the works.
(2005-02-09 09:05:41.0) Permalink Comments [0]
OpenIB in Linus's development tree If you want the cutting edge, go grab a copy of Linus's bk tree or download this patch on top of 2.6.10. The OpenIB base codebase has been submitted for inclusion into the 2.6.11 branch of the Linux kernel. This is the basic gen2 tree that has been developed over the past year or so through an open source process involving national labs and many vendors including TopSpin, Voltaire, Intel, IBM, Infinicon, Mellanox, and Sun. The gen2 tree includes:
A discussion is underway both on the Linux Kernel Mailing List as well as on the OpenIB mailing list. If you have anything to say, please speak up. (2004-12-16 08:13:46.0) Permalink Two good news items on the OpenIB front:
I spent all day trying to figure out this problem. I was trying to redo the threading in the OpenIB IPoIB code to use kthreads instead of a Topspin home grown method. And in fact, I think I almost have it. It is just that there seems to be some weirdness still going on. If my client cannot join the multicast group, it behaves fine. It backs off according to the exponential algorithm and dies properly when the interface is brought down. My problem seems to stem from when the interface actually works ok. If I try to bring it up and it can join the mulitcast group, it does it in a two stage process. The first stage works fine, it can come up and join the group. It is when it goes to the second part that it fails. It locks up the console. I can log in via ssh on another screen, but the console locks hard. I don't have a fresh enough brain to figure this out tonight. Maybe tomorrow with more coffee, I will be albe to diagnose the problem. So, goodnight folks, I am s00per frustrated. (2004-09-21 22:09:50.0) Permalink Woot!!! [root@nisus ~]# uname -a SunOS nisus.SFBay.Sun.COM 5.10 s10_64 sun4u sparc SUNW,Sun-Fire-280R [root@nisus ~]# netstat -nr Routing Table: IPv4 Destination Gateway Flags Ref Use Interface -------------------- -------------------- ----- ----- ------ --------- 192.168.0.0 192.168.0.78 U 1 6 ibd1 10.0.0.0 10.6.98.78 U 1 87102 eri0 224.0.0.0 10.6.98.78 U 1 0 eri0 default 10.6.98.1 UG 1 1 eri0 127.0.0.1 127.0.0.1 UH 8 292360 lo0 [root@nisus ~]# ping 192.168.0.233 192.168.0.233 is alive [root@sins-stinger-10 root]# uname -a Linux sins-stinger-10 2.6.9-rc2openib #3 SMP Mon Sep 13 10:40:38 PDT 2004 x86_64 x86_64 x86_64 GNU/Linux [root@sins-stinger-10 root]# ping 192.168.0.78 PING 192.168.0.78 (192.168.0.78) 56(84) bytes of data. 64 bytes from 192.168.0.78: icmp_seq=0 ttl=255 time=1.21 ms 64 bytes from 192.168.0.78: icmp_seq=1 ttl=255 time=0.130 ms 64 bytes from 192.168.0.78: icmp_seq=2 ttl=255 time=0.125 ms --- 192.168.0.78 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2002ms rtt min/avg/max/mdev = 0.125/0.488/1.210/0.510 ms, pipe 2 [root@sins-stinger-10 root]# netstat -nr Kernel IP routing table Destination Gateway Genmask Flags MSS Window irtt Iface 10.6.98.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 192.168.0.0 0.0.0.0 255.255.255.0 U 0 0 0 ib0.8001 169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0 0.0.0.0 10.6.98.1 0.0.0.0 UG 0 0 0 eth0 So, what does all this mean? Well, it means that Linux/OpenIB and Solaris are configured on the same IB network and are able to talk to each other vi IPoIB. That is pretty cool. Thanks to Roland and Jeremy for all the help to make this first important compatibility step happen. (2004-09-20 14:46:40.0) Permalink Comments [1]
Also, I got a couple of Arbel cards in finally after a mixup with the first order. Arbel is a PCI Express Infiniband card. Since I needed to bring down my brinup system to install the card, it gave me a good opportunity to upgrade the Opteron's from 244s to 248s bringing them both from 1.8Ghz to 2.2Ghz. Q had already put the faster 3200 memory in there, so now I have a rocking machine for sure. With big, honking heat sings and fans on their, it looks pretty cool too. I should take a picture and put it up here... The picture here is a graphical representation of the IB network. The big blue dot is a sleipner IB switch. The red ones are host nodes. Even though there is nothing indicating what is what, the one with the SM on it is a Sparc64/Solaris box running IBSRM (Sun's subnet manager). The other red dots are Linux systems with openib on them. One is an Opteron system. The other a Sparc64/Linux box as mentioned previously. (2004-09-16 16:25:50.0) Permalink One of my skunx works projects for a while has been to play with sparc64/linux. I have a couple of old e250's in the lab and a blade 100 on my desk that run Debian Sid sparc64. Well, since Roland wanted to test more platforms for openib, I said I would check out sparc64. I got it to compile with a few minor tweaks. On Monday, I spent a bit of time trying to debug an interaction between openib on Linux and Sun's Infiniband subnet manager (ibsrm) on Solaris. I was getting byte swap issues on the wire because I was using a x86_64 client with a sparc64 host (x86_64 is little endian, sparc64 big). I thought it might be a problem with ibsrm, so I asked Jeremy from the East coast to take a look. While he was setting up a Linux machine to test with it, I thought I would try a different tack. If I ran openib on sparc64, there wouldn't be an endian issue, so I thought I might get a little further. I needed to upgrade the firmware on the Tavor IB card and I had two options: pull the card and stick it into a Solaris box or try to get the recently open sourced tavor flash tool working on sparc64/Linux. The latter seemed more interesting, plus I wouldn't have to go into the lab and touch hardware :-) After Roland helped worked out a struct packing issue, the tool compiled and I took the leap. Luckily, it worked and I didn't end up with a dead card. Next was to get the tavor driver working on sparc64. This required a bit more, as the openib code mostly assumed that the PCI configuration was x86ish. Sparc64 has an IOMMU and only allows you to DMA into a 32bit address space. In any event, once again Roland pulled through with a quick hack to take out the x86 assumptions the driver loaded properly. Next on the plate: get IPoIB working with IBSRM. (2004-09-15 18:35:17.0) Permalink |
Calendar
RSS Feeds
All /General /Linux /Music /OpenIB /Personal SearchLinks
NavigationReferersToday's Page Hits: 25 |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||