Sunday June 19, 2005
Continuing my discussion of the Solaris 1394 Software Framework, in this post I'm going to go into some detail on the methods by which a 1394 target driver can register and deregister itself with the core framework. And, in the process, I will also touch on the Solaris event notification mechanisms leveraged by the framework.
All 1394 target drivers must register themselves with the framework in order to operate properly. The 1394 framework will make an association with the device driver instance for the target and the corresponding HAL driver instance (called the 'parent') for the adapter to which it is attached.
In addition, the 1394 framework will allocate resources (internally) to track the target driver state (in the target driver 'handle') and return useful information about the current state of the target device on the 1394 bus and DMA and/or interrupt properties of the parent HAL.[1]
/*
* Function: t1394_attach()
* Input(s): dip The dip given to the target driver
* in it's attach() routine
* version The version of the target driver -
* T1394_VERSION_V1
* flags The flags parameter is unused (for now)
*
* Output(s): attachinfo Used to pass info back to target,
* including bus generation, local
* node ID, dma attribute, etc.
* t1394_hdl The target "handle" to be used for
* all subsequent calls into the
* 1394 Software Framework
*
* Description: t1394_attach() registers the target (based on its dip) with
* the 1394 Software Framework. It returns the bus_generation,
* local_nodeID, iblock_cookie and other useful information to
* the target, as well as a handle (t1394_hdl) that will be used
* in all subsequent calls into this framework.
*/
/* ARGSUSED */
int
t1394_attach(
dev_info_t *dip, /* supplied by the target */
int version, /* supplied by the target */
uint_t flags, /* supplied by the target */
t1394_attachinfo_t *attachinfo, /* filled in by the framework */
t1394_handle_t *t1394_hdl) /* returned to the target */
During a target driver's attach processing, it calls t1394_attach() to register with the 1394 Software Framework. The Framework initializes any necessary internal data structures and returns a t1394_hdl which the target driver uses with all other calls into the Framework. The Framework also returns additional information in attachinfo, which is needed by some target driver implementations.
/*
* t1394_attachinfo_t
* is filled in and returned by the 1394 Framework at attach time
* (returned from the call to t1394_attach()). This structure contains
* the t1394_localinfo_t structure described above, as well as the
* iblock cookie and the attributes necessary for DMA allocations, etc.
*/
typedef struct t1394_attachinfo_s {
ddi_iblock_cookie_t iblock_cookie;
ddi_device_acc_attr_t acc_attr;
ddi_dma_attr_t dma_attr;
t1394_localinfo_t localinfo;
} t1394_attachinfo_t;
/*
* t1394_localinfo_t
* is filled in and returned by the 1394 Framework at attach time
* (in the t1394_attachinfo_t structure returned from t1394_attach())
* to provide the local host nodeID and the current bus generation.
*/
typedef struct t1394_localinfo_s {
uint_t bus_generation;
uint_t local_nodeID;
} t1394_localinfo_t;
/* * Function: t1394_detach() * Input(s): t1394_hdl The target "handle" returned by * t1394_attach() * flags The flags parameter is unused (for now) * * Output(s): DDI_SUCCESS Target successfully detached * DDI_FAILURE Target failed to detach * * Description: t1394_detach() unregisters the target from the 1394 Software * Framework. t1394_detach() can fail if the target has any * allocated commands that haven't been freed. */ /* ARGSUSED */ int t1394_detach(t1394_handle_t *t1394_hdl, uint_t flags)
The target driver calls t1394_detach() to deregister from the 1394 Software Framework. Typically the target calls this from its detach(9E) routine.
Target drivers may register callbacks for general 1394 Software Framework events by using the Solaris Event Framework. All calls to the Event Framework must be performed before the call to t1394_attach() is performed. For details about the Solaris Event Framework, see:
The following events are supported by the 1394 Framework:
void (*handler)(dev_info_t *dip, ddi_eventcookie_t cookie, void *arg,
void *impl_data);
The callback impl_data provided to the eventcalls associated with each of these events is a t1394_localinfo_t *, as described above.
Note: Within an event callback function, a target driver shouldn't invoke any procedure that blocks or sleeps. For example, an event callback function shouldn't issue any outgoing asynch request that has the CMD1394_BLOCKING flag set. (Yep, again... more on this in a later blog entry).
OK. So obviously there's more I could say here about the details of Framework's implementation for tracking and coordinating target drivers and their events, but as I've said before that I want to first go through the Framework at a high-level.
Next time, I'm going to go over the outgoing asynch interfaces: command structure allocation, command completion mechanisms (event-driver, blocking, polling), command types (read, write, lock), quadlet requests, block requests, etc. That's a ton of stuff to cover, so maybe it won't all get into a single blog entry, but anyway that's where I'm headed next.
Wednesday June 15, 2005
Yeah, that's right. Unfortunately, Sun is not yet able to open up our source code for our Solaris InfiniBand HCA driver. (One of my colleagues, Steve Rust touches on this in his most recent blog entry.) Although we wrote all the code ourselves, we did it with access to info that we got under NDA. So we're still under obligation not to disclose anything. I sincerely hope we will soon be able to open it up too, because there is some really interesting code in there that Steve R. and I are really proud of. For now, though, I guess it is among those few OpenSolaris drivers which you can get only as a binary.
The driver itself is called tavor and it basically started out as my baby (Steve R. owns it now). After my work on the Solaris 1394 Software Framework (and a handful of aborted or "development only" projects with InfiniBand HCA's), I finally got an opportunity in early 2002 to design and implement my own driver, from the ground up. The driver was to be for the Mellanox InfiniHost MT23108 HCA device, which was going to be the central I/O component in a SPARC-based blade server platform (which we never ultimately shipped).
But although the driver started out life as with a very specific purpose for a very specific (and since canceled) platform, we (the engineers) anticipated a value from the beginning if it could work well with plug-in cards. And today, a plug-in card is still the primary mechanism for adding InfiniBand to a system.
It took about a year and half of design/implementation/testing before it was ready for putback into Solaris (Steve Rust's blog says August 6th, 2003 and I'll trust him, since he was our 'gatekeeper' for the entire Solaris InfiniBand Framework putback). Subsequent to that putback, there were bug fixes (obviously), enhancements for x86 and AMD64 support, the userland access support, and (most recently) support for Shared Receive Queues (SRQ) and for the new Mellanox InfiniHost III Ex MT25208 HCA device.
The latter half of that work above was done by Steve R. and was done subsequent to Solaris 10 release. (I had the "project lead" role, he did all the hard work.) But if you want to check out the fruits of Steve R's latest work - check out Solaris Express 04/05 for the latest 'tavor' bits.
I know we're both extremely proud of this code (and really do wish we could show it off). And it's got some really fun stuff in it: handling userland access to HCA resources (i.e. OS bypass for lower latency), extreme configurability (honestly probably too configurable), a fancy mechanism for keeping track of "Work Request Identifiers" (for which I recently received US Patent #6,901,463), and a cool queue pair number allocation/reuse scheme for which Steve R. and I have a patent pending.
But anyway, I probably sound like a tease, since the driver isn't yet available in source form. But, if you're interested in InfiniBand, there's still plenty of really excellent code in OpenSolaris to check out. (Check Steve Rust's latest blog entry "InfiniBand Support in OpenSolaris" for a good starter.)
And if you've got an InfiniBand HCA card (from any of a number of vendors - Sun, TopSpin, Mellanox, etc.), then you can see this driver attached to your hardware and you can use it. (Matter of fact, like I said, this same 'tavor' driver will also attach to and operate on the latest generation of Mellanox's PCI-Express-capable InfiniHost III Ex MT25208 card. So, if you've got a system with PCI-Express - there are a few out there and I know Sun's got plenty coming - then you can get some really kick-ass performance out of our IB stack.)
Also, if you want to read more about our Solaris IB stuff, here's a few blogs by some other colleagues of mine:
There's a ton of other engineers (dozens, literally) who've contributed to the Solaris InfiniBand Framework. But maybe they're a little shy? I can't seems to find blogs by any of them. Anyway, they should all be proud too. And I'm sure that they are happy to have you folks able to see their code now in OpenSolaris.
Shoot me a comment (below) if you've used our IB software, or if you've been poking through the code. I'm very curious to hear from folks on the other end about what they think of our work.
(2005-06-15 18:14:00.0) Permalink Comments [3]
Tuesday June 14, 2005
When I started at Sun about seven years ago, the Solaris 1394 Software Framework was my first project. A team of four, we designed, implemented, tested and putback to Solaris 8 (Update 2) in about a year and a half. Since then the code has been transitioned to others (like Alan Perry and Artem Kachitchkine), who continue to maintain and extend its functionality even today. Now, almost six years later, I finally get a chance to share this code with the world (and, hopefully, at least some interested readers.)
So I figured I'd start by giving a brief overview of the stack and the features that it provides and then talk a bit about the source files for the modules: how they're organized, what's in them, etc. Then (over the course of several posts) I'll get into the specifics of how to use each of interfaces, gotcha's for potential developers working with the source, some of the little bits of which I'm most proud, and maybe some discussion of what could be improved or extended in the existing code (and I'll be interested to hear what others think). This will not be an intro to the IEEE 1394 specification or the technology, nor will it be an introduction to writing Solaris device drivers (though I'll try to help anyone in any way I can). What follows will assume a certain familiarity with IEEE 1394 and with writing Solaris device drivers.
The Solaris 1394 Software Framework Device Driver Interface provides a set of kernel interface routines to facilitate access to devices on an IEEE 1394 bus. The interface routines are intended for use by 1394 device drivers, also referred to as target drivers.
There are two kinds of target drivers: class drivers and vendor specific drivers. Class drivers adhere to a general standard for a particular kind of device and can drive any vendor's device that adheres to the same standard. For example, a class driver for the IEEE 1394 Digital Conferencing Camera specification can drive video conferencing cameras manufactures by a variety of vendors (even though each may have a different set of features). Vendor-specific drivers are built to drive a specialized non-standard device. The 1394 Software Framework supports both kinds of target drivers.
The 1394 Software Framework provides several features to support target drivers using the IEEE 1394 bus.
There are two sides of asynchronous I/O: issuing outgoing requests and handling incoming requests.
Outgoing Requests - The 1394 Framework provides the ability to send the basic set of IEEE 1394 asynchronous requests; read, write and lock. In addition to the IEEE 1394 defined set of lock request options, the Framework lock request interface also provides a set of bit and arithmetic functions.
For asynchronous requests, the 1394 Framework automatically determines the device's destination ID, sends the request using the local host's 1394 hardware interface, tracks the status of the request until the transaction completes, and supplies response information as needed.
Target drivers choose whether to: 1) block while waiting for the transaction to complete, 2) poll on the request completion status, or 3) have the 1394 Framework call a specified callback routine when the transaction completes. Using the poll and callback mechanisms, target drivers can issue several outstanding requests and poll or be notified for each completion.
Incoming Requests - The primary role of the 1394 Framework with respect to incoming asynchronous requests is to dispatch the request to the appropriate target driver or to handle the request on behalf of the appropriate target driver.
To support this, the 1394 Framework provides an allocation mechanism that target drivers use to reserve ranges of addresses within the 48-bit local node address space. Target drivers have several options available when allocating 1394 address space including the ability to specify a kernel virtual buffer to map to the allocated space. When the 'destination_offset' of an incoming request falls within an allocated address range, the 1394 Framework fulfills the request if possible, notifies the target driver if desired, and sends the response. Target drivers may also allocate 1394 address space with the characteristic that an incoming request to that space will be handled by hardware. In this case hardware directly accesses host memory bound to that address space, and transmits the appropriate response.
Due to the potentially large volume of isochronous data and the critical isochronous timing needs, the 1394 Software Framework provides a mechanism designed to reduce call overhead and to maximize throughput.
Before starting isochronous I/O, a target driver sets up the overall sequence and structure of receive or transmit buffers, indicating other needs such as when the 1394 Framework should invoke a target driver's callback. Once isochronous I/O is started, the target driver can focus most of its time on handling the data.
The 1394 Framework mechanism for configuring isochronous I/O is the Isochronous Transfer Language (IXL). The IXL is a hardware independent set of control blocks that the target driver uses to direct isochronous DMA. The 1394 Software Framework converts the hardware independent IXL into the appropriate DMA directives for the local host 1394 interface hardware changes, the impact to the target driver is minimal or non-existent.
The 1394 Software Framework also facilitates peer to peer communication by tracking all target drivers with an interest os a particular isochronous stream, allocating a channel number and bandwidth as needed, and coordinating the target driver notification of stream starts and stops.
In addition to complying with the IEEE 1394 requirements for bus reset processing, such as cancelling pending asynchronous requests, the 1394 Framework provides several bus reset related features.
One of the most severe effects of a bus reset is the re-enumeration of all the nodes on the bus. The 1394 Software Framework assesses the post bus reset topology and determines the new node_IDs for all target driver instances. It can then reissue any uncompleted outgoing asynchronous requests on behalf of the issuing target driver, and each target driver can continue on without concern for their new node number.
As part of the topology evaluation, the 1394 Software Framework also creates a speed map to determine the maximum packet speed between any two nodes. The Framework uses the speed map to select the most efficient speed for target driver outgoing asynchronous requests.
In addition to the topology map and speed map which are part of 1394 bus manager duties, the 1394 Software Framework also contends for isochronous resource manager and bus manager. If it is bus manager, the Framework will ensure that the root is cycle master capable and optimize the gap count.
Another aspect of the Framework's topology evaluation is that it determines which devices, if any, have been removed from the bus and which ones have been added to the bus. For removed devices, the 1394 Software Framework calls into the Solaris Hotplug Framework to notify it that the device is offline. For added devices, the Framework reads the device's configuration ROM to determine the pertinent information, the Global Unique ID and often the Unit_Spec_Id and Unit_Sw_Version, and creates the Solaris "/devices" node using the Solaris Hotplug Framework interfaces.
To ensure that the Solaris 1394 Software Framework is loaded, target drivers must link with a dynamic dependency on the Framework misc module. This is done using the '-N' flag with ld:
ld -r -dy -Nmisc/s1394 -o target target1.o target2.o -o target
Solaris device entries for IEEE 1394 devices are created based on the device's global unique ID. The format of the name uses a prefix of "unit@" followed by the GUID in hexadecimal. An example /devices pathname for device A above is as follows ("tdA" is target driver A's minor name):
/devices/pci@1f,4000/firewire@4,2/unit@0800460200000016,0:tdA
1. Unit_Spec_Id, Unit_Sw_Version
2. Node_Spec_Id, Node_Sw_Version
3. Node_Vendor_Id, Node_Hw_Version
4. Module_Spec_Id, Module_Sw_Version
5. Module_Vendor_Id, Module_Hw_Version
For further information on the layout of configuration ROM and the meaning of these values, refer to IEEE 1212-1994 Section 8 and IEEE 1394-1995 Section 8.3.2.5. For specific information about configuration ROM for a particular device class, refer to the device class specification.
After parsing configuration ROM and locating one of the pairs as shown above, the Solaris 1394 Software Framework provides the information to the hotplug framework. If a driver is configured to bind to the designated pair, the /devices and /dev entries are created and the driver's attach() routine is invoked. For example, to add a driver for a video conferencing camera which adheres to the 1394 Digital Camera Draft 1.04 (note that hexidecimal letters must be in lower case):
add_drv -n -i \"firewire00a02d,000100\" tdA
Where firewire is the hardware interface, 00a02d is the Unit_Spec_ID and 000100 is the Unit_Sw_Version.
All the source and headers for the Solaris 1394 Framework can be found under:
usr/src/uts/common/io/1394
usr/src/uts/common/sys/1394
The files themselves break down this way:
* t1394.c - 1394 Target Driver Interfaces
* s1394.c - 1394 Services LAyer Initialization and Cleanup Routines
* s1394_addr.c - 1394 Address Space Routines
* s1394_asynch.c - 1394 Services Layer Asynch Communications Routines
* s1394_bus_reset.c - 1394 Services Layer Bus Reset Routines
* s1394_csr.c - 1394 Services Layer CSR and Config ROM Routines
* s1394_dev_disc.c - 1394 Services Layer Device Deiscovery Routines
* s1394_hotplug.c - 1394 Services Layer Hotplug Routines
* s1394_isoch.c - 1394 Services Layer Isoch Communications Routines
* s1394_misc.c - 1394 Services Layer Miscellaneous Routines
* nx1394.c - 1394 Services Layer Nexus Support Routines
* h1394.c - 1394 Services Layer HAL Interfaces
* t1394_errmsg.c - Utility function that targets can use to convert an
error code into a printable string.
* s1394_cmp.c - 1394 Services Layer Connection Management Procedures Support Routines
* s1394_fa.c - 1394 Services Layer Fixed Address Support Routines
(Currently used only for FCP support)
* s1394_fcp.c - 1394 Services Layer FCP Support Routines
* t1394.h
This is the primary header file for the 1394 Framework and it includes
all other header files listed below. In addition, it contains all 1394
Framework interface routine prototypes as well as all data structure and
defines beginning with the "t1394_" prefix. (n.b. This one's pretty
well-commented, if I say so myself.)
* cmd1394.h
This file contains all structures and defines for handling asynchronous
commands.
* id1394.h
This file contains all structures and defines for managing a local
isochronous DMA resource.
* ieee1394.h
This file contains general IEEE 1394 defines.
* ieee1212.h
This file contains general IEEE 1212 defines.
* ixl1394.h
This file contains all structures and defines for utilizing IXL programs.
* h1394.h
This file contains the structure and error codes used to communicate between
the HAL and the rest of the 1394 Software Framework
* s1394.h
This file contains all of the structures used (internally) by the 1394
Software Framework.
* s1394_impl.h
This file contains typedefs and defines used by all 1394 Software Framework
files.
The source for our OpenHCI-compliant HAL driver can be found in:
usr/src/uts/common/io/1394/adapters
usr/src/uts/common/sys/1394/adapters
The files here are numerous, so I will hold off saying more about this driver until some later entries.
hci1394_extern.c hci1394_misc.c hci1394.c
hci1394_ioctl.c hci1394_ohci.c hci1394.conf
hci1394_async.c hci1394_isr.c hci1394_s1394if.c
hci1394_attach.c hci1394_ixl_comp.c hci1394_tlabel.c
hci1394_buf.c hci1394_ixl_isr.c hci1394_tlist.c
hci1394_csr.c hci1394_ixl_misc.c hci1394_vendor.c
hci1394_detach.c hci1394_ixl_update.c
hci1394_isoch.c hci1394_q.c
hci1394.h hci1394_extern.h hci1394_state.h
hci1394_async.h hci1394_ioctl.h hci1394_tlabel.h
hci1394_buf.h hci1394_isoch.h hci1394_tlist.h
hci1394_csr.h hci1394_ixl.h hci1394_tnf.h
hci1394_def.h hci1394_ohci.h hci1394_vendor.h
hci1394_descriptors.h hci1394_q.h hci1394_drvinfo.h
hci1394_rio_regs.h
And the source for the existing target drivers (mentioned above) can be found in:
usr/src/uts/common/io/1394/targets/av1394
usr/src/uts/common/io/1394/targets/scsa1394
So my basic plan is to continue with some discussion of driver attach() and detach() interfaces (see t1394_attach() and t1394_detach() in the source) and basic 1394 event processing. Then to move on to talk about the asynch interfaces, the isoch interfaces, and finally some of the miscellaneous interface routines. At this point, I thought I'd change the focus from a description of the interfaces and how to use them to a more detailed examination of some specific bits of internals code.
But I'm open to suggestions too. If you've read this far and feel like you may be interested in reading more, going through the code yourself, sharing you thoughts and comments, and end up with something specific you'd like to hear about... lemme know.
(2005-06-14 09:00:00.0) Permalink