SteveJay's Weblog

20060711 Tuesday July 11, 2006

"Andromeda" Blade Server - Congratulations

Congratulations to the team

Sun today announced the Sun Blade 8000 Modular System (code-named "Andromeda"). As the System Integration Lead for this platform, I want to take this opportunity to congratulate the entire engineering team (hardware team, software teams - including OS, BIOS, ILOM, and system diagnostics teams, SQA team, documentation team, services team, etc.) on a job well done. Customers are already very excited about this product.

The Sun Blade 8000 Modular System and its first server module, the Sun Blade x8400, are an impressive accomplishment for our team here on the East Coast.[1] You can read all about it here, here, here, etc... and you might have even caught the webcast of the announcement. What might not have been quite as obvious is that this modular system has a pretty robust and exciting roadmap of "pluggables" coming down the line as well.

Rock-solid foundation by design

By virtue of Andromeda's 'no compromises' design, we've got a rock-solid foundation in place for the dozens of new, innovative modules getting ready to roll. Can't say much, obviously, but you can use a little imagination. We've got multiple generations of compute module upgrades already in the pipeline, and we'll be further extending our I/O module offerings - both in the NEM form-factor and in the industry-standard PCIe ExpressModule form factor. (And we might even have a few other ideas in our bag-of-tricks that'll surprise more than a few folks about what we can do with this design.)

"If you can touch it, you can hot-plug it."

It's our support for the industry-standard PCIe ExpressModules form factor, though, that made me put this entry in my "PCI Express" category. This is because it's the Sun Blade 8000's service model, its complete 'hot-pluggability', that's the real winner for a modular computing system like ours. If you can touch it on our system, you can hot-plug it. This means fans, power supplies, disk drives... the typical stuff. But it also means the redundant chassis monitoring modules (CMMs) and, most importantly, the I/O modules - the NEMs and the EMs.[2]

And... read the FUD with a critical eye

Anyway, you'll likely hear a lot of noise from our competitors in the next few days and weeks. They'll tend to confuse the issues, of course, but I suppose that's their job.

They'll comment on how unsuccessful Sun's first blade server (the Sun Fire B1600)... the implication being that Sun doesn't know how to do this right. They're wrong, of course. On the contrary, our team - although an entirely different team than the original B1600 team - fully deconstructed all these issues, and many others that our competitors had or still have, and took it all back to the drawing board.

But then they'll tell you it's "too big". Well, tell them we can fit two (2) of our systems in a standard rack (total of 160 cores in 38 rack units) and power and cool them using today's data center standards. Then ask them about theirs... can they match the compute/memory/IO density and be able to cool and power it? Not without some compromises.

And maybe they'll tell you "Sun doesn't do Windows". Well, hmm, don't tell my buddies on the Windows team. And pay no attention to the fact that we're WHQL certified (with more certifications on the way) and listed in the Microsoft Windows Server Catalog.

Or perhaps they'll try to tell you "Sun's missing out on Linux". Well, again, I think my colleagues on our Linux team would beg to differ. We've got Red Hat and SLES certifications too. And, of course, this is an x86 system, so should it really surprise anyone that we support all these OSes?[3]

Congratulations!

Anyway, this is a really exciting time for me and the team. Congrats again to everyone involved in this milestone. Let's keep it rolling!


[1]Despite the fact that this morning's product announcement took place in sunny San Francisco, the team responsible for the engineering design, SQA, and delivery of this system is our team here in wet, rainy Burlington, MA.

[2]The PCI Express I/O hot-plug on Andromeda is one of the aspects of the system of which I'm personally most proud. The fact that we were able to implement an entirely OS-neutral ACPI hot-plug implementation is still a source of personal pride. Actually, maybe I'll blog about this in more detail in the future.

[3]We just happen to offer one of those OSes free-of-charge: Solaris.

(2006-07-11 20:50:00.0) Permalink Comments [4]

20060514 Sunday May 14, 2006

The 1394 Software Framework - Incoming Asynchronous Requests (Part 1 of 3)

In this next installment of my series of entries about the Solaris 1394 Software Framework, I'm continuing with a discussion of asynchronous request handling. (Looking for details on isochronous handling? Be patient.) But this time I'm going to be talking about handling and responding to incoming asynchronous requests.

When the 1394 Framework receives a request from the bus, it must determine to which target driver the request is intended. This is made possible by requiring target drivers to first reserve ranges within the 1394 host address space. Then, when a request arrives, the Framework can look up the target driver that owns the range and processing of the request can begin.



The other side of incoming request processing, of course, is sending a response. The Framework provides four ways for target drivers to do this:

IEEE 1394 Addressing Overview

IEEE 1212-1994 (also known as ISO/IEC 13213) is a specification describing the generalized command and status register arcitecture that can be used, by a variety of different (and future) protocols, to provide commonality and some interoperability.

IEEE 1394 is based on IEEE 1212 and uses IEEE 1212 style 64-bit addressing. In the 64-bit address the high-order 10 bits identifying the bus number (0x3FF for the local bus) and the next 6 bits identify the node ID. Node ID 63 is reserved for broadcast, so this allows for up to 63 devices (0 - 62) on the bus. The remaining 48 bits are used to address locations on the specified device.

Looking at it another way, each node (device) can assign, partition, and manage its own 48 bits of address space. The significance of this for a particular node is in terms of received requests, being sent to it from other nodes on the bus. By ascribing meaning or functionality to certain addresses, a node can handle incoming asynchronous requests as defined in a standards specification or as otherwise planned. From the point of view of transmitting a request, it is relevant to know the meaning of address locations at the destination node.

Again according to IEEE 1212 (and used by IEEE 1394), some of the highest 48-bit addresses, those beginning with 0xFFFF_Fxxx_xxxx, are resered for use by various protocols, including IEEE 1394 itself.

Physical I/O

One of the features of IEEE 1394 bus architecture is the ability to directly access device or host memory without having to pass messages or packets via software. Protocols that rely on a lot of asynchronous I/O throughput, such as SBP-2 (Serial Bus Protocol 2) for disks, are designed to take advantage of this feature. But these protocols do occasionally need to ensure software intervention, for example notifying software of a status update.

To satisfy this need, the 48-bit space is paritition into sections:

Currently, the 1394 Software Framework supports a physical address range within the low 32 bits (0x0000_0000_0000 to 0x0000_FFFF_FFFF). Received write and read requests that have a 48bit destination offset with that range can be automatically handled by hardware to or from host memory.

Posted Writes and CSR

A posted write is a received write request for which an ack_complete (instead of an ack_pending) is sent to the requesting node, indicating that the write was performed successfully before the write is actually completed. Posted writes are very efficient when they succeed because they obviate the need for sending a write response packet. Typically, post writes do eventually complete successfully, but in some circumstances the write fails and the requesting node cannot be notified. For some applications and protocols this functionality is acceptable, but for others it is not.

In addition, IEEE 1212-1994 reserves a region of address space for control and status registers, some of which may be used by protocols. This space is near the highest end of the address space.

The 1394 Software Framework further divides the 48-bit address space as shown:

Address Allocation

The 1394 Software Framework uses two mechanisms for assigning a range of address space to a target driver.

The routines for allocating and freeing 1394 address space do not themselves result in any transactions being sent over the bug and therefore are not handled like the outgoin asynchronous request commands. Instead, these routines perform the requested function and return immediately with DDI_SUCCESS or DDI_FAILURE

As with most 1394 Software Framework calls, these are called with the t1394_handle and a parameters structure as defined below.

Address Allocation Data Structures

/*
 * t1394_alloc_addr_t
 *    is passed to t1394_alloc_addr(), when 1394 address space is being
 *    allocated, to describe the type of address space.  The target driver
 *    is responsible for specifying the aa_enable, aa_type, and aa_evts
 *    fields described above as well as the size of the allocated block.
 *    Additionally, the target driver may specify backing store
 *    (aa_kmem_bufp), a specific address (in aa_address if aa_type is
 *    T1394_ADDR_FIXED), and a callback argument (in aa_arg) to be
 *    passed to the target in any of its callback routines.
 *    When it returns, t1394_alloc_addr() will return in aa_address the
 *    starting address of the requested block of 1394 address space and
 *    and address block handle (aa_hdl) used to free the address block
 *    in a call to t1394_free_addr().
 */
typedef struct t1394_alloc_addr {
        t1394_addr_type_t       aa_type;        /* IN: address region */
        size_t                  aa_length;      /* IN: # bytes requested */
        t1394_addr_enable_t     aa_enable;      /* IN: request enables */
        t1394_addr_evts_t       aa_evts;        /* IN: event callbacks */
        opaque_t                aa_arg;         /* IN: evt callback arg */
        caddr_t                 aa_kmem_bufp;   /* IN: backing-store buf */
        uint64_t                aa_address;     /* IN/OUT: alloced address */
        t1394_addr_handle_t     aa_hdl;         /* OUT: returned to target */
} t1394_alloc_addr_t;

Address Type

Target drivers select the address space type for the allocation by selecting one of the address type choices below.
/*
 * t1394_addr_type_t
 *    is used in the t1394_alloc_addr_t structure, passed to
 *    t1394_alloc_addr(), to indicate what type of address block the
 *    target driver would like to allocate.
 *    T1394_ADDR_POSTED_WRITE indicates posted write memory, where
 *    incoming write requests are automatically acknowledged as complete.
 *    T1394_ADDR_NORMAL indicates memory, unlike the posted write area,
 *    where all requests regardless of type are ack_pended upon receipt
 *    and are subsequently responded to.
 *    T1394_ADDR_CSR memory range is generally used by target drivers
 *    that are implementing a well-defined protocol.
 *    And T1394_ADDR_FIXED is used to indicate to t1394_alloc_addr()
 *    that a specific set of addresses are needed.  Unlike the other three
 *    types, this type of request is used to choose a specific address or
 *    range of addresses in 1394 address space.
 */
typedef enum {
        T1394_ADDR_POSTED_WRITE = 0,
        T1394_ADDR_NORMAL       = 1,
        T1394_ADDR_CSR          = 2,
        T1394_ADDR_FIXED        = 3
} t1394_addr_type_t;

Address Access Enables

When allocating local 1394 address space, target drivers specify which request types should be allowed in that space. If a request is received for an address range which, by the setting of these flags, does not permit the request type, then the 1394 Software Framework sends the appropriate response packet with a response code of resp_type_error. See IEEE 1394-1995 Section 9.5 for response code information.
Target drivers may specify any combination of these access enables when allocating local 1394 address space.
/*
 * t1394_addr_enable_t
 *    is used in the t1394_alloc_addr_t structure, passed to
 *    t1394_alloc_addr(), to indicate what types of (incoming)
 *    asynchronous requests will be allowed in a given address block.
 *    If, for example, an address block is intended to be read-only,
 *    then only the T1394_ADDR_RDENBL bit should be enabled at allocation
 *    time.  Then, when incoming requests of an inappropriate type (write
 *    or lock requests, in this case) arrive, the 1394 Framework can
 *    automatically respond to them with TYPE_ERROR in the response
 *    without having to notify the target driver.
 */
typedef enum {
        T1394_ADDR_RDENBL =     (1 << 0),
        T1394_ADDR_WRENBL =     (1 << 1),
        T1394_ADDR_LKENBL =     (1 << 2)
} t1394_addr_enable_t;

Address Event Callbacks

To be notified when a read, write, or lock request is received for the allocated range, a target driver sets the address event callbacks shown below. If it does not want to be notified, it sets the corresponding entry point to NULL.
/*
 * t1394_addr_evts_t
 *    is used in the t1394_alloc_addr_t structure, passed to
 *    t1394_alloc_addr(), to specify callback routines for the
 *    allocated address block.  When a request of the appropriate type
 *    (read/write/lock) is received to a target driver's address
 *    block, the appropriate callback routine is consulted and if it is
 *    non-NULL it is called and passed a cmd1394_cmd_t structure used to
 *    describe the incoming asynch request.
 */
typedef struct t1394_addr_evts {
        void    (*recv_read_request)(cmd1394_cmd_t *req);
        void    (*recv_write_request)(cmd1394_cmd_t *req);
        void    (*recv_lock_request)(cmd1394_cmd_t *req);
} t1394_addr_evts_t;

Incoming Asynchronous Requests (Part 2 of 3)... upcoming

OK, so this entry's just barely scratched the surface on the command structure used for incoming requests. The next entries will discuss the functions used for allocating and freeing 1394 addresses, the process for using 1394 physical addresses, and the methods necessary for handling and sending responses to incoming requests.

(2006-05-14 10:29:00.0) Permalink

20050710 Sunday July 10, 2005

The 1394 Software Framework - Outgoing Asynchronous Requests (Part 2 of 2)

OK... here's part 2 of my discussion on the Solaris 1394 Framework's interfaces for sending outgoing asynch requests. This continues my discussion of the Solaris 1394 Software Framework, and in this post I'm going to go over the functions used to allocate, free, and transmit asynch requests (read, write, or lock). In my previous entry, I described the details of the cmd1394_cmd_t structure that is used to encapsulate all the relevant information for an asynch command.




t1394_alloc_cmd()

/*
 * Function:    t1394_alloc_cmd()
 * Input(s):    t1394_hdl               The target "handle" returned by
 *                                          t1394_attach()
 *              flags                   The flags parameter is described below
 *
 * Output(s):   cmdp                    Pointer to the newly allocated command
 *
 * Description: t1394_alloc_cmd() allocates a command for use with the
 *              t1394_read(), t1394_write(), or t1394_lock() interfaces
 *              of the 1394 Software Framework.  By default, t1394_alloc_cmd()
 *              may sleep while allocating memory for the command structure.
 *              If this is undesirable, the target may set the
 *              T1394_ALLOC_CMD_NOSLEEP bit in the flags parameter.  Also,
 *              this call may fail because a target driver has already
 *              allocated MAX_NUMBER_ALLOC_CMDS commands.
 */
int
t1394_alloc_cmd(t1394_handle_t t1394_hdl, uint_t flags, cmd1394_cmd_t **cmdp)

Using this interface, a target driver allocates one command used to issue outgoing asynchronous requests. The Framework limits the total number of commands that a target driver may have outstanding. This limit is fixed and currently defaults to 256 commands.

Context:

Can be called from base kernel context or from interrupt context. If called from interrupt context, flags must be set to T1394_ALLOC_CMD_NOSLEEP.

Parameters:

Return Values:


t1394_free_cmd()

/*
 * Function:    t1394_free_cmd()
 * Input(s):    t1394_hdl               The target "handle" returned by
 *                                          t1394_attach()
 *              flags                   The flags parameter is unused (for now)
 *              cmdp                    Pointer to the command to be freed
 *
 * Output(s):   DDI_SUCCESS             Target successfully freed command
 *              DDI_FAILURE             Target failed to free command
 *
 * Description: t1394_free_cmd() attempts to free a command that has previously
 *              been allocated by the target driver.  It is possible for
 *              t1394_free_cmd() to fail because the command is currently
 *              in-use by the 1394 Software Framework.
 */
/* ARGSUSED */
int
t1394_free_cmd(t1394_handle_t t1394_hdl, uint_t flags, cmd1394_cmd_t **cmdp)

Using this interface, a target driver frees one cmd1394_cmd_t command.

Context:

Can be called from base kernel context or from interrupt context.

Parameters:

Return Values:


t1394_read()

/*
 * Function:    t1394_read()
 * Input(s):    t1394_hdl               The target "handle" returned by
 *                                          t1394_attach()
 *              cmd                     Pointer to the command to send
 *
 * Output(s):   DDI_SUCCESS             Target successful sent the command
 *              DDI_FAILURE             Target failed to send command
 *
 * Description: t1394_read() attempts to send an asynchronous read request
 *              onto the 1394 bus.
 */
int
t1394_read(t1394_handle_t t1394_hdl, cmd1394_cmd_t *cmd)

The t1394_read() command issues an asynchronous read quadlet request of read block request to the indicated address and places the response data into the provided cmd->cmd_u.b.data_block message buffer or cmd->cmd_u.q.quadlet_data

If the CMD1394_OVERRIDE_ADDR option is not set, the 1394 Software Framework fills in the appropriate high-order 16 bits of the cmd_addr field with the 10-bit bus number and 6-bit node ID for the attached device. If the CMD1394_OVERRIDE_ADDR option is set, the Framework uses the complete 64-bit cmd_addr field as provided.

The Framework splits the read request into multiple read requests if any of the following conditions are true.

  1. The specified data length exceeds the prevailing maximum payload size which is either the current maximum payload size or the overridden max_payload value.
  2. The specified data length exceeds the maximum packet size permitted by the current bus speed.

If the CMD1394_DISABLE_ADDR_INCREMENT option is not set, successive read requests are issued with the destination address advancing accordingly. If the CMD1394_DISABLE_ADDR_INCREMENT option is set, the Framework sends each read request to the same destination address.

Context:

Can be called from base kernel context or from interrupt context. If called from interrupt context, cmd_options can not be set to CMD1394_BLOCKING.

Parameters:

Return Values:

If the command succeeds (i.e. if cmd_result equals CMD1394_SUCCESS), the read data is in cmd->cmd_u.b.data_block or cmd->cmd_u.q.quadlet_data and bytes_transferred indicates the total number of bytes read. For block requests, the data block's b_wptr points to the byte following the last read byte.

Possible immediate errors (see 'Error Codes' in my previous blog entry):

Possible completion statuses (in cmd_result) are below. Note that if the operation fails, bytes_transferred indicates if part of the operation succeeded.


t1394_write()

/*
 * Function:    t1394_write()
 * Input(s):    t1394_hdl               The target "handle" returned by
 *                                          t1394_attach()
 *              cmd                     Pointer to the command to send
 *
 * Output(s):   DDI_SUCCESS             Target successful sent the command
 *              DDI_FAILURE             Target failed to send command
 *
 * Description: t1394_write() attempts to send an asynchronous write request
 *              onto the 1394 bus.
 */
int
t1394_write(t1394_handle_t t1394_hdl, cmd1394_cmd_t *cmd)

The t1394_write() command issues an asynchronous write quadlet request of write block request to the indicated address (based on cmd_type) and uses the provided cmd->cmd_u.b.data_block message buffer or cmd->cmd_u.q.quadlet_data field

If the CMD1394_OVERRIDE_ADDR option is not set, the 1394 Software Framework fills in the appropriate high-order 16 bits of the cmd_addr field with the 10-bit bus number and 6-bit node ID for the attached device. If the CMD1394_OVERRIDE_ADDR option is set, the Framework uses the complete 64-bit cmd_addr field as provided.

The Framework splits the write request into multiple write requests if any of the following conditions are true.

  1. The specified data length exceeds the prevailing maximum payload size which is either the current maximum payload size or the overridden max_payload value.
  2. The specified data length exceeds the maximum packet size permitted by the current bus speed.

If the CMD1394_DISABLE_ADDR_INCREMENT option is not set, successive write requests are issued with the destination address advancing accordingly. If the CMD1394_DISABLE_ADDR_INCREMENT option is set, the Framework sends each write request to the same destination address.

Context:

Can be called from base kernel context or from interrupt context. If called from interrupt context, cmd_options can not be set to CMD1394_BLOCKING.

Parameters:

Return Values:

If the command succeeds (i.e. if cmd_result equals CMD1394_SUCCESS), bytes_transferred indicates the total number of bytes written. For block writes, the 1394 Framework transmits the data between the data block's b_rptr and b_wptr and leaves both b_rptr and b_wptr unchanged.

Possible immediate errors (see 'Error Codes' in my previous blog entry):

Possible completion statuses (in cmd_result) are below. Note that if the operation fails, bytes_transferred indicates if part of the operation succeeded.


t1394_lock()

/*
 * Function:    t1394_lock()
 * Input(s):    t1394_hdl               The target "handle" returned by
 *                                          t1394_attach()
 *              cmd                     Pointer to the command to send
 *
 * Output(s):   DDI_SUCCESS             Target successful sent the command
 *              DDI_FAILURE             Target failed to send command
 *
 * Description: t1394_lock() attempts to send an asynchronous lock request
 *              onto the 1394 bus.
 */
int
t1394_lock(t1394_handle_t t1394_hdl, cmd1394_cmd_t *cmd)

There are several lock operations provided by the 1394 Software Framework (above and beyond the basic lock operations provided by IEEE 1394). These are described below (with pseudo-code to describe their operation).

For all lock operations, if the CMD1394_OVERRIDE_ADDR option is not set, the 1394 Software Framework fills in the appropriate high-order 16 bits of the cmd_addr field with the 10-bit bus number and 6-bit node ID for the attached device. If the CMD1394_OVERRIDE_ADDR option is set, the Framework uses the complete 64-bit cmd_addr field as provided.

Context:

Can be called from base kernel context or from interrupt context. If called from interrupt context, cmd_options can not be set to CMD1394_BLOCKING.

Parameters:

Return Values:

Possible immediate errors (see 'Error Codes' in my previous blog entry):

Possible completion statuses (in cmd_result) are below.


(2005-07-10 13:15:00.0) Permalink Comments [1]

20050709 Saturday July 09, 2005

The 1394 Software Framework - Outgoing Asynchronous Requests (Part 1 of 2)

Continuing my discussion of the Solaris 1394 Software Framework, in this post I'm going to go into some detail on the methods by which a 1394 target driver can transmit outgoing asynchronous requests and receive the corresponding responses.

Command Allocation/Freeing

Handling of all asynchronous packets is acheived via commands. A command is a data structure which contains fields for the necessary packet components as well as command options and other fields used to specify the desired command processing.

To acquire a command structure for transmitting asynchronous requests and receiving responses, the 1394 target driver calls t1394_alloc_cmd(). This routine allocates and returns a pointer to a cmd1394_cmd_t structure (which I'll say more about below.) Note: This same command structure is also used for handling incoming asynchronous requests, although some fields are used differently.

A 1394 target driver can re-use a command after it has completed (see below), but it should not reissue or alter any fields for any command that is still pending. And when a target driver no longer needs a command, it will call t1394_free_cmd() to release the command structure back to the Framework. Note: 1394 target drivers are responsible for freeing all of their allocated commands before detaching (or the detach will not be allowed - see my previous blog entry.)

Command Action

After initializing the relevant parameters for the outgoing request, the 1394 target driver calls the appropriate asynchronous command routine; either t1394_write(), t1394_read(), or t1394_lock().

The command itself often takes some time to complete, since a packet must be sent over the bus to the destination node and the 1394 Framework must await the response. Therefore, the Framework gives the target driver the option to block or not block pending command completion (more on this below).

Once the command is "handed off" to the 1394 Software Framework, the target driver should not re-use or modify the same allocated command until the target driver can determine that the requested action has fully completed.

A command is "completed" when the bus transaction(s) used to perform the command have finished. The command's cmd_result field indicates either success (CMD1394_CMDSUCCESS) or failure, where a failure is indicated by an error code (See 'Error Codes' below for details).

Command Completion

Target drivers have three options for determining command completion status:

Bus Reset and Command Handling

The IEEE 1394 bus is reset when devices are added to or removed from the bus (or for a variety of other reasons). When this happens, all devices on the bus are re-enumerated and can possibly be assigned different bus addresses (referred to as "Node IDs") from the ones they each had prior to the bus reset.

From the IEEE 1394 protocol perspective, when a bus reset occurs all the pending and in-progress command requests are canceled. Target drivers have two options with respect to processing of any outstanding commands.

Asynch Command Structure

/* cmd1394_cmd: cmd1394 - common command type */
typedef struct cmd1394_cmd
{
        int                     cmd_version;
        volatile int            cmd_result;
        cmd1394_flags_t         cmd_options;
        cmd1394_cmd_type_t      cmd_type;
        void                    (*completion_callback)(struct cmd1394_cmd *);
        opaque_t                cmd_callback_arg;
        uint64_t                cmd_addr;
        uint_t                  cmd_speed;
        uint_t                  bus_generation;
        uint_t                  nodeID;
        uint_t                  broadcast;
        union {
                cmd1394_quadlet_t       q;
                cmd1394_block_t         b;
                cmd1394_lock32_t        l32;
                cmd1394_lock64_t        l64;
        } cmd_u;
} cmd1394_cmd_t;

Command Types

/*
 * cmd1394_cmd.cmd_type
 *    Used to select/indicate the request packet type
 */
typedef enum {
        CMD1394_ASYNCH_RD_QUAD  = 0,
        CMD1394_ASYNCH_WR_QUAD  = 1,
        CMD1394_ASYNCH_RD_BLOCK = 2,
        CMD1394_ASYNCH_WR_BLOCK = 3,
        CMD1394_ASYNCH_LOCK_32  = 4,
        CMD1394_ASYNCH_LOCK_64  = 5
} cmd1394_cmd_type_t;

Command Options

A target driver uses these options to tailor the Framework's command processing behavior.
/*
 * cmd1394_cmd.flags
 *    Used to select the request's behavior, including
 *    how the destination address is determined, how
 *    a large request will be broken into smaller requests,
 *    whether the command should be resent after a
 *    bus reset has happened, etc.
 */
typedef enum {
        CMD1394_CANCEL_ON_BUS_RESET     = (1 << 0),
        CMD1394_OVERRIDE_ADDR           = (1 << 1),
        CMD1394_OVERRIDE_MAX_PAYLOAD    = (1 << 2),
        CMD1394_DISABLE_ADDR_INCREMENT  = (1 << 3),
        CMD1394_BLOCKING                = (1 << 4),
        CMD1394_OVERRIDE_SPEED          = (1 << 5)
} cmd1394_flags_t;

Packet Data

The asynchronous command structure is used for issuing a variety of requests. And each kind of request requires a different set of parameters. There are four kinds of requests as shown below:

quadlet requests

The structure is used for both write quadlet requests and read quadlet requests. A target driver issues write quadlet requests using t1394_write() and issues read quadlet requests using t1394_read().
/* Asynchronous Command (Data Quadlet) */
typedef struct cmd1394_quadlet {
        uint32_t                quadlet_data;
} cmd1394_quadlet_t;
For t1394_write(), the quadlet_data field contains the 4 bytes to be written. For t1394_read(), the quadlet_data contains the 4-byte read data from the requested address.

block requests

The following structure is used for read block and write block requests. A target driver issues write block requests using t1394_write and issues read block requests using t1394_read().
/* Asynchronous Command (Data Block) */
typedef struct cmd1394_block {
        mblk_t                  *data_block;
        size_t                  blk_length;
        size_t                  bytes_transferred;
        uint_t                  max_payload;
} cmd1394_block_t;

lock requests

The following structure is used for 32-bit lock requests:
/* Asynchronous Command (Lock Cmd - 32 bit) */
typedef struct cmd1394_lock32 {
        uint32_t                old_value;
        uint32_t                data_value;
        uint32_t                arg_value;
        uint_t                  num_retries;
        cmd1394_lock_type_t     lock_type;
} cmd1394_lock32_t;
The following structure is used for 64-bit lock requests:
/* Asynchronous Command (Lock Cmd - 64 bit) */
typedef struct cmd1394_lock64 {
        uint64_t                old_value;
        uint64_t                data_value;
        uint64_t                arg_value;
        uint_t                  num_retries;
        cmd1394_lock_type_t     lock_type;
} cmd1394_lock64_t;
The Framework supports a large number of lock operations, each of which uses the lock values somewhat differently. The value fields are described here in a general way.

Error Codes

There are two contexts under which an error may be reported.

An error might occur when the command is initially issued due to a faulty parameter or another immediately detected error. For example, if t1394_write() is called with a NULL data_block message pointer for a block write, it will return an error status of DDI_FAILURE, and the command's cmd_result is set to CMD1394_ENULL_MBLK. If no error is found, the function return value is DDI_SUCCESS and the command's cmd_result reflects the current status of the transaction(s).

An error might also occur during the ensuing bus transaction(s). In this case, the Framework sets the command's cmd_result appropriately and the command is completed as described above.

Error Codes for Immediately Detected Failures

After a target driver gets DDI_FAILURE from t1394_read(), t1394_write(), or t1394_lock(), the command's cmd_result can be one of the following:

For non-blocking commands, the command cmd_result after a target driver gets DDI_SUCCESS can be CMD1394_NOSTATUS.

Error Codes for Transactions

Outgoing Asynchronous Requests (Part 2 of 2)... upcoming

OK, so that's a ton of detail for one post. And that's just to talk about how the command structure is used for outgoing requests. But I'm gonna stop here for now and finish this discussion of outgoing asynch requests with another post in a couple of days to talk about the interfaces themselves.


[1]It's important to note that it is possible, due to various timing conditions, for the Framework to call the completion_callback() routine before the interface call returns. So target drivers using completion_callback() need to handle this possible sequence of events.
[2]Why polling? Why when we have callbacks and blocking commands already? Well the primary reason is for a purpose not yet fulfilled... being able to do a crash dump to a 1394 disk. When the system is crashed and dumping to disk, interrupts are disabled... which makes both blocking and callbacks impossible. If you're actually reading this, and you want a project, take a look at what it'd take to enable polling throughout the entire 1394 Software Framework and to enable dump to 1394 disk.
[3]It's a really big no-no to issue blocking commands while operating in the interrupt context, e.g. in a callback, as this could cause a deadlock.
[4]These defines can be found in ieee1394.h
[5]A word of caution: A target driver should be careful when sending requests using CMD1394_OVERRIDE_ADDR. Sending requests to devices other that its own could adversely impact the performance or function of the destination node.
(2005-07-09 20:57:00.0) Permalink

20050619 Sunday June 19, 2005

The 1394 Software Framework - Attach, Detach, and Events

Continuing my discussion of the Solaris 1394 Software Framework, in this post I'm going to go into some detail on the methods by which a 1394 target driver can register and deregister itself with the core framework. And, in the process, I will also touch on the Solaris event notification mechanisms leveraged by the framework.

t1394_attach()

All 1394 target drivers must register themselves with the framework in order to operate properly. The 1394 framework will make an association with the device driver instance for the target and the corresponding HAL driver instance (called the 'parent') for the adapter to which it is attached.

In addition, the 1394 framework will allocate resources (internally) to track the target driver state (in the target driver 'handle') and return useful information about the current state of the target device on the 1394 bus and DMA and/or interrupt properties of the parent HAL.[1]

/*
 * Function:    t1394_attach()
 * Input(s):    dip                     The dip given to the target driver
 *                                          in it's attach() routine
 *              version                 The version of the target driver -
 *                                          T1394_VERSION_V1
 *              flags                   The flags parameter is unused (for now)
 *
 * Output(s):   attachinfo              Used to pass info back to target,
 *                                          including bus generation, local
 *                                          node ID, dma attribute, etc.
 *              t1394_hdl               The target "handle" to be used for
 *                                          all subsequent calls into the
 *                                          1394 Software Framework
 *
 * Description: t1394_attach() registers the target (based on its dip) with
 *              the 1394 Software Framework.  It returns the bus_generation,
 *              local_nodeID, iblock_cookie and other useful information to
 *              the target, as well as a handle (t1394_hdl) that will be used
 *              in all subsequent calls into this framework.
 */
/* ARGSUSED */
int
t1394_attach(
    dev_info_t            *dip,         /* supplied by the target */
    int                   version,      /* supplied by the target */
    uint_t                flags,        /* supplied by the target */
    t1394_attachinfo_t    *attachinfo,  /* filled in by the framework */
    t1394_handle_t        *t1394_hdl)   /* returned to the target */

During a target driver's attach processing, it calls t1394_attach() to register with the 1394 Software Framework. The Framework initializes any necessary internal data structures and returns a t1394_hdl which the target driver uses with all other calls into the Framework. The Framework also returns additional information in attachinfo, which is needed by some target driver implementations.

Context:

Should be called only from base kernel context.

Parameters:

Return Values:

t1394_detach()

Of course, for every registration a 1394 target driver should also deregister (when complete and detaching). This gives the Framework the opportunity to reclaim all the internally allocated resources that had been set aside for tracking the target state.
/*
 * Function:    t1394_detach()
 * Input(s):    t1394_hdl               The target "handle" returned by
 *                                          t1394_attach()
 *              flags                   The flags parameter is unused (for now)
 *
 * Output(s):   DDI_SUCCESS             Target successfully detached
 *              DDI_FAILURE             Target failed to detach
 *
 * Description: t1394_detach() unregisters the target from the 1394 Software
 *              Framework.  t1394_detach() can fail if the target has any
 *              allocated commands that haven't been freed.
 */
/* ARGSUSED */
int
t1394_detach(t1394_handle_t *t1394_hdl, uint_t flags)

The target driver calls t1394_detach() to deregister from the 1394 Software Framework. Typically the target calls this from its detach(9E) routine.

Context:

Should be called only from base kernel context.

Parameters:

Return Values:

Events

Target drivers may register callbacks for general 1394 Software Framework events by using the Solaris Event Framework. All calls to the Event Framework must be performed before the call to t1394_attach() is performed. For details about the Solaris Event Framework, see:

The following events are supported by the 1394 Framework:

     void (*handler)(dev_info_t  *dip,  ddi_eventcookie_t cookie,  void  *arg,
         void *impl_data);

The callback impl_data provided to the eventcalls associated with each of these events is a t1394_localinfo_t *, as described above.

Note: Within an event callback function, a target driver shouldn't invoke any procedure that blocks or sleeps. For example, an event callback function shouldn't issue any outgoing asynch request that has the CMD1394_BLOCKING flag set. (Yep, again... more on this in a later blog entry).

And for next time?

OK. So obviously there's more I could say here about the details of Framework's implementation for tracking and coordinating target drivers and their events, but as I've said before that I want to first go through the Framework at a high-level.

Next time, I'm going to go over the outgoing asynch interfaces: command structure allocation, command completion mechanisms (event-driver, blocking, polling), command types (read, write, lock), quadlet requests, block requests, etc. That's a ton of stuff to cover, so maybe it won't all get into a single blog entry, but anyway that's where I'm headed next.


[1]Both t1394_attach() and t1394_detach() can be found in t1394.c
[2]I've worked with Mark for many years. He's a good guy, very bright, and he was one of us original four designers of the Solaris 1394 Software Framework. His contribution was essentially the entire OpenHCI-compliant HAL driver (hci1394).
[3]OK, so... flags. Why do we have it all over the place even though we don't often use it, you might be asking? Well the cynic among you might quote me some Emerson, but in reality this is more about our desire to be able to accomodate backwards compatibility. This goal of being able to have the design move forward and evolve, while still accomodating older versions of the software is part of Sun culture. It is an imperative of design in Solaris, especially for driver frameworks like theses. Of course, without more than one version of a framework like this (this one is essentially unchanged from when it was initially putback), a designer must consult a crystal ball (or draw on experience if you prefer). And sometimes you're gonna come up with flags that don't do anything (yet!)
(2005-06-19 09:16:00.0) Permalink

20050615 Wednesday June 15, 2005

InfiniBand HCA driver missing from OpenSolaris?

Yeah, that's right. Unfortunately, Sun is not yet able to open up our source code for our Solaris InfiniBand HCA driver. (One of my colleagues, Steve Rust touches on this in his most recent blog entry.) Although we wrote all the code ourselves, we did it with access to info that we got under NDA. So we're still under obligation not to disclose anything. I sincerely hope we will soon be able to open it up too, because there is some really interesting code in there that Steve R. and I are really proud of. For now, though, I guess it is among those few OpenSolaris drivers which you can get only as a binary.

The driver itself is called tavor and it basically started out as my baby (Steve R. owns it now). After my work on the Solaris 1394 Software Framework (and a handful of aborted or "development only" projects with InfiniBand HCA's), I finally got an opportunity in early 2002 to design and implement my own driver, from the ground up. The driver was to be for the Mellanox InfiniHost MT23108 HCA device, which was going to be the central I/O component in a SPARC-based blade server platform (which we never ultimately shipped).

But although the driver started out life as with a very specific purpose for a very specific (and since canceled) platform, we (the engineers) anticipated a value from the beginning if it could work well with plug-in cards. And today, a plug-in card is still the primary mechanism for adding InfiniBand to a system.

It took about a year and half of design/implementation/testing before it was ready for putback into Solaris (Steve Rust's blog says August 6th, 2003 and I'll trust him, since he was our 'gatekeeper' for the entire Solaris InfiniBand Framework putback). Subsequent to that putback, there were bug fixes (obviously), enhancements for x86 and AMD64 support, the userland access support, and (most recently) support for Shared Receive Queues (SRQ) and for the new Mellanox InfiniHost III Ex MT25208 HCA device.

The latter half of that work above was done by Steve R. and was done subsequent to Solaris 10 release. (I had the "project lead" role, he did all the hard work.) But if you want to check out the fruits of Steve R's latest work - check out Solaris Express 04/05 for the latest 'tavor' bits.

I know we're both extremely proud of this code (and really do wish we could show it off). And it's got some really fun stuff in it: handling userland access to HCA resources (i.e. OS bypass for lower latency), extreme configurability (honestly probably too configurable), a fancy mechanism for keeping track of "Work Request Identifiers" (for which I recently received US Patent #6,901,463), and a cool queue pair number allocation/reuse scheme for which Steve R. and I have a patent pending.

But anyway, I probably sound like a tease, since the driver isn't yet available in source form. But, if you're interested in InfiniBand, there's still plenty of really excellent code in OpenSolaris to check out. (Check Steve Rust's latest blog entry "InfiniBand Support in OpenSolaris" for a good starter.)

And if you've got an InfiniBand HCA card (from any of a number of vendors - Sun, TopSpin, Mellanox, etc.), then you can see this driver attached to your hardware and you can use it. (Matter of fact, like I said, this same 'tavor' driver will also attach to and operate on the latest generation of Mellanox's PCI-Express-capable InfiniHost III Ex MT25208 card. So, if you've got a system with PCI-Express - there are a few out there and I know Sun's got plenty coming - then you can get some really kick-ass performance out of our IB stack.)

Also, if you want to read more about our Solaris IB stuff, here's a few blogs by some other colleagues of mine:

There's a ton of other engineers (dozens, literally) who've contributed to the Solaris InfiniBand Framework. But maybe they're a little shy? I can't seems to find blogs by any of them. Anyway, they should all be proud too. And I'm sure that they are happy to have you folks able to see their code now in OpenSolaris.

Shoot me a comment (below) if you've used our IB software, or if you've been poking through the code. I'm very curious to hear from folks on the other end about what they think of our work.

(2005-06-15 18:14:00.0) Permalink Comments [3]

20050614 Tuesday June 14, 2005

OpenSolaris - The 1394 Software Framework

When I started at Sun about seven years ago, the Solaris 1394 Software Framework was my first project. A team of four, we designed, implemented, tested and putback to Solaris 8 (Update 2) in about a year and a half. Since then the code has been transitioned to others (like Alan Perry and Artem Kachitchkine), who continue to maintain and extend its functionality even today. Now, almost six years later, I finally get a chance to share this code with the world (and, hopefully, at least some interested readers.)

So I figured I'd start by giving a brief overview of the stack and the features that it provides and then talk a bit about the source files for the modules: how they're organized, what's in them, etc. Then (over the course of several posts) I'll get into the specifics of how to use each of interfaces, gotcha's for potential developers working with the source, some of the little bits of which I'm most proud, and maybe some discussion of what could be improved or extended in the existing code (and I'll be interested to hear what others think). This will not be an intro to the IEEE 1394 specification or the technology, nor will it be an introduction to writing Solaris device drivers (though I'll try to help anyone in any way I can). What follows will assume a certain familiarity with IEEE 1394 and with writing Solaris device drivers.

The Solaris 1394 Software Framework



The framework itself consists of a central module (s1394) called the "1394 Services Layer", an OpenHCI-compliant HAL driver (hci1394), and numerous target drivers (currently, av1394, scsa1394, and dcam1394 - which I'll say more about later.) If you are familiar with Solaris's SCSA framework (for SCSI drivers), you'll recognize this stacking of modules. This arrangement is a typical way to abstract hardware-specific details (below the Service Layer) from the target drivers (above the Services Layer).

The Solaris 1394 Software Framework Device Driver Interface provides a set of kernel interface routines to facilitate access to devices on an IEEE 1394 bus. The interface routines are intended for use by 1394 device drivers, also referred to as target drivers.

There are two kinds of target drivers: class drivers and vendor specific drivers. Class drivers adhere to a general standard for a particular kind of device and can drive any vendor's device that adheres to the same standard. For example, a class driver for the IEEE 1394 Digital Conferencing Camera specification can drive video conferencing cameras manufactures by a variety of vendors (even though each may have a different set of features). Vendor-specific drivers are built to drive a specialized non-standard device. The 1394 Software Framework supports both kinds of target drivers.

Features of the 1394 Framework

The 1394 Software Framework provides several features to support target drivers using the IEEE 1394 bus.

Asynchronous I/O

There are two sides of asynchronous I/O: issuing outgoing requests and handling incoming requests.

Outgoing Requests - The 1394 Framework provides the ability to send the basic set of IEEE 1394 asynchronous requests; read, write and lock. In addition to the IEEE 1394 defined set of lock request options, the Framework lock request interface also provides a set of bit and arithmetic functions.

For asynchronous requests, the 1394 Framework automatically determines the device's destination ID, sends the request using the local host's 1394 hardware interface, tracks the status of the request until the transaction completes, and supplies response information as needed.

Target drivers choose whether to: 1) block while waiting for the transaction to complete, 2) poll on the request completion status, or 3) have the 1394 Framework call a specified callback routine when the transaction completes. Using the poll and callback mechanisms, target drivers can issue several outstanding requests and poll or be notified for each completion.

Incoming Requests - The primary role of the 1394 Framework with respect to incoming asynchronous requests is to dispatch the request to the appropriate target driver or to handle the request on behalf of the appropriate target driver.

To support this, the 1394 Framework provides an allocation mechanism that target drivers use to reserve ranges of addresses within the 48-bit local node address space. Target drivers have several options available when allocating 1394 address space including the ability to specify a kernel virtual buffer to map to the allocated space. When the 'destination_offset' of an incoming request falls within an allocated address range, the 1394 Framework fulfills the request if possible, notifies the target driver if desired, and sends the response. Target drivers may also allocate 1394 address space with the characteristic that an incoming request to that space will be handled by hardware. In this case hardware directly accesses host memory bound to that address space, and transmits the appropriate response.

Isochronous I/O

Due to the potentially large volume of isochronous data and the critical isochronous timing needs, the 1394 Software Framework provides a mechanism designed to reduce call overhead and to maximize throughput.

Before starting isochronous I/O, a target driver sets up the overall sequence and structure of receive or transmit buffers, indicating other needs such as when the 1394 Framework should invoke a target driver's callback. Once isochronous I/O is started, the target driver can focus most of its time on handling the data.

The 1394 Framework mechanism for configuring isochronous I/O is the Isochronous Transfer Language (IXL). The IXL is a hardware independent set of control blocks that the target driver uses to direct isochronous DMA. The 1394 Software Framework converts the hardware independent IXL into the appropriate DMA directives for the local host 1394 interface hardware changes, the impact to the target driver is minimal or non-existent.

The 1394 Software Framework also facilitates peer to peer communication by tracking all target drivers with an interest os a particular isochronous stream, allocating a channel number and bandwidth as needed, and coordinating the target driver notification of stream starts and stops.

Bus Reset, Isochronous Resource Manager, Bus Manager

In addition to complying with the IEEE 1394 requirements for bus reset processing, such as cancelling pending asynchronous requests, the 1394 Framework provides several bus reset related features.

One of the most severe effects of a bus reset is the re-enumeration of all the nodes on the bus. The 1394 Software Framework assesses the post bus reset topology and determines the new node_IDs for all target driver instances. It can then reissue any uncompleted outgoing asynchronous requests on behalf of the issuing target driver, and each target driver can continue on without concern for their new node number.

As part of the topology evaluation, the 1394 Software Framework also creates a speed map to determine the maximum packet speed between any two nodes. The Framework uses the speed map to select the most efficient speed for target driver outgoing asynchronous requests.

In addition to the topology map and speed map which are part of 1394 bus manager duties, the 1394 Software Framework also contends for isochronous resource manager and bus manager. If it is bus manager, the Framework will ensure that the root is cycle master capable and optimize the gap count.

Hotplug

Another aspect of the Framework's topology evaluation is that it determines which devices, if any, have been removed from the bus and which ones have been added to the bus. For removed devices, the 1394 Software Framework calls into the Solaris Hotplug Framework to notify it that the device is offline. For added devices, the Framework reads the device's configuration ROM to determine the pertinent information, the Global Unique ID and often the Unit_Spec_Id and Unit_Sw_Version, and creates the Solaris "/devices" node using the Solaris Hotplug Framework interfaces.

Building a target driver

To ensure that the Solaris 1394 Software Framework is loaded, target drivers must link with a dynamic dependency on the Framework misc module. This is done using the '-N' flag with ld:

        ld -r -dy -Nmisc/s1394 -o target target1.o target2.o -o target

1394 Device /devices pathname

Solaris device entries for IEEE 1394 devices are created based on the device's global unique ID. The format of the name uses a prefix of "unit@" followed by the GUID in hexadecimal. An example /devices pathname for device A above is as follows ("tdA" is target driver A's minor name):

        /devices/pci@1f,4000/firewire@4,2/unit@0800460200000016,0:tdA

Adding a driver

Although the /devices name for the device is based on the GUID, the device driver itself is bound to the device(s) based on the first pair from the following list to exist in the device's configuration ROM:

        1. Unit_Spec_Id, Unit_Sw_Version
        2. Node_Spec_Id, Node_Sw_Version
        3. Node_Vendor_Id, Node_Hw_Version
        4. Module_Spec_Id, Module_Sw_Version
        5. Module_Vendor_Id, Module_Hw_Version

For further information on the layout of configuration ROM and the meaning of these values, refer to IEEE 1212-1994 Section 8 and IEEE 1394-1995 Section 8.3.2.5. For specific information about configuration ROM for a particular device class, refer to the device class specification.

After parsing configuration ROM and locating one of the pairs as shown above, the Solaris 1394 Software Framework provides the information to the hotplug framework. If a driver is configured to bind to the designated pair, the /devices and /dev entries are created and the driver's attach() routine is invoked. For example, to add a driver for a video conferencing camera which adheres to the 1394 Digital Camera Draft 1.04 (note that hexidecimal letters must be in lower case):

        add_drv -n -i \"firewire00a02d,000100\" tdA

Where firewire is the hardware interface, 00a02d is the Unit_Spec_ID and 000100 is the Unit_Sw_Version.

How the source is organized

All the source and headers for the Solaris 1394 Framework can be found under:

        usr/src/uts/common/io/1394
        usr/src/uts/common/sys/1394

The files themselves break down this way:

        * t1394.c - 1394 Target Driver Interfaces
        * s1394.c - 1394 Services LAyer Initialization and Cleanup Routines
        * s1394_addr.c - 1394 Address Space Routines
        * s1394_asynch.c - 1394 Services Layer Asynch Communications Routines
        * s1394_bus_reset.c - 1394 Services Layer Bus Reset Routines
        * s1394_csr.c - 1394 Services Layer CSR and Config ROM Routines
        * s1394_dev_disc.c - 1394 Services Layer Device Deiscovery Routines
        * s1394_hotplug.c - 1394 Services Layer Hotplug Routines
        * s1394_isoch.c - 1394 Services Layer Isoch Communications Routines
        * s1394_misc.c - 1394 Services Layer Miscellaneous Routines
        * nx1394.c - 1394 Services Layer Nexus Support Routines
        * h1394.c - 1394 Services Layer HAL Interfaces
        * t1394_errmsg.c - Utility function that targets can use to convert an
                           error code into a printable string.
        * s1394_cmp.c - 1394 Services Layer Connection Management Procedures Support Routines
        * s1394_fa.c  - 1394 Services Layer Fixed Address Support Routines
                        (Currently used only for FCP support)
        * s1394_fcp.c - 1394 Services Layer FCP Support Routines
        * t1394.h
            This is the primary header file for the 1394 Framework and it includes
            all other header files listed below.  In addition, it contains all 1394
            Framework interface routine prototypes as well as all data structure and
            defines beginning with the "t1394_" prefix.  (n.b. This one's pretty
            well-commented, if I say so myself.)
        * cmd1394.h
            This file contains all structures and defines for handling asynchronous
            commands.
        * id1394.h
            This file contains all structures and defines for managing a local
            isochronous DMA resource.
        * ieee1394.h
            This file contains general IEEE 1394 defines.
        * ieee1212.h
            This file contains general IEEE 1212 defines.
        * ixl1394.h
            This file contains all structures and defines for utilizing IXL programs.
        * h1394.h
           This file contains the structure and error codes used to communicate between
           the HAL and the rest of the 1394 Software Framework
        * s1394.h
           This file contains all of the structures used (internally) by the 1394
           Software Framework.
        * s1394_impl.h
           This file contains typedefs and defines used by all 1394 Software Framework
           files.

The source for our OpenHCI-compliant HAL driver can be found in:

        usr/src/uts/common/io/1394/adapters
        usr/src/uts/common/sys/1394/adapters

The files here are numerous, so I will hold off saying more about this driver until some later entries.

        hci1394_extern.c       hci1394_misc.c         hci1394.c
        hci1394_ioctl.c        hci1394_ohci.c         hci1394.conf
        hci1394_async.c        hci1394_isr.c          hci1394_s1394if.c
        hci1394_attach.c       hci1394_ixl_comp.c     hci1394_tlabel.c
        hci1394_buf.c          hci1394_ixl_isr.c      hci1394_tlist.c
        hci1394_csr.c          hci1394_ixl_misc.c     hci1394_vendor.c
        hci1394_detach.c       hci1394_ixl_update.c
        hci1394_isoch.c        hci1394_q.c

        hci1394.h              hci1394_extern.h       hci1394_state.h
        hci1394_async.h        hci1394_ioctl.h        hci1394_tlabel.h
        hci1394_buf.h          hci1394_isoch.h        hci1394_tlist.h
        hci1394_csr.h          hci1394_ixl.h          hci1394_tnf.h
        hci1394_def.h          hci1394_ohci.h         hci1394_vendor.h
        hci1394_descriptors.h  hci1394_q.h            hci1394_drvinfo.h
        hci1394_rio_regs.h

And the source for the existing target drivers (mentioned above) can be found in:

        usr/src/uts/common/io/1394/targets/av1394
        usr/src/uts/common/io/1394/targets/scsa1394

OK... So what's next?

So my basic plan is to continue with some discussion of driver attach() and detach() interfaces (see t1394_attach() and t1394_detach() in the source) and basic 1394 event processing. Then to move on to talk about the asynch interfaces, the isoch interfaces, and finally some of the miscellaneous interface routines. At this point, I thought I'd change the focus from a description of the interfaces and how to use them to a more detailed examination of some specific bits of internals code.

But I'm open to suggestions too. If you've read this far and feel like you may be interested in reading more, going through the code yourself, sharing you thoughts and comments, and end up with something specific you'd like to hear about... lemme know.

(2005-06-14 09:00:00.0) Permalink

20041130 Tuesday November 30, 2004

So you actually wanted to use your iPod with Solaris 10?!

Well... so it turns out that there's actually a Solaris bug that prevents the iPod from mounting properly (without a little coaxing). Let's just say we're working on it.

Update (01/18/2005): I have just confirmed that the workaround described below is completely unnecessary for Solaris 10 on x86. The 'pcfs' bug and workaround below only affect Solaris SPARC. If you are using Solaris SPARC, continue reading. Otherwise skip to "But wait... there's more..." below.

But if you're reading this, then you probably actually wanted to try out your iPod with Solaris 10 now. So I've included some details below on the workaround I'm using.

Problem
The iPod contains two partitions: the 1st is a small one used by iPod firmware, the 2nd is the rest of the disk. Originally, both are marked FAT32 and both are marked primary. But unfortunately Solaris 'pcfs' doesn't see the 2nd partition (which is, of course, a bug).

The 1st partition's signature is in the first sector:

  # dd if=/dev/rdsk/c5t0d0p0 count=1 | od -c
  1+0 records in
  1+0 records out
  0000000 353   X 220   M   S   D   O   S   5   .   0 002  \0  \b  \0 002
  0000020 002  \0  \0  \0  \0 370  \0  \0   ?  \0 377  \0  \0  \0  \0  \0
  0000040   _ 340 276 001  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
  0000060  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
  0000100 002  \0   )   R   6 213 250   i   P   o   d  \0   M   E
  0000120           F   A   T   3   2               3 311 216 321 274 364
  .......

Now the first part of the workaround is basically to modify the "FAT32" signature to something different, e.g. replacing 'F' with 'C' will be sufficient. As a result, 'pcfs' will ignore 1st partition and will skip to the 2nd.

But wait... there's more...

The iPod's music repository is also located in a hidden directory. And, by default, Solaris 'pcfs' will not show hidden directories. However, there is an undocumented 'hidden' option to override this behavior. So your mount command will typically look something like this:

  # mount -F pcfs -o hidden /dev/dsk/c5t0d0s0:1 /mnt/ipod

Update, cont'd. (01/18/2005): If you are using Solaris 10 on x86, then you need only to mount the 2nd partition with:

  # mount -F pcfs -o hidden /dev/dsk/c5t0d0p2 /mnt/ipod
Nothing else, no additional hacking of the first sector, etc. For Solaris 10 on x86, all of the rest of the hacks I described here are unnecessary.

Summary
So basically what I did to my iPod was this:

At this point everything is ready to roll. And I, personally, just leave my iPod in this 'hacked' state. It makes it easier going from work to home. And I haven't had any problems using it in this state on Windows, Solaris, or Linux, with iTunes, or in the car. But if you're uncomfortable and want to undo, you can always do this at any time:
  dd if=restoreoriginal_iPodhack of=/dev/rdsk/c5t0d0s0 count=1

So that's it. No big deal really. Hopefully we'll get this Solaris bug fixed soon and none of us will have to go through this convoluted process just to use our iPods. But until then... I hope this workaround helps.

BTW... with the help of some gtkpod developers, the latest version (v0.85) now compiles and runs unmodified on Solaris 10. Pretty cool. Thanks guys! Try it out for yourself.

(2004-11-30 18:02:59.0) Permalink Comments [4]

20041028 Thursday October 28, 2004

2004 World Series Champion Boston Red Sox!

Congratulations, 2004 Boston Red Sox!

You did it! And the entire New England region is so proud of you. Thank you!





(2004-10-27 21:25:45.0) Permalink

20041021 Thursday October 21, 2004

Incredible, stunning...

Congratulations, 2004 Boston Red Sox!

As a season ticket holder, I was fortunate enough to see you play at home at Fenway in Games 3, 4, and 5. And while Game 3 was probably the 2nd most painful loss I've been witness to in my lifetime (second only to last year's Game 7, *ugh*), I was there till the finish, in the frigid cold. (After all, no matter how painful... how many playoff games does one get to go to in one's lifetime?)

In the bleachers, I found myself able to stretch out... all the while hoping, praying that you would come back from an over-10-run deficit. Alas, it did not happen and Fenway park emptied in relative silence.

But I can honestly say that I consider myself truly blessed for the opportunity I had to share the drama of Games 4 and 5 with you men and the thousands that turned out at Fenway that night.

Game 4, postponed one day because of rain, was miraclulous. On the verge of being swept in four games by your hated rival, the New York Yankees, you came to the ballpark ready to play. I came consoling myself with a single thought... "any reason they can't win tonight? no" And down to your last three outs, you pulled it together and scratched across a run a against arguably history's best closer: a walk, a stolen base, a base hit. Tie ball game! Then three more incredibly intense innings stretched into the next day's cold early hours before Papi's heroics. Incredible!

Then Game 5, starting in the evening on the same day Game 4 had ended. "Any reason they can't win this game tonight?", I said, "no". We came to the ballpark so very tired from the previous evening's ballgame. But you game us another game for the ages!

After Derek Jeter hit that killer triple down the right field line, giving them a two run lead, we were silent, stunned, all of us. But your pitching and defense prevailed and after an inning of so all 35,000+ of us were right back in it. But you didn't blink, you didn't doubt yourselves for a second. A blast by Papi and another manufactured run on a sacrifice fly by Varitek, and we were headed to extra innings again. Stand up, sit down, stand up, sit down, Go Sox! Into the 14th inning it went....

Then Papi came to the plate with Damon in scoring position and put on THE BEST at-bat I have ever seen. And this I will remember for a VERY long time. Foul ball after foul ball sprayed off, more foul balls, more foul balls, a LONG drive down the right field line... could it be? no, another foul, another foul. And then... a flare, dropped into center field, Damon comes across... and pandemonium in Boston. High-fives and hugs with complete strangers, we all couldn't care less. We felt what I felt... blessed.

So you went back to New York and you did it. Game 6, Bellhorn, Schilling... amazing. And then, last night, Game 7... I'm still stunned. You did it, and now we're going to a World Series at Fenway Park!

It's not over yet. Not by any means. You still have four more games to win. But we'll be there, all 35,000+ of us, cheering. We do believe, and we wish you all the strength and wisdom in the world to keep believing in yourselves.

Thanks, guys.

(2004-10-21 18:04:42.0) Permalink

20041011 Monday October 11, 2004

More iPod working on Solaris 10

OK, so I got gtkpod working. Pretty good app. Nice and easy to use.

It wasn't too hard to build on Solaris 10, actually, but it didn't exactly work right out of the box either (i.e. not just 'configure; make; make install'). No big deal though. It just needed to be hacked up a little to get it running.

I started by downloading the latest release gtkpod-0.80-2 and found that I also needed to download and build libid3tag. So I grabbed libid3tag-0.15.1b and it built without any changes. Then I started to build gtkpod-0.80-2 and found that I needed to make a few changes, mostly minor.

A couple of the files ('clientserver.c' and 'file.c') make calls to flock() that really oughta be calls to fcntl() for Solaris. And in 'info.c' there's a call to 'df -k -P', where '-P' is supposedly POSIX output format. That extra argument causes problems (and is unnecessary) for Solaris, so I took it out. And in 'misc.c' the function called which() seemed to be causing me some problems (but this could just be a problem with my environment.) Not sure though, so... when in doubt, hack it out.

The only other things I had to do were also probably related to my environment. Couple minor changes in the makefiles: invalid compiler flag, some issues with duplicate/conflicting header files, and needing to add 'libnsl' and 'libsocket' in with the final link. No biggie. After all that was done, I was ready to play.

As I said above, this is a pretty good app. I didn't build it with any AAC support yet (see libmp4v2 though), so I can't listen to my handful of iTunes songs. (Plus there's that whole DRM issue, which I won't go into here.) But I am able to listen to all my MP3's (all ripped from CD's I own.)

(2004-10-11 18:07:11.0) Permalink Comments [5]

20041007 Thursday October 07, 2004

iPod working on Solaris 10

Got my iPod working with my Solaris desktop. Very cool.

Sometime around the end of July, the support for 1394 mass storage devices (CD-ROMs, DVD-ROMs, Zipdisks, and devices like the iPod - which is just a 1394 hard drive) was integrated into Solaris 10. I run the latest S10 build on my desktop, of course, so I've got the new Solaris 1394 Mass Storage driver, but most of you will have to wait for the next Solaris Express release (any day now).

The new 'scsa1394' driver implements the Serial Bus Protocol 2 (SBP-2) specification, which allows 1394 mass storages devices, like the iPod, to look and act like any other disk that you're used to using. This driver joins the existing collection of drivers know collectively as the Solaris 1394 Software Framework. The framework provides support for FireWire on SPARC and x86 systems and supports both 1394 digital video (DV) camera devices and 1394 conferencing camera ("webcam") devices, in addition to the new 1394 mass storage devices.

So, anyway, I plug my iPod into one of the FireWire ports on my system (it's immediately recognized after the hotplug), mount the drive, 'cd' and/or 'ls' to see the music files (and all the other hidden tidbits in there) and run xmms to listen to my tunes while I'm in my office at work.

Next step, I figure, is to try to get something like gtkpod working on Solaris 10 so that I can add/remove files from the iPod's database, use my playlists, and have a nicer GUI interface. But it's actually pretty cool as it is. I'll post again, though, when I get it working.

In short though, if you use Solaris and you love your iPod, stay tuned for some really cool new stuff coming to your desktop.

(2004-10-07 18:39:30.0) Permalink Comments [12]

20041006 Wednesday October 06, 2004

So proud of her

I can't say enough about how proud I am of my Mom! And now she's published.

The book, called "They Change Their Sky: The Irish in Maine", is a collection of ten essays that present a story of the experiences of Maine's Irish from their first arrival in Maine, through the hey-day of Irish immigration in the mid-19th and early-20th centuries, and concluding with the modern experience. My Mom's essay is called "The Irish Experience in Lewiston, Maine 1850-1880". It is thoroughly researched and a very interesting read. (And if you don't believe me, check this out)

The majority of the material for her essay is based on research she had done as an undergad at Bates College (in Lewiston, ME) and as a grad student at University of Virginia. She actually grew up in Maine, but she's not Irish. She was a history major, though, and (if I do say so myself) she's pretty smart.

The book is being published by the University of Maine Press, so don't expect to rush out and buy the book on amazon.com just yet. But, if you're interested, you can order a copy here. Proceeds from the sale of the book help support the Maine Irish Heritage Center, located in Portland, ME.

(2004-10-06 17:12:53.0) Permalink

20041002 Saturday October 02, 2004

Introduction

I've been a software engineer in the Platform I/O Software group on the East coast for 6+ years. Generally speaking, we do Solaris device drivers and device driver frameworks for mid-sized and (lately) blade servers. It's always interesting work and stressful schedules, but at the same time we're always getting to work on the "bleeding edge" with some of the newest, hottest technologies. Some examples of things I've worked on in my time at Sun:

These days I'm keeping myself pretty busy here at work doing bringup on prototype hardware. There's always something new and exciting to learn about (e.g. AMD64 hardware, PCI-Express, HyperTransport, InfiniBand, etc.)

I came to Sun pretty much directly out of school (M.S. in Computer Science - University of Maryland at College Park & B.S. in Computer and Systems Engineering, Rensselaer Polytechnic Institute). And, yes, Sun has had it's ups and downs over the last six years... but I have an excellent manager (ooh! brownie points!) and I can honestly say that I love being able to come in every morning to work on this stuff.

So, with any luck, my blog will give a little taste of what it is that I enjoy so much about my work.

(2004-10-02 13:48:33.0) Permalink