The HyperTrap
Alexandre Chartre's Weblog
Archives
« October 2009
SunMonTueWedThuFriSat
    
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
       
Today
Links
All | General | LDoms
« Linux with LDoms | Main | Utility to easily... »
20070816 Thursday August 16, 2007
Split PCI with LDoms
With LDoms, you have the ability to partition hardware resources so that you can create multiple domains which have direct access to the hardware of the system. This is very similar to the dynamic system domains feature available on Sun's mid-range and high-end systems such as the Sun Fire E20K/E25K or the new Sun SPARC Enterprise M9000 or M8000. The main difference is that with LDoms the partitioning is done by the software through the hypervisor layer while it is done by the hardware for dynamic system domains.

I/O domains

Logical domains which have direct access to the hardware are called I/O domains. Obviously, you will have at least one I/O domain and this will be the first domain created on the system i.e. the primary domain. Then you can create additional I/O domains by removing some hardware resources from the primary domain and assigning them to another domain. Finally the number of I/O domains you can create depends on the hardware resources available on your system so it eventually depends on the type of system you are using.

PCI buses on Sun Fire T2000

Let's look at the Sun Fire T2000 Server for a concrete example. On this system, the smallest hardware resource you can assign to a domain is an entire PCI bus; and the Sun Fire T2000 has only two PCI buses hence you can create a maximum of two I/O domains.

The two PCI buses of the Sun Fire T2000 server are initially assigned to the primary domain. The buses are identified as pci@780 (or bus_a) and pci@7c0 (or bus_b) and they are connected the following devices:

As you can see, both buses have two network interfaces, but other resources are not so evenly spread: pci@7c0 (bus_b) has all the internal disks, the DVD-ROM and 4 PCI slots while pci@780 (bus_a) has only one PCI slot.

So there is no problem to create an I/O domain with bus pci@7c0 (bus_b) because you can have all the basic hardware resources you need (i.e. a disk and a network interface). But when using bus pci@780 (bus_a), you only get some network interfaces but no disk. Hence to create an I/O domain with pci@780 (bus_a) you will have to add a PCI-E card (either a Fiber Channel or a SCSI host adapter) in the PCI-E slot 0 to get access to some storage devices. You also have to ensure that the card you are adding can be used to boot the system.

Configuration of the primary domain

Initially both PCI buses are assigned to the primary domain. You can verify this with the "ldm list-bindings" command:

primary# ldm list-bindings primary
...
    IO:    pci@780 (bus_a)
           pci@7c0 (bus_b)
...
However to be able to split the PCI buses, the primary domain should be using devices from only one PCI bus and, most of the time, you will use devices from bus pci@7c0 (bus_b) because the system disk of the primary domain is an internal disk.

After checking that the primary domain is only using devices from bus pci@7c0 (bus_b), you can remove bus pci@780 (bus_a) from the configuration of the primary domain. This can be done using the "ldm remove-io" command:

primary# ldm remove-io bus_a primary
The reconfiguration is not immediate and you will have to reboot the primary domain so that the removal of pci@780 (bus_a) gets effective. After the primary domain is rebooted, you can check that it now only owns bus pci@7c0 (bus_b):
primary# ldm list-bindings primary
...
    IO:    pci@7c0 (bus_b)
...

Configuration of the alternate I/O domain

Now that PCI bus pci@780 (bus_a) is available, you can assign it to another domain. To do so, you just have to use the "ldm add-io" command while configuring your alternate domain:

primary# ldm create alternate
primary# ldm set-vcpu 4 alternate
primary# ldm set-mem 4G alternate
primary# ldm add-io bus_a alternate
This creates an alternate I/O domain with 4 cpus, 4GB of memory and the PCI bus pci@780 (bus_a). After the alternate domain is configured, it can be started as a regular domain with the "ldm bind" and "ldm start" commands;
primary# ldm bind alternate
primary# ldm start alternate
When the alternate domain is bound, you can check that it is using bus pci@780 (bus_a):
primary# ldm list-bindings alternate
...
    IO:    pci@780 (bus_a)
...
And you can connect the console of that domain to install it. The installation can be done through the network with a "boot net" like for installing a regular Sparc system.

Differences on Sun Fire T1000

You can setup the same configuration on a Sun Fire T1000 Server. The Sun Fire T1000 has two PCI buses similar to the two PCI buses of the Sun Fire T2000: pci@780 (bus_a) and pci@7c0 (bus_b). But the Sun Fire T1000 has no PCI-E and PCI-X slots on bus pci@7c0 (bus_b). Fortunately it still has PCI-E slot 0 on bus pci@780 (bus_a) which can be used to plug a FC or SCSI host adapter to connect some storage for the alternate domain.

Virtual I/O Failover

Once you have more than one I/O domain, you can configure virtual I/O failover for guest domains. Check out Narayan's blog for details: Part One and Part Two.


Aug 16 2007, 10:31:30 AM PDT Permalink Comments [2]

Comments:

[Trackback] Die Virtualisierungstechnologie “Logical Domain” (LDOM), die auf den neuen USparc T1/T2 basierten Sun-Servern mit Hypervisor zur Verfügung steht, ist relativ neu (LDOM 1.0) und hat im Bereich Verfügbarkeit noch Optimierungspotenti...

Posted by Otmanix Blog on August 16, 2007 at 02:28 PM PDT #

Hi,

Very useful stuff...

Have a query on implementing VxVm root mirroring for guest domains? Can you please post VxVm root mirroring over a logical domain such as...

1. exporting a VxVm mirrored volume to a guest domain
2. Performing a VxVm root mirroring inside a guest domain

Vinod

Posted by Vinod K on September 05, 2007 at 07:13 AM PDT #

Post a Comment:

Comments are closed for this entry.