Wednesday Jul 30, 2008

vdbenchYou can now download vdbench 4.07 for some heavy duty storage performance loading and modeling, as well as swat 3.00 for a nice tnfprobe and prex view of what's going with your storage subsystem. These tools work nicely across a variety of platforms - (notably Solaris and Windows .. vdbench has Mac OS X, AIX, HPUX, and linux support) - they're both java based (vdbench uses a C library for speedier workload generation) .. but be forewarned .. you'll need to be careful with vdbench, as you're wielding the power to directly modify any raw devices you can see from the OS which can potentially overwrite existing data and labels (ie: think dd on steroids) - so use proper safety when testing, (or just use files on filesystems)!!

UPDATE: vdbench is now here, and the latest SWAT 3.02 is over here .. oh, and here's the official support statement:

“Swat and Vdbench are tools delivered and supported by the Sun Microsystems, Inc. Strategic Application Engineering (SAE) – Storage Performance Benchmarking Group (SPBG). It is the responsibility of SPBG to maintain, support, and enhance these tools, not the official Sun Service department. Additionally, the tools are supported for internal Sun use and Sun partners only – not the end users.” This is the official statement of support and its purpose is to make clear to the end user that Sun does not support these tools in its typical product fashion. However, if the tools are used in cooperation with Sun and/or one of its partners, for example in a sales situation, or when the tools are used to resolve a customer performance problem, then the tools will be supported by SPBG via the Sun field representative.

Wednesday Mar 26, 2008

I've been playing with a thumper for the past couple of weeks and a couple of fibre channel cards to build a simple storage array appliance .. thought I'd share the simple setup procedure for COMSTAR, that Sumit has nicely put together .. in the simple example below, I'm exporting zvols from the thumper killer to my initiator host tim (just a simple fibre channel connection FC private loop) .. observe:

killer:~ root# mdb -k
Loading modules: [ unix genunix specfs dtrace cpu.generic cpu_ms.AuthenticAMD.15
 uppc pcplusmp scsi_vhci ufs ip hook neti sctp arp usba fctl nca lofs md cpc ran
dom crypto zfs fcip logindmux nsctl ptm sppp ]
> ::devbindings -q qlc
ffffff04de43a050 pci1077,2422, instance #0 (driver name: qlc)
ffffff04de437d48 pci1077,2422, instance #1 (driver name: qlc)
> ^D

killer:~ root# update_drv -a -i '"pci1077,2422"' qlt

<..reboot..>

killer:~ root# mdb -k
Loading modules: [ unix genunix specfs dtrace cpu.generic cpu_ms.AuthenticAMD.15
 uppc pcplusmp scsi_vhci ufs ip hook neti sctp arp usba fctl nca lofs md cpc ran
dom crypto zfs smbsrv fcip fcp logindmux nsctl sdbc sv ptm ii sppp rdc ]
> ::devbindings -q qlt
ffffff04de43a050 pci1077,2422, instance #0 (driver name: qlt)
ffffff04de437d48 pci1077,2422, instance #1 (driver name: qlt)
> ^D

killer:~ root# svcadm enable stmf
killer:~ root# svcs stmf 
STATE          STIME    FMRI
online         17:04:35 svc:/system/device/stmf:default

killer:~ root# stmfadm list-target -v
Target: wwn.210000E08B9E5134
    Operational Status: Online
    Provider Name     : qlt
    Alias             : qlt0,0
    Sessions          : 1
        Initiator: wwn.210000E08B9EE333
            Alias: -
            Logged in since: Thu Feb 21 17:50:40 2008
Target: wwn.210100E08BBE5134
    Operational Status: Online
    Provider Name     : qlt
    Alias             : qlt1,0
    Sessions          : 1
        Initiator: wwn.210100E08BBEE333
            Alias: -
            Logged in since: Thu Feb 21 17:50:40 2008

killer:~ root# zfs list                  
NAME               USED  AVAIL  REFER  MOUNTPOINT
bigpool           11.1T  3.15T  28.8K  /bigpool
bigpool/vol1       100G  3.16T  98.1G  -
bigpool/vol2         1T  3.18T  1002G  -
bigpool/vol3        10T  3.36T  9.80T  -
rootpool          55.0G   402G  23.5K  /rootpool
rootpool/rootfs   5.04G   402G  5.04G  legacy
rootpool/testvol    50G   403G  49.1G  -
scratch            106K   457G    18K  /scratch
killer:~ root# sbdadm create-lu /dev/zvol/rdsk/bigpool/vol1
Created the following LU:

              GUID                    DATA SIZE           SOURCE
--------------------------------  -------------------  ----------------
6000ae4080000000000047be01940001      107374116864     /dev/zvol/rdsk/bigpool/vol1
killer:~ root# sbdadm create-lu /dev/zvol/rdsk/bigpool/vol2

Created the following LU:

              GUID                    DATA SIZE           SOURCE
--------------------------------  -------------------  ----------------
6000ae4080000000000047be01ae0002      1099511562240    /dev/zvol/rdsk/bigpool/vol2
killer:~ root# sbdadm create-lu /dev/zvol/rdsk/bigpool/vol3

Created the following LU:

              GUID                    DATA SIZE           SOURCE
--------------------------------  -------------------  ----------------
6000ae4080000000000047be01b00003  10995116212224       /dev/zvol/rdsk/bigpool/vol3
killer:~ root# stmfadm list-lu -v
LU Name: 6000AE4080000000000047BE01940001
    Operational Status: Online
    Provider Name     : sbd
    Alias             : /dev/zvol/rdsk/bigpool/vol1
    View Entry Count  : 0
LU Name: 6000AE4080000000000047BE01AE0002
    Operational Status: Online
    Provider Name     : sbd
    Alias             : /dev/zvol/rdsk/bigpool/vol2
    View Entry Count  : 0
LU Name: 6000AE4080000000000047BE01B00003
    Operational Status: Online
    Provider Name     : sbd
    Alias             : /dev/zvol/rdsk/bigpool/vol3
    View Entry Count  : 0
killer:~ root# stmfadm add-view -?

Usage:  stmfadm add-view [OPTIONS] 
        OPTIONS:
                -n, --lun  
                -t, --target-group  
                -h, --host-group  
killer:~ root# stmfadm add-view -n 0 6000AE4080000000000047BE01940001
killer:~ root# stmfadm add-view -n 1 6000AE4080000000000047BE01AE0002
killer:~ root# stmfadm add-view -n 2 6000AE4080000000000047BE01B00003
killer:~ root# stmfadm list-lu -v
LU Name: 6000AE4080000000000047BE01940001
    Operational Status: Online
    Provider Name     : sbd
    Alias             : /dev/zvol/rdsk/bigpool/vol1
    View Entry Count  : 1
LU Name: 6000AE4080000000000047BE01AE0002
    Operational Status: Online
    Provider Name     : sbd
    Alias             : /dev/zvol/rdsk/bigpool/vol2
    View Entry Count  : 1
LU Name: 6000AE4080000000000047BE01B00003
    Operational Status: Online
    Provider Name     : sbd
    Alias             : /dev/zvol/rdsk/bigpool/vol3
    View Entry Count  : 1
now back on tim
im:~ root# cfgadm -al -o show_SCSI_LUN c4 c5
Ap_Id                          Type         Receptacle   Occupant     Condition
c4                             fc-private   connected    configured   unknown
c4::210000e08b9e5134,0         disk         connected    configured   unknown
c4::210000e08b9e5134,1         disk         connected    configured   unknown
c4::210000e08b9e5134,2         disk         connected    configured   unknown
c5                             fc-private   connected    configured   unknown
c5::210100e08bbe5134,0         disk         connected    configured   unknown
c5::210100e08bbe5134,1         disk         connected    configured   unknown
c5::210100e08bbe5134,2         disk         connected    configured   unknown

tim:~ root# echo | format
Searching for disks...done


AVAILABLE DISK SELECTIONS:
       0. c3t0d0 
          /pci@0,0/pci1022,7458@11/pci1000,3060@4/sd@0,0
       1. c3t1d0 
          /pci@0,0/pci1022,7458@11/pci1000,3060@4/sd@1,0
       2. c3t2d0 
          /pci@0,0/pci1022,7458@11/pci1000,3060@4/sd@2,0
       3. c3t3d0 
          /pci@0,0/pci1022,7458@11/pci1000,3060@4/sd@3,0
       4. c6t6000AE4080000000000047BE01AE0002d0 
          /scsi_vhci/disk@g6000ae4080000000000047be01ae0002
       5. c6t6000AE4080000000000047BE01B00003d0 
          /scsi_vhci/disk@g6000ae4080000000000047be01b00003
       6. c6t6000AE4080000000000047BE01940001d0 
          /scsi_vhci/disk@g6000ae4080000000000047be01940001
Specify disk (enter its number): Specify disk (enter its number): 
tim:~ root# zpool create z1 c6t6000AE4080000000000047BE01940001d0
tim:~ root# zpool create z2 c6t6000AE4080000000000047BE01AE0002d0
tim:~ root# zpool create z3 c6t6000AE4080000000000047BE01B00003d0
tim:~ root# df -h
Filesystem             size   used  avail capacity  Mounted on
/dev/dsk/c3t0d0s0       66G   5.1G    61G     8%    /
/devices                 0K     0K     0K     0%    /devices
/dev                     0K     0K     0K     0%    /dev
ctfs                     0K     0K     0K     0%    /system/contract
proc                     0K     0K     0K     0%    /proc
mnttab                   0K     0K     0K     0%    /etc/mnttab
swap                    54G   1.0M    54G     1%    /etc/svc/volatile
objfs                    0K     0K     0K     0%    /system/object
sharefs                  0K     0K     0K     0%    /etc/dfs/sharetab
/usr/lib/libc/libc_hwcap2.so.1
                        66G   5.1G    61G     8%    /lib/libc.so.1
fd                       0K     0K     0K     0%    /dev/fd
swap                    54G    44K    54G     1%    /tmp
swap                    54G    40K    54G     1%    /var/run
z1                      98G    18K    98G     1%    /z1
z2                    1000G    18K  1000G     1%    /z2
z3                     9.8T     1K   9.8T     1%    /z3

We'll talk about performance a little later, as well as some other details for building an easy appliance out of this setup in another entry .. stay tuned!

Friday Feb 29, 2008

One thing I appreciate in ZFS is the ability to quickly check system limits, or at least 64 bit code completeness, since making sparse volumes or sparse files is pretty easy. As most shells support the bitshift parameter, this is just a couple trivial oneliners:

To make an sickly huge 8EB sparse file (and yes that's 8 exabytes) on a zfs volume:

killer:bigpool jone# mkfile -n $(((1<<63)-512)) /bigpool/sickfile
And now to make a block aligned 8EB sparse volume:
killer:bigpool jone# zfs create -s -V $(((1<<63)-512)) bigpool/sickvol
let's take a look
killer:bigpool jone# ls -lh sickfile
-rw------T   1 root     root        8.0E Feb 29 17:11 sickfile

killer:bigpool root# format -e /dev/zvol/rdsk/bigpool/sickvol
selecting /dev/zvol/rdsk/bigpool/sickvol
No defect list found
[disk formatted, no defect list found]


FORMAT MENU:
        disk       - select a disk
        type       - select (define) a disk type
        partition  - select (define) a partition table
        current    - describe the current disk
        format     - format and analyze the disk
        fdisk      - run the fdisk program
        repair     - repair a defective sector
        show       - translate a disk address
        label      - write label to the disk
        analyze    - surface analysis
        defect     - defect list management
        backup     - search for backup labels
        verify     - read and display labels
        volname    - set 8-character volume name
        !     - execute , then return
        quit
format> ver

Volume name = <        >
ascii name  = 
bytes/sector    =  512
sectors = 18014398509481855
accessible sectors = 18014398509481855
Part      Tag    Flag     First Sector                 Size                 Last Sector
  0   reserved    wm                34            8388608.00TB                  18014398509481855    

format>
egad! have i just boiled the oceans?? of course to fill a real volume like this is another story since at an unrealistic sustained 10GB/s it'd take you over 27 years .. enjoy!

Friday Feb 15, 2008

forgive me father for i have sinned .. it's been over 2 years since my last blog entry - i must confess that it's just one of those things that's easy to put off for another day, but that day never comes until you seize it .. (carpe diem and all that corpus crapus) ..

anyhoo - there's a lot of exciting things that've been happening more recently in our storage software stack that i hope to delve into in the coming days, minutes, weeks, months and years .. but perhaps i should [re]-introduce myself and what i've been doing here the past 10+ years for the illustrious fireball in the sky:

year 1: internal system administration for the sun java centers (ah, the old professional service oriented java consulting days) .. this was an interesting exercise in promoting corporate communication and cooperation with a very small budget .. we focused a lot on building up aliases for cooperation and had a few machines that we found in various closets and through other sources to collaborate on some projects and internal development .. (let me just say that i thought i'd never see the day when sun consultants were welcome to tout whatever laptop and operating system they so desired, and our source would be completely opened)

years 2-10: various professional services engagements across the country .. ah from network solutions at their high time when they were split, to many hours at AT&T Wireless, AT&T corporate, many gov't jobs, hospitals, media companies, financial svc industries, and more .. always a blast implementing, testing, filing bugs, fixing, working around, proving and breaking many things

ever since the LSC acquisition back in 2001 (bringing us a wealth of experience from Control Data along with SAM-QFS) i've always had a special place in my heart for the filesystem and storage problem .. particularly as it relates to storage performance and global access issues .. it's been fun to see ZFS take the stage and integrate some lovely cache related goodness (ah - slab allocator on the VOP) and watch as the system folk took on a greater interest in storage issues (it's much more than an accessory you know) .. so i guess that's where I've really been focusing much of my efforts over the years .. for those of you who've i'd had the pleasure to work with over the years - it's always been a treat to hear of your problems and issues you're trying to solve as well as your experiences with sun (i have many of my own)

This blog copyright 2009 by jone