Eric Kustarz's Weblog

e-street

All | FileBench | NFS | SETUP | ZFS

20060807 Monday August 07, 2006

 vq_max_pending

As part of the I/O scheduling, ZFS has a field called 'zfs_vdev_max_pending'. This limits the maximum number of I/Os we can send down per leaf vdev. This is NOT the maximum per filesystem or per pool. Currently the default is 35. This is a good number for today's disk drives; however, it is not a good number for storage arrays that are really comprised of many disks but exported to ZFS as a single device.

This limit is a really good thing when you have a heavy I/O load as described in Bill's "ZFS vs. The Benchmark" blog.

But if you've created say a 2 device mirrored pool - where each device is really a 10 disk storage array, and you think that ZFS just isn't doing enough I/O for you, here's a script to see if that's true:

#!/usr/sbin/dtrace -s

vdev_queue_io_to_issue:return
/arg1 != NULL/
{
        @c["issued I/O"] = count();
}

vdev_queue_io_to_issue:return
/arg1 == NULL/
{
        @c["didn't issue I/O"] = count();
}

vdev_queue_io_to_issue:entry
{
        @avgers["avg pending I/Os"] = avg(args[0]->vq_pending_tree.avl_numnodes);
        @lquant["quant pending I/Os"] = quantize(args[0]->vq_pending_tree.avl_numnodes);
        @c["total times tried to issue I/O"] = count();
}

vdev_queue_io_to_issue:entry
/args[0]->vq_pending_tree.avl_numnodes > 349/
{
        @avgers["avg pending I/Os > 349"] = avg(args[0]->vq_pending_tree.avl_numnodes);
        @quant["quant pending I/Os > 349"] = lquantize(args[0]->vq_pending_tree.avl_numnodes, 33, 1000, 1);
        @c["total times tried to issue I/O where > 349"] = count();
}

/* bail after 5 minutes */
tick-300sec
{
        exit(0);
} 

If you see the "avg pending I/Os" hitting your vq_max_pending limit, then raising the limit would be a good thing. The way to do that used to be per vdev, but we now have a single global way to change all vdevs.

heavy# mdb -kw
Loading modules: [ unix genunix specfs dtrace cpu.generic cpu_ms.AuthenticAMD.15 uppc pcplusmp scsi_vhci ufs ip hook neti sctp arp usba fctl nca lofs zfs random nfs cpc fcip logindmux ptm sppp ipc ]
> zfs_vdev_max_pending/E
zfs_vdev_max_pending:
zfs_vdev_max_pending:           35              
> zfs_vdev_max_pending/W 0t70
zfs_vdev_max_pending:           0x23            =       0x46
> zfs_vdev_max_pending/E
zfs_vdev_max_pending:
zfs_vdev_max_pending:           70              
>

The above will change the max # of pending requests to 70, instead of 35.

So having people tune variables is never desireable, and we'd like 'vq_max_pending' (among others) to be dynamically set, see: 6457709 vdev_knob values should be determined dynamically .



(2008-03-03 14:19:42.0/2006-08-07 11:22:50.0) Permalink Comments [6]
Trackback: http://blogs.sun.com/erickustarz/en_US/entry/vq_max_pending


« August 2006 »
SunMonTueWedThuFriSat
  
1
2
3
5
6
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
  
       
Today


XML





Today's Page Hits: 82