From My Brain to Your Browser
Jeff Victor's Blog
Archives
« March 2007 »
SunMonTueWedThuFriSat
    
1
2
3
4
5
6
7
8
9
10
11
12
13
14
16
17
18
19
20
21
23
24
25
26
27
28
30
31
       
Today
Click me to subscribe
Search

Links
 

Today's Page Hits: 264

« Previous month (Jan 2007) | Main | Next month (Mar 2007) »
Thursday Mar 29, 2007
Virtualization HCLs

Did you know that Solaris Containers has the largest HCL of any server virtualization solution?

Here are three examples:

  1. Solaris 10 HCL: 790 x86/x64 systems + 75 SPARC systems = 865 total systems (March 28, 2007)
  2. VMware: 305 x86/x64 systems (March 21, 2007)
  3. Xen publishes specific component requirements (e.g. "1.5GHz single CPU minimum") instead of an HCL
Note that the Solaris Containers functionality is available on all Solaris 10 systems, and is exactly the same on all hardware architectures.

Is that metric relevant? Many factors should affect your virtualization choice. One of them is hardware choice: "does my choice of server virtualization technology limit my choice of hardware platform?"

The data points above show sufficient choice in commodity hardware for most people, but Containers maximizes your choice, and only Containers is supported on multiple hardware architectures.

Posted at 12:27PM Mar 29, 2007 by Jeffrey Victor in Solaris 10 Containers  |  Comments[1]

Thursday Mar 22, 2007
Title: Spawning 0.5kZ/hr (Part 3)

Two previous blogs described my quest to create and boot 500 zones on one system as efficiently as possible, given my hardware constraints. But my original goal was testing the sanity of the limit of 8,191 zones per Solaris instance. Is the limit too low, or absurdly high? Running 500 zones on a sufficiently large system seemed reasonable if the application load was sufficiently small per zone. How about 1,000 zones?

Modifying my scripts to create the 501st through 1,000th zones was simple enough. The creation of 500 zones went very smoothly. Booting 1,000 zones seemed too easy...until somewhere in the 600's. Further zones didn't boot, or booted into administrative mode.

Several possible obstacles occurred to me, but a quick check of Richard and Jim's new Solaris Internals edition helped me find the maximum number of processes currently allowed on the system. The value was a bit over 16,000. And those 600+ zones were using them all up. A short entry in the global zone's /etc/system file increased the maximum number of processes to 25,000:

set max_nprocs=25000

Unfettered by a limit on the number of concurrent processes, I re-booted all the zones. More then 900 booted, but the same behavior returned: many zones did not boot properly. The running zones were not using all 25,000 PID slots. To re-diagnose the problem I first verified that I could create 25,000 processes with a "limited fork bomb." I was temporarily stumped until a conversation I had with some students in my LISA'06 class "Managing Resources with Solaris 10 Containers." One of them had experienced a problem on a very large Sun computer that was running hundreds of applications, though they weren't using Containers.

They found that they were being limited by the amount of software thread (LWP) stack space in the kernel. LWP stack pages are one of the portions of kernel memory that are pageable. Space for pageable kernel memory is allocated when the system boots and cannot be re-sized while the kernel is running.

The default size depends on the hardware architecture. For 64-bit x86 systems the default is 2GB. The kernel tunable which controls this is segkpsize, which represents the number of kernel memory pages that are pageable. When these pages are all in use, new LWPs (threads) cannot be created.

With over 900 zones running, prstat(1M) showed over 77,000 LWPs in use. To test my guess that segkpsize was limiting my ability to boot 1,000 zones, I added the following line to /etc/system and re-booted:

set segkpsize=1048576
This doubles the amount of pageable kernel memory to 4GB on AMD64 systems. With that, booting my 1,000 zones was boring, as it should be. :-) Final statistics for 1,000 running zones included:

Conclusions:

  1. Zones are extremely efficient, lightweight virtual server environments. Hundreds of them can run simultaneously on a larger (>=4 processor) system.
  2. At this point, a limit of 8,191 zones is very reasonable. Future sytems might be able to handle more, and Solaris shouldn't get in the way...

    Footnotes:
    Limited fork bomb: I wrote a program which created a fixed number of processes, with a short interval between forks. This allowed me to find the maximum number of processes that the system could create, but also allowed me to terminate the "limited fork bomb" and regain control of the system.

Posted at 10:15AM Mar 22, 2007 by Jeffrey Victor in Solaris 10 Containers  | 

Thursday Mar 15, 2007
Spawning 0.5kZ/hr (Part 2)

As I said last time, zone-clone/ZFS-clone is time- and space-efficient. And that entry looked briefly at cloning zones. Now let's look at the integration of zone-clones and ZFS-clones.

Enter Z^2 Clones

Instead of copying every file from the original zone to the new zone, a clone of a zone that 'lives' in a ZFS file system is actually a clone of a snapshot of the original zone's file system. As you might imagine, this is fast and small. When you use zone-clone to install a zone, most of the work is merely copying zone-specific files around. Because all of the files start out identical from one zone to the next, and because each zone is a snapshot of an existing zone, there is very little disk activity, and very little additional disk space is used.

But how fast is the process of cloning, and how small is the new zone?

I asked myself those questions, and then used a Sun Fire X4600 with eight AMD Opeteron 854's and 64GB of RAM to answer them. Unfortunately the system only has its internal disk drives. The disk drive was the bottleneck most of the time. I created a zpool from one disk slice on that drive, which is neither robust nor efficient. But it worked.

Creating the first zone took 150 seconds, including creating the ZFS file system for the zone, and used 131MB in the zpool. Note that this is much smaller than the disk space used by other virtualization solutions. Creating the next nine zones took less than 50 seconds, and used less than 20MB, total, in the zpool.

The length of time to create additional zones gradually increased. Creation of the 200th through 500th zones averaged 8.2 seconds each. Also, the disk space used gradually increased per zone. After booting each zone several times, they each used 6MB-7MB of disk space. The disk space used per zone increased as each zone made its own changes to configuration files. But the final rate of creation was 489 zones per hour.

But will they run? And are they as efficient at memory usage as they are at disk usage?

I booted them from a script, sequentially. This took roughly 10 minutes Using the "memstat" tool of mdb, I found that each zone uses 36MB of RAM. This allowed all 500 zones to run very comfortably in the 64GB on this system. This small amount was due to the model used by sparse-root zones: a program that is running in multiple zones shares the program's text pages.

The scalability of performance was also excellent. A quick check of CPU usage showed that all 500 zones used less than 2% of the eight CPUs in the system. Of course, there weren't any applications running in the zones, but just try to run 500 guest operating systems in your favorite hypervisor-based virtualization product...

But why stop there? 500 zones not enough for you? Nah, me neither. How about 1,000 zones? That sounds like a good reason for a "Part 3."

Conclusion

New features added recently to Solaris zones improve on their excellent efficiency:

  1. making a copy of a zone was reduced from 30 minutes to 8 seconds
  2. disk usage of zones decreased from 100MB to 7 MB
  3. memory usage stayed extremely low - roughly 36MB per zone
  4. CPU cycles used just by unbooted zones is zero, and by running zones (with no applications) is negligible

So, maybe computers hate me for pushing them out of their comfort zone. Or maybe it's something else.

Posted at 02:58PM Mar 15, 2007 by Jeffrey Victor in Technology  |