Friday April 17, 2009
Sun Blade X6270 & INTEL XEON X5560 on OpenSolaris create the ultimate Directory Server

Sun Blade 6000 Modular
System
As you can see in this video,
the ten rack units Sun Blade 6000 system is the way to provide a very
dense 10 blades environment which can run a mix of Solaris, Linux and
Windows on a unique choice of SPARC, AMD or INTEL processors. You
will have up to double memory and I/O capacity of competing blades
using industry standard PCIe ExpressModules. We are announcing in a
few days the most powerful of all blades : the Sun Blade X6270 based
on the new INTEL XEON X5560 processor (codename Nehalem).

Sun Blade X6270
The
blade that I have tested came equipped with two INTEL XEON X5560
processors (code name Nehalem) running at 2.8Ghz and 24 Gbytes of 1066Mhz DDR3 memory. As
I wanted to get the lowest memory latency possible, I borrowed from
engineering six 1333Mhz DDR3 4Gbytes X5870A . By placing them in
strategic slots (Bank 2,5,8 of each socket), I guaranteed they will
effectively be running at 1333Mhz , producing an ideal Directory
Server environment. (You can accurately observe that I would get a
little boost in performance by upgrading to the XEON X5570, most
likely between 2 and 4%.)
Here are the details of the configuration :
System Configuration: SUN MICROSYSTEMS SUN BLADE X6270 SERVER MODULE
BIOS Configuration: American Megatrends Inc.
BMC Configuration: IPMI 1.5 (KCS: Keyboard Controller Style)
==== Processor Sockets ====================================
Version Location Tag
-------------------------------- --------------------------
Intel(R) Xeon(R) CPU X5560 @ 2.80GHz CPU 1
Intel(R) Xeon(R) CPU X5560 @ 2.80GHz CPU 2
==== Memory Device Sockets ================================
Type Status Set Device Locator Bank Locator
----------- ------ --- ------------------- ----------------
other in use 0 D2 BANK2
other in use 0 D5 BANK5
other in use 0 D8 BANK8
other in use 0 D2 BANK2
other in use 0 D5 BANK5
other in use 0 D8 BANK8
FLASH in use 0
==== On-Board Devices =====================================
Zoar 2x GbE.
Zoar 2x GbE.
==== Upgradeable Slots ====================================
ID Status Type Description
--- --------- ---------------- ----------------------------
0 in use PCI Express PCIE0
1 available PCI Express PCIE1
2 available PCI Express PCIE2
3 available PCI Express PCIE3
4 available PCI Express PCIE4
INTEL Xeon X5500 processors
This new family of XEON processors is based on the Intel
Processors Microarchitecture
(see diagram below). Using a 45nm manufacturing process, each 263 sq.
mm quad-core dual-thread chip has 781 million transistors, 256KB L1
cache, 1 MB of L2 cache and 8MB of L3 cache. The DDR3-1333 memory
controller is key to obtain extreme performance of memory intensive
applications. I have recently tested for an undisclosed customer in
the Silicon Valley an in-memory database showing on this chip more
than 30 times the throughput of any relational database software.

Note
on BIOS settings : Overall INTEL recommendations on BIOS settings
for the TPC-E benchmarks were used for this test. In a nutshell, the
following parameters were enabled : NUMA, HyperThreading, MLC Spacial
& Streamer prefetchers and DCU IP & Streamer Prefetchers.
RTID was kept at the default value of 24-16-24.
Sun DSEE 6.3.1
The Sun Java System Directory Server Enterprise Edition provides a central repository for storing and managing identity profiles and access informations. Leading the directory market, it is a secure, highly available and scalable product just updated with release 6.3.1.
This latest update provides fixes to replication issues in mixed DS 5.2 and 6.x topologies, on Directory Proxy Server it improves support for Virtualization and includes additional performance related enhancements. Furthermore this patch release improves overall quality and robustness of deployments. More informations are in the Release Notes located here.

OpenSolaris
The OpenSolaris Operating System, a single distribution for desktop, server and HPC deployments, is based on the Solaris kernel and created through community collaboration at openSolaris.org . It combines Solaris technologies and tools with modern desktop features and applications developed by open source communities such as GNOME, Mozilla and the Free Software Foundation. LiveCD installation and the new network-based OpenSolaris Image Packaging System (IPS) simplify and speed installation and integration with third-party applications. OpenSolaris is fully supported, with OpenSolaris Subscriptions available from Sun ranging from email support to 24/7 production support.
For this test, we are using OpenSolaris build 109 – which includes some of the engineering work done by Sun and INTEL to optimize INTEL XEON X5500 Solaris environments :
# cat /etc/release
Solaris Express Community Edition snv_109 X86
Copyright 2009 Sun Microsystems, Inc. All Rights Reserved.
Use is subject to license terms.
Assembled 23 February 2009
Benchmarketing a directory server
That's right, this is not a typo. Benchmarketing is different than benchmarking ! You can find today on the Internet various claims for high LDAP search performance. Now most of those performance numbers are obtained on very small directories. It is not uncommon for white papers authors to test a LDAP directory as small as 50,000 entries and consider it relevant. See "Measurement and Analysis of LDAP Performance white paper" for an example. Also, engineers use common tricks to increase LDAP performance including disabling the directory logs (or writing them in memory), returning only one or no attributes and/or querying only a portion of the Index tree. (Note that I could not resist to try it and under this conditions I was able to get more than 55,000 LDAP search/s on the X6270)
What's happening on our customer sites is very different. An average Sun DSEE Directory size has around 10 Million entries accessed with a 10% ratio. (i.e 1 Million user ids). And of course, we can not use any of the previously detailed performance tricks on a 24x7 production environment. The following benchmarking results have been obtained using production-ready tunables.
iGenLDAPs – A LDAP Search benchmark
The iGenLDAPs benchmark is based on SLAMD – a load simulation framework initially developed at Sun and now available as a java.net project. SLAMD is multi-clients, multi-process and multi-threads making it the most scalable LDAP load simulator on the market. As mentioned, we are querying a 10% accessed 10 Millions “ou=People” directory using DSEE6.3.1 The directory is configured to use a maximum of 20 Gigabytes of RAM -which is enough to cache the entire index+data. All 12 attributes are returned to the client (a Sun Fire X4450 server) fully loaded and connected to the blade via a private 1 Gbit network. A Sun StorageTek 6140 array is hosting the directory on five 15000 rpm FC disks (RAID1+0), one RAID controller with 1 Gbytes of cache and one 4 Gbit/second link.
Here are the results :
iGenLDAPs - 1800 seconds (30m 0s)
|
Count |
Avg/Second |
Avg/Interval |
Std Dev |
Corr Coeff |
|
49647440 |
34239.614 |
1711980.069 |
2713.15 |
0.040 |
Exceptions Caught
|
Count |
Avg/Second |
Avg/Interval |
Std Dev |
Corr Coeff |
|
0 |
0.000 |
0.000 |
0.000 |
0.000 |
Entries Returned
|
Total |
Avg Value |
Avg/Second |
Avg/Interval |
Std Dev |
Corr Coeff |
|
49647450 |
1.000 |
34239.621 |
1711980.103 |
0.000 |
0.000 |
Search Time (ms)
|
Total Duration |
Total Count |
Avg Duration |
Avg Count/Interval |
Std Dev |
Corr Coeff |
|
2873937 |
4964747 |
0.579 |
171198.172 |
0.05 |
0.048 |
Under similar conditions the fastest I ever obtained before this test was about 23,000 searches/s ! Please try it at home and see if you can beat the X6270 numbers. And if you can, let us know !
But you can argue (rightfully) that a standard directory deployment does not process only LDAP searches but a mix of various LDAP calls. This is why I also provide youo iGenLDAPsm numbers.
iGenLDAPsm – A
LDAP Siteminder simulation
The iGenLDAPsm
simulation provides a mechanism for simulating the load that
Netegrity SiteMinder places on a directory server when it is using
that server to authenticate users. In particular, this job simulates
the requests that SiteMinder issues to the directory server when
password services are enabled. While Modify or Authenticate
operations performance is interesting, iGenLDAPsm provides a very
realistic way of determining your REAL LDAP capacity. This load will
generate in each transaction 11 LDAP Operations : 1 Authentication, 1
Bind, 1 Modify and 8 Searches. As you can see below, we obtained in
peak 4180 iGenLDAPsm transactions per second corresponding to 45,980
LDAP Operations per second ! On a single X6270 blade....
iGenLDAPsm - 1800 seconds (30m 0s)
Authentication Attempts
|
Count |
Avg/Second |
Avg/Interval |
Std Dev |
Corr Coeff |
|
4807641 |
4180.557 |
20902.783 |
57.34 |
-0.062 |
Successful Authentications
|
Count |
Avg/Second |
Avg/Interval |
Std Dev |
Corr Coeff |
|
4805644 |
4178.817 |
20894.087 |
54.89 |
-0.061 |
Failed Authentications
|
Count |
Avg/Second |
Avg/Interval |
Std Dev |
Corr Coeff |
|
0 |
0.000 |
0.000 |
0.000 |
0.000 |
Authentication Time (ms)
|
Total Duration |
Total Count |
Avg Duration |
Avg Count/Interval |
Std Dev |
Corr Coeff |
|
228998500 |
480564 |
47.652 |
20894.087 |
1.62 |
0.321 |
Bind Operations Performed
|
Count |
Avg/Second |
Avg/Interval |
Std Dev |
Corr Coeff |
|
4807032 |
4180.026 |
20900.130 |
56.66 |
-0.062 |
Bind Time (ms)
|
Total Duration |
Total Count |
Avg Duration |
Avg Count/Interval |
Std Dev |
Corr Coeff |
|
2721396 |
480700 |
0.566 |
20900.000 |
0.016 |
0.060 |
Modify Operations Performed
|
Count |
Avg/Second |
Avg/Interval |
Std Dev |
Corr Coeff |
|
4806293 |
4179.383 |
20896.913 |
55.65 |
-0.061 |
Modify Time (ms)
|
Total Duration |
Total Count |
Avg Duration |
Avg Count/Interval |
Std Dev |
Corr Coeff |
|
29812733 |
480600 |
6.203 |
20895.652 |
0.19 |
-0.068 |
Search Operations Performed
|
Count |
Avg/Second |
Avg/Interval |
Std Dev |
Corr Coeff |
|
38452462 |
33436.922 |
167184.609 |
447.43 |
-0.061 |
Initial Search Time (ms)
|
Total Duration |
Total Count |
Avg Duration |
Avg Count/Interval |
Std Dev |
Corr Coeff |
|
5000859 |
4807211 |
10.403 |
20900.913 |
0.375 |
0.015 |
Subsequent Search Time (ms)
|
Total Duration |
Total Count |
Avg Duration |
Avg Count/Interval |
Std Dev |
Corr Coeff |
|
146366814 |
33643572 |
4.351 |
146276.391 |
0.189 |
0.171 |
Conclusion
A Sun Blade 6000 can host as many as ten blades, therefore it gives us a potential 459,800 LDAP Operations per second on a fully loaded modular system.
And it is easy to federate ten instances of the Directory Server. You can use the Data Distribution feature of the Directory Proxy Server. Amazing technology !
See you next time in the wonderful world of
benchmarking....
Not that amazing. Quite unimpressed by Nehalem. Sun can do better than that. Surely, I'm not the only one who knows better. :)
Not that amazing. Quite unimpressed by Nehalem. Sun can do better than that. Surely, I'm not the only one who knows better. :)
Am I reading right? No Flash? This has got to be amazing with ZIL then... Why not test with flash?
Posted by -=armaud=- on April 21, 2009 at 10:05 AM PDT #
Benoit...nice work! What would the numbers look like with OpenSSO? Could you pull that off for us and let us know your results?
Posted by Nicholas Wooler on May 12, 2009 at 01:26 PM PDT #