Rationale
Long have we heard that the new Nehalem-based Sun x86 systems would bring significant performance boost over the AMD Opterons still ruling the land to this day. The whole idea of the test was to see in the particular case of Directory Services, and even more specifically of OpenDS, how this translated into throughput, response time and all the good things we (meaning the seriously loony LDAP geeks) like to look at...
Bird's Eye View
On this single blade, OpenDS achieves over 93,000 search operations per second and over 17,000 modification operations per second. Under lighter -but still significant throughput always above 70,000k ops/sec- OpenDS delivers sub millisecond response time.
Sounds too good to be true? Then read further...
To sum it up as Benoit did in his post, this would give you, in a fully populated 6000 chassis, the ability to process almost A MILLION REQUESTS PER SECOND in a well integrated, highly available and easily manageable package. it does NOT get any better from any vendor out there as of today.
Special thanks to Benoit Chaffanjon and his team for making this equipment available to us on short notice. Their support, reactivity and in-depth knowledge of all things benchmark is what makes them top-notch and an indispensable component of our success.
The Meat
Maybe you have already heard about Benoit's previous benchmark of DSEE (6.3.1) on Nehalem. If you haven't, read it, it'll give all you the background you need to read these results here. I tried to stick as much as I could to his bench, and I think did a pretty good job at that. The main intentional difference between our two benches is that in his, searches only span across 1 Million entries among the 10 Million data base. In mine, searches span across the whole 10 Million entries. In practice, he's right to do his benchmarks the way he does, as it better reflects the reality of how most customers end up consuming data, but mine is more stressful on the system.
Setup
Hardware
Software
Tunings
Hardware
None
Software
Solaris
- Cap the ZFS ARC size to ( SYSTEM MEMORY * 0.95 ) - OPENDS JVM HEAP SIZE
- Disable ZFS cache flush since the storage takes care of that for us and has persistent cache (4GB of NVRAM)
- Put ZFS ZIL on a dedicated SSD
Other things to consider doing:
- use jumbo frames if returning whole entries, YMMV depending on your most frequent access patterns. I haven't tried this time around for lack of time but this should be interesting in reducing the network overhead. As we'll see later, OpenDS on this blade can max out a gigabit Ethernet connection.
Java
With very high volumes like we are seeing here, say above 80k ops/sec, you will likely want to bump request handlers and worker threads a notch to cope with the frenzy of the traffic. When you do so, the 32-bits JVM will quickly become too small no matter what tunings you try. Even though the 64-bits is not as space efficient for cache and all other aspects of memory access, it will provide an extremely stable environment for OpenDS even under heavy client traffic. I have been able to attach 10,000 hyper-clients (as in clients continuously sending traffic with no pause between requests) to OpenDS without a problem.
To cut to the chase, the settings:
OpenDS
Worker Threads
|
32
|
|
Connection Handlers
|
16 |
As I have said previously, you may want to dial these values depending on a couple of factors:
- How many clients you have at peak
- How quickly your client applications open their connections (bursts or ramped up?)
- How frantic a client is on each connection in average
If you have 5,00 clients opening 100 connections all at once, you will likely want to have more connection handlers to be able to cope with the suddenness of the pattern. This will however come at a performance cost (that we have yet to appropriately profile) under more normal circumstances.
If you have few frantic clients, these values will be right, you may want to bump up the number of worker threads a bit. This too is subobptimal under normal circumstances.
Note: regardless of the access pattern, these settings will be adequate to serve whatever load you throw at the server, I'm only pointing out ways to improve the performance a bit. In particular, these advices will contribute to keeping the request backlog on a leash.
Import
Importing our 10M entries took 14'59", which averages at 11,120 entries per second.
Search Performance
These tests mainly aim at determining the maximum throughput that can be achieved. As such, they tend to load the servers to artificially high number of concurrent clients, inflating the response time compared to what can be expected under more normal production conditions... in the last section (Lighter Load), I will show what the response time looks like with lighter loads and lower overall throughput.
Exact Search
Return 1 Attribute
Heavy Load, Maximum Throughput
| Actual Duration |
|
1839 seconds (30m 39s) |
| Searches Completed |
|
| Count |
Avg/Second |
Avg/Interval |
Std Dev |
Corr Coeff |
| 169056808 |
93660.281 |
468301.407 |
5590.951 |
-0.004 |
|
| Exceptions Caught |
|
| Count |
Avg/Second |
Avg/Interval |
Std Dev |
Corr Coeff |
| 0 |
0.000 |
0.000 |
0.000 |
0.000 |
|
| Entries Returned |
|
| Total |
Avg Value |
Avg/Second |
Avg/Interval |
Std Dev |
Corr Coeff |
| 169056809 |
1.000 |
93660.282 |
468301.410 |
0.000 |
0.000 |
|
| Search Time (ms) |
|
| Total Duration |
Total Count |
Avg Duration |
Avg Count/Interval |
Std Dev |
Corr Coeff |
| 450590169 |
169056809 |
2.665 |
468301.410 |
0.189 |
-0.006 |
|
Lighter Load
| Searches Completed |
|
| Count |
Avg/Second |
Avg/Interval |
Std Dev |
Corr Coeff |
| 28143684 |
92274.374 |
461371.869 |
3791.935 |
-0.040 |
|
| Exceptions Caught |
|
| Count |
Avg/Second |
Avg/Interval |
Std Dev |
Corr Coeff |
| 0 |
0.000 |
0.000 |
0.000 |
0.000 |
|
| Entries Returned |
|
| Total |
Avg Value |
Avg/Second |
Avg/Interval |
Std Dev |
Corr Coeff |
| 28143684 |
1.000 |
92274.374 |
461371.869 |
0.000 |
0.000 |
|
| Search Time (ms) |
|
| Total Duration |
Total Count |
Avg Duration |
Avg Count/Interval |
Std Dev |
Corr Coeff |
| 30399915 |
28143685 |
1.080 |
461371.885 |
0.055 |
0.023 |
|
Return whole entry
Heavy Load, Maximum Throughput
| Actual Duration |
|
1839 seconds (30m 39s) |
| Searches Completed |
|
| Count |
Avg/Second |
Avg/Interval |
Std Dev |
Corr Coeff |
| 151991059 |
84205.573 |
421027.864 |
5264.386 |
-0.006 |
|
| Exceptions Caught |
|
| Count |
Avg/Second |
Avg/Interval |
Std Dev |
Corr Coeff |
| 0 |
0.000 |
0.000 |
0.000 |
0.000 |
|
| Entries Returned |
|
| Total |
Avg Value |
Avg/Second |
Avg/Interval |
Std Dev |
Corr Coeff |
| 151991061 |
1.000 |
84205.574 |
421027.870 |
0.000 |
0.000 |
|
| Search Time (ms) |
|
| Total Duration |
Total Count |
Avg Duration |
Avg Count/Interval |
Std Dev |
Corr Coeff |
| 360407639 |
151991065 |
2.371 |
421027.881 |
0.183 |
0.022 |
|
Lighter Load
| Searches Completed |
|
| Count |
Avg/Second |
Avg/Interval |
Std Dev |
Corr Coeff |
| 21896817 |
71792.843 |
358964.213 |
4125.281 |
-0.020 |
|
| Exceptions Caught |
|
| Count |
Avg/Second |
Avg/Interval |
Std Dev |
Corr Coeff |
| 0 |
0.000 |
0.000 |
0.000 |
0.000 |
|
| Entries Returned |
|
| Total |
Avg Value |
Avg/Second |
Avg/Interval |
Std Dev |
Corr Coeff |
| 21896817 |
1.000 |
71792.843 |
358964.213 |
0.000 |
0.000 |
|
| Search Time (ms) |
|
| Total Duration |
Total Count |
Avg Duration |
Avg Count/Interval |
Std Dev |
Corr Coeff |
| 15177289 |
21896817 |
0.693 |
358964.213 |
0.047 |
0.023 |
|
Sub Scope Search
Return 1 Attribute
Heavy load, Maximum Throughput
| Actual Duration |
|
1838 seconds (30m 38s) |
| Searches Completed |
|
| Count |
Avg/Second |
Avg/Interval |
Std Dev |
Corr Coeff |
| 169252464 |
93768.678 |
468843.391 |
6339.082 |
-0.012 |
|
| Exceptions Caught |
|
| Count |
Avg/Second |
Avg/Interval |
Std Dev |
Corr Coeff |
| 0 |
0.000 |
0.000 |
0.000 |
0.000 |
|
| Entries Returned |
|
| Total |
Avg Value |
Avg/Second |
Avg/Interval |
Std Dev |
Corr Coeff |
| 169252464 |
1.000 |
93768.678 |
468843.391 |
0.000 |
0.000 |
|
| Search Time (ms) |
|
| Total Duration |
Total Count |
Avg Duration |
Avg Count/Interval |
Std Dev |
Corr Coeff |
| 270122894 |
169252465 |
1.596 |
468843.393 |
0.140 |
0.022 |
|
Lighter Load
| Searches Completed |
|
| Count |
Avg/Second |
Avg/Interval |
Std Dev |
Corr Coeff |
| 24902860 |
81648.721 |
408243.607 |
4020.767 |
-0.011 |
|
| Exceptions Caught |
|
| Count |
Avg/Second |
Avg/Interval |
Std Dev |
Corr Coeff |
| 0 |
0.000 |
0.000 |
0.000 |
0.000 |
|
| Entries Returned |
|
| Total |
Avg Value |
Avg/Second |
Avg/Interval |
Std Dev |
Corr Coeff |
| 24902860 |
1.000 |
81648.721 |
408243.607 |
0.000 |
0.000 |
|
| Search Time (ms) |
|
| Total Duration |
Total Count |
Avg Duration |
Avg Count/Interval |
Std Dev |
Corr Coeff |
| 15166324 |
24902860 |
0.609 |
408243.607 |
0.039 |
0.023 |
|
Return Whole Entry
Heavy Load, Maximum Throughput
| Actual Duration |
|
1839 seconds (30m 39s) |
| Searches Completed |
|
| Count |
Avg/Second |
Avg/Interval |
Std Dev |
Corr Coeff |
| 152888061 |
84702.527 |
423512.634 |
6003.399 |
-0.008 |
|
| Exceptions Caught |
|
| Count |
Avg/Second |
Avg/Interval |
Std Dev |
Corr Coeff |
| 0 |
0.000 |
0.000 |
0.000 |
0.000 |
|
| Entries Returned |
|
| Total |
Avg Value |
Avg/Second |
Avg/Interval |
Std Dev |
Corr Coeff |
| 152888064 |
1.000 |
84702.529 |
423512.643 |
0.000 |
0.000 |
|
| Search Time (ms) |
|
| Total Duration |
Total Count |
Avg Duration |
Avg Count/Interval |
Std Dev |
Corr Coeff |
| 270188257 |
152888064 |
1.767 |
423512.643 |
0.154 |
0.013 |
|
Lighter Load
| Searches Completed |
|
| Count |
Avg/Second |
Avg/Interval |
Std Dev |
Corr Coeff |
| 22151207 |
72626.908 |
363134.541 |
3680.320 |
-0.007 |
|
| Exceptions Caught |
|
| Count |
Avg/Second |
Avg/Interval |
Std Dev |
Corr Coeff |
| 0 |
0.000 |
0.000 |
0.000 |
0.000 |
|
| Entries Returned |
|
| Total |
Avg Value |
Avg/Second |
Avg/Interval |
Std Dev |
Corr Coeff |
| 22151207 |
1.000 |
72626.908 |
363134.541 |
0.000 |
0.000 |
|
| Search Time (ms) |
|
| Total Duration |
Total Count |
Avg Duration |
Avg Count/Interval |
Std Dev |
Corr Coeff |
| 15179772 |
22151207 |
0.685 |
363134.541 |
0.041 |
0.023 |
|
Modifications Performance
| Modifications Performed |
|
| Count |
Avg/Second |
Avg/Interval |
Std Dev |
Corr Coeff |
| 15687496 |
17334.250 |
86671.249 |
2048.109 |
0.015 |
|
| Modify Time (ms) |
|
| Total Duration |
Total Count |
Avg Duration |
Avg Count/Interval |
Std Dev |
Corr Coeff |
| 126643779 |
15687499 |
8.073 |
86671.265 |
1.435 |
-0.201 |
|
Nice writeup, those numbers are incredible! Were there any additional dsconfig settings used besides the thread and connection handlers? Also, it would be nice to if you could add the JVM tuning settings that were applied.
Posted by Bill Hathaway on June 24, 2009 at 07:52 PM MDT #
Arnaud, those numbers are amazing... You forgot to tell the story where you've OC'd the machine and removed the contentiion point i.e. the Access Log to break the 100 K mark.... and how you've burned down the machine ;-)
This would have been some story as well ! We're like stunts ! Don't try this at home :-)
Posted by Ludo on June 24, 2009 at 08:20 PM MDT #
Ludo, this machine had not been overclocked. This is a stock Sun Blade x6270.
The 100k mark has been broken on a prototype, I have not shared those results externally. I may in the near future though.
Posted by arnaud on June 25, 2009 at 09:13 AM MDT #
I Tested OpenDS:
a. Server OpenDS, Ubuntu: 320GB HDD, 4GB RAM, 4 CHIP x 2.2GHz
b. Test Bind & Serach command, using .NET DirectoryEntry (C#), search by attribute UID :
DirectorySearcher theSearcher = new DirectorySearcher(entry);
theSearcher.SearchScope = SearchScope.Subtree;
theSearcher.Filter = string.Format("(&(uid={0}))", userName);
c. Run 10 Thread, each thread 1000 search command
THE RESULT: Average 60 SEARCH COMMAND/SECOND
The result are TOO SLOW. Am I wrong with my test
Posted by DVD on September 11, 2009 at 01:43 AM MDT #
Hey DVD, thanks for posting about your experience. Honestly, irrespective of the data you loaded in OpenDS, I totally understand your feeling: this is WAY too slow. In all honesty though, I think you ought to test your OpenDS a couple of different ways before thinking it's on OpenDS' side. For example, what was the CPU utilization ? Where is your test program running ? if it's running on the same machine, isn't the test program the bottleneck rather than OpenDS ?
Other information that would be useful for me would be the numer of entries you have in the database and how big each entry is in average (like 1kiB, 10kiB, 100kiB?).
60 search / sec on OpenDS would probably be achieved on an old Pentium III machine.
let me know if you'd like help evaluating your issue, I know by experience that your machine should be able to give you a lot more than what you currently get.
cheers
-=arnaud=-
Posted by arnaud on September 14, 2009 at 02:36 PM MDT #