Friday August 25, 2006 | cn=Directory Manager All about Directory Server |
LDAP/Directory Server Standards and SpecificationsOne of the great things about LDAP is that it's a very standards-based protocol. All LDAP clients and servers speak the same language, and as long as they stick to the published standards then it doesn't matter what API the client is using or which server it's talking to. Compare that with relational databases and SQL. Every vendor has their own flavor of SQL and uses a different mechanism for talking to the database. You can't swap out one relational database for another and expect all of your applications to continue working as if nothing had happened. This is much more possible with LDAP.Virtually everything that you need to know about LDAP (especially LDAPv3) is publicly available in the form of either RFCs or Internet Drafts. However, this information is not always as easy to find as it should be. There are thousands of RFCs covering all kinds of technologies, and it's not all that simple to figure out which ones are related to LDAP or directory technologies. It's even harder with Internet Drafts because they are in flux, and drafts are designed to expire within six months of being published. When they expire, the IETF deletes them from their site so they can be really hard to find. Each revision of a draft is numbered, so in some cases you can just try incrementing the number to see if there is a newer version, but if a draft has stalled and no new revision is available then you can be out of luck. Since it's obviously pretty important to keep up to date on all of these specifications when you're developing a directory server, we have compiled our own collection of the various RFCs and Internet Drafts that are related to directory technologies. Since OpenDS is now public, we have all of these specifications in the documentation section of the project site at https://opends.dev.java.net/public/standards/index.html. What's also handy about this list is that it indicates whether we intend to support each of these specifications in OpenDS at some point, as well as whether it is already in place. I should point out that the specifications we currently have marked "No" doesn't mean that's the final word on the matter. We're open to suggestion, and if you can provide a convincing argument as to why you think that we should support something that's currently marked "no", then we can reconsider it. The best way to voice your opinion on things like this is by sending a message to one of our mailing lists. The users@opends.dev.java.net list is a good one for these kinds of topics. Note that if you do intend to post to the lists, then it's easiest if you subscribe first, since posts from non-subscribers will be subject to moderation, and you may not necessarily see all of the replies since the reply-to address will automatically be set to the address of the mailing list. This list of specifications is an ongoing work, as we try to keep it up to date as new documents are published (or new revisions of existing drafts). If you know of any specifications that aren't included on this list then please let us know. Posted by cn_equals_directory_manager ( Aug 25 2006, 03:41:01 PM CDT ) Permalink The truth about nsslapd-search-tuneThe Directory Server has a configuration attribute named nsslapd-search-tune that isn't really addressed in our documentation, and it's not really all that well understood by a lot of the people that know about it. It can certainly be useful under the right circumstances, but it's not a panacea and probably shouldn't be used if it isn't needed. I'll try to clear things up here to help people better understand what this configuration attribute actually does and under what conditions it should (or should not) be used.For the purposes of this discussion, two of the most important things that happen during the course of search processing in the Directory Server are:
An astute observer might ask why we need the second of these steps if we have confidence in our indexing mechanism. The main reason is that the candidate list has the potential to contain entry IDs for entries that don't actually match the search filter. This could happen if the search filter contained at least one component that wasn't indexed and therefore we can't be sure that all of the entries identified in the candidate list match the criteria associated with that filter component. It could also occur if an optimization in our search filter processing caused the server to short-circuit out of the index processing portion before we had evaluated all of the filter components. There are also a few other special cases that can cause this to happen, but it ultimately all boils down to the fact that we can be sure that the candidate list contains all of the possible matches, but we can't necessarily be sure that all of the entries in the candidate list actually match the provided search criteria. Most of the time, the fact that we do this secondary evaluation on each candidate entry doesn't really cause any problems. Determining whether a particular entry matches a given filter is generally a very fast operation and the benefits that it provides far outweigh any added cost that might be incurred. Of course, there's always an exception to the rule and in this case one exception happens to be an indexed equality filter component that targets an attribute with a very large number of values. This most commonly manifests itself in the form of static groups with lots of members (e.g., getting into at least the tens or hundreds of thousands of members). In these kinds of entries, there can be a notable performance hit that results from that secondary evaluation, and it can be the case that skipping it can make things significantly faster. I should point out that it's especially evident for large static groups because not only are there large numbers of values, but processing them may require a lot of DN normalization which can become expensive. This is where the nsslapd-search-tune configuration attribute comes into play. It can be used to change the way that the server processes search operations such that if it is possible to ensure that all of the filter components were indexed, and if we are confident that the candidate list doesn't contain any entries that don't actually match the criteria, then we can skip the potentially costly secondary evaluation. For example, consider the following very simple search filter: (member=uid=john.doe,ou=People,dc=example,dc=com) This is a very common type of search that might be used to identify the set of static groups in which the user uid=john.doe,ou=People,dc=example,dc=com is a member. In this type of filter (unless that user is or has been a member of more than ALLIDs groups), the candidate list should be exactly the set of entries that match that filter, and the process of actually checking each of those entries against the filter doesn't add any additional value. Similarly, let's take this slightly more complicated filter: (&(cn=My Group)(member=uid=john.doe,ou=People,dc=example,dc=com)) This is a fairly common way for clients to determine if the user john.doe,ou=People,dc=example,dc=com is a member of the group named "My Group". Even though it's a logical AND of multiple components, if we know that indexes were used to obtain the ID lists for each of those components, then their intersection should exactly equal the ID list for entries that match the entire filter. However, in this case there's more that could happen to make it infeasible to rely purely on the candidate list. One case is that one of the components could be unindexed or have hit the ALLIDs threshold, in which case the resulting candidate list would only take into account the other filter component (meaning that it could include IDs for entries that don't match the other component). Another could be an optimization in our filter processing code that causes the index evaluation to stop after the first component (meaning that the candidate list could include IDs for entries that don't match the second). The nsslapd-search-tune configuration attribute can be used to allow the server to skip this secondary filter evaluation and just rely on the candidate list when it's possible to ensure that it doesn't contain any non-matching entries. If you do wish to use it, then it should go in the "cn=config,cn=ldbm database,cn=plugins,cn=config" entry. Its value is an integer that is interpreted as a bitmap, meaning that different bits are taken to mean different things. The most important bits are:
As I mentioned above, the value is a bitmap, which means that you can just add the values of each of the components together to get what you want. For example, if you want to enable all of these options, you would use a value of 57 (1+8+16+32=57). In most deployments, however, it may be recommended to leave out the "8" option, leaving you with a value of 49 (1+16+32=49). So what's the problem with the "8" option? It has to do with a very tiny race condition that can arise if the secondary filter test is skipped. As noted above, normal search processing is done by first building the candidate list, then retrieving each entry, checking it against the filter, and sending it to the client if it matches. It is possible that in the split-second between the time that the candidate list is constructed and an entry on that list is retrieved and returned to the client, that entry could be modified so that it no longer matches the filter criteria. With the secondary filter check in place, the server would see that the entry no longer matches and wouldn't send it to the client, but if that secondary test is skipped then it is possible that the server could return an entry that no longer matches. In most cases, this isn't a problem since the entry did match the filter at the time that the server started processing the search, but it does have the potential to cause a problem if the client does try to do some validation on the entries that it gets back and "freaks out" in some way if it finds one that doesn't match. The chances of the required conditions are extremely small, and the chance that the client will notice and care about it are even smaller. Nevertheless, if you're concerned about it and want to avoid that possibility, then you can just leave option "8" out of the bitmap. It's important to note that the nsslapd-search-tune configuration attribute has been around in one form or another for quite some time (it was around in a more limited form in the 4.x server), but there were code changes required to make it "safer" to use (i.e., to reduce the chance of false positives). If you're running the 5.1 version of the server, then those fixes did not integrate until the 5.1SP4 release. If you're running the 5.2 version of the server, then they did not integrate until the 5.2patch2 release. If you're running something older than that, then you really should consider upgrading for a number of reasons, but until you do it's probably best to avoid using nsslapd-search-tune. It's also important to note that this configuration attribute is not a panacea, and it can slow things down in some cases because of the search optimizations that it can disable. I would advise against using it unless you are seeing a performance problem in dealing with entries containing an attribute with a large number of values (like big static groups). If you're not sure if a particular use case might benefit from nsslapd-search-tune, then open a support case and we can help you figure that out, and we may be able to provide other tuning recommendations that may also help. Posted by cn_equals_directory_manager ( Aug 17 2006, 09:01:32 AM CDT ) Permalink Comments [1] OpenDS, Java, and PerformanceI've seen a few questions, comments, and concerns recently about how OpenDS will perform. In particular, many of them have focused on the fact that OpenDS is written in Java. I wanted to take this opportunity to address them here.If you've heard anything about Sun's Directory Server in the last several years, it's hard to miss the fact that performance and scalability are very important to us. We've been continually working on improving this in our current Sun Java System Directory Server, and it is one of the main goals of OpenDS to be even faster and more scalable. If we really thought that Java was going to significantly hold us back, then the project probably wouldn't have even gotten off the ground. First, though, I should say that we haven't yet spent a lot of time on performance optimization, primarily because it's too early in the process. Since there's still a lot more code to be written, anything that we do to optimize performance now might get negated by changes that we make later when adding other features. We'd rather get things working first and then we'll see what we need to do to make it as fast as possible. However, I will say that we're definitely writing the code with performance and scalability in mind, and we have run a few performance tests that do show that we're in pretty good shape. For example, our modify performance is about twice as fast as the best that I've seen out of Directory Server 6. Our LDIF import code is also faster and more scalable. Search performance right now can range from a little slower to a lot faster, but we also know of a few things that can be improved to give us some significant boosts. I do find it a little hard to resist the temptation to jump into performance analysis, but at least it does allow development to progress more quickly. So how does Java fit into all of this? In many ways, it does play a very big role because the JVM is what is actually executing the code. The performance of the JVM has improved dramatically over time, and it keeps getting better. This works very much in our favor, since switching to a faster JVM can give OpenDS a notable performance boost without changing a line of code. Further, new releases of Java can introduce new features and capabilities that we may be able to use to our advantage. For example, the Java 5 release added a number of libraries for better concurrency that allows us to reduce locking and improve performance and scalability, and improved upon the NIO features added in the 1.4 release for fast and scalable network communication. Back in the Java 1.1 or 1.2 days, we probably would have been crazy to attempt something like this and expect decent performance, but things have come a long way since then. Note that I am talking primarily about Sun's JVM implementation, since I don't have a lot of experience with those from other vendors. It may or may not be the case that other JVMs share the same level of performance, but Sun's implementation certainly shows what's possible. Other things to consider with regard to Java and performance include:
However, there are also a number of improvements in the area of performance and scalability that are not directly related to our use of Java. We have made a number of architectural changes based on our experiences with the current Directory Server that allow us to avoid problems that we've encountered in the past. We have focused heavily on avoiding the need to acquire locks wherever possible, and to minimize the scope of those locks that are necessary. We have changed the way in which we store entries in the backend database to make them much more efficient to encode and decode so that it is still possible to get very good performance even when not using an entry cache, and we've changed our approach to entry caching to allow for different implementations that may be suited for different kinds of use cases. The algorithms used for interacting with indexes in the backend can allow us to be more efficient and reduce the number of entries that we need to look at in several cases. We've increased the parallelization in several areas like connection handling and backend processing to be faster and more scalable. As development continues, we'll add other improvements throughout the code. So to summarize, the performance today is pretty impressive and we know that there are ways to make it better. Java is not going to be a performance inhibitor, and it's only going to keep getting better. You can get it now to try it out for yourself, or watch our development to see how we progress in the future. If you still have concerns, or if you have ideas about how we can make it better, then feel free to participate on our mailing lists. Posted by cn_equals_directory_manager ( Aug 11 2006, 02:29:21 PM CDT ) Permalink Comments [2] Interesting new features in OpenDSAs we've stated in the past, our intention is to make OpenDS better than any other directory server. This is true on multiple fronts, including performance, scalability, reliability, supportability, and feature set. All of the features that we want to include in OpenDS have been entered into the issue tracker, but I thought that I would highlight some of the more interesting new features here.Configuration Features
Backup/Restore and Import/Export Features
Backend Features
Entry Cache Features
Connection Handler Features
Security Features
Password Policy Features
Logging Features
Monitoring/Alerting Features
Note that these aren't all of the new features that we're planning. We've got some other gems that we're thinking about as well. As I mentioned above, you can check our issue tracker for what should be a pretty complete list. If you have other ideas for features that you don't see in the list, then feel free to add them to the issue tracker yourself (anyone with the User role or higher in the project can file new issues), and if you're so inclined then you could even help us implement them. I should point out that not everything listed in the issue tracker will necessarily be actually implemented in OpenDS, and if it is included then it might not be in the initial release. We're certainly working hard to make the server as feature-rich as possible, but we're not yet at the point at which we can make any guarantees about what will be available and when. Posted by cn_equals_directory_manager ( Aug 07 2006, 01:45:13 PM CDT ) Permalink Introducing the OpenDS Directory ServiceThe Sun Java System Directory Server has a distinguished heritage and a proven track record, with thousands of customers and billions of entries deployed. However the codebase is over ten years old and its origins are from a time when performance, scalability, and feature set requirements then were very different from what we're seeing today and expect to see in the future. It has also grown quite complex, and making a change to one area of the code may require an in-depth understanding of several other components. We're still working on improving this code, and the upcoming Directory Server 6.0 release will be the best we've had yet, but we are also preparing for the future and we think that an open development model needs to be a big part of that.Last year, we decided to start from scratch, designing a new server from the ground up. Drawing from our years of experience, customer feedback, and some of our own ideas, we began working on what we hope will become even more enduring and successful than our current Directory Server. We're calling the new codebase OpenDS, and last Friday we rather quietly pushed the code out to https://opends.dev.java.net/. OpenDS is an open source Directory Service written entirely in Java. I say "Directory Service" because we will include more than just the core LDAP-accessible database. Much like our current Directory Server Enterprise Edition, we'll also include directory proxy functionality (including virtual directory and data distribution capabilities), the ability to synchronize with Active Directory and potentially other sources, and various client-side tools. To date, our development has primarily focused on the Directory Server itself, and we have basic support for all core LDAPv3 operations, and several extensions like controls, extended operations, and SASL mechanisms. However there are still a number of components to be added, like the access control subsystem, virtual attribute capabilities, and administrative interfaces, and there is significant work to be done in other areas like password policy and logging. We very much wish to have an open development process, and we welcome community participation. You can provide comments and feedback, file bugs and enhancement requests, participate in mailing lists, or even write code. I'm sure that there will be a lot of questions about OpenDS, and our FAQ (I suppose in this case, that's "Frequently Anticipated Questions") at https://opends.dev.java.net/public/docs/OpenDS-FAQ.html aims to address many of them. If you still have other questions, then check out other parts of the site like our Documentation Depot or join our mailing lists. We'd love to have you help us achieve our goal of making OpenDS better than any other directory product out there. Posted by cn_equals_directory_manager ( Aug 01 2006, 03:33:21 PM CDT ) Permalink Comments [12] |
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||