Red Hat DIRECTORY SERVER 7.1 - ADMINISTRATOR Administrator's Manual

Also see for DIRECTORY SERVER 7.1: Manual Installation guide Deployment guide Configuration Reference

Contents

About IndexesChapter 10 Managing Indexes 387In the redesigned index, the storage manager has visibility into the fine-grain indexstructure, which optimizes transaction logging so that only the number of bytesactually changed need to be logged for any given index modification. TheBerkeleyDB feature provides ID list semantics, which are implemented by thestorage manager. The enhanced Berkeley API supports duplicate index keys,inserting and deleting individual IDs stored against a common key, and anoptimized mechanism retrieve the complete ID list for a given key.This means the storage manager has direct knowledge of the application’s intentwhen changes are made to ID lists. As a result:• For long ID lists, the number of bytes written to the transaction log for anyupdate to the list is significantly reduced, from the maximum ID list size(8Kbytes) to twice the size of one ID (4bytes).• For short ID lists, storage efficiency, and in most cases performance, isimproved because only the storage manager metadata need to be stored, notthe ID list metadata.• The average number of database pages marked as dirty per ID insert or deleteoperation is very small because a large number of duplicate keys will fit intoeach database page.For each entry ID list, there is a size limit that is globally applied to all index keysmanaged by the server. This limit used to be called the All IDs Threshold, which seta limit on how large a single entry ID list could get because maintaining large IDlists in memory can affect performance. When a list hit a certain pre-determinedsize, the search acted as if the index contained the entire directory.The difficulty in setting the All IDs Threshold hurt peformance. If the thresholdwas too low, too many searches examined every entry in the directory. If it was toohigh, too many large ID lists had to be maintained in memory.The problems addressed by the All IDs Threshold are no longer present because ofthe efficiency of entry insertion, modification, and deletion in the BerkeleyDBdesign. The All IDs Threshold is removed for database write operations, and everyID list is now maintained accurately.Since loading a long ID list from the database can still significantly reduce searchperformance, the configuration parameter nsslapd-idlistscanlimit sets a limiton the number of IDs that is read before a key is considered to equal the entiredirectory. nsslapd-idlistscanlimit is analagous to the All IDs Threshold, but itonly applies to the behavior of the search, not the content of the database. See“Overview of the Searching Algorithm,” on page 392, for more information onsearching and indexes.