RFC 2654 (rfc2654) - Page 2 of 24
A Tagged Index Object for use in the Common Indexing Protocol
Alternative Format: Original Text Document
RFC 2654 Tagged Index Object for use in CIP August 1999 5. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . .13 5.1 The original database . . . . . . . . . . . . . . . . . . . .13 5.1.1 "complete" consistency based full update . . . . . . . . . .14 5.1.2 "tag" consistency based full update . . . . . . . . . . . .14 5.1.3 "unique" consistency based full update . . . . . . . . . . .15 5.2 First update . . . . . . . . . . . . . . . . . . . . . . . . .16 5.2.1 "complete" consistency based incremental update . . . . . .16 5.2.2 "tag" consistency based incremental update . . . . . . . .17 5.2.3 "unique" consistency based incremental update . . . . . . .17 5.3 Second update . . . . . . . . . . . . . . . . . . . . . . . .18 5.3.1 "complete" consistency based incremental update . . . . . .18 5.3.2 "tag" consistency based incremental update . . . . . . . . .19 5.3.3 "unique" consistency based incremental update . . . . . . .20 6. Aggregation . . . . . . . . . . . . . . . . . . . . . . . . . .21 6.1 Aggregation of Tagged Index Objects . . . . . . . . . . . . .21 7. Security Considerations . . . . . . . . . . . . . . . . . . . .21 8. References . . . . . . . . . . . . . . . . . . . . . . . . . .22 9. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . .23 Full Copyright Statement . . . . . . . . . . . . . . . . . . . . .24 1. Introduction The Common Indexing Protocol (CIP) as defined in [1] proposes a mechanism for distributing searches across several instances of a single type of search engine to create a global directory. CIP provides a scalable, flexible scheme to tie individual databases into distributed data warehouses that can scale gracefully with the growth of the Internet. CIP provides a mechanism for meeting these goals that is independent of the access method that is used to access the data that underlies the indices. Separate from CIP is the definition of the Index Object that is used to contain the information that is exchanged among Index Servers. One such Index Object that has already been defined is the Centroid that is derived from the Whois++ protocol [2]. The Centroid does not meet all the requirements for the exchange of index information amongst information servers. For example, it does not support the notion of incremental updates natively. For information servers that contain millions of records in their database, constant exchange of complete dredges of the database is bandwidth intensive. The Tagged Index Object is specifically designed to support the exchange of index update information. This design comes at the cost of an increase in the size of the index object being exchanged. The Centroid is also not tailored to always be able to give boolean answers to queries. In the Centroid Model, "an index server will take a query in standard Whois++ format, search its collections of centroids and other forward information, determine which servers hold records which may fill that query, and then Hedberg, et al. Experimental



