RFC 744 (rfc744) - Page 2 of 6
MARS - a Message Archiving and Retrieval Service
Alternative Format: Original Text Document
NWG/RFC 744 JS5 8-Jan-78 21:59 42857 MARS - A Message Archiving & Retrieval Service II. Using MARS ---------- A. Message Indexing ---------------- For each message, a vector of parsed tokens is created. The parsed tokens are collected by the message-field in which they occurred -- to be used as "indexes", i.e., values of inverted fields, by the Datacomputer. The Filer "indexes", essentially without analysis, except for the following: -- Each distinguishable section of the message is indexed separately; each header line is a separate inversion domain, as is the body of the message. -- The header lines which contain ARPANET addresses are analyzed in order to index separately on mailbox and host. -- The date-field is parsed and converted to the standard Tenex internal date/time format, which is better adapted for less-than/greater-than comparisons, as in retrievals which specify a date range. -- One-character words in both the subject-field and the message-text field arbitrarily discarded. -- Two-character words in the message-text field are arbitrarily discarded. -- Hyphenated phrases, i.e., words bound together by hyphens, are retained intact. -- All message formats which conform to RFC 733 standards are accommodated. The minimum requirements are: a date-field, a from-field, and a blank line between the message-header and message-body.