RFC 1842 (rfc1842) - Page 2 of 12
ASCII Printable Characters-Based Chinese Character Encoding for Internet Messages
Alternative Format: Original Text Document
RFC 1842 ASCII/Chinese Character Encoding August 1995 Table of Contents 1. Introduction................................................ 2 2. Description................................................. 3 3. Formal Syntax............................................... 4 4. MIME Considerations......................................... 5 5. Background Information...................................... 5 6. References.................................................. 6 7. Acknowledgements............................................ 6 8. Security Considerations..................................... 7 9. Authors' Addresses.......................................... 7 10. Appendix: List of Software Implementing HZ Representation... 9 1. Introduction Chinese (and other east Asia languages) characters are encoded with multiple bytes to guarantee sufficient coding space for the large number of glyphs these languages contain. With the prolification of internetwork traffic around the world, it becomes necessary to define ways to facilitate the transfer of text in multiple-byte character- set languages (hereafter as Chinese text) over internet. There are two layers of concerns need to be addressed by any mechanism whose purpose is to transfer Chinese text over internet. The first is on application layer, in which concerned applications should be able to recognize the encoding of the text and/or discern different character sets which might be mixed in the text and handle it accordingly. The second layer is the actual transport of Chinese text between point A to point B over the Internet. Because the prevailing mail transport protocol used over internet, the Simple Mail Transport Protocol (aka. SMTP) was designed originally for ASCII character set only, many internet mail agents are not 8 bit clean and therefore introduce challenges for any attempt to actually implement a mechanism for the transport of Chinese text over internet. Here we describe a mechanism for transmission of Chinese text over IP network. This described mechanism has being implemented by various software package dealing with multi-language support and has been tested on USENET newsgroups and other types of internet forums over the last two years. The test results shows that the HZ representation can pass through almost all existing mail delivery agents without being corrupted. The HZ representation currently handles GB2312-80 Chinese character set only. Further expansion to other Chinese encoding systems and to other East Asia Language is under consideration. Wei, et al Informational



