|ISBN: 3423050012 ISBN: 3423050012 ISBN: 3423050012 ISBN: 3423050012|
What document(s) define(d) the Unicode standard?
Unicode's character set is defined in parallel by two co-operating bodies:
The relationship between Unicode and the ISO UCS has been the subject of recurring discussions. Unicode can either be seen as a special implementation of the ISO UCS or as its underlying idea. Fact is that the two standards have converged and benefitted from each other and specify identical character codes so that Unicode and UCS can be used synonymously for all practical purposes.
"The Unicode Standard, Version 1.0" was first published as a book (ISBN 0-201-56788-1 from Addison-Wesley, out of print) in 1991. The first Unicode-incompatible ISO WG draft DIS-10646.1:1990 (no longer available) had been voted down by the ISO plenary in favor of the linear Unicode proposal.
The two proposals were subsequently merged and published as Unicode-compatible international standard ISO/IEC 10646 Universal Multiple-Octet Coded Character Set (UCS) -- Part 1: Architecture and Basic Multilingual Plane (price code XN = 351 CHF for 754 pages with much less descriptive text but sharper images on brighter paper than the Unicode standard) and an aligned Unicode 1.1 (Unicode Technical Report # 4 - diff to Unicode 1.0) in 1993.
Amendments 1..7 (UTF-16, UTF-8, reordered Hangul syllables, ...) to ISO/IEC 10646-1 were reflected in the book "The Unicode Standard, Version 2.0" (ISBN 0-201-48345-9) in 1996. The Chinese ideographs are now included instead of being outsourced into a second volume. The book reads like an encyclopedia of the world's scripts and is definitely a bargain at 59 USD for 930 A4 pages. Much of its contents is available online (character properties database, textual descriptions, cross references, illustrative glyph charts, even the full text of some chapters: TOC, 1, 3.11) but not yet the illustrated texts describing some of the more complicated algorithms like the Devanagari and Tamil rendering rules and implementation hints and not bundled so nicely.
Unicode 2.1 (Unicode Technical Report # 8, diff to Unicode 2.0) fixed a number of errors and added the U+20AC EURO SIGN for the new European currency and U+FFFD OBJECT REPLACEMENT CHARACTER as placeholder for images etc in 1998.
Recent ISO-10646-1 amendments (such as Ethiopic and others in the pipeline) shall be reflected in Unicode 3.0 scheduled for publication in the near future (1999?). Besides extending the existing scripts (inclusive of 6'582 new CJK Unified Ideographs, Extension A) there shall be new support for Braille, Canadian Aboriginal Syllabics, Cherokee, Ethiopic, Khmer, Mongolian, Myanmar, Ogham, Runic, Sinhala, Syriac, Thaana, Yi. Unicode 3.0 shall still be limited to BMP characters and may be accompanied by a revised second edition ISO-10646-1 to reduce the spaghetti of incremental amendments.
Some later version after Unicode 3.0 in the next millenium shall be the first to include defined characters in the non-BMP planes. A second part ISO-10646-2 shall comprehensively define the contents of all non-BMP planes.
The Unicode website also provides an official version history now.
Back to the topic sites:
External Links to this site are permitted without prior consent.
|Home | deutsch | Set bookmark | Send a friend a link | Copyright © | Impressum|