ISO/IEC JTC1/SC2/WG2 N2 2000-04-13 Universal Multiple-Octet Coded Character Set International Organization for Standardization Organisation Internationale de Normalisation Œåæäóíàðîäíàß îðãàíèçàöèß ïî ñòàíäàðòèçàöèè Doc Type: Working Group Document Title: Proposal to encode New Tai Lü in the BMP of the UCS Source:, EGT (IE) Status: Expert Contribution Action: For consideration by JTC1/SC2/WG2 and UTC Date: 2000-04-13 This document requests the New Tai Lü script to be added to the UCS and contains the proposal summary form. THIS IS ONLY A DRAFT. A. Administrative 1. Title Proposal to encode New Tai Lü in the BMP of the UCS. 2. Requester s name, EGT (WG2 member for Ireland). 3. Requester type Expert contribution. 4. Submission date 2000-04-13 5. Requester s reference 6a. Completion This is a DRAFT proposal. 6b. More information to be provided? B. Technical -- General 1a. New script? Name? New Tai Lü. 1b. Addition of characters to existing block? Name? 2. Number of characters 56 3. Proposed category Category A. 4. Proposed level of implementation and rationale Level 2; combining characters are used in the Brahmic fashion. 5a. Character names included in proposal? 5b. Character names in accordance with guidelines? 5c. Character shapes reviewable? Yes (see below). 1
6a. Who will provide computerized font? and James Kass. 6b. Font currently available? 6c. Font format? TrueType. 7a. Are references (to other character sets, dictionaries, descriptive texts, etc.) provided? 7b. Are published examples (such as samples from newspapers, magazines, or other sources) of use of proposed characters attached? 8. Does the proposal address other aspects of character data processing? Yes, see Unicode properties below. C. Technical -- Justification 1. Contact with the user community? Zhang Gongjin, Professor of Linguistics, Director of Institute for Kam-Tai Studies, Central University for Nationalities, Beijing. 2. Information on the user community? XXXXXX. 3a. The context of use for the proposed characters? Used to write Tai Lü. 3b. Reference See bibliography. 4a. Proposed characters in current use? 4b. Where? In day-to-day contexts. 5a. Characters should be encoded entirely in BMP? 5b. Rationale Accordance with the Roadmap. 6. Should characters be kept in a continuous range? 7a. Can the characters be considered a presentation form of an existing character or character sequence? 7b. Where? 7c. Reference 8a. Can any of the characters be considered to be similar (in appearance or function) to an existing character? 8b. Where? 8c. Reference 9a. Combining characters or use of composite sequences included? 9b. List of composite sequences and their corresponding glyph images provided? Yes, see below. 10. Characters with any special properties such as control function, etc. included? 2
E. Proposal The New Tai Lü script was developed as a simplification of the Lanna script used to represent XXXXX languages spoken in XX, China, Laos, Thailand etc. It differs from the Lanna script in that it simplifies and regularizes the consonant repertoire, and in that it makes use of spacing vowel signs which appear before or after the consonants they modify, where Lanna makes use of both spacing vowel signs and nonspacing vowel signs which appear above or below the consonants they modify. Vowel signs in New Tai Lü are, however, still considered combining characters, and follow their base consonants in the text stream. VIRAMA is also used to create a limited set of conjunct consonants, most of which are considered unique letters of the alphabet. New Tai Lü also makes use of a limited set of final consonants, which are modified with a hook; this hook is understood to be the VIRAMA, and, as for other Brahmic scripts, is encoded, when visible, as VIRAMA + ZERO-WIDTH NON-JOINER. One nonspacing mark is also used with three consonants to form unique letters of the alphabet. A number of true consonant clusters (liquids and nasals) are also formed with the VIRAMA. Issue: One vowel sign E could be decomposed to e + e. Is there an advantage to encoding it separately? Table A below presents the set of letters modified with VIRAMA. Most of these follow the usual Brahmic model, but six of them are spacing vowel signs modified by a final -y glide; structurally (in terms of the encoding) these are identical to the consonants; it is only the reading rules which differ. So pha + Š ya yields ù phya while ÿ ā + Š ya yields Ð ā y. Table B below lists all the permissable finals (omitting the tone marks X and X whose use is uncertain at present. Table C below lists these finals in what is assumed to be alphabetical order. This needs to be verified. Issue: The use of the tone marks is unknown. Consonants come in two classes, high tone and low tone; presumably these tone marks appear in syllable-final position and modify the contours of the base tones. Issue: Is there any script-specific punctuation? Issue: What is the Abbreviation mark at xx1e? Apparently it can also take a subscript -wa. Issue: Are both European digits and New Tai Lü digits in use? Issue: I need sample printed texts. 3
Table A. New Tai Lü Conjuncts Unique letters of the alphabet xx00 + ÿ«xx2b = ß NEW TAI LU LETTER A LOW š xx1a + ÿ xx2f + xx05 = Ö NEW TAI LU LETTER NGA HIGH š xx1a + ÿ xx2f + xx10 = NEW TAI LU LETTER NA HIGH š xx1a + ÿ xx2f + xx15 = Ø NEW TAI LU LETTER MA HIGH š xx1a + ÿ xx2f + xx18 = Ù NEW TAI LU LETTER VA HIGH š xx1a + ÿ xx2f + xx19 = Ú NEW TAI LU LETTER LA HIGH œ xx1c + ÿ«xx2b = à NEW TAI LU LETTER DA LOW xx1d + ÿ«xx2b = á NEW TAI LU LETTER BA LOW Final consonants xx01 + ÿ xx2f + ZWNJ 200C = æ NEW TAI LU FINAL K xx05 + ÿ xx2f + ZWNJ 200C = ã NEW TAI LU FINAL NG xx10 + ÿ xx2f + ZWNJ 200C = ä NEW TAI LU FINAL N xx15 + ÿ xx2f + ZWNJ 200C = å NEW TAI LU FINAL M xx18 + ÿ xx2f + ZWNJ 200C = â NEW TAI LU FINAL V œ xx1c + ÿ xx2f + ZWNJ 200C = ç NEW TAI LU FINAL D xx1d + ÿ xx2f + ZWNJ 200C = è NEW TAI LU FINAL B Conjuncts xx13 + ÿ xx2f + Š xx0a = ù NEW TAI LU CONJUNCT PHYA HIGH xx14 + ÿ xx2f + Š xx0a = ú NEW TAI LU CONJUNCT PHYA LOW xx18 + ÿ xx2f + Š xx0a = Ò NEW TAI LU CONJUNCT OOY xx21 + ÿ xx2f + Š xx0a = Ð NEW TAI LU CONJUNCT AAY xx22 + ÿ xx2f + Š xx0a = Õ NEW TAI LU CONJUNCT IIY xx24 + ÿ xx2f + Š xx0a = Ñ NEW TAI LU CONJUNCT UUY xx28 + ÿ xx2f + Š xx0a = Ó NEW TAI LU CONJUNCT OY xx29 + ÿ xx2f + Š xx0a = Ô NEW TAI LU CONJUNCT UWY ˆ xx08 + ÿ xx2f + xx10 = ö NEW TAI LU CONJUNCT SNA HIGH xx09 + ÿ xx2f + xx10 = NEW TAI LU CONJUNCT SNA LOW xx0f + ÿ xx2f + xx10 = ø NEW TAI LU CONJUNCT THNA LOW ˆ xx08 + ÿ xx2f + xx15 = û NEW TAI LU CONJUNCT SMA HIGH xx09 + ÿ xx2f + xx15 = ü NEW TAI LU CONJUNCT SMA LOW xx01 + ÿ xx2f + xx18 = Û NEW TAI LU CONJUNCT KWA HIGH xx02 + ÿ xx2f + xx18 = Ý NEW TAI LU CONJUNCT KWA LOW ƒ xx03 + ÿ xx2f + xx18 = Ü NEW TAI LU CONJUNCT XWA HIGH xx04 + ÿ xx2f + xx18 = Þ NEW TAI LU CONJUNCT XWA LOW xx07 + ÿ xx2f + xx18 = î NEW TAI LU CONJUNCT CWA LOW ˆ xx08 + ÿ xx2f + xx18 = ê NEW TAI LU CONJUNCT SWA HIGH xx09 + ÿ xx2f + xx18 = ë NEW TAI LU CONJUNCT SWA LOW xx0b + ÿ xx2f + xx18 = ì NEW TAI LU CONJUNCT YWA LOW xx0d + ÿ xx2f + xx18 = í NEW TAI LU CONJUNCT TWA LOW ž xx1e + ÿ xx2f + xx18 = é NEW TAI LU CONJUNCT LWE [???] xx01 + ÿ xx2f + xx19 = ï NEW TAI LU CONJUNCT KLA HIGH xx02 + ÿ xx2f + xx19 = ð NEW TAI LU CONJUNCT KLA LOW xx04 + ÿ xx2f + xx19 = ñ NEW TAI LU CONJUNCT XLA LOW ˆ xx08 + ÿ xx2f + xx19 = ô NEW TAI LU CONJUNCT SLA HIGH xx09 + ÿ xx2f + xx19 = ò NEW TAI LU CONJUNCT SLA LOW Ž xx0e + ÿ xx2f + xx19 = õ NEW TAI LU CONJUNCT THLA HIGH xx13 + ÿ xx2f + xx19 = ó NEW TAI LU CONJUNCT PHLA HIGH 4
Table B. New Tai Lü Combining Sequences ka + ÿ ka: + ÿ ki + ÿ + ÿ ki: + ÿ ku + ÿ ku: + ÿ ke + ÿ + ÿ ke: + ÿ ke + ÿ + ÿ ke: + ÿ ko + ÿ + ÿ ko: + ÿ ko + ÿ + ÿ ko: + ÿ k + ÿ + ÿ k : + ÿ k«+ ÿ + ÿ + ÿ k«: + ÿ + ÿ ª kaj + ªÿ Ð ka:j + ÿ + ZWNJ + Š Ñ kuj + ÿ + ZWNJ + Š Ò koj + + ZWNJ + Š Ó koj + ÿ + ZWNJ + Š Ô k j + ÿ + ZWNJ + Š Õ k«j + ÿ + ÿ + ZWNJ + Š Û kwa + ÿ + ï kla + ÿ + æ kak + + ÿ + ZWNJ æ ka:k + ÿ + + ÿ + ZWNJ æ kik + ÿ + + ÿ + ZWNJ æ kuk + ÿ + + ÿ + ZWNJ æ kek + ÿ + + ÿ + ZWNJ æ kek + ÿ + + ÿ + ZWNJ æ kok + ÿ + + ÿ + ZWNJ æ kok + ÿ + + ÿ + ZWNJ æ k k + ÿ + + ÿ + ZWNJ æ k«k + ÿ + ÿ + + ÿ + ZWNJ ã kan + + ÿ + ZWNJ ã ka:n + ÿ + + ÿ + ZWNJ ã kin + ÿ + + ÿ + ZWNJ ã kun + ÿ + + ÿ + ZWNJ ã ken + ÿ + + ÿ + ZWNJ ã ken + ÿ + + ÿ + ZWNJ ã kon + ÿ + + ÿ + ZWNJ ã kon + ÿ + + ÿ + ZWNJ ã k N + ÿ + + ÿ + ZWNJ ã k«n + ÿ + ÿ + + ÿ + ZWNJ ä kan + + ÿ + ZWNJ ä ka:n + ÿ + + ÿ + ZWNJ ä kin + ÿ + + ÿ + ZWNJ ä kun + ÿ + + ÿ + ZWNJ ä ken + ÿ + + ÿ + ZWNJ ä ken + ÿ + + ÿ + ZWNJ ä kon + ÿ + + ÿ + ZWNJ ä kon + ÿ + + ÿ + ZWNJ ä k n + ÿ + + ÿ + ZWNJ ä k«n + ÿ + ÿ + + ÿ + ZWNJ å kam + + ÿ + ZWNJ å ka:m + ÿ + + ÿ + ZWNJ å kim + ÿ + + ÿ + ZWNJ å kum + ÿ + + ÿ + ZWNJ å kem + ÿ + + ÿ + ZWNJ å kem + ÿ + + ÿ + ZWNJ å kom + ÿ + + ÿ + ZWNJ å kom + ÿ + + ÿ + ZWNJ å k m + ÿ + + ÿ + ZWNJ å k«m + ÿ + ÿ + + ÿ + ZWNJ â kaw + + ÿ + ZWNJ â ka:w + ÿ + + ÿ + ZWNJ â kiw + ÿ + + ÿ + ZWNJ â kew + ÿ + + ÿ + ZWNJ â kew + ÿ + + ÿ + ZWNJ â k«w + ÿ + ÿ + + ÿ + ZWNJ ç kat + œ + ÿ + ZWNJ ç ka:t + ÿ + œ + ÿ + ZWNJ ç kit + ÿ + œ + ÿ + ZWNJ ç kut + ÿ + œ + ÿ + ZWNJ ç ket + ÿ + œ + ÿ + ZWNJ ç ket + ÿ + œ + ÿ + ZWNJ ç kot + ÿ + œ + ÿ + ZWNJ ç kot + ÿ + œ + ÿ + ZWNJ ç k t + ÿ + œ + ÿ + ZWNJ ç k«t + ÿ + ÿ + œ + ÿ + ZWNJ è kap + + ÿ + ZWNJ è ka:p + ÿ + + ÿ + ZWNJ è kip + ÿ + + ÿ + ZWNJ è kup + ÿ + + ÿ + ZWNJ è kep + ÿ + + ÿ + ZWNJ è kep + ÿ + + ÿ + ZWNJ è kop + ÿ + + ÿ + ZWNJ è kop + ÿ + + ÿ + ZWNJ è k p + ÿ + + ÿ + ZWNJ è k«p + ÿ + ÿ + + ÿ + ZWNJ 5
Table C. New Tai Lü Ordering ka + ÿ æ kak + + ÿ + ZWNJ ã kan + + ÿ + ZWNJ ä kan + + ÿ + ZWNJ å kam + + ÿ + ZWNJ â kaw + + ÿ + ZWNJ ç kat + œ + ÿ + ZWNJ è kap + + ÿ + ZWNJ ka: + ÿ æ ka:k + ÿ + + ÿ + ZWNJ ã ka:n + ÿ + + ÿ + ZWNJ ä ka:n + ÿ + + ÿ + ZWNJ å ka:m + ÿ + + ÿ + ZWNJ â ka:w + ÿ + + ÿ + ZWNJ ç ka:t + ÿ + œ + ÿ + ZWNJ è ka:p + ÿ + + ÿ + ZWNJ ki + ÿ + ÿ æ kik + ÿ + + ÿ + ZWNJ ã kin + ÿ + + ÿ + ZWNJ ä kin + ÿ + + ÿ + ZWNJ å kim + ÿ + + ÿ + ZWNJ â kiw + ÿ + + ÿ + ZWNJ ç kit + ÿ + œ + ÿ + ZWNJ è kip + ÿ + + ÿ + ZWNJ ki: + ÿ ku + ÿ æ kuk + ÿ + + ÿ + ZWNJ ã kun + ÿ + + ÿ + ZWNJ ä kun + ÿ + + ÿ + ZWNJ å kum + ÿ + + ÿ + ZWNJ ç kut + ÿ + œ + ÿ + ZWNJ è kup + ÿ + + ÿ + ZWNJ ku: + ÿ ke + ÿ + ÿ æ kek + ÿ + + ÿ + ZWNJ ã ken + ÿ + + ÿ + ZWNJ ä ken + ÿ + + ÿ + ZWNJ å kem + ÿ + + ÿ + ZWNJ â kew + ÿ + + ÿ + ZWNJ ç ket + ÿ + œ + ÿ + ZWNJ è kep + ÿ + + ÿ + ZWNJ ke: + ÿ ke + ÿ + ÿ æ kek + ÿ + + ÿ + ZWNJ ã ken + ÿ + + ÿ + ZWNJ ä ken + ÿ + + ÿ + ZWNJ å kem + ÿ + + ÿ + ZWNJ â kew + ÿ + + ÿ + ZWNJ ç ket + ÿ + œ + ÿ + ZWNJ è kep + ÿ + + ÿ + ZWNJ ke: + ÿ ko + ÿ + ÿ æ kok + ÿ + + ÿ + ZWNJ ã kon + ÿ + + ÿ + ZWNJ ä kon + ÿ + + ÿ + ZWNJ å kom + ÿ + + ÿ + ZWNJ ç kot + ÿ + œ + ÿ + ZWNJ è kop + ÿ + + ÿ + ZWNJ ko: + ÿ ko + ÿ + ÿ æ kok + ÿ + + ÿ + ZWNJ ã kon + ÿ + + ÿ + ZWNJ ä kon + ÿ + + ÿ + ZWNJ å kom + ÿ + + ÿ + ZWNJ ç kot + ÿ + œ + ÿ + ZWNJ è kop + ÿ + + ÿ + ZWNJ ko: + ÿ k + ÿ + ÿ æ k k + ÿ + + ÿ + ZWNJ ã k N + ÿ + + ÿ + ZWNJ ä k n + ÿ + + ÿ + ZWNJ å k m + ÿ + + ÿ + ZWNJ ç k t + ÿ + œ + ÿ + ZWNJ è k p + ÿ + + ÿ + ZWNJ k : + ÿ k«+ ÿ + ÿ + ÿ æ k«k + ÿ + ÿ + + ÿ + ZWNJ ã k«n + ÿ + ÿ + + ÿ + ZWNJ ä k«n + ÿ + ÿ + + ÿ + ZWNJ å k«m + ÿ + ÿ + + ÿ + ZWNJ â k«w + ÿ + ÿ + + ÿ + ZWNJ ç k«t + ÿ + ÿ + œ + ÿ + ZWNJ è k«p + ÿ + ÿ + + ÿ + ZWNJ k«: + ÿ + ÿ Where do these sort? ª kaj + ªÿ Ð ka:j + ÿ + ZWNJ + Š Ñ kuj + ÿ + ZWNJ + Š Ò koj + + ZWNJ + Š Ó koj + ÿ + ZWNJ + Š Ô k j + ÿ + ZWNJ + Š Õ k«j + ÿ + ÿ + ZWNJ + Š Û kwa + ÿ + ï kla + ÿ + 6
NEW TAI LÜ xx0 xx1 xx2 xx2 0 ÿ 1 ÿ ± 2 ÿ ² 3 ƒ ÿ ³ 4 ÿ 5 ÿ µ 6 ÿ 7 ÿ 8 ˆ ÿ G = 00 P = 00 9 ÿ ¹ A Š š ªÿ B ÿ«c Œ œ ÿ D ÿ E Ž ž F ÿ 7
Proposal for the Universal Character Set NEW TAI LÜ dec hex Name dec hex Name 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F 20 21 22 23 24 25 26 27 28 29 2A 2B 2C 2D 2E 2F 30 31 32 33 34 35 36 37 38 39 3A 3B 3C 3D 3E 3F NEW TAI LU LETTER A HIGH NEW TAI LU LETTER KA HIGH NEW TAI LU LETTER KA LOW NEW TAI LU LETTER XA HIGH NEW TAI LU LETTER XA LOW NEW TAI LU LETTER NGA LOW NEW TAI LU LETTER CA HIGH NEW TAI LU LETTER CA LOW NEW TAI LU LETTER SA HIGH NEW TAI LU LETTER SA LOW NEW TAI LU LETTER YA HIGH NEW TAI LU LETTER YA LOW NEW TAI LU LETTER TA HIGH NEW TAI LU LETTER TA LOW NEW TAI LU LETTER THA HIGH NEW TAI LU LETTER THA LOW NEW TAI LU LETTER NA LOW NEW TAI LU LETTER PA HIGH NEW TAI LU LETTER PA LOW NEW TAI LU LETTER PHA HIGH NEW TAI LU LETTER PHA LOW NEW TAI LU LETTER MA LOW NEW TAI LU LETTER FA HIGH NEW TAI LU LETTER FA LOW NEW TAI LU LETTER VA LOW NEW TAI LU LETTER LA LOW NEW TAI LU LETTER HA HIGH NEW TAI LU LETTER HA LOW NEW TAI LU LETTER DA HIGH NEW TAI LU LETTER BA HIGH NEW TAI LU ABBREVIATION LAE NEW TAI LU VOWEL SIGN A NEW TAI LU VOWEL SIGN AA NEW TAI LU VOWEL SIGN II NEW TAI LU VOWEL SIGN U NEW TAI LU VOWEL SIGN UU NEW TAI LU VOWEL SIGN EE NEW TAI LU VOWEL SIGN E NEW TAI LU VOWEL SIGN OO NEW TAI LU VOWEL SIGN O NEW TAI LU VOWEL SIGN UW NEW TAI LU VOWEL SIGN AI NEW TAI LU TONE MODIFIER NEW TAI LU TONE MARK ONE NEW TAI LU TONE MARK TWO NEW TAI LU VIRAMA NEW TAI LU DIGIT ZERO NEW TAI LU DIGIT ONE NEW TAI LU DIGIT TWO NEW TAI LU DIGIT THREE NEW TAI LU DIGIT FOUR NEW TAI LU DIGIT FIVE NEW TAI LU DIGIT SIX NEW TAI LU DIGIT SEVEN NEW TAI LU DIGIT EIGHT NEW TAI LU DIGIT NINE Group 00 Plane 00 Row XX 8