tr28
rev 3Unicode 3.2
Open HTMLUpstream
tr28-3.html
5995 lines
Open Raw
<!doctype HTML 

PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">

<html>



<head><base href="https://www.unicode.org/reports/tr28/tr28-3.html">





<link rel="stylesheet" href="http://www.unicode.org/reports/reports.css" type="text/css">

<meta name="GENERATOR" content="Microsoft FrontPage 5.0">

<meta name="ProgId" content="FrontPage.Editor.Document">

<style type=text/css><!--



td.n         { text-align: Center; vertical-align: top }



td.q         { text-align: Center; width: 48px; border: 1px solid #FFFFFF;  }



tt.n         { font-size: 75% }



-->



</style>

<title>UTR #28: Unicode 3.2</title>

</head>



<body>



<table class="header" width="100%" cellspacing="0" cellpadding="0">

  <tr>

    <td class="icon"><a href="http://www.unicode.org"><img

align="middle"

      alt="[Unicode]" border="0"

      src="http://www.unicode.org/webscripts/logo60s2.gif" width="34"

      height="33"></a>&nbsp;&nbsp;<a class="bar"

      href="http://www.unicode.org/unicode/reports">Technical

Reports</a></td>

  </tr>

  <tr>

    <td class="gray">&nbsp;</td>

  </tr>

</table>

<div class="body">

<h2 align="center">Unicode Standard Annex #28</h2>

<h1 align="center">Unicode 3.2</h1>

<table border="1" cellpadding="2" width="100%">

  <tr>

    <td height="23" valign="TOP" width="20%">Version</td>

    <td valign="TOP" height="23">Unicode 3.2.0</td>

  </tr>

  <tr>

    <td height="23" valign="TOP">Authors</td>

    <td valign="TOP" height="23">Members of the Editorial Committee</td>

  </tr>

  <tr>

    <td height="23" valign="TOP">Date</td>

    <td valign="TOP" height="23">2002-03-27</td>

  </tr>

  <tr>

    <td height="23" valign="TOP">This Version</td>

    <td valign="TOP" height="23">

    <a href="http://www.unicode.org/unicode/reports/tr28/tr28-3.html">

    http://www.unicode.org/unicode/reports/tr28/tr28-3</a></td>

  </tr>

  <tr>

    <td height="23" valign="TOP">Previous Version</td>

    <td valign="TOP" height="23">

    N/A</td>

  </tr>

  <tr>

    <td height="23" valign="TOP">Latest Version</td>

    <td valign="TOP" height="23"><a href="http://www.unicode.org/unicode/reports/tr28">

    http://www.unicode.org/unicode/reports/tr28</a></td>

  </tr>

  <tr>

    <td height="23" valign="TOP">Tracking Number</td>

    <td valign="TOP" height="23"><a href="#tracking_number">3</a></td>

  </tr>

</table>

<h3><i>Summary</i></h3>

<p><i><em>This document defines Version 3.2 of the Unicode Standard.&nbsp;</em></i></p>

<h3><i>Status</i></h3>

<p><i>This document has been reviewed by Unicode members and other interested 

parties, and has been approved by the Unicode Technical Committee as a <b>

Unicode Standard Annex</b>. It is a stable document and may be used as reference 

material or cited as a normative reference from another document.</i></p>

<blockquote>

  <i><b>A Unicode Standard Annex (UAX)</b> forms an integral part of the Unicode Standard, but is published as a separate document. Note that conformance to a version of the Unicode Standard includes conformance to its Unicode Standard Annexes. The version number of a UAX document corresponds to the version number of the Unicode Standard at the last point that the UAX document was updated.</i></blockquote>

<p><i>A list of current Unicode Technical Reports is found on

<a href="http://www.unicode.org/unicode/reports/">

http://www.unicode.org/unicode/reports/</a>. For more information about versions 

of the Unicode Standard, see

<a href="http://www.unicode.org/unicode/standard/versions/">

http://www.unicode.org/unicode/standard/versions/</a>.</i></p>

<p><i>The <a href="#references">References</a> provide related information that 

is useful in understanding this document. Please mail corrigenda and other 

comments to the author(s).</i></p>

<h3><i>Contents</i></h3>

<ul>

  <li><a href="#description">I Description</a></li>

  <li><a href="#conformance">II Conformance</a>

  <ul>

    <li><a href="#3_1_conformance">3.1 Conformance Requirements (revision)</a></li>

    <li><a href="#3_6_decomposition">3.6 Decomposition (revision)</a></li>

    <li><a href="#3_9_special_character_properties">3.9 Special Character 

    Properties (revision)</a></li>

    <li><a href="#3_11_conjoining_jamo_behavior">3.11 Conjoining Jamo Behavior 

    (revision)</a></li>

    <li><a href="#4_2_combining_classes_normative">4.2 Combining Classes—Normative 

    (revision)</a></li>

  </ul>

  </li>

  <li><a href="#general_structure_and_guidelines">III General Structure and Guidelines</a>

  <ul>

    <li><a href="#2_2_unicode_design_principles">2.2 Unicode Design Principles 

    (addition) </a></li>

    <li><a href="#5_15_locating_text_element_boundaries">5.15 Locating Text 

    Element Boundaries (revision)</a></li>

  </ul>

  </li>

  <li><a href="#block">IV Block Descriptions</a>

  <ul>

    <li><a href="#6_1_general_punctuation">6.1 General Punctuation (addition)</a></li>

    <li><a href="#7_2_greek">7.2 Greek (revision)</a></li>

    <li><a href="#8_2_arabic">8.2 Arabic (addition)</a></li>

    <li><a href="#9_15_khmer">9.15 Khmer (addition)</a></li>

    <li><a href="#9_16_philippine_scripts">9.16 Philippine Scripts (new section)</a></li>

    <li><a href="#10_1_han">10.1 Han (addition)</a></li>

    <li><a href="#10_3_katakana">10.3 Katakana&nbsp;(addition)</a></li>

    <li><a href="#10_4_hangul">10.4 Hangul (addition)</a></li>

    <li><a href="#11_4_mongolian">11.4 Mongolian (addition)</a></li>

    <li><a href="#11_4_mongolian">12.4 Mathematical Operators (additions)</a></li>

    <li><a href="#12_5_technical_symbols">12.5 Technical Symbols (additions)</a></li>

    <li><a href="#12_7_miscellaneous_symbols_and_dingbats">12.7 Miscellaneous 

    Symbols and Dingbats (new subsection, revision and addition)</a></li>

    <li><a href="#12_12_standardized_variants_of_mathematical_symbols">12.12 

    Standardized Variants of Mathematical Symbols (new section)</a></li>

    <li><a href="#13_2_layout_controls">13.2 Layout Controls (additions)</a></li>

    <li><a href="#13_7_variation_selectors">13.7 Variation Selectors (new 

    section)</a></li>

    <li><a href="#14.1_character_names_list">14.1 Character Names List 

    (addition)</a></li>

  </ul>

  </li>

  <li><a href="#charts">V Code Charts</a></li>

  <li><a href="#errata">VI Errata</a></li>

  <li><a href="#database">VII Unicode Character Database Changes</a></li>

  <li><a href="#relation">VIII Relation to 10646</a></li>

  <li><a href="#references">IX References and Sources</a></li>

  <li><a href="#Modifications">X Modifications</a></li>

</ul>

<hr align="LEFT">

<h2 class="bb"><a name="description">I Description</a></h2>

<p>Unicode 3.2 is a minor version of the Unicode Standard. It overrides certain 

features of Unicode 3.1, and adds a significant number of coded characters.&nbsp;</p>

<h3>Recommended Citation Format for Unicode 3.2</h3>

<table border="1" cellspacing="0" cellpadding="4">

  <tr>

    <td width="100%">

    <p class="small">The Unicode Consortium. The Unicode Standard, Version 3.2.0 

    is defined by <i>The Unicode Standard, Version 3.0</i> (Reading, MA, 

    Addison-Wesley, 2000. ISBN 0-201-61633-5), as amended by the <i>Unicode 

    Standard Annex #27: Unicode 3.1</i> (<a 

    href="http://www.unicode.org/unicode/reports/tr27/">http://www.unicode.org/reports/tr27/</a>) 

    and by the Unicode Standard Annex #28: <i>Unicode 3.2</i> (<a 

    href="http://www.unicode.org/reports/tr28/">http://www.unicode.org/reports/tr28/</a>).</td>

  </tr>

</table>

<h3>Formal Definition of Unicode 3.2</h3>

<p>The Unicode Standard, Version 3.2.0 is defined by the following list.&nbsp; 

The version numbering and the role of each component are explained in

<a href="http://www.unicode.org/unicode/standard/versions/">Versions of The 

Unicode Standard</a>. The symbols in the change status column are explained in 

the <a href="#ChangeStatusKey">key</a> below. A summary of modifications in the 

Unicode Character Database for this version can be found in

<a href="http://www.unicode.org/Public/3.2-Update/UnicodeCharacterDatabase-3.2.0.html">

UnicodeCharacterDatabase-3.2.0.html</a>, together with a list of which data 

files contain normative vs. informative data.&nbsp;</p>

<blockquote>

  <table border="0" cellspacing="0" class="noborder" style="border-collapse: collapse" cellpadding="0">

      <tr>

      <th align="left" colspan="4" class="noborder">Major Reference</th>

    </tr>

    <tr>

      <th align="left" class="noborder"></th>

      <td colspan="2" class="noborder"></td>

      <td class="noborder">The Unicode Consortium.

      <a href="http://www.unicode.org/unicode/uni2book/u2.html">The Unicode 

      Standard, Version 3.0</a><br>

      Reading, MA, Addison-Wesley Developers Press, 2000. ISBN 0-201-61633-5.</td>

    </tr>

    <tr>

      <th align="left" colspan="4" class="noborder">Minor References</th>

    </tr>

    <tr>

      <td class="noborder"></td>

      <td colspan="2" class="noborder"></td>

      <td class="noborder">UAX #27: Unicode 3.1</td>

    </tr>

    <tr>

      <td class="noborder"></td>

      <td colspan="2" class="noborder"></td>

      <td class="noborder">UAX #28: Unicode 3.2</td>

    </tr>

    <tr>

      <th align="left" colspan="4" class="noborder">Update Reference</th>

    </tr>

    <tr>

      <td class="noborder"></td>

      <td colspan="2" class="noborder"></td>

      <td class="noborder"><b>n/a</b></td>

    </tr>

    <tr>

      <th align="left" colspan="4" class="noborder">

      <a href="http://www.unicode.org/unicode/reports/">Unicode Standard Annexes</a></th>

    </tr>

    <tr>

      <td class="noborder"></td>

      <td colspan="2" class="noborder"></td>

      <td class="noborder">
      <a href="http://www.unicode.org/unicode/reports/tr9/tr9-10.html">UAX 

      #9:&nbsp;The Bidirectional Algorithm, V3.2.0</a><br>

      <a href="http://www.unicode.org/unicode/reports/tr11/tr11-10.html">UAX 

      #11:&nbsp;East Asian Width, V3.2.0</a><br>

      <a href="http://www.unicode.org/unicode/reports/tr13/tr13-9.html">UAX #13: 

      Unicode Newline Guidelines, V3.2.0</a><br>

      <a href="http://www.unicode.org/unicode/reports/tr14/tr14-12.html">UAX 

      #14: Line Breaking Properties, V3.2.0</a><br>

      <a href="http://www.unicode.org/unicode/reports/tr15/tr15-22.html">UAX 

      #15: Unicode Normalization Forms, V3.2.0</a><br>

      <a href="http://www.unicode.org/unicode/reports/tr19/tr19-9.html">UAX #19: 

      UTF-32, V3.2.0</a><br>

      <a href="http://www.unicode.org/unicode/reports/tr21/tr21-5.html">UAX #21: 

      Case Mappings, V3.2.0</a></td>

    </tr>

    <tr>

      <th align="left" colspan="4" class="noborder">Unicode Character Database</th>

    </tr>

    <tr>

      <td class="noborder"></td>

      <td colspan="2" class="noborder"></td>

      <th align="left" class="noborder"><a href="http://www.unicode.org/Public/3.2-Update">

      http://www.unicode.org/Public/3.2-Update</a>, or<br>

      <a href="ftp://www.unicode.org/Public/3.2-Update/">

      ftp://www.unicode.org/Public/3.2-Update/</a></th>

    </tr>

    <tr>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <th colspan="2" align="left" class="noborder">Documentation</th>

    </tr>

    <tr>

      <td class="noborder"><i>T</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder">

      <a href="http://www.unicode.org/Public/3.2-Update/DerivedProperties-3.2.0.html">

      DerivedProperties-3.2.0.html</a></td>

    </tr>

    <tr>

      <td class="noborder"><i>T</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder"><a href="http://www.unicode.org/Public/3.2-Update/Index-3.2.0.txt">

      Index-3.2.0.txt</a></td>

    </tr>

    <tr>

      <td class="noborder"><i>T</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder">

      <a href="http://www.unicode.org/Public/3.2-Update/NamesList-3.2.0.html">NamesList-3.2.0.html</a></td>

    </tr>

    <tr>

      <td class="noborder"><i>T</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder">

      <a href="http://www.unicode.org/Public/3.2-Update/PropList-3.2.0.html">

      PropList-3.2.0.html</a></td>

    </tr>

    <tr>

      <td class="noborder"><i>T</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder"><a href="http://www.unicode.org/Public/3.2-Update/ReadMe-3.2.0.txt">

      ReadMe-3.2.0.txt</a></td>

    </tr>

    <tr>

      <td class="noborder"><i>T</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder">

      <a href="http://www.unicode.org/Public/3.2-Update/UnicodeCharacterDatabase-3.2.0.html">

      UnicodeCharacterDatabase-3.2.0.html</a></td>

    </tr>

    <tr>

      <td class="noborder"><i>T</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder">

      <a href="http://www.unicode.org/Public/3.2-Update/UnicodeData-3.2.0.html">

      UnicodeData-3.2.0.html</a></td>

    </tr>

    <tr>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <th colspan="2" align="left" class="noborder">Core Data</th>

    </tr>

    <tr>

      <td class="noborder"><i>D</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder">

      <a href="http://www.unicode.org/Public/3.2-Update/ArabicShaping-3.2.0.txt">ArabicShaping-3.2.0.txt</a></td>

    </tr>

    <tr>

      <td class="noborder"><i>D</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder">

      <a href="http://www.unicode.org/Public/3.2-Update/BidiMirroring-3.2.0.txt">BidiMirroring-3.2.0.txt</a></td>

    </tr>

    <tr>

      <td class="noborder"><i>D</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder"><a href="http://www.unicode.org/Public/3.2-Update/Blocks-3.2.0.txt">Blocks-3.2.0.txt</a></td>

    </tr>

    <tr>

      <td class="noborder"><i>D</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder">

      <a href="http://www.unicode.org/Public/3.2-Update/CompositionExclusions-3.2.0.txt">CompositionExclusions-3.2.0.txt</a></td>

    </tr>

    <tr>

      <td class="noborder"><i>D</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder">

      <a href="http://www.unicode.org/Public/3.2-Update/EastAsianWidth-3.2.0.txt">

      EastAsianWidth-3.2.0.txt</a></td>

    </tr>

    <tr>

      <td class="noborder"><i>T</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder"><a href="http://www.unicode.org/Public/3.2-Update/Jamo-3.2.0.txt">

      Jamo-3.2.0.txt</a></td>

    </tr>

    <tr>

      <td class="noborder"><i>D</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder">

      <a href="http://www.unicode.org/Public/3.2-Update/LineBreak-3.2.0.txt">

      LineBreak-3.2.0.txt</a></td>

    </tr>

    <tr>

      <td class="noborder"><i>D</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder">

      <a href="http://www.unicode.org/Public/3.2-Update/NamesList-3.2.0.txt">

      NamesList-3.2.0.txt</a></td>

    </tr>

    <tr>

      <td class="noborder"><i>N</i></td>

      <td class="noborder">&nbsp;</td>

      <td class="noborder">&nbsp;</td>

      <td class="noborder">

      <a href="http://www.unicode.org/Public/3.2-Update/NormalizationCorrections-3.2.0.txt">

      NormalizationCorrections-3.2.0.txt</a></td>

    </tr>

    <tr>

      <td class="noborder"><i>N</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder">

      <a href="http://www.unicode.org/Public/3.2-Update/PropertyAliases-3.2.0.txt">

      PropertyAliases-3.2.0.txt</a></td>

    </tr>

    <tr>

      <td class="noborder"><i>N</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder">

      <a href="http://www.unicode.org/Public/3.2-Update/PropertyValueAliases-3.2.0.txt">

      PropertyValueAliases-3.2.0.txt</a></td>

    </tr>

    <tr>

      <td class="noborder"><i>D</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder"><a href="http://www.unicode.org/Public/3.2-Update/PropList-3.2.0.txt">

      PropList-3.2.0.txt</a></td>

    </tr>

    <tr>

      <td class="noborder"><i>D</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder"><a href="http://www.unicode.org/Public/3.2-Update/Scripts-3.2.0.txt">

      Scripts-3.2.0.txt</a></td>

    </tr>

    <tr>

      <td class="noborder"><i>D</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder">

      <a href="http://www.unicode.org/Public/3.2-Update/SpecialCasing-3.2.0.txt">

      SpecialCasing-3.2.0.txt</a></td>

    </tr>

    <tr>

      <td class="noborder"><i>N</i></td>

      <td class="noborder">&nbsp;</td>

      <td class="noborder">&nbsp;</td>

      <td class="noborder"><a href="http://www.unicode.org/Public/3.2-Update/StandardizedVariants-3.2.0.html">

      StandardizedVariants-3.2.0.html</a></td>

    </tr>

    <tr>

      <td class="noborder"><i>D</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder">

      <a href="http://www.unicode.org/Public/3.2-Update/UnicodeData-3.2.0.txt">

      UnicodeData-3.2.0.txt</a></td>

    </tr>

    <tr>

      <td class="noborder"><i>D</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder">
      <a href="http://www.unicode.org/Public/3.2-Update/Unihan-3.2.0.txt">Unihan-3.2.0.txt</a> 
      (very large file, see
      <a href="http://www.unicode.org/Public/3.2-Update/Unihan-3.2.0.zip">
      Unihan-3.2.0.zip</a>)</td>

    </tr>

      <tr>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <th colspan="2" align="left" class="noborder">Derived Data</th>

      </tr>
      <tr>

      <td class="noborder"><i>D</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder">

      <a href="http://www.unicode.org/Public/3.2-Update/CaseFolding-3.2.0.txt">CaseFolding-3.2.0.txt</a></td>

      </tr>
      <tr>

      <td class="noborder"><i>N</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder">

      <a href="http://www.unicode.org/Public/3.2-Update/DerivedAge-3.2.0.txt">DerivedAge-3.2.0.txt</a></td>

      </tr>
      <tr>

      <td class="noborder"><i>D</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder">

      <a href="http://www.unicode.org/Public/3.2-Update/DerivedCoreProperties-3.2.0.txt">

      DerivedCoreProperties-3.2.0.txt</a></td>

      </tr>
      <tr>

      <td class="noborder"><i>D</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder">

      <a href="http://www.unicode.org/Public/3.2-Update/DerivedNormalizationProps-3.2.0.txt">DerivedNormalizationProps-3.2.0.txt</a></td>

      </tr>

    <tr>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <th colspan="2" align="left" class="noborder">Extracted Data</th>

    </tr>





    <tr>

      <td class="noborder"><i>D</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder">

      <a href="http://www.unicode.org/Public/3.2-Update/extracted/DerivedBidiClass-3.2.0.txt">

      DerivedBidiClass-3.2.0.txt</a></td>

    </tr>

    <tr>

      <td class="noborder"><i>D</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder">

      <a href="http://www.unicode.org/Public/3.2-Update/extracted/DerivedBinaryProperties-3.2.0.txt">

      DerivedBinaryProperties-3.2.0.txt</a></td>

    </tr>

    <tr>

      <td class="noborder"><i>D</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder">

      <a href="http://www.unicode.org/Public/3.2-Update/extracted/DerivedCombiningClass-3.2.0.txt">

      DerivedCombiningClass-3.2.0.txt</a></td>

    </tr>

    <tr>

      <td class="noborder"><i>D</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder">

      <a href="http://www.unicode.org/Public/3.2-Update/extracted/DerivedDecompositionType-3.2.0.txt">

      DerivedDecompositionType-3.2.0.txt</a></td>

    </tr>

    <tr>

      <td class="noborder"><i>D</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder">

      <a href="http://www.unicode.org/Public/3.2-Update/extracted/DerivedEastAsianWidth-3.2.0.txt">

      DerivedEastAsianWidth-3.2.0.txt</a></td>

    </tr>

    <tr>

      <td class="noborder"><i>D</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder">

      <a href="http://www.unicode.org/Public/3.2-Update/extracted/DerivedGeneralCategory-3.2.0.txt">

      DerivedGeneralCategory-3.2.0.txt</a></td>

    </tr>

    <tr>

      <td class="noborder"><i>D</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder">

      <a href="http://www.unicode.org/Public/3.2-Update/extracted/DerivedJoiningGroup-3.2.0.txt">

      DerivedJoiningGroup-3.2.0.txt</a></td>

    </tr>

    <tr>

      <td class="noborder"><i>D</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder">

      <a href="http://www.unicode.org/Public/3.2-Update/extracted/DerivedJoiningType-3.2.0.txt">

      DerivedJoiningType-3.2.0.txt</a></td>

    </tr>

    <tr>

      <td class="noborder"><i>D</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder">

      <a href="http://www.unicode.org/Public/3.2-Update/extracted/DerivedLineBreak-3.2.0.txt">

      DerivedLineBreak-3.2.0.txt</a></td>

    </tr>

    <tr>

      <td class="noborder"><i>D</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder">

      <a href="http://www.unicode.org/Public/3.2-Update/extracted/DerivedNumericType-3.2.0.txt">

      DerivedNumericType-3.2.0.txt</a></td>

    </tr>

    <tr>

      <td class="noborder"><i>D</i></td>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <td class="noborder">

      <a href="http://www.unicode.org/Public/3.2-Update/extracted/DerivedNumericValues-3.2.0.txt">

      DerivedNumericValues-3.2.0.txt</a></td>

    </tr>

    <tr>

      <td class="noborder"></td>

      <td class="noborder"></td>

      <th colspan="2" align="left" class="noborder">Conformance Test Data</th>

    </tr>

    <tr>

      <td class="noborder"><i>D</i></td>

      <td class="noborder"></td>

      <td class="noborder">&nbsp;&nbsp;</td>

      <td class="noborder">

      <a href="http://www.unicode.org/Public/3.2-Update/NormalizationTest-3.2.0.txt">

      NormalizationTest-3.2.0.txt</a></td>

    </tr>

  </table>

  <p><b><a name="ChangeStatusKey">Key:</a></b></p>

  <table border="1" cellspacing="0" cellpadding="2">

    <tr>

      <td><i>N</i></td>

      <td>New in this release</td>

    </tr>

    <tr>

      <td><i>D</i></td>

      <td>Data change (possibly also format/text change)</td>

    </tr>

    <tr>

      <td><i>F</i></td>

      <td>Data format change (possibly also text change)</td>

    </tr>

    <tr>

      <td><i>T</i></td>

      <td>Text annotation change</td>

    </tr>

    <tr>

      <td><i>-</i></td>

      <td>Unchanged</td>

    </tr>

  </table>

</blockquote>

<p>The list of contributory data files constituting the Unicode Standard, 

Version 3.2 can also be found online at

<a href="http://www.unicode.org/standard/versions/enumeratedversions.html">Enumerated Versions</a>.</p>

<h3>New Character Allocations</h3>

<p>The primary feature of Unicode 3.2 is the addition of 1016 new encoded 

characters. These additions consist of several Philippine scripts, a large 

collection of mathematical symbols, and small sets of other letters and 

symbols.&nbsp;</p>

<p>All of the newly encoded characters in Unicode 3.2 are additions to the Basic 

Multilingual Plane (BMP).&nbsp;</p>

<p>Complete introductions to the newly encoded scripts and symbols can be found 

in <a href="#block">Article IV, Block Descriptions</a>, below.&nbsp;</p>

<h3>Additional Features of Unicode 3.2</h3>

<p>Unicode 3.2 also features amended contributory data files, to bring the data 

files up to date against the expanded repertoire of characters. A summary of the 

revisions to the data files can be found in <a href="#database">Article VII, 

Unicode Character Database Changes</a>.&nbsp;</p>

<p>All outstanding errata and corrigenda to the Unicode Standard are included in 

this specification. Major corrigenda having a bearing on conformance to the 

standard are listed in <a href="#conformance">Article II, Conformance</a>. Other 

minor errata are listed in <a href="#errata">Article VI, Errata</a>.&nbsp;</p>

<p>Most notable among the corrigenda to the Standard is a further tightening of 

the definition of UTF-8, to eliminate irregular UTF-8 and to bring the Unicode 

specification of UTF-8 more completely into line with other specifications of 

UTF-8.&nbsp;</p>

<p>The former UTR #21, Case Mappings has been upgraded in status to a Unicode 

Standard Annex in Unicode 3.2. This means that

<a href="http://www.unicode.org/unicode/reports/tr21/tr21-5.html">UAX #21, Case 

Mappings</a> is now formally a part of the Unicode Standard.</p>

<h3>Conventions Used in this Document</h3>

<p>The sections of this document are referred to as “articles” to prevent 

confusion with references to sections of <i>The Unicode Standard, Version 3.0</i>. 

In addition, the articles in this document are numbered with Roman numerals, to 

highlight the distinction. The word “section” always refers to sections of <i>

The Unicode Standard, Version 3.0</i> or to a new section of the standard which 

is added by this document. Page numbers also refer to <i>

<a href="http://www.unicode.org/unicode/uni2book/u2.html">The Unicode Standard, Version 3.0</a></i>.</p>

<p>New or replacement text for the standard is indicated with <u>underlined</u> 

text, when this new text is a corrigendum of an existing section of the 

standard.</p>

<p>Deleted text from the standard is indicated with <strike>struck-through</strike> 

text.</p>

<p>In instances where entire new sections or subsections are to be added to the 

standard, as for the block descriptions for newly encoded scripts or symbol 

sets, new section numbers are provided that interleave reasonably with the 

existing sections of the published Unicode 3.0 book. And for these added 

sections, the text is not underlined, since the entire sections are new.</p>

<p>In this document, unambiguous dates of the current common era, such as 1999, 

are unlabeled. In cases of ambiguity, CE is used. Dates before the common era 

are labeled with BCE.</p>

<h2 class="bb"><a name="conformance">II Conformance</a></h2>

<h3><a name="3_1_conformance">3.1 Conformance Requirements (revision)</a></h3>

<h3>Elimination of Irregular Sequences&nbsp;</h3>

<p>The definition of transformation formats such as UTF-8 allowed conformant 

processes to interpret certain sequences called <i>irregular</i> sequences. 

These irregular sequences are those that would be produced by transforming 

supplementary code points as if they were a sequence of two surrogate code 

points.</p>

<p>To tighten the definitions, in Unicode 3.2 such irregular sequences are now 

illegal.  </p>

<p>Note: Some implementations of UTF-8 might still interpret irregular 

sequences; for those, a separate compatibility encoding scheme, to be 

distinguished from UTF-8, may be used. See <a href="http://www.unicode.org/reports/tr26/">Unicode  Technical Report #26, “Compatibility 

Encoding Scheme for UTF-16: 8-Bit (CESU-8).”</a> However, CESU-8 is not intended 

nor recommended as an encoding used for open information exchange.</p>

<p>Terminology to distinguish <i>ill-formed</i>, <i>illegal</i>, and <i>

irregular</i> code unit sequences is no longer needed. There are no <i>irregular</i> 

code unit sequences, and thus all <i>ill-formed</i> code unit sequences are <i>

illegal</i>. It is illegal to emit or interpret any <i>ill-formed</i> code unit 

sequence. Unicode 4.0 will revise the terminology and conformance clauses in 

light of this. For Unicode 3.2, only the minimal changes required of the text 

are noted here.</p>

<p><i><b>Change C12 in Unicode 3.1 to:</b></i></p>

<table class="noborder" style="border-collapse: collapse" cellpadding="0" cellspacing="0">

  <tr>

    <td valign="top" align="center" class="noborder">C12</td>

    <td valign="top" align="left" class="noborder">(a) When a process generates data in a Unicode 

    Transformation Format, it shall not emit ill-formed code unit sequences.<br>

    (b) When a process interprets data in a Unicode Transformation Format, it 

    shall treat <strike>illegal</strike> <u>ill-formed</u> code unit sequences 

    as an error condition.<br>

    (c) A conformant process shall not interpret <strike>illegal</strike> <u>

    ill-formed</u> UTF code unit sequences as characters.<br>

    <strike>(d) Irregular UTF code unit sequences shall not be used for encoding 

    any other information.</strike></td>

  </tr>

</table>

<p><i><b>Change the fifth note after C12 in Unicode 3.1 to:</b></i></p>

<ul>

  <li>Conformant processes cannot interpret <strike>illegal</strike> <u>

  ill-formed</u> code unit sequences. However, the conformance clauses do not, 

  for example, prevent utility programs from operating on “mangled” text. For 

  example, a UTF-8 file could have had CRLF sequences introduced at every 80 

  bytes by a bad mailer program. This could result in some UTF-8 byte sequences 

  being interrupted by CRLFs, producing ill-formed byte sequences. This mangled 

  text is no longer UTF-8. It is permissible for a conformant program to repair 

  such text, recognizing that the mangled text was originally well-formed UTF-8 

  byte sequences. However, such repair of mangled data is a special case, and 

  must not be used in circumstances where it would cause security problems.</li>

</ul>

<p><i><b>Change Table 3.1B after C12 in Unicode 3.1 by splitting the row 

U+1000..U+FFFF to exclude the surrogate code points:</b></i></p>

<div align="center">

  <center>

  <table cellspacing="0" cellpadding="4" border="1">

    <caption><b>Table 3.1B. Legal UTF-8 Byte Sequences</b></caption>

    <tr>

      <th style="BACKGROUND-COLOR: #990000" width="10%" bgcolor="#cccccc">

      <font color="#ffffff">&nbsp;Code Points</font></th>

      <th style="BACKGROUND-COLOR: #990000" width="10%"><font color="#ffffff">

      1st Byte</font></th>

      <th style="BACKGROUND-COLOR: #990000" width="10%"><font color="#ffffff">

      2nd Byte</font></th>

      <th style="BACKGROUND-COLOR: #990000" width="10%"><font color="#ffffff">

      3rd Byte</font></th>

      <th style="BACKGROUND-COLOR: #990000" width="10%"><font color="#ffffff">

      4th Byte</font></th>

    </tr>

    <tr>

      <th style="BACKGROUND-COLOR: #990000" width="10%"><tt>

      <font color="#ffffff">U+0000..U+007F</font></tt></th>

      <td width="10%"><tt>00..7F</tt></td>

      <td width="10%"><tt>&nbsp;</tt></td>

      <td width="10%"><tt>&nbsp;</tt></td>

      <td width="10%"><tt>&nbsp;</tt></td>

    </tr>

    <tr>

      <th style="BACKGROUND-COLOR: #990000" width="10%"><tt>

      <font color="#ffffff">U+0080..U+07FF</font></tt></th>

      <td width="10%"><tt>C2..DF</tt></td>

      <td width="10%"><tt>80..BF&nbsp;</tt></td>

      <td width="10%"><tt>&nbsp;</tt></td>

      <td width="10%"><tt>&nbsp;</tt></td>

    </tr>

    <tr>

      <th style="BACKGROUND-COLOR: #990000" width="10%"><tt>

      <font color="#ffffff">U+0800..U+0FFF</font></tt></th>

      <td width="10%"><tt>E0</tt></td>

      <td width="10%"><tt><u>A0</u>..BF</tt></td>

      <td width="10%"><tt>80..BF&nbsp;</tt></td>

      <td width="10%"><tt>&nbsp;</tt></td>

    </tr>

    <tr>

      <th style="BACKGROUND-COLOR: #990000" width="10%"><tt>

      <font color="#ffffff"><u>U+1000..U+CFFF</u></font></tt></th>

      <td width="10%"><tt>E1..EC</tt></td>

      <td width="10%"><tt>80..BF</tt></td>

      <td width="10%"><tt>80..BF&nbsp;</tt></td>

      <td width="10%"><tt>&nbsp;</tt></td>

    </tr>

    <tr>

      <th style="BACKGROUND-COLOR: #990000" width="10%"><tt>

      <font color="#ffffff"><u>U+D000..U+D7FF</u></font></tt></th>

      <td width="10%"><tt>ED</tt></td>

      <td width="10%"><tt>80..<u>9F</u></tt></td>

      <td width="10%"><tt>80..BF&nbsp;</tt></td>

      <td width="10%"><tt>&nbsp;</tt></td>

    </tr>

    <tr>

      <th style="BACKGROUND-COLOR: #990000" width="10%"><tt>

      <font color="#ffffff"><u>U+D800..U+DFFF</u></font></tt></th>

      <td width="40%" colspan="4"><tt>ill-formed</tt></td>

    </tr>

    <tr>

      <th style="BACKGROUND-COLOR: #990000" width="10%"><tt>

      <font color="#ffffff"><u>U+E000..U+FFFF</u></font></tt></th>

      <td width="10%"><tt>EE..EF</tt></td>

      <td width="10%"><tt>80..BF</tt></td>

      <td width="10%"><tt>80..BF&nbsp;</tt></td>

      <td width="10%"><tt>&nbsp;</tt></td>

    </tr>

    <tr>

      <th style="BACKGROUND-COLOR: #990000" width="10%"><tt>

      <font color="#ffffff">U+10000..U+3FFFF</font></tt></th>

      <td width="10%"><tt>F0</tt></td>

      <td width="10%"><tt><u>90</u>..BF</tt></td>

      <td width="10%"><tt>80..BF</tt></td>

      <td width="10%"><tt>80..BF</tt></td>

    </tr>

    <tr>

      <th style="BACKGROUND-COLOR: #990000" width="10%"><tt>

      <font color="#ffffff">U+40000..U+FFFFF</font></tt></th>

      <td width="10%"><tt>F1..F3</tt></td>

      <td width="10%"><tt>80..BF</tt></td>

      <td width="10%"><tt>80..BF</tt></td>

      <td width="10%"><tt>80..BF</tt></td>

    </tr>

    <tr>

      <th style="BACKGROUND-COLOR: #990000" width="10%"><tt>

      <font color="#ffffff">U+100000..U+10FFFF</font></tt></th>

      <td width="10%"><tt>F4</tt></td>

      <td width="10%"><tt>80..<u>8F</u></tt></td>

      <td width="10%"><tt>80..BF&nbsp;</tt></td>

      <td width="10%"><tt>80..BF</tt></td>

    </tr>

  </table>

  </center>

</div>

<h3><a name="3_6_decomposition">3.6 Decomposition</a> (revision)</h3>

<p>The text of D21 is replaced by the following text:</p>

<p>D21 <i>Compatibility decomposable character</i>: a character whose compatibility 

decomposition is not identical to its canonical decomposition. It may also be 

known as a <i>compatibility precomposed</i> character or a <i>compatibility 

composite</i> character.</p>

<ul>

  <li>For example:&nbsp;

  <ul>

    <li>U+00B5 MICRO SIGN has no canonical decomposition mapping, so its 

    canonical decomposition is the same as the character itself. It has a 

    compatibility decomposition to U+03BC GREEK SMALL LETTER MU. Because MICRO 

    SIGN has a compatibility decomposition that is not equal to its canonical 

    decomposition, it is a compatibility decomposable character.</li>

    <li>U+03D3 GREEK UPSILON WITH ACUTE AND HOOK SYMBOL canonically decomposes 

    to the sequence &lt;U+03D2&nbsp; GREEK UPSILON WITH HOOK SYMBOL, U+0301 

    COMBINING ACUTE ACCENT&gt;. That sequence has a compatibility decomposition of 

    &lt;U+03A5 GREEK CAPITAL LETTER UPSILON, U+0301 COMBINING ACUTE ACCENT&gt;. 

    Because GREEK UPSILON WITH ACUTE AND HOOK SYMBOL has a compatibility 

    decomposition that is not equal to its canonical decomposition, it is a 

    compatibility decomposable character.</li>

  </ul>

  </li>

  <li>This should not be confused with the term “compatibility character”, which 

  is discussed in <i>Section 2.2,</i> <i>Unicode Design Principles</i> in <i>The 

  Unicode Standard, Version 3.0</i>.</li>

  <li>Compatibility composites are a subset of compatibility  characters included 

  in the Unicode Standard to represent distinctions in other base standards. 

  They support transmission and processing of legacy data. Their use is 

  discouraged other than for legacy data or other special circumstances.</li>

  <li>Replacing a compatibility decomposable character by its compatibility decomposition may 

  lose round-trip convertibility with a base standard.</li>

</ul>

<p>Add the following new text after D23:</p>

<p>D23a <i>Canonical decomposable character</i>: a character which is not 

identical to its canonical decomposition. It may also be known as a <i>canonical 

precomposed</i> character or a <i>canonical composite</i> character.</p>

<ul>

  <li>For example: U+00E0 LATIN SMALL LETTER A WITH GRAVE is a canonical 

  decomposable character because its canonical decomposition is to the two 

  characters U+0061 LATIN SMALL LETTER A and U+0300 COMBINING GRAVE ACCENT. 

  U+212A KELVIN SIGN is a canonical decomposable character because its canonical 

  decomposition is to U+004B LATIN CAPITAL LETTER K.</li>

</ul>

<h3><a name="3_9_special_character_properties">3.9 Special Character Properties</a> 

(revision)</h3>

<h3>Replacing ZWNBSP with Word Joiner</h3>

<p>The character U+2060 has been added to the standard to allow unambiguous 

expression of the word-joining semantics. U+2060 WORD JOINER is now the 

preferred character to express the word-joining semantics implied by the ZWNBSP. 

The availability of U+2060 makes it unnecessary to use U+FEFF as a zero-width 

non-breaking space, allowing U+FEFF to be used solely with the semantic of BOM. 

For more information, see the subsection on “Word Joiner” in <i>

<a href="#13_2_layout_controls">Section 13.2, Layout Controls</a></i> in this 

document.</p>

<p>Note: Implementers are strongly encouraged to use word joiner in those 

circumstances whenever word joining semantics is intended.</p>

<h3>Additions to Properties</h3>

<p>A number of characters which have special character properties have been added in the Unicode Standard, Version 3.2. To reflect this, the following changes 

are made to the special character properties listing, on pages 48-50 of <i>The 

Unicode Standard, Version 3.0</i>:</p>

<p>In the entry for “Line boundary control”, add:</p>

<p>205F MEDIUM MATHEMATICAL SPACE<br>

2060 WORD JOINER</p>

<p>Change the name of the “Joining” entry to “Cursive joining and ligation 

control”.</p>

<p>Add a new entry called “Grapheme joining” after the renamed entry for 

“Cursive joining and ligation control” and add to that new entry:</p>

<p>034F COMBINING GRAPHEME JOINER</p>

<p>Add a new entry called “Mathematical expression formatting” after the entry 

“Bidirectional ordering” and add to that new entry:</p>

<p>2061 FUNCTION APPLICATION<br>

2062 INVISIBLE TIMES<br>

2063 INVISIBLE SEPARATOR</p>

<p dir="ltr">Change the name of the “Alternate formatting” entry to “Deprecated 

alternate formatting”.</p>

<p dir="ltr">Change the name of the “Syriac abbreviation” entry to “Prefixed 

format control” and add to that entry:</p>

<p dir="ltr">06DD ARABIC END OF AYAH</p>

<p>Change the name of the “Indic dead-character formation” entry to 

“Brahmi-derived script dead-character formation” and add to that entry:</p>

<p>1714 TAGALOG SIGN VIRAMA<br>

1734 HANUNOO SIGN PAMUDPOD</p>

<p>Change the name of the “Mongolian variant selectors” entry to “Mongolian 

variation selectors”.</p>

<p>After the “Mongolian variation selectors” entry add a new entry “Generic 

variation selectors” and add to that new entry:</p>

<p>FE00 VARIATION SELECTOR-1<br>

FE01 VARIATION SELECTOR-2<br>

FE02 VARIATION SELECTOR-3<br>

FE03 VARIATION SELECTOR-4<br>

FE04 VARIATION SELECTOR-5<br>

FE05 VARIATION SELECTOR-6<br>

FE06 VARIATION SELECTOR-7<br>

FE07 VARIATION SELECTOR-8<br>

FE08 VARIATION SELECTOR-9<br>

FE09 VARIATION SELECTOR-10<br>

FE0A VARIATION SELECTOR-11<br>

FE0B VARIATION SELECTOR-12<br>

FE0C VARIATION SELECTOR-13<br>

FE0D VARIATION SELECTOR-14<br>

FE0E VARIATION SELECTOR-15<br>

FE0F VARIATION SELECTOR-16</p>

<h3>Application of Combining Marks</h3>

<p>Formally speaking, combining marks apply to the preceding grapheme cluster. In most cases, this is the same as 

applying to the preceding base character. However, in two circumstances there is 

a difference: </p>

<ul>

  <li><i>Hangul syllables</i> </li>

  <li><i>Enclosing combining marks</i></li>

</ul>

<p><b><i>Hangul Syllables.</i> </b>Where a grapheme cluster contains a Hangul 

syllable, the combining mark applies to the entire syllable. For example, in the 

following sequence the <i>grave</i> is applied to the entire Hangul syllable, 

not just the last jamo:</p>

<ul>

  <li>U+1100 HANGUL CHOSEONG KIYEOK </li>

  <li>U+1161 HANGUL JUNGSEONG A </li>

  <li>U+0300 COMBINING GRAVE ACCENT</li>

</ul>

<p><b><i>Enclosing Combining Marks.</i> </b>These marks enclose the entire 

preceding grapheme cluster. For example, in the following sequence the entire 

Hangul syllable is circled, not just part of it:</p>

<ul>

  <li>U+1100 HANGUL CHOSEONG KIYEOK </li>

  <li>U+1161 HANGUL JUNGSEONG A </li>

  <li>U+20DD COMBINING ENCLOSING CIRCLE</li>

</ul>

<p>This is also true of grapheme clusters composed of elements linked by a 

Grapheme_Link or <i>combining grapheme joiner</i>. For example, the entire conjunct is circled in the following 

sequence: </p>

<ul>

  <li>U+0915 DEVANAGARI LETTER KA </li>

  <li>U+094D DEVANAGARI SIGN VIRAMA </li>

  <li>U+0922 DEVANAGARI LETTER DDHA </li>

  <li>U+20DD COMBINING ENCLOSING CIRCLE</li>

</ul>

<p>On the other hand, where elements are linked by a Grapheme_Link or combining 

grapheme joiner, <i>

non-enclosing</i> combining marks <i>only</i> apply to the last base character. 

For example, in the following sequence the <i>nukta</i> applies to the 

immediately preceding <i>ddha</i>, not to the entire cluster:</p>

<ul>

  <li>U+0915 DEVANAGARI LETTER KA </li>

  <li>U+094D DEVANAGARI SIGN VIRAMA </li>

  <li>U+0922 DEVANAGARI LETTER DDHA </li>

  <li>U+093C DEVANAGARI SIGN NUKTA</li>

</ul>

<p>For more information, see the subsection on “Combining Grapheme Joiner” in

<i><a href="#13_2_layout_controls">Section 13.2, Layout Controls</a></i> in this 

document.</p>

<h3><a name="3_11_conjoining_jamo_behavior">3.11 Conjoining Jamo Behavior</a> 

(revision)</h3>

<p>The following text replaces the text and tables for this section on pages 

52-53 of <i>The Unicode Standard, Version 3.0</i>:</p>

<p>The Unicode Standard contains both a large set of precomposed modern Hangul 

syllables and a set of conjoining Hangul jamo, which can be used to encode 

archaic syllable blocks as well as modern syllable blocks. This section 

describes how to:</p>

<ul>

  <li>Determine the syllable boundaries in a sequence of conjoining jamo 

  characters</li>

  <li>Compose jamo characters into precomposed Hangul syllables</li>

  <li>Determine the canonical decomposition of precomposed Hangul syllables</li>

  <li>Algorithmically determine the names of precomposed Hangul syllables</li>

</ul>

<p>For more information, see the “Hangul Syllables” and “Hangul Jamo” 

subsections in <i>Section 10.4, Hangul</i> in <i>The Unicode Standard, Version 

3.0</i>. Hangul syllables are a special case of grapheme clusters.</p>

<p>The jamo characters can be classified into three sets of characters: <i>

choseong</i> (leading consonants, or syllable-initial characters), <i>jungseong</i> 

(vowels, or syllable-peak characters), and <i>jongseong</i> (trailing 

consonants, or syllable-final characters). In the following discussion, these 

jamo are abbreviated as <i>L</i> (leading consonant), <i>V</i> (vowel), and <i>T</i> 

(trailing consonant); syllable breaks are shown by <i>middle dots</i> “·”; 

non-syllable breaks are shown by “×”, combining marks are shown by M, and non-jamo 

are shown by <i>X</i>.</p>

<p>In the following discussion, a <i>syllable</i> refers to a sequence of Korean 

characters that should be grouped into a single cell for display. This is 

different from a <i>precomposed Hangul syllable</i>, which consists of any of 

the characters in the range U+AC00..U+D7A3. Note that a syllable may contain a 

precomposed Hangul syllable <i>plus</i> other characters.</p>

<h3>Syllable Boundaries</h3>

<p>In rendering, a sequence of jamos is displayed as a series of syllable 

blocks. The following rules specify how to divide up an arbitrary sequence of 

jamos (including nonstandard sequences) into these syllable blocks. In these 

rules, a <i>choseong filler</i> (<i>L<sub>f </sub></i>) is treated as a <i>

choseong</i> character, and a <i>jungseong filler</i> (<i>V</i><i><sub>f </sub>

</i>) is treated as a <i>jungseong</i>.</p>

<p>The precomposed Hangul syllables are of two types: <i>LV</i> or <i>LVT</i>. 

In determining the syllable boundaries, the LV behave as if they were a sequence 

of jamo L V, and the LVT behave as if they were a sequence of jamo L V T.</p>

<p>Within any sequence of characters, a syllable break never occurs between the 

pairs of characters shown in <i>Table 3-5</i>. In all other cases, there is a 

syllable break before and after any jamo or precomposed Hangul syllable. Note 

that like other characters, any combining mark between two conjoining jamos 

prevents the jamos from forming a syllable.</p>

<p align="center"><b>Table 3-5. Hangul Syllable No-Break Rules</b></p>

<div align="center">

  <center>

  <table border="2" cellpadding="2" cellspacing="0">

    <tr>

      <td colspan="2"><b>Do Not Break Between</b></td>

      <td><b>Examples</b></td>

    </tr>

    <tr>

      <td>L</td>

      <td>L, V, or precomposed<br>

      Hangul syllable</td>

      <td>L × L<br>

      L× V<br>

      L × LV<br>

      L × LVT</td>

    </tr>

    <tr>

      <td>V or LV</td>

      <td>V or T&nbsp;</td>

      <td>V × V<br>

      V × T<br>

      LV × V<br>

      LV × T</td>

    </tr>

    <tr>

      <td>T or LVT</td>

      <td>T</td>

      <td>T × T<br>

      LVT × T</td>

    </tr>

    <tr>

      <td>Jamo or<br>

      precomposed<br>

      Hangul syllable</td>

      <td>Combining marks</td>

      <td>L × M<br>

      V × M<br>

      T × M<br>

      LV × M<br>

      LVT × M</td>

    </tr>

  </table>

  </center>

</div>

<p>Note that even in normalization form NFC, a syllable may contain a 

precomposed Hangul syllable in the middle. An example is “L LVT T”. Each 

well-formed modern Hangul syllable, however, can be represented in the form L V 

T? (that is one L, one V and optionally one T), and is a single character in NFC.</p>

<p>For information on the behavior of Hangul compatibility jamo in syllables, 

see <i>Section 10.4, Hangul</i> in <i>The Unicode Standard, Version 3.0</i>.</p>

<h3>Standard Korean Syllables</h3>

<p>A standard Korean syllable block is composed of a sequence of one or more <i>

L</i> followed by a sequence of one or more <i>V</i> and optionally a sequence 

of zero or more <i>T</i>. A sequence of nonstandard syllable blocks can be 

transformed into a sequence of standard Korean syllable blocks by inserting <i>

choseong</i> fillers (<i>L<sub>f </sub></i>) and <i>jungseong</i> fillers (<i>V<sub>f

</sub></i>).</p>

<p>Using regular expression notation, a standard Korean syllable is thus of the 

form:</p>

<p>L+ V+ T*</p>

<p>The transformation of a string of text into standard Korean syllables is 

performed by determining the syllable breaks as explained in the subsection on 

“Syllable Boundaries” earlier in this section, then inserting one or two fillers 

as necessary to transform each syllable into a standard Korean syllable. Thus:</p>

<p>L ^V → L V<sub>f</sub> ^V<br>

^L V → ^L L<sub>f</sub> V<br>

^V T → ^V L<sub>f</sub> V<sub>f</sub> T</p>

<p>where ^X indicates a character that is not X, or the absence of a character.</p>

<p><i><b>Examples.</b></i> In <i>Table 3-6</i>, the first row shows syllable 

breaks in a standard sequence, the second row shows syllable breaks in a 

nonstandard sequence, and the third row shows how the sequence in the second row 

could be transformed into standard form by inserting fillers into each syllable.

</p>

<p align="center"><b>Table 3-6. Syllable Break Examples</b></p>

<div align="center">

  <center>

  <table border="2" cellpadding="2" cellspacing="0">

    <tr>

      <td align="left">

      <p align="left">No.&nbsp;</td>

      <td align="left">Sequence</td>

      <td align="left">&nbsp;</td>

      <td align="left">Sequence with Syllable Breaks Marked</td>

    </tr>

    <tr>

      <td align="left">

      <p align="left">1&nbsp;</td>

      <td align="left">

      <p align="left">LVTLVLVLV<sub>f</sub>L<sub>f</sub>VL<sub>f</sub>V<sub>f</sub>T</td>

      <td align="left">→&nbsp;</td>

      <td align="left">LVT · LV · LV · LV<sub>f</sub> · L<sub>f</sub>V · L<sub>f</sub>V<sub>f</sub>T</td>

    </tr>

    <tr>

      <td align="left">

      <p align="left">2</td>

      <td align="left">LLTTVVTTVVLLVV</td>

      <td align="left">→</td>

      <td align="left">LL · TT · VVTT · VV · LL · LLVV</td>

    </tr>

    <tr>

      <td align="left">

      <p align="left">3</td>

      <td align="left">LLTTVVTTVVLLVV</td>

      <td align="left">→&nbsp;</td>

      <td align="left">LLV<sub>f</sub> · L<sub>f</sub>V<sub>f</sub>TT · L<sub>f</sub>VVTT 

      · L<sub>f</sub>VV · LLV<sub>f</sub> · LLVV</td>

    </tr>

  </table>

  </center>

</div>

<h3><a name="4_2_combining_classes_normative">4.2 Combining Classes—Normative</a> (revision)</h3>

<p>Remove the entry for U+06DD ARABIC END OF AYAH from <i>Table 4-3, Combining 

Classes</i> on page 80 of <i>The Unicode Standard, Version 3.0</i>.</p>

<h3>Unicode Standard Annex #15, “Unicode Normalization Forms” (revision)</h3>

<p>In Corrigendum #3 the canonical mapping for U+F951 has been corrected. For 

more information, see <a href="http://www.unicode.org/unicode/reports/tr15/">Unicode 

Standard Annex #15, “Unicode Normalization Forms”</a>.</p>

<h2 class="bb"><a name="general_structure_and_guidelines">III General Structure 

and Guidelines</a></h2>

<h3><a name="2_2_unicode_design_principles">2.2 Unicode Design Principles</a> 

(addition) </h3>

<p>Add the following text to page 18 of<i> The Unicode Standard, Version 3.0 </i>just before 

the subsection on “Convertibility”:</p>

<p><i><b>Decompositions</b></i></p>

<p>Precomposed characters are formally known as decomposables, because they have

decompositions to one or more other characters. There are two types of

decompositions:</p>

<ul>

  <li><b>Canonical.</b> The character and its decomposition should be treated as

    essentially equivalent.</li>

  <li><b>Compatibility. </b>The decomposition may remove some information

    (typically formatting information) that is important to preserve in

    particular contexts. By definition, compatibility decomposition is a 

  superset of canonical decomposition.</li>

</ul>

<p>Thus there are three types of characters, based on their decomposition

behavior:</p>

<ul>

  <li><b>Nondecomposable. </b>The character has no decomposition: neither

    canonical nor compatibility.</li>

  <li><b>Canonical Decomposable. </b>The character has a distinct canonical

    decomposition.</li>

  <li><b>Compatibility Decomposable. </b>The character has a distinct

    compatibility decomposition.</li>

</ul>

<p>The following figure illustrates these three types. The solid arrows indicate canonical decompositions, and the dotted arrows indicate compatibility decompositions. If an arrow loops back and points to the character itself, that indicates that there is no decomposition of that type (other than in the trivial sense of a character 

“decomposing” to itself).</p>



<p>The figure illustrates two important things to keep in mind:</p>



  <ul>

    <li>Decompositions may be to single characters <i>or</i> to

    sequences of characters. Decompositions to a single character,

    also known as <i>singleton decompositions,</i> are seen

    for the <i>ohm sign</i> and the <i>halfwidth katakana

    ka</i> in the figure. Because of examples like these,

    decomposable characters in Unicode do not always consist of

    obvious, separate parts; one can only know their status by

    examining the data tables for the standard.</li>

    <li>There are a very small number of characters that are both

    canonical <i>and</i> compatibility decomposable. The

    example shown in the figure is for the Greek hooked upsilon

    symbol with an acute accent. It has a canonical decomposition

    to one sequence and a compatibility decomposition to a different

    sequence.</li>

</ul>



  <p>For more precise definitions of some of these terms, see <i>Chapter 3,

Conformance</i> in <i>The Unicode Standard, Version 3.0</i>.</p>

<div align="center">

  <center>

  <table border="1" cellspacing="0" cellpadding="8" style="page-break-before:always">

    <tr>

      <th colspan="2" style="text-align: center"><font size="4">Nondecomposables</font>

        <p>

        <img border="0" src="nondecomp.gif" alt="nondecomposable example" width="289" height="179"></th>

    </tr>

    <tr>

      <th style="text-align: center"><font size="4">Canonical Decomposables</font>

        <p>

        <img border="0" src="cdecomp.gif" alt="canonical decomposable example" width="289" height="179"></p>

        <p>

        <img border="0" src="cdecomp2.gif" alt="canonical decomposable example" width="289" height="179"></p>

        <p>

        <img border="0" src="ckdecomp.gif" alt="canonical decomposable example" width="289" height="179"></th>

      <th style="text-align: center"><font size="4">Compatibility Decomposables</font>

        <p>

        <img border="0" src="kdecomp.gif" alt="compatibility decomposable example" width="289" height="179"></p>

        <p>

        <img border="0" src="kdecomp2.gif" alt="compatibility decomposable example" width="289" height="179"></p>

        <p>

        <img border="0" src="ckdecomp.gif" alt="compatibility decomposable example" width="289" height="179"></th>

    </tr>

  </table>

  </center>

</div>

<h3><a name="5_15_locating_text_element_boundaries">5.15 Locating Text Element 

Boundaries</a> (revision)</h3>

<p>Add the following text after bullet item 6 on page 125 of<i> The Unicode 

Standard, Version 3.0</i>:<i><br>

<br>

</i>The rules are applied in order. That is, there is an implicit “otherwise” at 

the front of each rule following the first. It is possible to construct 

alternate sets of such rules that are fully equivalent; that is, they have the 

same effect.</p>

<p>Note: The rules for default grapheme cluster boundaries, default word boundaries and default sentence 

boundaries are in the process of being superseded by a new 

<a href="http://www.unicode.org/unicode/reports/tr29/">Unicode Technical 

Report #29, Text Boundaries</a>.</p>

<h2 class="bb"><a name="block">IV Block Descriptions</a></h2>

<p>Note: The numbering used here for block descriptions and revised text follows

<i>The Unicode Standard, Version 3.0</i> for ease of cross-reference.</p>

<h3><a name="6_1_general_punctuation">6.1 General Punctuation</a> (addition)</h3>

<p><i><b>Invisible Operators</b></i>. In mathematics some operators or 

punctuation are often implied, but not displayed. U+2063 INVISIBLE SEPARATOR or

<i>invisible comma</i> is intended for use in index expressions and other 

mathematical notation where two adjacent variables form a list and are not 

implicitly multiplied. In mathematical notation, commas are not always 

explicitly present, but need to be indicated for symbolic calculation software 

to help it disambiguate a sequence from a multiplication. For example, the 

double <i><sub>ij</sub></i> subscript in the variable <i>a<sub>ij</sub></i> 

means <i>a<sub>i</sub></i><sub>, <i>j </i></sub>— that is, the <i>i</i> and <i>j</i> 

are separate indices and not a single variable with the name <i>ij</i> or even 

the product of <i>i</i> and <i>j</i>. Accordingly to represent the implied list 

separation in the subscript <i><sub>ij</sub></i> one can insert a nondisplaying

<i>invisible separator</i> between the <i>i</i> and the <i>j</i>. In addition, 

use of the invisible comma would hint to a math layout program to typeset a 

small space between the variables.</p>

<p>Similarly an expression like <i>mc</i><sup>2</sup> implies that the mass <i>m</i> 

multiplies the square of the speed <i>c</i>. To represent the implied 

multiplication in <i>mc</i><sup>2</sup>, one inserts a nondisplaying U+2061 

INVISIBLE TIMES between the <i>m</i> and the <i>c</i>. A related case is the use 

of U+2062 FUNCTION APPLICATION for an implied function dependence as in <i>f</i>(<i>x</i> 

+ <i>y</i>). To indicate that this is the function <i>f</i> of the quantity <i>x</i> 

+ <i>y</i> and not the expression <i>fx</i> + <i>fy</i>, one can insert the 

nondisplaying <i>function application symbol</i> between the <i>f</i> and the 

left parenthesis.&nbsp;</p>

<p>Another example is the expression <i>f <sup>ij</sup></i>(cos(<i>ab</i>)), 

which means the same as <i>f<sup>ij</sup></i>(cos(<i>a</i>×<i>b</i>)), where × 

represents <i>multiplication</i>, not the <i>cross product</i>. Note that the 

spacing between characters may also depend on whether the adjacent variables are 

part of a list or are to be concatenated, that is, multiplied.</p>

<p>A more complete discussion of mathematical notation can be found in

<a href="http://www.unicode.org/reports/tr25/">Proposed Draft Unicode Technical Report #25, “Unicode Support 

for Mathematics.”</a></p>

<p><i><b>Commercial Minus.</b></i> U+2052 COMMERCIAL MINUS SIGN is used in 

commercial or tax related forms or publications in several European countries, 

including Germany and Scandinavia. The string “./.” appears to be used as a 

fallback representation for this character.</p>

<p>The symbol may also appear as a marginal note in letters, denoting 

enclosures. One variation replaces the top dot with a digit indicating the 

number of enclosures.</p>

<p>An additional usage of the sign appears in the Finno-Ugric Phonetic Alphabet 

(FUPA), where it marks a structurally-related borrowed element of different 

pronunciation. In Finland and a number of other European countries, the dingbats 

<img src="U-2052.jpg" alt="U+2052" width="21" height="19"> and

<img src="U-2713.jpg" alt="U+2713" width="16" height="19"> are used for “correct” and “incorrect” 

respectively in marking a student’s paper. This contrasts with American 

practice, for example, where 

<img src="U-2713.jpg" alt="U+2713" width="16" height="19"> and

<img src="U-2717.jpg" alt="U+2717" width="12" height="19"> can be used for “correct” and “incorrect” 

respectively in the same context.</p>

<h3>CJK Symbols and Punctuation: U+3000–U+303F (update and addition)</h3>

<p>On page 155 of <i>The Unicode Standard, Version 3.0</i> update the first full 

paragraph as follows:</p>

<p>This block encodes punctuation marks and symbols <strike>primarily </strike>

used by writing systems that employ Han ideographs. Most of these characters are 

found in East Asian standards.</p>

<p>Add a new paragraph on page 155 of <i>The Unicode Standard, Version 3.0</i> to 

follow the paragraph on U+3006: </p>

<p>U+3008, U+3009 angle brackets are unambiguously wide. The Unicode Standard 

encodes different characters for use in other contexts, such as mathematics. 

There are other characters in this block that have the same characteristics, 

including double angle brackets, tortoise shell brackets, and white square 

brackets.</p>

<h3><a name="7_2_greek">7.2 Greek</a> (revision)</h3>

<h3>Representative Glyphs for Greek Phi</h3>

<p>With Unicode 3.0 and the concurrent second edition of ISO/IEC 10646-1, the 

representative glyphs for U+03C6 GREEK LETTER SMALL PHI and U+03D5 GREEK PHI SYMBOL 

were swapped. In ordinary Greek text, the character U+03C6 is used exclusively, 

although this characters has considerably glyphic variation, sometimes 

represented with a glyph more like the representative glyph shown for U+03C6 

(the “loopy” form) and less often with a glyph more like the representative 

glyph shown for U+03D5 (the “straight” form).</p>

<p>For mathematical and technical use, the straight form of the small phi is an 

important symbol and needs to be consistently distinguishable from the loopy 

form. The straight form phi glyph is used as the representative glyph for the 

symbol phi at U+03D5 to satisfy this distinction.</p>

<p>The reversed assignment of representative glyphs in versions of the Unicode 

Standard prior to Unicode 3.0 had the problem that the character explicitly 

identified as the mathematical symbol did not have the straight form of the 

character that is the preferred glyph for that use. Furthermore, it made it 

unnecessarily difficult for general purpose fonts supporting ordinary Greek text 

to also add support for Greek letters used as mathematical symbols. This 

resulted from the fact that many of those fonts already used the loopy form 

glyph for U+03C6, as preferred for Greek body text; to support the phi symbol as 

well, they would have had to disrupt glyph choices already optimized for Greek 

text.</p>

<p>When mapping symbol sets or SGML entities to the Unicode Standard, it is 

important to make sure that codes or entities that require the straight form of 

the phi symbol be mapped to U+03D5 and not to U+03C6. Mapping to the latter 

should be reserved for codes or entities that represent the small phi as used in 

ordinary Greek text.</p>

<p>Fonts used primarily for Greek text may use either glyph form for U+03C6, but 

fonts that also intend to support technical use of the Greek letters should use 

the loopy form to ensure appropriate contrast with the straight form used for 

U+03D5. </p>

<h3><a name="8_2_arabic">8.2 Arabic</a> (addition)</h3>

<p><b><i>End of Ayah. </i></b>U+06DD ARABIC END OF AYAH<i> </i>graphically 

encloses a sequence of zero or more digits (of General Category Nd) that follow 

it in the data stream. The enclosure terminates with any non-digit. For behavior 

of a similar prefixed formatting control, see the discussion of the Syriac 

Abbreviation Mark in <i>Section 8.3, Syriac</i> in <i>The Unicode Standard, 

Version 3.0</i>.</p>

<h3><a name="9_15_khmer">9.15 Khmer</a> (addition)</h3>

<p><b><i>Characters Whose Use is Discouraged.</i> </b>The use of the following characters 

is discouraged; they are being 

considered for possible deprecation in a future version of the Standard. These 

characters should be avoided in the normal representation of Khmer text:</p>

<p>17A3 KHMER INDEPENDENT VOWEL QAQ<br>

17A4 KHMER INDEPENDENT VOWEL QAA<br>

17B4 KHMER VOWEL INHERENT AQ<br>

17B5 KHMER VOWEL INHERENT AA<br>

17D3 KHMER SIGN BATHAMASAT<br>

17D8 KHMER SIGN BEYYAL</p>

<p>For transliteration of Pali/Sanskrit, U+17A2 KHMER LETTER QA is recommended instead of 

U+17A3 KHMER INDEPENDENT VOWEL QAQ, and the sequence &lt;U+17A2 KHMER LETTER QA, U+17B6 

KHMER VOWEL SIGN AA&gt; is recommended instead of 

U+17A4 KHMER INDEPENDENT VOWEL QAA.</p>

<p>The use of U+17D3 KHMER SIGN BATHAMASAT is not recommended for representation of Khmer lunar dates; 

a separate proposal for the full representation of Khmer lunar dates  is 

under development.</p>

<p>U+17D8 KHMER SIGN BEYYAL is not recommended for use in the Khmer word meaning, “etc.”. 

It should be spelled out  with a sequence of signs and letters instead.</p>

<p><i><b>Combined Vowels</b></i>. The Khmer language  uses two dependent 

vowel signs whose Unicode representation consists of a sequence of two code 

points. These are <i>khmer vowel sign srak om</i>, represented by the sequence 

&lt;U+17BB KHMER VOWEL SIGN U, U+17C6 KHMER SIGN NIKAHIT&gt; and <i>khmer vowel sign 

srak aam</i>, represented by the sequence &lt;U+17B6 KHMER VOWEL SIGN AA, U+17C6 

KHMER SIGN NIKAHIT&gt;. The <i>nikahit</i> represents the final nasalization of the 

vowel, shown by the “m” in the transliteration. These dependent vowels are treated as units, for the purposes of enumeration of 

the “letters” of Khmer, and most importantly for collation. Having these vowels 

represented by a sequence of two Unicode code points may be unexpected for Khmer 

implementers. It is important, therefore, to ensure that 

these sequences are treated as units when implementing Khmer.</p>

<p><i><b>Subscript Letters.</b></i> The Unicode encoding of the Khmer script 

uses an independent (and invisible) <i>coeng</i> sign to indicate that the 

following consonant is subscripted, by analogy with the virama model employed 

for representing conjuncts in Indian scripts. Subscripted independent vowels are 

encoded in the same manner. This approach  uses an artificial <i>coeng</i> sign 

character which does not exist as a letter or sign in the Khmer script, and 

therefore departs from the ordinary way that 

Khmer is conceived of and taught to native Khmer speakers.  Consequently, 

the encoding may not be intuitive to a native user of the Khmer writing system. Ordinarily, the units 

such as <i>khmer consonant coeng ka</i> are conceived of as independent and 

unitary subscript letters, rather than as a result of conjunct formation.</p>

<p>To aid Khmer script users, a full listing of all the  Khmer subscript letters 

has been provided in the table, “Additional Khmer Character Names”, together with appropriate names for them which follow 

preferred Khmer practice. While the Unicode encoding represents both the 

subscripts and the combined vowel letters with a pair of code points, they must 

be treated<i> as a unit</i> for most processing purposes. In other words they 

must function as if they had been encoded as a single character. The combined 

vowel characters are also included in this list, and should also be treated as a 

unit in processing.</p>

<p>A full Khmer script chart is also provided, showing <i>

all</i> of the Khmer characters preferred for modern Khmer usage, including the 

subscripts and combined vowels. This chart is better for didactic purposes in 

representing the Khmer script and its Unicode encoding. By contrast, the main 

Unicode code chart does not reflect the modern reading rules for Khmer, and 

thereby can give a misleading picture of the structure of the script.</p>

<div align="center">

  <center>

  <table width="75%">

    <caption>

      <b>

      <font size="3">Khmer Script Chart</font> </b>

    </caption>

    <tr>

      <th style="text-align:center" colspan="10">Consonants</th>

    </tr>

    <tr>

      <td width="10%" style="text-align: center">
      <img alt="1780" src="images/U1780.gif" width="52" height="62"><br>

        <tt>1780</tt></td>

      <td width="10%" style="text-align: center">
      <img alt="1781" src="images/U1781.gif" width="52" height="62"><br>

        <tt>1781</tt></td>

      <td width="10%" style="text-align: center">
      <img alt="1782" src="images/U1782.gif" width="52" height="62"><br>

        <tt>1782</tt></td>

      <td width="10%" style="text-align: center">
      <img alt="1783" src="images/U1783.gif" width="52" height="62"><br>

        <tt>1783</tt></td>

      <td width="10%" style="text-align: center">
      <img alt="1784" src="images/U1784.gif" width="52" height="62"><br>

        <tt>1784</tt></td>

      <td width="10%" style="text-align: center">
      <img alt="1785" src="images/U1785.gif" width="52" height="62"><br>

        <tt>1785</tt></td>

      <td width="10%" style="text-align: center">
      <img alt="1786" src="images/U1786.gif" width="52" height="62"><br>

        <tt>1786</tt></td>

      <td width="10%" style="text-align: center">
      <img alt="1787" src="images/U1787.gif" width="52" height="62"><br>

        <tt>1787</tt></td>

      <td width="10%" style="text-align: center">
      <img alt="1788" src="images/U1788.gif" width="52" height="62"><br>

        <tt>1788</tt></td>

      <td width="10%" style="text-align: center">
      <img alt="1789" src="images/U1789.gif" width="52" height="62"><br>

        <tt>1789</tt></td>

    </tr>

    <tr>

      <td style="text-align: center"><img alt="178A" src="images/U178A.gif" width="52" height="62"><br>

        <tt>178A</tt></td>

      <td style="text-align: center"><img alt="178B" src="images/U178B.gif" width="52" height="62"><br>

        <tt>178B</tt></td>

      <td style="text-align: center"><img alt="178C" src="images/U178C.gif" width="52" height="62"><br>

        <tt>178C</tt></td>

      <td style="text-align: center"><img alt="178D" src="images/U178D.gif" width="52" height="62"><br>

        <tt>178D</tt></td>

      <td style="text-align: center"><img alt="178E" src="images/U178E.gif" width="52" height="62"><br>

        <tt>178E</tt></td>

      <td style="text-align: center"><img alt="178F" src="images/U178F.gif" width="52" height="62"><br>

        <tt>178F</tt></td>

      <td style="text-align: center"><img alt="1790" src="images/U1790.gif" width="52" height="62"><br>

        <tt>1790</tt></td>

      <td style="text-align: center"><img alt="1791" src="images/U1791.gif" width="52" height="62"><br>

        <tt>1791</tt></td>

      <td style="text-align: center"><img alt="1792" src="images/U1792.gif" width="52" height="62"><br>

        <tt>1792</tt></td>

      <td style="text-align: center"><img alt="1793" src="images/U1793.gif" width="52" height="62"><br>

        <tt>1793</tt></td>

    </tr>

    <tr>

      <td style="text-align: center"><img alt="1794" src="images/U1794.gif" width="52" height="62"><br>

        <tt>1794</tt></td>

      <td style="text-align: center"><img alt="1795" src="images/U1795.gif" width="52" height="62"><br>

        <tt>1795</tt></td>

      <td style="text-align: center"><img alt="1796" src="images/U1796.gif" width="52" height="62"><br>

        <tt>1796</tt></td>

      <td style="text-align: center"><img alt="1797" src="images/U1797.gif" width="52" height="62"><br>

        <tt>1797</tt></td>

      <td style="text-align: center"><img alt="1798" src="images/U1798.gif" width="52" height="62"><br>

        <tt>1798</tt></td>

      <td style="text-align: center"><img alt="1799" src="images/U1799.gif" width="52" height="62"><br>

        <tt>1799</tt></td>

      <td style="text-align: center"><img alt="179A" src="images/U179A.gif" width="52" height="62"><br>

        <tt>179A</tt></td>

      <td style="text-align: center"><img alt="179B" src="images/U179B.gif" width="52" height="62"><br>

        <tt>179B</tt></td>

      <td style="text-align: center"><img alt="179C" src="images/U179C.gif" width="52" height="62"><br>

        <tt>179C</tt></td>

      <td style="text-align: center"><img alt="179D" src="images/U179D.gif" width="52" height="62"><br>

        <tt>179D</tt></td>

    </tr>

    <tr>

      <td style="text-align: center"><img alt="179E" src="images/U179E.gif" width="52" height="62"><br>

        <tt>179E</tt></td>

      <td style="text-align: center"><img alt="179F" src="images/U179F.gif" width="52" height="62"><br>

        <tt>179F</tt></td>

      <td style="text-align: center"><img alt="17A0" src="images/U17A0.gif" width="52" height="62"><br>

        <tt>17A0</tt></td>

      <td style="text-align: center"><img alt="17A1" src="images/U17A1.gif" width="52" height="62"><br>

        <tt>17A1</tt></td>

      <td style="text-align: center"><img alt="17A1" src="images/U17A2.gif" width="52" height="62"><br>

        <tt>17A2</tt></td>

      <td style="text-align: center">&nbsp;</td>

      <td style="text-align: center">&nbsp;</td>

      <td style="text-align: center">&nbsp;</td>

      <td style="text-align: center">&nbsp;</td>

      <td style="text-align: center">&nbsp;</td>

    </tr>

    <tr>

      <th style="text-align:center" colspan="10">Independent Vowels</th>

    </tr>

    <tr>

      <td style="text-align: center"><img alt="17A5" src="images/U17A5.gif" width="52" height="62"><br>

        <tt>17A5</tt></td>

      <td style="text-align: center"><img alt="17A6" src="images/U17A6.gif" width="52" height="62"><br>

        <tt>17A6</tt></td>

      <td style="text-align: center"><img alt="17A7" src="images/U17A7.gif" width="52" height="62"><br>

        <tt>17A7</tt></td>

      <td style="text-align: center"><img alt="17A9" src="images/U17A9.gif" width="52" height="62"><br>

        <tt>17A9</tt></td>

      <td style="text-align: center"><img alt="17AA" src="images/U17AA.gif" width="52" height="62"><br>

        <tt>17AA</tt></td>

      <td style="text-align: center"><img alt="17AB" src="images/U17AB.gif" width="52" height="62"><br>

        <tt>17AB</tt></td>

      <td style="text-align: center"><img alt="17AC" src="images/U17AC.gif" width="52" height="62"><br>

        <tt>17AC</tt></td>

      <td style="text-align: center"><img alt="17AD" src="images/U17AD.gif" width="52" height="62"><br>

        <tt>17AD</tt></td>

      <td style="text-align: center"><img alt="17AE" src="images/U17AE.gif" width="52" height="62"><br>

        <tt>17AE</tt></td>

      <td style="text-align: center"><img alt="17AF" src="images/U17AF.gif" width="52" height="62"><br>

        <tt>17AF</tt></td>

    </tr>

    <tr>

      <td style="text-align: center"><img alt="17B0" src="images/U17B0.gif" width="52" height="62"><br>

        <tt>17B0</tt></td>

      <td style="text-align: center"><img alt="17B1" src="images/U17B1.gif" width="52" height="62"><br>

        <tt>17B1</tt></td>

      <td style="text-align: center"><img alt="17B3" src="images/U17B3.gif" width="52" height="62"><br>

        <tt>17B3</tt></td>

      <td style="text-align: center">&nbsp;</td>

      <td style="text-align: center">&nbsp;</td>

      <td style="text-align: center">&nbsp;</td>

      <td style="text-align: center">&nbsp;</td>

      <td style="text-align: center">&nbsp;</td>

      <td style="text-align: center">&nbsp;</td>

      <td style="text-align: center">&nbsp;</td>

    </tr>

    <tr>

      <th style="text-align:center" colspan="10">Dependent Vowel Signs</th>

    </tr>

    <tr>

      <td style="text-align: center"><img alt="17B6" src="images/U17B6.gif" width="52" height="62"><br>

        <tt>17B6</tt></td>

      <td style="text-align: center"><img alt="17B7" src="images/U17B7.gif" width="52" height="62"><br>

        <tt>17B7</tt></td>

      <td style="text-align: center"><img alt="17B8" src="images/U17B8.gif" width="52" height="62"><br>

        <tt>17B8</tt></td>

      <td style="text-align: center"><img alt="17B9" src="images/U17B9.gif" width="52" height="62"><br>

        <tt>17B9</tt></td>

      <td style="text-align: center"><img alt="17BA" src="images/U17BA.gif" width="52" height="62"><br>

        <tt>17BA</tt></td>

      <td style="text-align: center"><img alt="17BB" src="images/U17BB.gif" width="52" height="62"><br>

        <tt>17BB</tt></td>

      <td style="text-align: center"><img alt="17BC" src="images/U17BC.gif" width="52" height="62"><br>

        <tt>17BC</tt></td>

      <td style="text-align: center"><img alt="17BD" src="images/U17BD.gif" width="52" height="62"><br>

        <tt>17BD</tt></td>

      <td style="text-align: center"><img alt="17BE" src="images/U17BE.gif" width="52" height="62"><br>

        <tt>17BE</tt></td>

      <td style="text-align: center"><img alt="17BF" src="images/U17BF.gif" width="52" height="62"><br>

        <tt>17BF</tt></td>

    </tr>

    <tr>

      <td style="text-align: center"><img alt="17C0" src="images/U17C0.gif" width="52" height="62"><br>

        <tt>17C0</tt></td>

      <td style="text-align: center"><img alt="17C1" src="images/U17C1.gif" width="52" height="62"><br>

        <tt>17C1</tt></td>

      <td style="text-align: center"><img alt="17C2" src="images/U17C2.gif" width="52" height="62"><br>

        <tt>17C2</tt></td>

      <td style="text-align: center"><img alt="17C3" src="images/U17C3.gif" width="52" height="62"><br>

        <tt>17C3</tt></td>

      <td style="text-align: center"><img alt="17C4" src="images/U17C4.gif" width="52" height="62"><br>

        <tt>17C4</tt></td>

      <td style="text-align: center"><img alt="17C5" src="images/U17C5.gif" width="52" height="62"><br>

        <tt>17C5</tt></td>

      <td style="text-align: center">
      <img alt="17BB 17C6" src="images/17BB17C6.gif" width="52" height="62"><br>

        <tt>17BB<br>

        17C6</tt></td>

      <td style="text-align: center"><img alt="17C6" src="images/U17C6.gif" width="52" height="62"><br>

        <tt>17C6</tt></td>

      <td style="text-align: center">
      <img alt="17B6 17C6" src="images/17B617C6.gif" width="52" height="62"><br>

        <tt>17B6<br>

        17C6</tt></td>

      <td style="text-align: center"><img alt="17C7" src="images/U17C7.gif" width="52" height="62"><br>

        <tt>17C7</tt></td>

    </tr>

    <tr>

      <th style="text-align:center" colspan="10">Subscript Characters</th>

    </tr>

    <tr>

      <td style="text-align: center">
      <img alt="17D2 1780" src="images/17D21780.gif" width="58" height="70"><br>

        <tt>17D2<br>

        1780</tt></td>

      <td style="text-align: center">
      <img alt="17D2 1781" src="images/17D21781.gif" width="58" height="70"><br>

        <tt>17D2<br>

        1781</tt></td>

      <td style="text-align: center">
      <img alt="17D2 1782" src="images/17D21782.gif" width="58" height="70"><br>

        <tt>17D2<br>

        1782</tt></td>

      <td style="text-align: center">
      <img alt="17D2 1783" src="images/17D21783.gif" width="58" height="70"><br>

        <tt>17D2<br>

        1783</tt></td>

      <td style="text-align: center">
      <img alt="17D2 1784" src="images/17D21784.gif" width="58" height="70"><br>

        <tt>17D2<br>

        1784</tt></td>

      <td style="text-align: center">
      <img alt="17D2 1785" src="images/17D21785.gif" width="58" height="70"><br>

        <tt>17D2<br>

        1785</tt></td>

      <td style="text-align: center">
      <img alt="17D2 1786" src="images/17D21786.gif" width="58" height="70"><br>

        <tt>17D2<br>

        1786</tt></td>

      <td style="text-align: center">
      <img alt="17D2 1787" src="images/17D21787.gif" width="58" height="70"><br>

        <tt>17D2<br>

        1787</tt></td>

      <td style="text-align: center">
      <img alt="17D2 1788" src="images/17D21788.gif" width="58" height="70"><br>

        <tt>17D2<br>

        1788</tt></td>

      <td style="text-align: center">
      <img alt="17D2 1789" src="images/17D21789.gif" width="58" height="70"><br>

        <tt>17D2<br>

        1789</tt></td>

    </tr>

    <tr>

      <td style="text-align: center">
      <img alt="17D2 178A" src="images/17D2178A.gif" width="58" height="70"><br>

        <tt>17D2<br>

        178A</tt></td>

      <td style="text-align: center">
      <img alt="17D2 178B" src="images/17D2178B.gif" width="58" height="70"><br>

        <tt>17D2<br>

        178B</tt></td>

      <td style="text-align: center">
      <img alt="17D2 178C" src="images/17D2178C.gif" width="58" height="70"><br>

        <tt>17D2<br>

        178C</tt></td>

      <td style="text-align: center">
      <img alt="17D2 178D" src="images/17D2178D.gif" width="58" height="70"><br>

        <tt>17D2<br>

        178D</tt></td>

      <td style="text-align: center">
      <img alt="17D2 178E" src="images/17D2178E.gif" width="58" height="70"><br>

        <tt>17D2<br>

        178E</tt></td>

      <td style="text-align: center">
      <img alt="17D2 178F" src="images/17D2178F.gif" width="58" height="70"><br>

        <tt>17D2<br>

        178F</tt></td>

      <td style="text-align: center">
      <img alt="17D2 1790" src="images/17D21790.gif" width="58" height="70"><br>

        <tt>17D2<br>

        1790</tt></td>

      <td style="text-align: center">
      <img alt="17D2 1791" src="images/17D21791.gif" width="58" height="70"><br>

        <tt>17D2<br>

        1791</tt></td>

      <td style="text-align: center">
      <img alt="17D2 1792" src="images/17D21792.gif" width="58" height="70"><br>

        <tt>17D2<br>

        1792</tt></td>

      <td style="text-align: center">
      <img alt="17D2 1793" src="images/17D21793.gif" width="58" height="70"><br>

        <tt>17D2<br>

        1793</tt></td>

    </tr>

    <tr>

      <td style="text-align: center">
      <img alt="17D2 1794" src="images/17D21794.gif" width="58" height="70"><br>

        <tt>17D2<br>

        1794</tt></td>

      <td style="text-align: center">
      <img alt="17D2 1795" src="images/17D21795.gif" width="58" height="70"><br>

        <tt>17D2<br>

        1795</tt></td>

      <td style="text-align: center">
      <img alt="17D2 1796" src="images/17D21796.gif" width="58" height="70"><br>

        <tt>17D2<br>

        1796</tt></td>

      <td style="text-align: center">
      <img alt="17D2 1797" src="images/17D21797.gif" width="58" height="70"><br>

        <tt>17D2<br>

        1797</tt></td>

      <td style="text-align: center">
      <img alt="17D2 1798" src="images/17D21798.gif" width="58" height="70"><br>

        <tt>17D2<br>

        1798</tt></td>

      <td style="text-align: center">
      <img alt="17D2 1799" src="images/17D21799.gif" width="58" height="70"><br>

        <tt>17D2<br>

        1799</tt></td>

      <td style="text-align: center">
      <img alt="17D2 179A" src="images/17D2179A.gif" width="58" height="70"><br>

        <tt>17D2<br>

        179A</tt></td>

      <td style="text-align: center">
      <img alt="17D2 179B" src="images/17D2179B.gif" width="58" height="70"><br>

        <tt>17D2<br>

        179B</tt></td>

      <td style="text-align: center">
      <img alt="17D2 179C" src="images/17D2179C.gif" width="58" height="70"><br>

        <tt>17D2<br>

        179C</tt></td>

      <td style="text-align: center">
      <img alt="17D2 179D" src="images/17D2179D.gif" width="58" height="70"><br>

        <tt>17D2<br>

        179D</tt></td>

    </tr>

    <tr>

      <td style="text-align: center">
      <img alt="17D2 179E" src="images/17D2179E.gif" width="58" height="70"><br>

        <tt>17D2<br>

        179E</tt></td>

      <td style="text-align: center">
      <img alt="17D2 179F" src="images/17D2179F.gif" width="58" height="70"><br>

        <tt>17D2<br>

        179F</tt></td>

      <td style="text-align: center">
      <img alt="17D2 17A0" src="images/17D217A0.gif" width="58" height="70"><br>

        <tt>17D2<br>

        17A0</tt></td>

      <td style="text-align: center">
      <img alt="17D2 17A2" src="images/17D217A2.gif" width="58" height="70"><br>

        <tt>17D2<br>

        17A2</tt></td>

      <td style="text-align: center">
      <img alt="17D2 17A7" src="images/17D217A7.gif" width="58" height="70"><br>

        <tt>17D2<br>

        17A7</tt></td>

      <td style="text-align: center">
      <img alt="17D2 17AB" src="images/17D217AB.gif" width="58" height="70"><br>

        <tt>17D2<br>

        17AB</tt></td>

      <td style="text-align: center">
      <img alt="17D2 17AF" src="images/17D217AF.gif" width="58" height="70"><br>

        <tt>17D2<br>

        17AF</tt></td>

      <td style="text-align: center">&nbsp;</td>

      <td style="text-align: center">&nbsp;</td>

      <td style="text-align: center">&nbsp;</td>

    </tr>

    <tr>

      <th style="text-align:center" colspan="10">Various Signs</th>

    </tr>

    <tr>

      <td style="text-align: center"><img alt="17C8" src="images/U17C8.gif" width="52" height="62"><br>

        <tt>17C8</tt></td>

      <td style="text-align: center"><img alt="17CB" src="images/U17CB.gif" width="52" height="62"><br>

        <tt>17CB</tt></td>

      <td style="text-align: center"><img alt="17CC" src="images/U17CC.gif" width="52" height="62"><br>

        <tt>17CC</tt></td>

      <td style="text-align: center"><img alt="17CD" src="images/U17CD.gif" width="52" height="62"><br>

        <tt>17CD</tt></td>

      <td style="text-align: center"><img alt="17CE" src="images/U17CE.gif" width="52" height="62"><br>

        <tt>17CE</tt></td>

      <td style="text-align: center"><img alt="17CF" src="images/U17CF.gif" width="52" height="62"><br>

        <tt>17CF</tt></td>

      <td style="text-align: center"><img alt="17D0" src="images/U17D0.gif" width="52" height="62"><br>

        <tt>17D0</tt></td>

      <td style="text-align: center"><img alt="17D1" src="images/U17D1.gif" width="52" height="62"><br>

        <tt>17D1</tt></td>

      <td style="text-align: center"><img alt="17D4" src="images/U17D4.gif" width="52" height="62"><br>

        <tt>17D4</tt></td>

      <td style="text-align: center"><img alt="17D5" src="images/U17D5.gif" width="52" height="62"><br>

        <tt>17D5</tt></td>

    </tr>

    <tr>

      <td style="text-align: center"><img alt="17D6" src="images/U17D6.gif" width="52" height="62"><br>

        <tt>17D6</tt></td>

      <td style="text-align: center"><img alt="17D7" src="images/U17D7.gif" width="52" height="62"><br>

        <tt>17D7</tt></td>

      <td style="text-align: center"><img alt="17D9" src="images/U17D9.gif" width="52" height="62"><br>

        <tt>17D9</tt></td>

      <td style="text-align: center"><img alt="17DA" src="images/U17DA.gif" width="52" height="62"><br>

        <tt>17DA</tt></td>

      <td style="text-align: center"><img alt="17DC" src="images/U17DC.gif" width="52" height="62"><br>

        <tt>17DC</tt></td>

      <td style="text-align: center"><img alt="17DB" src="images/U17DB.gif" width="52" height="62"><br>

        <tt>17DB</tt></td>

      <td style="text-align: center"><img alt="17C9" src="images/U17C9.gif" width="52" height="62"><br>

        <tt>17C9</tt></td>

      <td style="text-align: center"><img alt="17CA" src="images/U17CA.gif" width="52" height="62"><br>

        <tt>17CA</tt></td>

      <td style="text-align: center">&nbsp;</td>

      <td style="text-align: center">&nbsp;</td>

    </tr>

    <tr>

      <th style="text-align:center" colspan="10">Digits</th>

    </tr>

    <tr>

      <td style="text-align: center"><img alt="17E0" src="images/U17E0.gif" width="52" height="62"><br>

        <tt>17E0</tt></td>

      <td style="text-align: center"><img alt="17E1" src="images/U17E1.gif" width="52" height="62"><br>

        <tt>17E1</tt></td>

      <td style="text-align: center"><img alt="17E2" src="images/U17E2.gif" width="52" height="62"><br>

        <tt>17E2</tt></td>

      <td style="text-align: center"><img alt="17E3" src="images/U17E3.gif" width="52" height="62"><br>

        <tt>17E3</tt></td>

      <td style="text-align: center"><img alt="17E4" src="images/U17E4.gif" width="52" height="62"><br>

        <tt>17E4</tt></td>

      <td style="text-align: center"><img alt="17E5" src="images/U17E5.gif" width="52" height="62"><br>

        <tt>17E5</tt></td>

      <td style="text-align: center"><img alt="17E6" src="images/U17E6.gif" width="52" height="62"><br>

        <tt>17E6</tt></td>

      <td style="text-align: center"><img alt="17E7" src="images/U17E7.gif" width="52" height="62"><br>

        <tt>17E7</tt></td>

      <td style="text-align: center"><img alt="17E8" src="images/U17E8.gif" width="52" height="62"><br>

        <tt>17E8</tt></td>

      <td style="text-align: center"><img alt="17E9" src="images/U17E9.gif" width="52" height="62"><br>

        <tt>17E9</tt></td>

    </tr>

  </table>

  </center>

</div>

<p>&nbsp;</p>

<center>

<b>Additional Khmer Character Names</b>



<table>

<tr><td align="center" style="vertical-align: middle"><b>Glyph</b></td>

  <td align="center" style="vertical-align: middle"><b>Code</b></td>

  <td align="center" style="vertical-align: middle"><b>Name</b></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17BB17C6.gif" alt="17BB,17C6" width="52" height="62"></td>

  <td style="vertical-align: middle">17BB 17C6</td>

  <td style="vertical-align: middle"><i>khmer vowel sign srak om</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17B617C6.gif" alt="17B6,17C6" width="52" height="62"></td>

  <td style="vertical-align: middle">17B6 17C6</td>

  <td style="vertical-align: middle"><i>khmer vowel sign srak am</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D21780.gif" alt="17D2,1780" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 1780</td>

  <td style="vertical-align: middle"><i>khmer consonant sign coeng ka</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D21781.gif" alt="17D2,1781" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 1781</td>

  <td style="vertical-align: middle"><i>khmer consonant sign coeng kha</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D21782.gif" alt="17D2,1782" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 1782</td>

  <td style="vertical-align: middle"><i>khmer consonant sign coeng ko</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D21783.gif" alt="17D2,1783" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 1783</td>

  <td style="vertical-align: middle"><i>khmer consonant sign coeng kho</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D21784.gif" alt="17D2,1784" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 1784</td>

  <td style="vertical-align: middle"><i>khmer consonant sign coeng ngo</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D21785.gif" alt="17D2,1785" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 1785</td>

  <td style="vertical-align: middle"><i>khmer consonant sign coeng ca</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D21786.gif" alt="17D2,1786" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 1786</td>

  <td style="vertical-align: middle"><i>khmer consonant sign coeng cha</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D21787.gif" alt="17D2,1787" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 1787</td>

  <td style="vertical-align: middle"><i>khmer consonant sign coeng co</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D21788.gif" alt="17D2,1788" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 1788</td>

  <td style="vertical-align: middle"><i>khmer consonant sign coeng cho</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D21789.gif" alt="17D2,1789" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 1789</td>

  <td style="vertical-align: middle"><i>khmer consonant sign coeng nyo</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D2178A.gif" alt="17D2,178A" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 178A</td>

  <td style="vertical-align: middle"><i>khmer consonant sign coeng da</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D2178B.gif" alt="17D2,178B" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 178B</td>

  <td style="vertical-align: middle"><i>khmer consonant sign coeng ttha</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D2178C.gif" alt="17D2,178C" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 178C</td>

  <td style="vertical-align: middle"><i>khmer consonant sign coeng do</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D2178D.gif" alt="17D2,178D" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 178D</td>

  <td style="vertical-align: middle"><i>khmer consonant sign coeng ttho</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D2178E.gif" alt="17D2,178E" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 178E</td>

  <td style="vertical-align: middle"><i>khmer consonant sign coeng na</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D2178F.gif" alt="17D2,178F" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 178F</td>

  <td style="vertical-align: middle"><i>khmer consonant sign coeng ta</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D21790.gif" alt="17D2,1790" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 1790</td>

  <td style="vertical-align: middle"><i>khmer consonant sign coeng tha</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D21791.gif" alt="17D2,1791" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 1791</td>

  <td style="vertical-align: middle"><i>khmer consonant sign coeng to</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D21792.gif" alt="17D2,1792" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 1792</td>

  <td style="vertical-align: middle"><i>khmer consonant sign coeng tho</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D21793.gif" alt="17D2,1793" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 1793</td>

  <td style="vertical-align: middle"><i>khmer consonant sign coeng no</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D21794.gif" alt="17D2,1794" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 1794</td>

  <td style="vertical-align: middle"><i>khmer consonant sign coeng ba</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D21795.gif" alt="17D2,1795" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 1795</td>

  <td style="vertical-align: middle"><i>khmer consonant sign coeng pha</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D21796.gif" alt="17D2,1796" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 1796</td>

  <td style="vertical-align: middle"><i>khmer consonant sign coeng po</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D21797.gif" alt="17D2,1797" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 1797</td>

  <td style="vertical-align: middle"><i>khmer consonant sign coeng pho</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D21798.gif" alt="17D2,1798" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 1798</td>

  <td style="vertical-align: middle"><i>khmer consonant sign coeng mo</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D21799.gif" alt="17D2,1799" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 1799</td>

  <td style="vertical-align: middle"><i>khmer consonant sign coeng yo</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D2179A.gif" alt="17D2,179A" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 179A</td>

  <td style="vertical-align: middle"><i>khmer consonant sign coeng ro</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D2179B.gif" alt="17D2,179B" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 179B</td>

  <td style="vertical-align: middle"><i>khmer consonant sign coeng lo</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D2179C.gif" alt="17D2,179C" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 179C</td>

  <td style="vertical-align: middle"><i>khmer consonant sign coeng vo</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D2179D.gif" alt="17D2,179D" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 179D</td>

  <td style="vertical-align: middle"><i>khmer consonant sign coeng sha</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D2179E.gif" alt="17D2,179E" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 179E</td>

  <td style="vertical-align: middle"><i>khmer consonant sign coeng ssa</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D2179F.gif" alt="17D2,179F" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 179F</td>

  <td style="vertical-align: middle"><i>khmer consonant sign coeng sa</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D217A0.gif" alt="17D2,17A0" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 17A0</td>

  <td style="vertical-align: middle"><i>khmer consonant sign coeng ha</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D217A2.gif" alt="17D2,17A2" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 17A2</td>

  <td style="vertical-align: middle"><i>khmer consonant sign coeng qa</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D217A7.gif" alt="17D2,17A7" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 17A7</td>

  <td style="vertical-align: middle"><i>khmer vowel sign coeng qu</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D217AB.gif" alt="17D2,17AB" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 17AB</td>

  <td style="vertical-align: middle"><i>khmer vowel sign coeng ry</i></td></tr>

<tr><td style="vertical-align: middle">

  <img src="images/17D217AF.gif" alt="17D2,17AF" width="58" height="70"></td>

  <td style="vertical-align: middle">17D2 17AF</td>

  <td style="vertical-align: middle"><i>khmer vowel sign coeng qe</i></td></tr>

</table>



</center>

<p>&nbsp;</p>

<h3><a name="9_16_philippine_scripts">9.16 Philippine Scripts</a> (new section)&nbsp;</h3>

<h3>Tagalog: U+1700..U+171F<br>

Hanunóo: U+1720..U+173F<br>

Buhid: U+1740..U+175F<br>

Tagbanwa: U+1760..U+177F</h3>

<p>The first of these four scripts, Tagalog, is no longer used, although the 

other three, Hanunóo, Buhid, and Tagbanwa, are living scripts of the 

Philippines. South Indian scripts of the Pallava dynasty made their way to the 

Philippines, although the exact route is uncertain. They may have been 

transported by way of the Kavi scripts of Western Java between the 10th and 14th 

centuries CE.&nbsp;</p>

<p>There are written accounts of the Tagalog script by Spanish missionaries, and 

documents in Tagalog dating from the mid-1500s. The first book in this script 

was printed in Manila in 1593. While the Tagalog script was used to write 

Tagalog, Bisaya, Ilocano, and other languages, it fell out of normal use by the 

mid-1700s; modern Tagalog language is now written in the Latin script.&nbsp;</p>

<p>The three living scripts, Hanunóo, Buhid, and Tagbanwa, are related to 

Tagalog, but may not be directly descended from it. The Hanunóo and the Buhid 

peoples live in Mindoro, while the Tagbanwa live in Palawan. Hanunóo enjoys the 

most use; it is widely used to write love poetry, a popular pastime among the 

Hanunóo. Tagbanwa is less used.</p>

<h3>Principles of the Scripts</h3>

<p>The Philippine scripts share features with the other Brahmi-derived scripts 

to which they are related.</p>

<p><i><b>Consonant Letters.</b></i> Philippine scripts have consonants 

containing an inherent <i>-a</i> vowel, which may be modified by the addition of 

vowel signs or canceled (killed) by the use of a virama-type mark.</p>

<p><b><i>Independent Vowel Letters.</i></b> Philippine scripts have null 

consonants which are used to write syllables that start with a vowel.</p>

<p><i><b>Dependent Vowel Signs.</b></i> The vowel <i>-i</i> is written with a 

mark above the associated consonant, and the vowel <i>-u</i> with an identical 

mark below. The mark is known in Tagalog as <i>kudlit </i>“diacritic,” <i>tuldik</i> 

“accent,” or <i>tildok</i> “dot,” and <i>ulitan</i> “diacritic” in Tagbanwa. The 

Philippine scripts employ only the two vowel signs <i>i</i> and <i>u</i>, which 

are also used to stand for the vowels <i>e</i> and <i>o</i> respectively.</p>

<p><i><b>Virama.</b></i> Though all languages normally written with the 

Philippine scripts have syllables ending in consonants, not all of the scripts 

have a mechanism for expressing the canceled <i>-a</i>. As a result, in those 

orthographies, the final consonants are unexpressed. Francisco Lopez introduced 

a cross-shaped <i>virama</i> in his 1620 catechism in the Ilocano language, but 

this innovation did not seem to find favor with native users, who seem to have 

considered the script adequate without it (they preferred

<img src="kakapi-1.jpg" alt="image for kakapi" width="52" height="14"> <i>kakapi</i> to

<img src="kakampi-2.jpg" alt="image for kakampi" width="68" height="14"> <i>kakampi</i>). A similar 

reform for the Hanunóo script seems to have been better received. The Hanunóo <i>

pamudpod</i> was devised by Antoon Postma, who went to the Philippines from the 

Netherlands in the mid-1950s. In traditional orthography,

<img src="si-apu-1.jpg" alt="image for si apu ba upada" width="116" height="17"> <i>si apu ba upada</i> 

is, with the <i>pamudpod</i>, rendered more accurately as

<img src="si-aypud-2.jpg" alt="image for si aypud bay upadan" width="205" height="20"> <i>si aypud bay 

upadan</i>; the Hanunóo pronunciation is <i>si aypod bay upadan</i>. The Tagalog

<i>virama</i> and Hanunóo <i>pamudpod</i> cancel only the inherent <i>-a</i>. No 

conjunct consonants are employed in the Philippine scripts.</p>

<p><i><b>Directionality.</b></i> The Philippine scripts are read from left to 

right in horizontal lines running from top to bottom. They may be written or 

carved either in that manner, or in vertical lines running from bottom to top, 

moving from left to right. In the latter case, the letters are written sideways 

so they may be read horizontally. This method of writing is probably due to the 

medium and writing implements used. Text is often scratched with a sharp 

instrument onto beaten strips of bamboo which are held pointing away from the 

body and worked from the proximal to distal ends, in columns from left to right.</p>

<p><i><b>Rendering.</b></i> In Tagalog and Tagbanwa, the vowel signs simply rest 

over or under the consonants. In Hanunóo and Buhid, however, special ligatures 

are often formed as shown in the following tables.</p>

<center>

<table style="page-break-before:always; border-collapse:collapse" class="noborder" cellpadding="0" cellspacing="0"> <tr>

    <td class="noborder">

    <p align="center"><b>Hanunóo</b></td>

    <td class="noborder">

    <p align="center"><b>Buhid</b></td>

  </tr>

  <tr>

    <td class="noborder">

    <img border="0" src="phil1.jpg" alt="Table for Hanunoo" width="240" height="448"></td>

    <td class="noborder"><img border="0" src="phil2.jpg" alt="Table for Buhid" width="240" height="448"></td>

  </tr>

</table>

</center>

<p><i><b>Punctuation.</b></i> Punctuation has been unified for the Philippine 

scripts. In the Hanunóo block, U+1735 PHILIPPINE SINGLE PUNCTUATION and U+1736 

PHILIPPINE DOUBLE PUNCTUATION are encoded. Tagalog makes use only of the latter; 

Hanunóo, Buhid, and Tagbanwa make use of both of them. </p>

<h3><a name="10_1_han">10.1 Han</a> (addition)</h3>

<h3>CJK Compatibility Ideographs (addition)&nbsp;</h3>

<p>Unicode 3.2 adds 59 new ideographs to the Compatibility Ideographs block. 

These new compatibility ideographs are found from U+FA30 to U+FA6A. They are 

included in the Unicode Standard to provide full round-trip compatibility with 

the ideographic repertoire of JIS X 0213:2000 and should not be used for any 

other purpose.</p>

<h3><a name="10_3_katakana">10.3 Katakana</a>&nbsp;(addition)</h3>

<h3>Katakana Phonetic Extensions (addition)&nbsp;</h3>

<p>Katakana Phonetic Extensions: U+31F0..U+31FF</p>

<p>These extensions to the Katakana syllabary are all “small” variants. They are 

used in Japan for phonetic transcription of Ainu and other languages.</p>

<h3><a name="10_4_hangul">10.4 Hangul</a> (addition)&nbsp;</h3>

<h3>Hangul Compatibility Jamo</h3>

<p>When Hangul compatibility jamo are transformed with a compatibility 

normalization form, NFKD or NFKC, the characters are converted to the 

corresponding conjoining jamo characters. Where the characters are intended to 

remain in separate syllables after such transformation, they may require 

separation from adjacent characters. This can be done by inserting any 

non-Korean character.</p>

<ul>

  <li>U+200B ZERO-WIDTH SPACE is recommended where the characters are to allow 

  line-break.</li>

  <li>U+2060 WORD JOINER can&nbsp; be used where the characters are not to break 

  across lines.</li>

</ul>

<p>For example, the table below illustrates how two Hangul compatibility jamo 

can be separated in display, even after transforming with NFKD or NFKC.</p>

<center>

<table border="1" cellspacing="0" cellpadding="4">

  <caption><b>Separating Jamo Characters</b></caption>

  <tr>

    <th width="25%" style="text-align: center">Original</th>

    <th width="25%" style="text-align: center">&nbsp;NFKD</th>

    <th width="25%" style="text-align: center">&nbsp;NFKC</th>

    <th width="25%" style="text-align: center">Display</th>

  </tr>

  <tr>

    <td class="n">

    <table>

      <tr>

        <td class="q">

        <img src="http://www.unicode.org/cgi-bin/refglyph?24-3131" alt="U+3131"><br>

        <tt class="n">3131</tt></td>

        <td class="q">

        <img src="http://www.unicode.org/cgi-bin/refglyph?24-314F" alt="U+314F"><br>

        <tt class="n">314F</tt></td>

      </tr>

    </table>

    </td>

    <td class="n">

    <table>

      <tr>

        <td class="q">

        <img src="http://www.unicode.org/cgi-bin/refglyph?24-1100" alt="U+1100"><br>

        <tt class="n">1100</tt></td>

        <td class="q">

        <img src="http://www.unicode.org/cgi-bin/refglyph?24-1161" alt="U+1161"><br>

        <tt class="n">1161</tt></td>

      </tr>

    </table>

    </td>

    <td class="n">

    <table>

      <tr>

        <td class="q">

        <img src="http://www.unicode.org/cgi-bin/refglyph?24-AC00" alt="U+AC00"><br>

        <tt class="n">AC00</tt></td>

      </tr>

    </table>

    </td>

    <td class="n"><img src="http://www.unicode.org/cgi-bin/refglyph?24-AC00" 

    alt="Glyph for U+AC00"></td>

  </tr>

  <tr>

    <td class="n">

    <table>

      <tr>

        <td class="q">

        <img src="http://www.unicode.org/cgi-bin/refglyph?24-3131" alt="U+3131"><br>

        <tt class="n">3131</tt></td>

        <td class="q">

        <img src="http://www.unicode.org/cgi-bin/refglyph?24-200B" alt="U+200B"><br>

        <tt class="n">200B</tt></td>

        <td class="q">

        <img src="http://www.unicode.org/cgi-bin/refglyph?24-314F" alt="U+314F"><br>

        <tt class="n">314F</tt></td>

      </tr>

    </table>

    </td>

    <td class="n">

    <table>

      <tr>

        <td class="q">

        <img src="http://www.unicode.org/cgi-bin/refglyph?24-1100" alt="U+1100"><br>

        <tt class="n">1100</tt></td>

        <td class="q">

        <img src="http://www.unicode.org/cgi-bin/refglyph?24-200B" alt="U+200B"><br>

        <tt class="n">200B</tt></td>

        <td class="q">

        <img src="http://www.unicode.org/cgi-bin/refglyph?24-1161" alt="U+1161"><br>

        <tt class="n">1161</tt></td>

      </tr>

    </table>

    </td>

    <td class="n">

    <table>

      <tr>

        <td class="q">

        <img src="http://www.unicode.org/cgi-bin/refglyph?24-1100" alt="U+1100"><br>

        <tt class="n">1100</tt></td>

        <td class="q">

        <img src="http://www.unicode.org/cgi-bin/refglyph?24-200B" alt="U+200B"><br>

        <tt class="n">200B</tt></td>

        <td class="q">

        <img src="http://www.unicode.org/cgi-bin/refglyph?24-1161" alt="U+1161"><br>

        <tt class="n">1161</tt></td>

      </tr>

    </table>

    </td>

    <td class="n"><img src="http://www.unicode.org/cgi-bin/refglyph?24-3131" 

    alt="Glyph for U+3131"><img 

    src="http://www.unicode.org/cgi-bin/refglyph?24-314F" 

    alt="Glyph for U+314F"></td>

  </tr>

</table>

</center>

<p><br>

</p>

<h3><a name="11_4_mongolian">11.4 Mongolian</a> (addition)</h3>

<h3>Standardized Variants of Mongolian Characters (addition)&nbsp;</h3>

<p>Like Arabic letters, Mongolian letters have various presentation forms 

depending on their positions in words. There are additional linguistic 

constraints that result in variations that must be employed in specific 

contexts, creating the need for several Mongolian-specific variant selectors, 

which are encoded at U+180B, U+180C, and U+180D.</p>

<p>The table of standardized variants in the Unicode Character Database found at

<a href="http://www.unicode.org/Public/3.2-Update/StandardizedVariants-3.2.0.html">

http://www.unicode.org/Public/3.2-Update/StandardizedVariants-3.2.0.html</a> 

provides a description of the variant appearances corresponding to the use of 

appropriate variation selectors with all allowed base Mongolian characters. Only 

some presentation forms of the base Mongolian characters used with the Mongolian 

free variation selectors produce variant appearances. These combinations are 

exhaustively listed and described in the table. All combinations not listed in 

the table are unspecified and are reserved for future standardization; no 

conformant process may interpret them as standardized variants.</p>

<p>For more information, see <i><a href="#13_7_variation_selectors">Section 

13.7, Variation Selectors</a></i>, later in this document.</p>

<h3><a name="12_4_mathematical_operators">12.4 Mathematical Operators</a> 

(additions)</h3>

<p>In addition to the symbols in these blocks, mathematical and scientific 

notation makes frequent use of arrows, punctuation characters, letterlike 

symbols, geometrical shapes and other miscellaneous and technical symbols. For 

additional information on all the mathematical operators and other symbols, see

<a href="http://www.unicode.org/reports/tr25/">Proposed Draft Unicode Technical Report #25, “Unicode Support 

for Mathematics.”</a></p>

<p>Other symbols used in mathematical and scientific notation can be found in 

the Geometric Shapes block. For an extensive discussion of mathematical 

alphanumeric symbols, see <i>Section 12.2, Letterlike Symbols</i> in <i>The 

Unicode Standard, Version 3.0</i>. For additional information on all the 

mathematical operators and other symbols, see <a href="http://www.unicode.org/reports/tr25/">Proposed Draft 

Unicode Technical Report #25, “Unicode Support for Mathematics.”</a></p>

<h3>Supplements to Mathematical Operators and Arrows</h3>

<p>The Unicode Standard defines a number of additional blocks to supplement the 

repertoire of mathematical operators and arrows. These additions are intended to 

extend the Unicode repertoire sufficiently to cover the needs of such 

applications as MathML, modern mathematical formula editing and presentation 

software, and symbolic algebra systems.</p>

<p><i><b>Standards.</b></i> MathML, an XML application, is intended to support 

the full legacy collection of the ISO mathematical entity sets. Accordingly, the 

repertoire of mathematical symbols for the Unicode Standard has been 

supplemented by the full list of mathematical entity sets in ISO TR 9573-13, <i>

Public entity sets for mathematics and science</i>. Additional repertoire was 

provided from the amalgamated collection of the STIX Project (Scientific and 

Technical Information Exchange). That collection includes, but is not limited 

to, symbols gleaned from mathematical publications by experts of the American 

Mathematical Society and symbol sets provided by Elsevier Publishing and by the 

American Physical Society.</p>

<p><i><b>Semantics.</b></i> The same mathematical symbol may have different 

meanings in different subdisciplines or different contexts. The Unicode Standard 

only encodes a single character for a single symbolic form. For example, the “+” 

symbol normally denotes addition in a mathematical context, but might refer to 

concatenation in a computer science context dealing with strings, or 

incrementation, or have any number of other functions in given contexts. It is 

up to the application to distinguish such meanings according to the appropriate 

context. Where information is available about the usage (or usages) of 

particular symbols, it has been indicated in the character annotations in 

<i>Chapter 14, Code Charts</i> in <i>The Unicode Standard, Version 3.0</i>.</p>

<h3>Supplemental Mathematical Operators: U+2A00–U+2AFF</h3>

<p>This block contains many additional symbols to supplement the collection of 

mathematical operators.</p>

<h3>Miscellaneous Mathematical Symbols-A: U+27C0–U+27EF</h3>

<p>This block contains symbols used mostly as operators or delimiters in 

mathematical notation.</p>

<p><i><b>Mathematical Brackets.</b></i> The mathematical white square brackets, 

angle brackets, and double angle brackets encoded at U+27E6..U+27EB are intended 

for ordinary mathematical use of these particular bracket types. They are 

unambiguously narrow, for use in mathematical and scientific notation, and 

should be distinguished from the corresponding wide forms of white square 

brackets, angle brackets, and double angle brackets used in CJK typography. (See 

the CJK Symbols and Punctuation block.) Note especially that the “bra” and “ket” 

angle brackets, U+2329 LEFT-POINTING ANGLE BRACKET and U+232A RIGHT-POINTING 

ANGLE BRACKET, are now deprecated for use with mathematics because of their 

canonical equivalence to CJK angle brackets, which is likely to result in 

unintended spacing problems if used in mathematical formulae.</p>

<h3>Miscellaneous Mathematical Symbols-B: U+2980–U+29FF</h3>

<p>This block contains miscellaneous symbols used for mathematical notation, 

including fences and other delimiters. Some of the symbols in this block may 

also be used as operators in some contexts.</p>

<p><b><i>Wiggly Fence</i></b>. U+29DB LEFT WIGGLY FENCE has a superficial 

similarity to U+FE34 PRESENTATION FORM FOR VERTICAL LOW LINE. The latter is a 

wiggly sidebar character, intended for legacy support as an style of underlining 

character in a vertical text layout context; it has a compatibility mapping to 

U+005F LOW LINE. This represents a very different usage from the standard use of 

fence characters in mathematical notation.</p>

<h3>Supplemental Arrows-A: U+27F0–U+27FF</h3>

<p>This block contains a small additional set of arrows to supplement the main 

set in the Arrows block.</p>

<p><i><b>Long Arrows.</b></i> The long arrows encoded in the range 

U+27F5..U+27FF map to standard SGML entity sets supported by MathML. Long arrows 

represent distinct semantics from their short counterparts, rather than mere 

stylistic glyph differences. For example, the shorter forms of arrows are often 

used in connection with limits, whereas the longer ones are associated with 

mappings. The use of the long arrows is so common that they were assigned entity 

names in the ISOAMSA entity set, one of the suite of mathematical symbol entity 

sets covered by the Unicode Standard.</p>

<h3>Supplemental Arrows-B:U+2900–U+297F</h3>

<p>This block contains a large additional repertoire of arrows to round out the 

main set in the Arrows block.</p>

<h3><a name="12_5_technical_symbols">12.5 Technical Symbols</a> (additions)</h3>

<h3>Miscellaneous Technical: U+2300-U+23FF (additions)</h3>

<p><b><i>Keytop Labels.</i></b> [to precede “Crops and Quine Corners”] Where 

possible, keytop labels have been unified with other symbols of like appearance, 

for example U+21E7 UPWARDS WHITE ARROW to indicate the shift key. While symbols 

such as U+2318 PLACE OF INTEREST SIGN and U+2388 HELM SYMBOL are generic symbols 

that have been adapted to use on keytops, other symbols specifically follow ISO/IEC 

9995-7.</p>

<p><b><i>Angle Brackets.</i></b> [to follow “Crops and Quine Corners”] U+2329 

LEFT-POINTING ANGLE BRACKET and U+232A RIGHT-POINTING ANGLE BRACKET have long 

been canonically equivalent to the CJK punctuation characters, U+3008 LEFT ANGLE 

BRACKET and U+3009 RIGHT ANGLE BRACKET, respectively. This canonical equivalence 

implies that the use of the latter (CJK) code points is preferred, and that 

U+2329 and U+232A are also “wide” characters. (See <a href="http://www.unicode.org/reports/tr11/"><i>Unicode 

Standard Annex #11, “East Asian Width</i></a><a href="http://www.unicode.org/reports/tr25/">”</a>, for the 

definition of the East Asian wide property.) Because of this fact, the use of 

U+2329 and U+232A is deprecated for mathematics and technical publication, where 

the wide property of the characters has the potential for interfering with 

proper formatting of mathematical formulae. Instead, use the angle brackets 

specifically provided for mathematics: U+27E8 MATHEMATICAL LEFT ANGLE BRACKET 

and U+27E9 MATHEMATICAL RIGHT ANGLE BRACKET. See <i>

<a href="#12_4_mathematical_operators">Section 12.4, Mathematical Operators</a>

</i>earlier in this document<i>.</i></p>

<p><i><b>Symbol Pieces.</b></i> [to follow “APL Functional Symbols”] The 

characters in the range U+239B..U+23B3, plus U+23B7, comprise a set of bracket 

and other symbol fragments for use in mathematical typesetting. These pieces 

originated in older font standards, but have been used in past mathematical 

processing as characters in their own right to make up extra-tall glyphs for 

enclosing multi-line mathematical formulae. Mathematical fences are ordinarily 

sized to the content that they enclose. However, in creating a large fence, the 

glyph is not scaled proportionally; in particular the displayed stem weights 

must remain compatible with the accompanying smaller characters. Thus, simple 

scaling of font outlines cannot be used to create tall brackets. Instead, a 

common technique is to build up the symbol from pieces. In particular, the 

characters U+239B LEFT PARENTHESIS UPPER HOOK through U+23B3 SUMMATION BOTTOM 

represent a set of glyph pieces for building up large versions of the fences (, 

), [, ], {, and }, and of the large operators ∑ and ∫. These brace and operator 

pieces are compatibility characters. They should not be used in stored 

mathematical text, but are often used in the data stream created by display and 

print drivers.</p>

<p>The following table shows which pieces are intended to be used together to 

create specific symbols.</p>

<p align="center"><b>Use of Symbol Pieces</b></p>

<div align="center">

  <table border="2" cellpadding="2" cellspacing="0">

    <tr>

      <td>&nbsp;</td>

      <td>2-row</td>

      <td>3-row</td>

      <td>5-row</td>

    </tr>

    <tr>

      <td>Summation </td>

      <td>23B2, 23B3 </td>

      <td>&nbsp;</td>

      <td>&nbsp;</td>

    </tr>

    <tr>

      <td>Integral</td>

      <td>2320, 2321</td>

      <td>2320, 23AE, 2321</td>

      <td>2320, 3×23AE, 2321</td>

    </tr>

    <tr>

      <td>Left Parenthesis</td>

      <td>239B, 239D</td>

      <td>239B, 239D</td>

      <td>239B, 3×239C, 239D</td>

    </tr>

    <tr>

      <td>Right Parenthesis</td>

      <td>239E, 23A0</td>

      <td>239E, 239F, 23A0</td>

      <td>239E, 3×239F, 23A0</td>

    </tr>

    <tr>

      <td>Left Bracket&nbsp;</td>

      <td>23A1, 23A3</td>

      <td>23A1, 23A2, 23A4&nbsp;</td>

      <td>23A1, 3×23A2, 23A3</td>

    </tr>

    <tr>

      <td>Right Bracket&nbsp;&nbsp;&nbsp;</td>

      <td>23A4, 23A6</td>

      <td>23A4, 23A5, 23A6</td>

      <td>

      <p align="left">23A4, 3×23A5, 23A6</td>

    </tr>

    <tr>

      <td>Left Brace</td>

      <td>23B0, 23B1</td>

      <td>23A7, 23A8, 2389</td>

      <td>23A7, 23AA, 23A8, 23AA, 2389</td>

    </tr>

    <tr>

      <td>Right Brace&nbsp;&nbsp;&nbsp;</td>

      <td>23B1, 23B0</td>

      <td>23AB, 23AC, 23AD</td>

      <td>23AB, 23AA, 23AC, 23AA, 23AD</td>

    </tr>

  </table>

</div>

<p>For example, an instance of U+239B can be positioned relative to instances of 

U+239C and U+239D to form an extra-tall (three or more line) left parenthesis. 

The center sections encoded here are meant to be used only with the top and 

bottom pieces encoded adjacent to them because&nbsp; the segments are usually 

graphically constructed within the fonts so that they match perfectly when 

positioned at the same <i>x</i> coordinates.</p>

<p><i><b>Vertical Square Brackets.</b></i> The vertical square brackets, U+23B4 

TOP SQUARE BRACKET and U+23B5 BOTTOM SQUARE BRACKET, are compatibility 

characters for legacy applications emulating certain terminals. They are 

intended for those terminal applications only, for limited use in 

vertically-oriented bracketed expressions. U+23B6 BOTTOM SQUARE BRACKET OVER TOP 

SQUARE BRACKET is used when a single character cell is both the end of one such 

expression and the start of another. These compatibility characters should not 

be confused with the general need for rotated <i>glyphs</i> for parentheses, 

brackets, braces, and quotation marks for vertically rendered CJK text. Such 

rotations should be handled by fonts and rendering software, rather than by 

separate encoding of each rotated glyph as a character. See further discussion 

in <i>Section 6.1, General Punctuation</i> in <i>The Unicode Standard, Version 

3.0.</i></p>

<p><i><b>Terminal Graphics Characters.</b></i> In addition to the box-drawing 

characters in the Box Drawing block, a small number of additional vertical or 

horizontal line characters are encoded in the Miscellaneous Technical symbols 

block to complete the set of compatibility characters needed for applications 

which need to emulate various old terminals. The horizontal scan line 

characters, U+23BA HORIZONTAL SCAN LINE-1 through U+23BD HORIZONTAL SCAN LINE-9, 

in particular, represent characters that were encoded in character ROM for use 

with 9-line character graphic cells. Horizontal scan line characters are encoded 

for scan lines 1, 3, 7, and 9. The horizontal scan line character for scan line 

5 is unified with U+2500 BOX DRAWINGS LIGHT HORIZONTAL.</p>

<p><i><b>Dental Symbols. </b></i>The set of symbols from U+23BE to U+23CC form a 

set of symbols from JIS X0213 for use in dental notation.</p>

<p><i><b>Standards. </b></i>This block contains a large number of symbols from 

ISO/IEC 9995-7:1994, <i>Information technology—Keyboard layouts for text and 

office systems—Part 7: Symbols used to represent functions</i>. </p>

<h3><a name="12_7_miscellaneous_symbols_and_dingbats">12.7 Miscellaneous Symbols 

and Dingbats</a> (new subsection, revision and addition)</h3>

<h3>Recycling Symbols (new subsection in Miscellaneous Symbols: U+2600-U+26FF)</h3>

<p><i><b>Plastic Bottle Material Code System</b></i>. The seven numbered logos 

encoded from U+2673 to U+2679 

<img src="PBMCS.jpg" 

alt="images for U+2673 to U+2679" width="211" height="31"> are from “The Plastic Bottle Material Code 

System,” introduced in 1988 by the Society of the Plastics Industry (SPI) (see

<a href="http://www.socplas.org">http://www.socplas.org</a>). This set 

consistently uses thin, two-dimensional curved arrows suitable for use in 

plastics molding. In actual use, the symbols often are combined with an 

abbreviation of the material class below the triangle. Such abbreviations are 

not universal, therefore they are not present in the representative glyphs in <i>Chapter 14, Code Charts</i> in <i>The Unicode Standard, Version 3.0</i>.</p>

<p><i><b>Recycling Symbol for Generic Materials</b></i>. An unnumbered plastic 

resin code symbol U+267A <img src="U-267A.jpg" width="33" height="30" 

alt="U+267A"> RECYCLING SYMBOL FOR GENERIC MATERIALS is not formally part of the 

SPI system, but is found in many fonts. Occasional use of this symbol as a 

generic materials code symbol can be found in the field, usually with a text 

legend below, but sometimes also surrounding (or overlaid by) other text or 

symbols. Sometimes, the UNIVERSAL RECYCLING SYMBOL is substituted for the 

generic symbol in this context.</p>

<p><i><b>Universal Recycling Symbol.</b></i> Unicode encodes two common glyph 

variants of this symbol, U+2672 <img src="U-2672.jpg" width="38" height="31" 

alt="U+2672"> UNIVERSAL RECYCLING SYMBOL and U+267B <img src="U-267B.jpg" 

width="35" height="32" alt="U+267B"> BLACK UNIVERSAL RECYCLING SYMBOL. Both are 

used to indicate that the material is recyclable. The white form is the 

traditional version of the symbol, but the black form is sometimes substituted, 

presumably because the thin outlines of the white form do not always reproduce 

well.</p>

<p><b><i>Paper Recycling Symbols.</i></b> The two paper recycling symbols U+267C

<img src="U-267C.jpg" width="32" height="30" alt="U+267C"> RECYCLED PAPER SYMBOL 

and U+267D <img src="U-267D.jpg" width="33" height="29" alt="U+267D"> 

PARTIALLY-RECYCLED PAPER SYMBOL can be used to distinguish fully and partially 

recycled fiber content in paper products or packaging. They are usually 

accompanied by additional text.</p>

<h3>Dingbats: U+2700-U+27BF (revision)&nbsp;</h3>

<p>The following text replaces the text on Dingbats on pages 305-306 of <i>The 

Unicode Standard, Version 3.0</i>:</p>

<p>The Dingbats are derived from a well-established set of glyphs, the ITC Zapf 

Dingbats series 100, which comprises the industry standard “Zapf Dingbat” font 

currently available in most laser printers. Other series of dingbat glyphs also 

exist, but are not encoded in the Unicode Standard because they are not widely 

implemented in existing hardware and software as character-encoded fonts. The 

order of the Dingbats block basically follows the PostScript encoding.</p>

<p><b><i>Unifications.</i></b> Where a dingbat from the ITC Zapf Dingbats series 

100 could be unified with a generic symbol widely used in other contexts, only 

the generic symbol was encoded. This accounts for the encoding gaps in the 

Dingbats block. Examples of such unifications include card suits, BLACK STAR, 

BLACK TELEPHONE, and BLACK RIGHT-POINTING INDEX (see “Miscellaneous Symbols”); 

BLACK CIRCLE and BLACK SQUARE (see “Geometric Shapes”); white encircled numbers 

1 to 10 (see “Enclosed Alphanumerics”); and several generic arrows (see 

“Arrows”). Those four entries appear elsewhere in this section.</p>

<p>In other instances, other glyphs from the ITC Zapf Dingbats series 100 glyphs 

have come to be recognized as having applicability as generic symbols, despite 

having originally been encoded in the Dingbats block. For example, the series of 

negative (black) circled numbers 1 to 10 are now treated as generic symbols for 

this sequence, the continuation of which can be found in “Enclosed Alphanumerics”. 

Other examples include U+2708 AIRPLANE and U+2709 ENVELOPE, which have definite 

semantics independent of the specific glyph shape, and which therefore should be 

considered generic symbols, rather than as symbols representing only the Zapf 

Dingbat glyph shapes.</p>

<p>For many of the remaining characters in the Dingbat block, their semantic 

value is primarily their shape; unlike characters that represent letters from a 

script, there is no well-established range of typeface variations for a dingbat 

that will retain its identity and therefore its semantics. It would be incorrect 

to arbitrarily replace U+279D TRIANGLE-HEADED RIGHTWARDS ARROW with any other 

right arrow dingbat or with any of the generic arrows from the Arrows block 

(U+2190..U+21FF). But exact shape retention for the glyphs is not always 

required in order to maintain the relevant distinctions. For example, ornamental 

characters such as U+2741 EIGHT PETALLED OUTLINE BLACK FLORETTE have been 

successfully implemented in font faces other than Zapf Dingbats with glyph 

shapes which are similar, but not identical to the ITC Zapf Dingbats series 100.</p>

<p>The following guidelines are provided for font developers wishing to support 

this block of characters. Characters showing large sets of contrastive glyph 

shapes in the Dingbats block, and in particular the various arrow shapes at 

U+2794..U+27BE, should have glyphs that are closely modeled on the ITC Zapf 

Dingbats series 100, which are shown as representative glyphs in the code charts. The 

same applies to the various stars, asterisks, and snowflakes, drop-shadowed 

squares, checkmarks, and x’s, many of which are ornamental, and have an 

elaborate name describing their glyph.</p>

<p>Where the above does not apply, or where dingbats have more generic 

applicability as a symbol, their glyphs do not need not to match the representative 

glyphs in the code charts in every detail.</p>

<h3>Ornamental Brackets (addition to Dingbats: U+2700-U+27BF)</h3>

<p><b><i>Ornamental Brackets.</i></b> The 14 ornamental brackets encoded at 

U+2768..U+2775 are a late addition to the set of Zapf Dingbats encoded in the 

Unicode Standard. Although they have always been included in Zapf Dingbats 

fonts, they were unencoded in PostScript versions of the fonts on some 

platforms, and hence were omitted from the original set encoded in Unicode. They 

have been added for compatibility and consistency in handling of the cmaps for 

current versions of the fonts.</p>

<h3><a name="12_12_standardized_variants_of_mathematical_symbols">12.12 

Standardized Variants of Mathematical Symbols</a> (new section)</h3>

<p>These mathematical variants are all produced with the addition of U+FE00 

VARIATION SELECTOR-1 (VS1) to mathematical operator base characters. Only the 

valid, recognized combinations are listed in the table of standardized variants. 

All combinations not listed here are unspecified and are reserved for future 

standardization; no conformant process may interpret them as standardized 

variants.</p>

<h3>Change in Representative Glyphs for U+2278 and U+2279</h3>

<p>In Unicode 3.2 the representative glyphs for U+2278 NEITHER LESS-THAN NOR 

GREATER-THAN and U+2279 NEITHER GREATER-THAN NOR LESS-THAN are changed from using a vertical cancellation to using a 

slanted cancellation. This change was made to match the long standing canonical decompositions for these characters, which use 

U+0338 COMBINING LONG SOLIDUS OVERLAY. Irrespective of this change to the 

representative glyphs, the symmetric forms using the vertical stroke are 

acceptable glyph variants. Using U+2278 or U+2279 with VS1 will request these 

variants explicitly, as will using U+2276 LESS-THAN OR GREATER-THAN or U+2277 

GREATER-THAN OR LESS-THAN with U+20D2 COMBINING LONG VERTICAL LINE 

OVERLAY. Unless fonts are  

created with the intention to add support for both forms (via VS1 for the  

upright forms), there is no need to revise the glyphs in existing fonts; the  

glyphic range implied by using the base character code alone encompasses both  

shapes.</p>

<p>For more information, see <i><a href="#13_7_variation_selectors">Section 

13.7, Variation Selectors</a></i>, later in this document.</p>

<h3><a name="13_2_layout_controls">13.2 Layout Controls</a> (additions)</h3>

<h3>Combining Grapheme Joiner (U+034F) (addition)&nbsp;</h3>

<p>The <i>combining grapheme joiner</i> is used to indicate that adjacent characters 

belong to the same grapheme cluster. Grapheme clusters are sequences of one or 

more encoded characters that correspond to what users think of as characters. 

They include, but are not limited to, combining character sequences such as (g + 

°), digraphs such as Slovak “ch”, or sequences with letter modifiers such as k<sup>w</sup>. 

Grapheme cluster boundaries are important for collation, regular-expressions, 

and counting “character” positions within text. The Unicode Standard provides a 

determination of where the default grapheme boundaries fall in a string of 

characters. This algorithm can be customized for specific locales. </p>

<p>Note: The rules for default grapheme cluster boundaries, default word boundaries and default sentence 

boundaries are in the process of being superseded by a new 

<a href="http://www.unicode.org/unicode/reports/tr29/">Unicode Technical 

Report #29, Text Boundaries</a>.</p>

<p>There are circumstances where even the locale-specific determination of 

grapheme boundaries may need to be further tailored on a local basis. These 

include:</p>

<ul>

  <li>Determining the placement of combining accents that should apply to a 

  sequence of base characters, rather than a single base character.</li>

  <li>Distinguishing in collation between sequences of characters that are 

  normally considered a grapheme in a particular language, and that same 

  sequence in foreign words.</li>

</ul>

<p>The character U+034F COMBINING GRAPHEME JOINER has been added to prevent 

inappropriate grapheme breaks. The properties of this character are specified so 

as to work well with current software for such processes as grapheme-cluster 

determination, line-break, and collation. In terms of grapheme determination it 

functions like the Indic <i>viramas</i>. Thus a sequence 

functions as a single grapheme.</p>

<p>The grapheme joiner prevents line breaking between adjacent characters; 

however, where the prevention of line breaking is the only desired effect, the word joiner should be used 

instead (see <a href="http://www.unicode.org/reports/tr14/">Unicode Standard Annex #14, “Line Breaking 

Properties”</a>). In collation, the grapheme joiner should be ignored unless it 

specifically occurs within a tailored collation element mapping. Thus it is 

given a completely ignorable collation element in the default collation table, 

like NULL (see <a href="http://www.unicode.org/reports/tr10/">Unicode Technical Standard #10, “Unicode 

Collation Algorithm”</a> and also ISO/IEC 14651). However, it can be entered 

into the tailoring rules for any given language, using the UCA and ISO/IEC 14651 

tailoring capabilities.</p>

<p>For rendering, the grapheme joiner is an invisible combining character with 

canonical class of zero. It can bind adjacent characters into a base for 

combining marks in circumstances described in “Applications of Combining Marks” 

in <i><a href="#3_9_special_character_properties">Section 3.9, Special Character 

Properties (revision)</a></i> in this document. For 

any specified repertoire, implementation support for this capability can be 

provided by means of ligature tables in the font, or by means of special 

placement rules (see

<a href="http://partners.adobe.com/asn/developer/opentype/main.html">

http://partners.adobe.com/asn/developer/opentype/main.html</a>). Some display 

engines may be able to supply runtime generative support. As with other 

combining marks, there is considerable latitude for display depending on the 

environment (such as the choice of font). </p>

<p>The combining grapheme joiner must not be confused with the <i>zero width 

joiner,</i> or the <i>word joiner,</i> which have very different functions. In 

particular, inserting a <i>combining grapheme joiner</i> between two characters 

has no effect on their ligation or cursive joining behavior.</p>

<h3>Word Joiner (U+2060) (addition)&nbsp;</h3>

<p>In Unicode 3.1.1 and before, the codepoint U+FEFF serves two very different 

purposes:</p>

<ul>

  <li>It is used as a zero-width non-breaking space (ZWNBSP), with applicability 

  across a wide range of scripts and usages.</li>

  <li>It is also used as a signature, with a very specific use at the start of 

  files or streams. See <i>Section 2.7, Special Character and Noncharacter 

  Values</i>, <i>Section 3.8, Transformations</i>, and <i>Section 13.6, Specials</i> 

  in <i>The Unicode Standard, Version 3.0</i>.</li>

</ul>

<p>If U+FEFF had only the semantic of a signature codepoint, it could be freely 

deleted from text without affecting the interpretation of the rest of the text. 

Carelessly appending files together, for example, can result in a signature 

codepoint in the middle of text. Unfortunately, U+FEFF also has significance as 

a character. As a ZWNBSP, it indicates that line breaks are not allowed between 

the adjoining characters. Thus U+FEFF impacts the interpretation of text, and 

cannot be freely deleted. The overloading of semantics for this codepoint has 

caused problems for programs and protocols.</p>

<p>The new character U+2060 WORD JOINER has the same semantics in all cases as 

U+FEFF, except that it <i>cannot</i> be used as a signature. That is, the 

function of the character is to indicate that the two adjacent characters should 

not be broken across lines. See the GL category in <a href="http://www.unicode.org/reports/tr14/">Unicode 

Standard Annex #14, “Line Breaking Properties”</a>. In other contexts the 

character should be ignored.</p>

<p>Unicode 3.2 implementations should support this new character, but also 

support the ZWNBSP semantic of U+FEFF.</p>

<p>Note: Implementers are strongly encouraged to use word joiner in those 

circumstances whenever word joining semantics is intended.</p>

<p>The word joiner must not be confused with the <i>zero width joiner</i> or the

<i>combining grapheme joiner,</i> which have very different functions. In 

particular, inserting a <i>word joiner</i> between two characters has no effect on 

their ligating or cursive joining behavior.</p>

<h3>Ligatures and Latin Typography (addition)</h3>

<p>It is the task of the rendering system to select a ligature (where ligatures 

are possible) as part of the task of creating the most pleasing line layout. 

Fonts that provide more ligatures give the rendering system more options.</p>

<p>However, defining the locations where ligatures are possible cannot be done 

by the rendering system, because there are many languages in which this depends 

not on simple letter pair context but on the meaning of the word in question.&nbsp;</p>

<p>ZWJ and ZWNJ are to be used for the latter task, marking the non-regular 

cases where ligatures are required or prohibited. This is different from 

selecting a degree of ligation for stylistic reasons. Such selection is best 

done with style markup. See

<a href="http://www.unicode.org/unicode/reports/tr20/">Unicode Technical Report 

#20, “Unicode in XML and other Markup Languages”</a> for more information.</p>

<h3><a name="13_7_variation_selectors">13.7 Variation Selectors</a> (new 

section)</h3>

<p>Unicode characters can be represented by a wide variety of glyphs, as

discussed in <i>Chapter 2</i><span lang="en-us"><i>, General Structure</i> in<i> 

The Unicode Standard, Version 3.0</i>.</span><i> </i>Occasionally the need arises in text

processing to restrict or change the set of glyphs that are to be used to

represent a character. Normally such changes are indicated by choice of font or

style in rich-text documents. In special circumstances, such a variation from

the normal range of appearance needs to be expressed side-by-side in the same

document in plain-text contexts, where it is impossible or inconvenient to

exchange formatted text. For example, in languages employing the Mongolian

script, sometimes a specific variant range of glyphs is needed for a specific

textual purpose for which the range of “generic” glyphs is considered

inappropriate. The variation selectors are used when characters have essentially

the same semantic.</p>

<p>Variation selectors provide a mechanism for specifying a restriction on the

set of glyphs that are used to represent a particular character. They also

provide a mechanism for specifying variants, such as for CJK Ideographs and

Mongolian, that have essentially the same semantic but have substantially

different ranges of glyphs. A variation sequence, which always consists of a

base character followed by the variation selector, may be specified as part of

the Unicode Standard. That sequence is referred to as a <i>variant</i> of the

base character. The variation selector affects <i>only</i> the appearance of the

base character,* and only in the variation sequences defined in this Standard.

The variation selector is <i>not </i>used as a general code extension mechanism:</p>

<blockquote>

  <p><i>Only the variation sequences specifically defined in the Unicode

  Character Database in the file <a href="http://www.unicode.org/Public/3.2-Update/StandardizedVariants-3.2.0.html">StandardizedVariants.html</a>

  are sanctioned for standard use; in all other cases the variation selector

  cannot change the visual appearance of the preceding base character from what

  it would have had, in the absence of the variation selector.</i></p>

</blockquote>

<p>The base character in a variation sequence is never a combining character or

a decomposable character.* The variation selectors themselves are combining marks

of combining class 0, and are default ignorable characters. Thus if the

variation sequence is not supported, the variation selector should be invisible

and ignored. As with all default ignorable characters, this does not preclude

modes or environments where the variation selectors should be given visible

appearance. For example, a “Show Hidden” mode could reveal the presence of

such characters with specialized glyphs, or particular environment could use or

require a visual indication of a base character (such as a wavy underline) to

show that it is part of a standardized variation sequence that cannot be

supported by the current font.</p>

<p>The standardization or support of a particular variation sequence does <i>not</i>

limit the set of glyphs that can be used to represent the base character alone.

If a user <i>requires</i> a visual distinction between a character and a

particular variant of that character, then fonts must be used to make that

distinction. The existence of a variation sequence does not preclude the later

encoding of a new character with a distinct semantic and a similar or

overlapping range of glyphs.</p>



<blockquote>* Note: Just before publication, an inconsistency was discovered between the 

above principles and the standardization of the two variant sequences &lt;2278, 

FE00&gt; and &lt;2279, FE00&gt; because U+2278 and U+2279 are in fact decomposable 

characters. Those variant sequences denote glyph variants of these mathematical 

symbols with a vertical line instead of a slanted line as the diacritic to indicate the negation.<p>The sequence &lt;2278, FE00&gt; is canonically equivalent to &lt;2276, 0338, FE00&gt;, and 

the sequence &lt;2279, FE00&gt; is canonically equivalent to &lt;2277, 0338, FE00&gt;. So 

that these equivalent sequences are given equivalent rendering treatment, the 

use of U+FE00 would have to be interpreted—exceptionally—as 

defining a variant appearance for the <i>entire</i> sequence.</p>

  <p>Because a combining vertical line overlay, U+20D2 COMBINING LONG VERTICAL LINE 

OVERLAY, is also available in the Standard, an alternate way of explicitly 

indicating these particular variants already exists. That alternative mechanism 

is a safer and more stable way to indicate the distinction, as the inherent 

complications in allowing variation selectors to follow combining marks may 

require future corrective action to remove the exceptional variant sequences 

&lt;2278, FE00&gt; and &lt;2279, FE00&gt; from the table.</p>

</blockquote>



<h3><a name="14.1_character_names_list">14.1 Character Names List</a> (addition)</h3>

<p>Add the following text to the end of <i>Section 14.1, Character Names List</i> 

on page 335, <i>The Unicode Standard, Version 3.0</i>:</p>

<h3>Subheads</h3>

<p>The character names list contains a number of informative subheads which help divide up the list into smaller sublists of similar characters. For example, in the Miscellaneous Symbols block, U+2600..U+26FF, there are subheads for 

“Astrological symbols”, “Chess symbols”, and so on. Such subheads are editorial and informative, and should not be taken as providing any definitive, normative status information about characters in the sublists they mark, nor about any constraints on what characters could be encoded in the future at reserved code points within their ranges. 

The subheads are subject to change.</p>

<h2 class="bb"><a name="charts">V Code Charts</a></h2>

<p>The following code charts contain the characters added in Unicode 3.2.&nbsp;They 

are shown together with the characters that were part of Unicode 3.1. New 

characters are shown on a yellow background in these code charts.</p>

<ul>

  <li><a href="http://www.unicode.org/charts/PDF/U32-0180.pdf">Latin Extended-B</a></li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-0300.pdf">Combining 

  Diacritical Marks</a> </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-0370.pdf">Greek and Coptic</a>

  </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-0400.pdf">Cyrillic</a> </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-0500.pdf">Cyrillic 

  Supplement</a> </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-0600.pdf">Arabic</a> </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-0780.pdf">Thaana</a> </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-10A0.pdf">Georgian</a> </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-1700.pdf">Tagalog</a> </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-1720.pdf">Hanunoo</a> </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-1740.pdf">Buhid</a> </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-1760.pdf">Tagbanwa</a> </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-2000.pdf">General 

  Punctuation</a> </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-2070.pdf">Superscripts and 

  Subscripts</a> </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-20A0.pdf">Currency Symbols</a>

  </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-20D0.pdf">Combining 

  Diacritical Marks for Symbols</a> </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-2100.pdf">Letterlike 

  Symbols</a> </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-2190.pdf">Arrows</a> </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-2200.pdf">Mathematical 

  Operators</a> </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-2300.pdf">Miscellaneous 

  Technical</a> </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-2460.pdf">Enclosed 

  Alphanumerics</a> </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-2580.pdf">Block Elements</a>

  </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-25A0.pdf">Geometric Shapes</a>

  </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-2600.pdf">Miscellaneous 

  Symbols</a> </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-2700.pdf">Dingbats</a> </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-27C0.pdf">Miscellaneous 

  Mathematical Symbols-A</a> </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-27F0.pdf">Supplemental 

  Arrows-A</a> </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-2900.pdf">Supplemental 

  Arrows-B</a> </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-2980.pdf">Miscellaneous 

  Mathematical Symbols-B</a> </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-2A00.pdf">Supplemental 

  Mathematical Operators</a> </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-3000.pdf">CJK Symbols and 

  Punctuation</a> </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-3040.pdf">Hiragana</a> </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-30A0.pdf">Katakana</a> </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-31F0.pdf">Katakana Phonetic 

  Extensions</a> </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-3200.pdf">Enclosed CJK 

  Letters and Months</a> </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-A490.pdf">Yi Radicals</a>

  </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-F900.pdf">CJK Compatibility 

  Ideographs</a> </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-FE70.pdf">Arabic 

  Presentation Forms-B</a> </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-FE00.pdf">Variation 

  Selectors</a> </li>

  <li><a href="http://www.unicode.org/charts/PDF/U32-FE30.pdf">CJK Compatibility 

  Forms</a></li>

</ul>

<p>&nbsp; </p>

<blockquote>

  <table border="1" width="85%" cellpadding="3" cellspacing="0">

    <tr>

      <td width="85%" height="15">

      <p align="center"><b><i><u>Code Charts Notice:</u></i></b> </p>

      <p>Annotations for many characters have been added or revised throughout 

      the code charts. These are not mentioned explicitly in the list above. 

      Please see <a href="http://www.unicode.org/charts">

      http://www.unicode.org/charts</a> for a list of all code charts.</td>

    </tr>

  </table>

</blockquote>

<h2 class="bb"><a name="errata">VI Errata</a></h2>

<p>This article contains errata rolled up since the publication of <i>The 

Unicode Standard, Version 3.1</i>. These errata are listed by date in the table 

below. For prior errata from Unicode 3.1, see the errata listed in <i>Unicode 
Standard Annex #27: Unicode 3.1</i> (<a href="http://www.unicode.org/reports/tr27/#errata">http://www.unicode.org/reports/tr27/#errata</a>).</p>

<table border="1">

  <tr>

    <th width="20%">Date&nbsp;</th>

    <th width="85%">Summary&nbsp;</th>

  </tr>

  <tr>

    <td width="20%">2002 February 26</td>

    <td width="85%">Corrigendum #3: U+F951 Normalization posted.<br>

    NOTE: This corrigendum is incorporated in, and superseded by, this document.

    </td>

  </tr>

  <tr>

    <td width="20%">2002 January 18</td>

    <td width="85%">In UAX #27: Unicode 3.1, in Article IV, Guidelines under the 

    subsection Unassigned Code Points, “U+FFFC” should instead read “U+FFFB” in 

    the following sentence:<br>

    To allow a greater degree of compatibility across versions of the standard, 

    the ranges of U+2060..U+206F, U+FFF0..U+FFFB, and U+E0000..U+E0FFF are 

    reserved for format and control characters (General Category = Cf).</td>

  </tr>

  <tr>

    <td width="20%" valign="top">2001 September 25</td>

    <td width="80%">The character U+0B83 TAMIL SIGN VISARGA is actually a 

    stand-alone character, not a combining character. This character&#39;s General 

    Category has been changed from “Mc” to “Lo” in accordance with this. The 

    glyph on the left below shows the character in previous charts; the glyph on 

    the right shows the character as it should appear (without a dotted circle). See 
    <a href="http://www.unicode.org/charts/PDF/U32-0B80.pdf">http://www.unicode.org/charts/PDF/U32-0B80.pdf</a>.

    <p><img border="0" src="Tamil0B83-before.jpg" alt="prior U+0B83" width="149" height="171">&nbsp;&nbsp;&nbsp; 

    <img border="0" 

    src="Tamil0B83-after.jpg" alt="corrected U+0B83" width="149" height="171">

    </td>

  </tr>

  <tr>

    <td width="20%" valign="top">2001 April 25</td>

    <td width="80%">On p. 500, in the Unicode names list in TUS 3.0, the glyph 

    for U+2032 was omitted. It is shown correctly in the code chart on page 498 
    or see <a href="http://www.unicode.org/charts/PDF/U2000.pdf">http://www.unicode.org/charts/PDF/U2000.pdf</a>.</td>

  </tr>

</table>

<h2 class="bb"><a name="database">VII Unicode Character Database Changes</a></h2>

<p>The main change to the <a href="http://www.unicode.org/Public/3.2-Update/">

Unicode Character Database for Unicode 3.2</a> is the extension of the data 

files to cover the character repertoire addition. This most importantly impacts 

UnicodeData.txt, LineBreak.txt, and EastAsianWidth.txt, each of which has been 

extended to cover all the newly encoded characters. Also, an updated informative 

NamesList.txt file is provided to cover the new repertoire.</p>

<p><b><i>Property and Property Value Aliases.</i></b> The PropertyAliases and 

PropertyValueAliases files contain contain recommended UCD property identifiers 

and property value identifiers. These identifers can be used for XML formats of 

UCD data, for regular-expression property tests, and other programmatic textual 

descriptions of Unicode data. In comparing identifiers, case differences should 

not be significant, and the presence or absence of an underbar should be 

ignored. The identifiers in the PropertyAliases and PropertyValueAliases files 

are normative in the following sense: </p>

<blockquote>

  <p>Where the identifiers are used to refer to Unicode properties or property 

  values, they can only be used&nbsp;in accordance with the Unicode Character 

  Database semantics.</p>

</blockquote>

<p>This does not prevent implementations from using other identifiers to refer 

to Unicode property or property values. For example, there is nothing to prevent 

the use of French translations of the identifiers.</p>

<p><b><i>Blocks.</i></b> The normative blocks defined in Blocks.txt have been 

adjusted slightly, in accordance with Unicode Technical Committee decisions.</p>

<ul>

  <li>Every block starts and ends on a column boundary. That is, the last digit 

  of the first code point in the block is always 0, and the last digit of the 

  final code point in the block is always F.</li>

  <li>Every block is contiguous. That is, if any two code points are in the same 

  block, then all intermediate code points are in that block. </li>

</ul>

<p>The block property values are listed in the Blocks datafile, and are not 

repeated in the PropertyValueAliases datafile. (Block property values should be 

used with caution; for more information see

<a href="http://www.unicode.org/reports/tr18/tr18-6d2.html">Unicode Technical 

Report #18, “Unicode Regular Expression Guidelines”</a>, Annex A.)</p>

<p>The notes for SpecialCasing.txt have been updated, and the rules for casing 

involving dotted letters (i, j) have been reformulated more generically.</p>

<p>An updated Index.txt has been provided, to make it easier to locate the newly 

added characters, particularly for mathematics.</p>

<h3>New Properties</h3>

<p>The following new property files have been added:</p>

<ul>

  <li>PropertyValueAliases and PropertyAliases: These contain recommended UCD 

  property names and property value names. These names can be used for XML 

  formats of UCD data, for regular-expression property tests, and other 

  programmatic textual descriptions of Unicode data.</li>

  <li>DerivedAge: This file shows when various code points were designated in 

  successive versions of the Unicode Standard.</li>

  <li>NormalizationCorrections: This file contains any corrections required
   to maintain backwards compatibility for normalization. Currently it
   lists code point differences for 
  <a href="http://www.unicode.org/versions/corrigendum3.html">Corrigendum #3: U+F951 Normalization</a>.</li>

</ul>

<p>Other new properties include:</p>

<ul>

  <li>Grapheme_Base, Grapheme_Extend, Grapheme_Link: For programmatic 
  determination of grapheme cluster boundaries.</li>

  <li>IDS_Binary_Operator, IDS_Trinary_Operator, Radical, Unified_Ideograph: For 

  a machine-readable list of Ideographic Description Sequences.</li>

  <li>Default_Ignorable_Code_Point: For programmatic determination of 

  default-ignorable code points. These code points are to be ignored by processes that do not explicitly support them. This 

  permits programs to be compatible with future assignments of such characters. 

  Ordinarily they are invisible, have no glyph, and have no advance width.</li>

  <li>Deprecated: For a machine-readable list of deprecated characters. No 

  characters will ever be removed from the standard, but the use of deprecated 
  characters is strongly 

  discouraged.</li>

  <li>Soft_Dotted: Characters with a “soft dot”, like <i>i</i> or <i>j</i>. An 

  accent placed on these characters causes the dot to disappear.</li>

  <li>Logical_Order_Exception: There are a small number of characters (in the 

  Thai and Lao scripts) that do not use logical order. These characters require 

  special handling in most processing.</li>

</ul>

<p>For more information on these new properties, see the relevant documentation 

in the Unicode Character Database.</p>

<p>Note: For consistency with the property naming conventions, the property <i>

BidiMirrored</i> has been renamed to <i>Bidi_Mirrored</i> (see 

DerivedBinaryProperties.txt). Also the property <i>Comp_Ex</i> has been renamed 

to <i>Full_Composition_Exclusion</i> (see DerivedNormalizationProps.txt).</p>

<h3>File Name Length Restriction</h3>

<p>For cross-platform interoperability, the file names will be restricted to no 

more than 31 characters in length. Due to this change in policy, 

DerivedNormalizationProps.txt is the new file name for the file formerly known 

as DerivedNormalizationProperties.txt.</p>

<p>The documentation files for the Unicode Character Database have been updated 

to reflect the additions of new property files and new character properties to 

existing files, and the new file name length restriction.</p>

<h2 class="bb"><a name="relation">VIII Relation to ISO/IEC 10646</a></h2>

<p>ISO/IEC 10646 is a multi-part standard. Part 1, published as ISO/IEC 

10646-1:2000(E), covers the Architecture and Basic Multilingual Plane. Part 2, 

published as ISO/IEC 10646-2:2001(E), covers the supplementary planes. Amendment 

1 to Part 1 makes a few modifications to the architecture of 10646 and adds 

about a thousand characters to the BMP.&nbsp;</p>

<p>Unicode 3.2 contains all of the characters of Amendment 1, including the two 

characters of Amendment 1 that had already been added to Unicode 3.1. With the 

publication of Amendment 1 to ISO/IEC 10646-1:2000 and the Unicode Standard, 

Version 3.2, the two standards are fully synchronized.&nbsp;</p>

<p>The Unicode Consortium and ISO/IEC JTC1/SC2/WG2 are committed to maintaining 

the synchronization between the two standards.&nbsp;</p>

<p>Notable among the architectural changes to ISO/IEC 10646 approved in 

Amendment 1 are:&nbsp;</p>

<ul>

  <li>The range of characters available for private use has been restricted to 

  those characters accessible via UTF-16, and the intent not to encode 

  characters past Plane 16 has been clarified. This guarantees the 

  interoperability of UTF-8 and UTF-16, and the equivalence of UTF-32 and UCS-4.</li>

  <li>The definition of UCS short identifiers has been modified and UCS sequence 

  identifiers have been added. This brings 10646 in line with Unicode 

  conventions for representing characters and sequences of characters.&nbsp;</li>

  <li>The clause reserving characters for internal use has been updated, so that 

  the 10646 specification is in line with the Unicode specification of 

  noncharacters, including the noncharacters at U+FDD0..U+FDEF.</li>

</ul>

<h2 class="bb"><a name="references">IX References</a> and Sources</h2>

<h3>Standards and Specifications</h3>

<p>ISO/IEC 9573-13: International Organization for Standardization. <i>

Information technology—SGML support facilities—Techniques for using SGML—Part 

13: Public entity sets for mathematics and science.</i> [Geneva], 1991. (ISO/IEC 

TR 9573-13:1991).</p>

<p>ISO/IEC 9995-7: <i>Information technology—Keyboard layouts for text and 

office systems—Part 7: Symbols used to represent functions</i>. [Geneva], 1994. 

(ISO/IEC 9995-7:1994).</p>

<p>ISO/IEC 14651: International Organization for Standardization. <i>Information 

technology—International string ordering and comparison—Method for comparing 

character strings and description of the common template tailorable ordering</i>. 

[Geneva], 2001. (ISO/IEC 14651:2001).</p>

<p>JIS X 0213: Japanese Industrial Standards Committee. <i>7 bitto oyobi 8 bitto 

no 2 baito jouhou koukan you fugouka kakuchou kanji shuugou</i> (<i>7-bit and 

8-bit double byte coded extended KANJI sets for information interchange</i>). 

Tokyo, 2000. (JIS X 0213:2000).</p>

<h3>Other References and Sources</h3>

<p> <i>Doctrina christiana: the first book printed in the Philippines, Manila 

1593.</i>&nbsp; A facsimile of the copy in the Lessing J. 

Rosenwald Collection...with an introductory essay by Edwin Wolf II.  Washington, DC, Library of Congress, 

1947.</p>

<p>Kuipers, Joel C., and Ray McDermott.  “Insular Southeast Asian Scripts.” In <i>The World’s Writing System</i>s. 

Edited by Peter T. Daniels and William Bright. New York, Oxford University Press, 

1996. ISBN 0-19-507993-0.</p>

<p>Santos, Hector. <i>The Living Scripts</i>. Los Angeles: Sushi Dog Graphics, 

1995. (Ancient Philippine Scripts Series; 2).<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

User’s guide accompanying <i>Computer Fonts, Living Scripts</i> software.</p>

<p>Santos, Hector. <i>Our Living Scripts</i>. January 31, 1997. <br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

<a href="http://www.bibingka.com/dahon/living/living.htm">http://www.bibingka.com/dahon/living/living.htm

</a>

<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

     Part of his <i>A Philippine Leaf</i>.</p>

<p>Santos, Hector. <i>The Tagalog Script</i>. Los Angeles: Sushi Dog Graphics, 

1994. (Ancient Philippine Scripts Series; 1).

<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

User’s guide accompanying <i>Tagalog Script Fonts</i> software.</p>

<p>Santos, Hector. <i>The Tagalog Script</i>. October 26, 1996. <br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

<a href="http://www.bibingka.com/dahon/tagalog/tagalog.htm">http://www.bibingka.com/dahon/tagalog/tagalog.htm</a>

<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

     Part of his <i>A Philippine Leaf</i>.</p>

<p>STIPUB Consortium. STIX (Scientific and Technical Information Exchange) 

Project.&nbsp;<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

<a href="http://www.ams.org/STIX/">http://www.ams.org/STIX/</a></p>

<h2><a name="Modifications">X Modifications</a></h2>

<p>The following summarizes modifications from the previous version of this 

document. Modifications to this document will be limited to repairing 

straightforward typographical and production errors. Updates in content will be 

carried out via a future version of the Unicode Standard, published in a 

separate document.</p>

<table cellspacing="4" cellpadding="0" width="100%" border="0" class="noborder" style="border-collapse: collapse">

  <tr>

    <td valign="top" width="1" class="noborder"><a name="tracking_number">3</a></td>

    <td valign="top" class="noborder">

    <ul>

      <li>None</li>

    </ul>

    </td>

  </tr>

</table>

<hr align="LEFT">

<p><font size="-1">Copyright © 2001-2002 Unicode, Inc. All Rights Reserved. The 

Unicode Consortium makes no expressed or implied warranty of any kind, and 

assumes no liability for errors or omissions. No liability is assumed for 

incidental and consequential damages in connection with or arising out of the 

use of the information or programs contained or accompanying this technical 

report.</font></p>

<p><font size="-1">Unicode and the Unicode logo are trademarks of Unicode, Inc., 

and are registered in some jurisdictions.</font></p>

</div>

</body>


</html>
Rendered documentLive HTML preview