tr56-5.html
1155 lines<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<html>
<head><base href="https://www.unicode.org/reports/tr56/tr56-5.html">
<link rel="stylesheet" href="https://www.unicode.org/reports/reports-v2.css" type="text/css">
<title>UTR #56: Unicode Cuneiform Sign Lists</title>
<style type="text/css">
.xsux-highlight {
text-decoration: underline;
color: red;
}
@font-face {
font-family: "OB Freie";
src: url("fonts/OBFreie-Regular.ttf");
}
@font-face {
font-family: "Oracc RSP";
src: url("fonts/Oracc-RSP.ttf");
}
@font-face {
font-family: "CuneiformNAOutline";
src: url("fonts/CuneiformNA.ttf");
}
@font-face {
font-family: "Nabuninuaihsus";
src: url("fonts/Nabuninuaihsus.ttf");
}
.obfreie {
font-family: 'OB Freie';
font-size: 110%;
}
.cuneiformnaoutline {
font-family: CuneiformNAOutline;
font-size: 90%;
}
.nabuninuaihsus {
font-family: Nabuninuaihsus;
font-weight: normal;
}
.oraccrsp {
font-family: 'Oracc RSP';
font-size: 120%;
}
</style>
</head>
<body>
<table class="header">
<tr>
<td class="icon" style="width:38px; height:35px">
<a href="https://www.unicode.org/">
<img border="0" src="https://www.unicode.org/webscripts/logo60s2.gif" align="middle"
alt="[Unicode]" width="34" height="33">
</a>
</td>
<td class="icon" style="vertical-align:middle">
<a class="bar"> </a>
<a class="bar" href="https://www.unicode.org/reports/"><font size="3">Technical Reports</font></a>
</td>
</tr>
<tr>
<td colspan="2" class="gray"> </td>
</tr>
</table>
<div class="body">
<h2 align="center">Unicode Technical Report #56</h2>
<h1>Unicode® Cuneiform Sign Lists</h1>
<table class="simple" width="90%">
<tr>
<td valign="top" width="20%">Editors</td>
<td valign="top">
Robin Leroy 𒉭 (<a href="mailto:eggrobin@unicode.org">eggrobin@unicode.org</a>)
</td>
</tr>
<tr>
<td valign="top" width="20%">Date</td>
<td valign="top">2025-10-29</td>
</tr>
<tr>
<td valign="top" width="20%">This Version</td>
<td valign="top">
<a href="https://www.unicode.org/reports/tr56/tr56-5.html">https://www.unicode.org/reports/tr56/tr56-5.html</a>
</td>
</tr>
<tr>
<td valign="top" width="20%">Previous Version</td>
<td valign="top">
<a href="https://www.unicode.org/reports/tr56/tr56-3.html">https://www.unicode.org/reports/tr56/tr56-3.html</a>
</td>
</tr>
<tr>
<td valign="top" width="20%">Latest Version</td>
<td valign="top"><a href="https://www.unicode.org/reports/tr56/">https://www.unicode.org/reports/tr56/</a></td>
</tr>
<tr>
<td valign="top" width="20%">Latest Proposed Update</td>
<td valign="top">
<a href="https://www.unicode.org/reports/tr56/proposed.html">https://www.unicode.org/reports/tr56/proposed.html</a>
</td>
</tr>
<tr>
<td valign="top" width="20%">Revision</td>
<td valign="top"><a href="#Modifications">5</a></td>
</tr>
</table>
<h4><a name="Summary" href="#Summary">Summary</a></h4>
<p>
<i>
This document outlines the need for ancillary data in the use of the Sumero-Akkadian Cuneiform
script, and describes how the Oracc Sign List provides that data.
</i>
</p>
<h4>Status</h4>
<!-- NOT YET APPROVED
<p class="changed">
<i>
This is a<b><font color="#ff3333"> draft </font></b>document
which may be updated, replaced, or superseded by other documents at
any time. Publication does not imply endorsement by the Unicode
Consortium. This is not a stable document; it is inappropriate to
cite this document as other than a work in progress.
</i>
</p>
END NOT YET APPROVED -->
<!-- APPROVED -->
<p>
<i>
This document has been reviewed by Unicode members and other
interested parties, and has been approved for publication by the
Unicode Consortium. This is a stable document and may be used as
reference material or cited as a normative reference by other
specifications.
</i>
</p>
<!-- END APPROVED -->
<blockquote>
<p>
<i>
<b>A Unicode Technical Report (UTR)</b> contains
informative material. Conformance to the Unicode Standard does not
imply conformance to any UTR. Other specifications, however, are
free to make normative references to a UTR.
</i>
</p>
</blockquote>
<p>
<i>
Please submit corrigenda and other comments with the online reporting
form [<a href="https://www.unicode.org/reporting.html">Feedback</a>]. Related information that is useful in
understanding this document is found in the <a href="#References">References</a>.
For the latest version of the Unicode Standard see [<a href="https://www.unicode.org/versions/latest/">Unicode</a>].
For a list of current Unicode Technical Reports see [<a href="https://www.unicode.org/reports/">Reports</a>].
For more information about versions of the Unicode Standard, see [<a href="https://www.unicode.org/versions/">Versions</a>].
</i>
</p>
<h4 class="contents">Contents</h4>
<!--TOC-->
<ul class="toc">
<li>
1 <a href="#Introduction">Introduction</a>
</li>
<li>
2 <a href="#Principles_of_Cuneiform_Encoding">Principles of Cuneiform Encoding</a>
<ul class="toc">
<li>
2.1 <a href="#Cuneiform_Signs">Cuneiform Signs</a>
<ul class="toc">
<li>
2.1.1 <a href="#Transliteration">Transliteration</a>
</li>
<li>
2.1.2 <a href="#Numerals">Numerals</a>
</li>
</ul>
</li>
<li>
2.2 <a href="#Sequences">Sequences</a>
</li>
<li>
2.3 <a href="#Mergers_and_Splits">Mergers and Splits</a>
<ul class="toc">
<li>
2.3.1 <a href="#Mergers_and_Splits_of_Sequences">Mergers and Splits of Sequences</a>
</li>
</ul>
</li>
<li>
2.4 <a href="#Representative_Glyphs">Representative Glyphs</a>
</li>
<li>
2.5 <a href="#Sign_Names">Sign Names</a>
</li>
<li>
2.6 <a href="#Ligatures">Ligatures</a>
<ul class="toc">
<li>
2.6.1 <a href="#Discretionary_Ligatures">Discretionary Ligatures</a>
</li>
</ul>
</li>
</ul>
</li>
<li>
3 <a href="#The_Oracc_Sign_List">The Oracc Sign List</a>
</li>
<li>
<a href="#References">References</a>
</li>
<li>
<a href="#Acknowledgements">Acknowledgements</a>
</li>
<li>
<a href="#Modifications">Modifications</a>
</li>
</ul>
<!--TOC-->
<!--end TOC-->
<h2>1 <a name="Introduction" href="#Introduction">Introduction</a></h2>
<p>
The Unicode Standard formally establishes the character identity of cuneiform signs by means of their names
and representative glyphs in the code charts; see D2 in <cite>Section 3.3, <a href="https://www.unicode.org/versions/latest/core-spec/#G19002">Semantics</a></cite>, in [<a href="#Unicode">Unicode</a>].
However, while the identity of abstract characters is well-established in the cuneiform script, the abstract
characters are not usually referred to by standardized names, and the glyphic ranges of the abstract characters
are vast and overlapping.
</p>
<p>
In practice, implementations of the script require an association of sequences of code points with entries in
the classical sign lists that establish abstract character identity, and with the sign values which provide the
usual names of these signs. Similar reliance on ancillary data may be found in other large scripts; see for
instance Unicode Standard Annex #38, “Unicode Han Database (Unihan)” [<a href="#UAX38">UAX38</a>].
</p>
<p>
This document briefly discusses the approach to the complexities of cuneiform sign identity taken by the
encoding; it then describes the sign list maintained by the Open Richly Annotated Cuneiform Project (Oracc)
which provides the ancillary data necessary to the effective use of the encoded script.
</p>
<h2>2 <a name="Principles_of_Cuneiform_Encoding" href="#Principles_of_Cuneiform_Encoding">Principles of Cuneiform Encoding</a></h2>
<h3>2.1 <a name="Cuneiform_Signs" href="#Cuneiform_Signs">Cuneiform Signs</a></h3>
<p>
Assyriologists have published many <em>sign lists</em>, that is, classifications of the repertoire of cuneiform signs; these
are numbered lists of signs, each illustrated with its glyphic range in the area and time period of interest, and
often associated with a representative glyph from the Neo-Assyrian period and with the phonetic and
logographic values of the sign.
The sign lists play a similar role to the <em>sources</em> used in the CJKV or Tangut encodings.
</p>
<p>
Examples of such sign lists include [<a href="#aBZL">aBZL</a>], [<a href="#BAU">BAU</a>], [<a href="#ELLes">ELLes</a>], [<a href="#HZL">HZL</a>] [<a href="#KWU">KWU</a>], [<a href="#LAK">LAK</a>], [<a href="#MÉA">MÉA</a>], [<a href="#MZL">MZL</a>], [<a href="#PTACE">PTACE</a>], [<a href="#RÉC">RÉC</a>], [<a href="#RSP">RSP</a>], [<a href="#ŠL">ŠL</a>], and [<a href="#ZATU">ZATU</a>].
Notably, [<a href="#ŠL">ŠL</a>] and [<a href="#MÉA">MÉA</a>] use the same numbering; however, the other sign lists have different numbering schemes.
</p>
<p>
The glyphic range of a sign is stylistic, encompassing for instance variation between lapidary inscriptions and
cursive on clay tablets, regional variation, and variation between time periods.
This is illustrated in <a href="#Glyphs-𒈾">Figure 1</a>,
which shows glyphs given in [<a href="#MÉA">MÉA</a>] for the sign NA 𒈾 in three styles:
</p>
<ul>
<li>Old Babylonian lapidary (a)</li>
<li>Old Babylonian cursive (b)</li>
<li>Neo-Assyrian (c)</li>
</ul>
<p>Distinct glyphs
for the same sign are not used contrastively, nor do they co-occur in texts that use a consistent style. In
particular, for a given sign, the various phonetic and logographic values are not distinguished by contrasting
glyphs.
</p>
<p class="caption">Figure 1. <a name="Glyphs-𒈾" href="#Glyphs-𒈾">Glyphs for the sign NA 𒈾</a>.</p>
<div class="center">
<img src="images/NA-MÉA.png" alt="Three different glyphs for the sign 𒈾.">
</div>
<p>
These signs are the abstract characters of the cuneiform script. See also point 5 in [<a href="#ICE">ICE</a>].
This approach makes it possible to encode texts known from multiple copies
(so-called <em>composite texts</em>) that use different styles but consistent spellings,
or to use encoded text to refer to the signs diachronically,
as in dictionaries or sign lists covering broad timespans.
</p>
<h4>2.1.1 <a name="Transliteration" href="#Transliteration">Transliteration</a></h4>
<p>
Texts are often published in transliterated form; the scheme for transliteration (and for the notation of sign
values) originates with Thureau-Dangin’s [<a href="#Syllabaire">Syllabaire</a>]. It uses numeric subscripts to distinguish homophones;
the numbering of homophones is kept consistent across sign lists.
</p>
<p>
Note that accents can be used interchangeably with numbers (ú for u₂, ù for u₃), and additional information
about the interpretation of signs is conveyed by capitalization and styling; a discussion of the specifics of
assyriological transliteration is out of scope for this document.
</p>
<p>
Thanks to this numbering, a transliteration uniquely determines the sequence of signs of the original text. For
example, the transliterations <i>ib-bu-u</i>₂ and <i>ib-bu-u</i> of distinct spellings of Akkadian <i>ibbû</i> “they named” are
unambiguously transliterations of the sequences of signs 𒅁𒁍𒌑 and 𒅁𒁍𒌋, respectively. Note that
while they share the phonetic value /u/, the signs U₂ 𒌑 and U 𒌋 are not stylistic variants of each other: they
have distinct sets of values and meanings; for instance, 𒌑 means “grass” and 𒌋 means the number 10,
meanings that are not shared with the other sign.
</p>
<p>
This relation between transliteration and abstract characters means that encoded cuneiform texts can
normally be
automatically generated from transliterated corpora. The reverse is not true; for instance, the sign 𒀸 might
be transliterated <i>aš</i>, <i>ina</i>, or <i>dil</i>, depending on context.
</p>
<div>
<p>
There are occasional exceptions where a typical transliteration does not
suffice to determine the cuneiform text. An example is the Eblaite version
of the sign DIRI; DIRI is normally the sequence 𒋛𒀀 SI.A, but is written
𒀀𒋛 A.SI in Ebla instead, while still being transliterated diri or dirig
in the literature on Ebla.
When generating cuneiform from transliterations, either information about
the provenience of the text should be taken into account to disambiguate
these cases, or the transliterations should be adjusted to disambiguate.
For instance, the Oracc Digital Corpus of Cuneiform Lexical Texts uses
the transliteration dirig(A.SI) to unambiguously represent Eblaite dirig.
</p>
</div>
<p>
A machine-readable format for cuneiform transliteration exists to facilitate such automatic processing of
transliterated corpora. See [<a href="#ATF">ATF</a>].
</p>
<div>
<h4>2.1.2 <a name="Numerals" href="#Numerals">Numerals</a></h4>
<p>
The transliteration of numbers is less standardized. Transliterations
that merely record the numeric value without also indicating the type of
sign used cannot generally be used to automatically produce cuneiform
text: in such a transliteration, 𒀸 and 𒁹 could both be transliterated as
“1”.
</p>
<p>
Other transliterations record the type of numeral, often together with
an interpretation as part of a metrological system.
For instance, in [<a href="#ATF">ATF</a>], 𒁹 could be transliterated as
1(barig) if it is a volume measure, or as 1(diš) if it is a count;
𒀸 could be transliterated as as 1(iku) as an area measure, or as 1(aš) as
a count.
These transliterations can be used to automatically produce cuneiform
text.
However, conventions differ as to whether the actual numeric value or only
the multiplicity of the sign is recorded in the transliteration:
[<a href="#ATF">ATF</a>] uses “1(u) 5(aš)” to transliterate 15 written
𒌋𒐃, whereas other systems use “10(U) 5(AŠ)”.
For corpora where the sexagesimal place value system is dominant, in
particular in the first millennium, [<a href="#ATF">ATF</a>] allows for
the sexagesimal places to be written in a so-called diš-less notation,
wherein 1 implicitly represents 1(diš) 𒁹.
Each sexagesimal place is a decimal number in the range 1–60, which
corresponds to one or two cuneiform signs : 10 represents 𒌋, and 32
represents the sequence 𒌍𒈫.
Note that even in corpora that use diš-less notation, other types of
numerals are transliterated in a qualified form, so that the type of
numeric sign used remains unambiguous: the same text may have 15 ANŠE for
<span class="nabuninuaihsus">𒌋𒐊𒀲</span> (15 donkey-loads) and
1(u) 5(aš) GUN for <span class="nabuninuaihsus">𒌋𒐃𒄘𒌦</span> (15 talents).
See the <cite><a href="https://oracc.museum.upenn.edu/doc/help/editinginatf/metrology/index.html">Metrology</a></cite> page in [<a href="#ATF">ATF</a>].
Implementers should document what conventions they expect for numeric
transliterations.
</p>
<blockquote>
<b>Note:</b> The Numeric_Value property of cuneiform signs corresponds to
the multiplicity of the sign, rather than the numeric value represented,
which depends on the metrological system. The sign U 𒌋 thus has
Numeric_Value=1, rather than Numeric_Value=10.
See <cite><a href="https://www.unicode.org/versions/Unicode17.0.0/core-spec/chapter-11/#G26965">Cuneiform Numerals</a></cite> in <cite>Section 11.1.2, Cuneiform Numbers and Punctuation</cite>, of [<a href="#Unicode">Unicode</a>].
</blockquote>
<p>
An additional complication when producing cuneiform text from
transliterations of numeric expressions is that some variant stacking
patterns for cuneiform numerals are separately encoded, even though they
are rarely marked in transliteration.
For instance, a transliteration 4(diš) can correspond to either U+12409 𒐉
or U+1243C 𒐼; likewise 7(diš), 8(diš), and 9(diš) can correspond to either
𒐌, 𒐍, 𒐎, or to 𒑂, 𒑄, 𒑆.
The stacking pattern used primarily depends on the period and style;
the style with rows of at most three wedges is more common in the
Neo-Assyrian period, the style with two rows is more common in the Ur III
period. When automatically generating cuneiform text from
transliterations of Neo-Assyrian texts, 4(diš) should therefore generally
be taken to correspond to 𒐼 rather than 𒐉.
</p>
<p>
There are some corpora where a contrast is recorded in transliteration
between the 𒐼 and 𒐉 families of stacking patterns; these co-occur in some
Ur III texts where the 𒐼 family is used in scratch calculations and the 𒐉
family is used in results. In that case, the 𒐼 family is transliterated
as a variant, thus 4(diš@v) in [<a href="#ATF">ATF</a>].
This convention is reflected in [<a href="#OSL">OSL</a>], as well as in the
character names: U+1243C 𒐼 is CUNEIFORM NUMERIC SIGN FOUR VARIANT FORM LIMMU, whereas
U+12409 𒐉 is plain CUNEIFORM NUMERIC SIGN FOUR DISH.
</p>
<p>
The main reason for the disunification of stacking patterns, which would
normally be a stylistic distinction, is the representability of sign lists
that distinguish them, but otherwise present all signs in a consistent
style; in particular, [<a href="#MZL">MZL</a>], whose cuneiform text is in
Neo-Assyrian style, assigns different sign list numbers and sometimes
different values to the variant stacking patterns: 𒐼 is number 860 with
the value limmu, and 𒐉 is number 852 with the value limmu₅.
Since that need does not extend to earlier periods, the stacking patterns
used in the Early Dynastic period are not separately encoded, and the
default versions of numeric signs should be used in these periods.
For instance, the character U+12399 𒎙 should be used for Early Dynastic
2(u), even though the two stylus impressions are normally stacked
vertically rather than horizontally in Early Dynastic tablets:
the character U+12399 has the glyph <span class="oraccrsp">𒎙</span> in
the Early Dynastic font [<a href="#OFS-RSP">OFS-RSP</a>].
</p>
</div>
<h3>2.2 <a name="Sequences" href="#Sequences">Sequences</a></h3>
<p>
Some signs can be analysed in most styles as a sequence of other signs written one after the other, and some
sequences of signs have special values unrelated to their components; for instance, the sign GEME₂ 𒊩𒆳 is
written like the sign SAL 𒊩 followed by the sign KUR 𒆳, even as these signs change across styles; the
sign DIRI 𒋛𒀀 is written as SI 𒋛 followed by A 𒀀.
</p>
<p>
In cases where a sign can be analysed as a sequence
both in the third millennium and in the Neo-Assyrian style, that sign is normally
not separately encoded; the corresponding sequence should be used to represent this abstract
character. If the analysis as a sequence is applicable only
in the third millennium, but not in Neo-Assyrian, or only in Neo-Assyrian, but not in the third millennium, the character is generally encoded
atomically; examples of both are given in <cite>Section 2.3.1, <a href="#Mergers_and_Splits_of_Sequences">Mergers and Splits of Sequences</a></cite>. See also items 2 and 5 in [<a href="#Principles">Principles</a>], and <em>Complex and Compound Signs</em> in
<cite>Section 11.1, <a href="https://www.unicode.org/versions/latest/core-spec/#G26852">Sumero-Akkadian</a></cite>,
of [<a href="#Unicode">Unicode</a>]. An
exception is made for signs that were taught as basic syllables as part of
the early scribal curriculum, such as those in the
sign exercises Syllable Alphabets A and B (known to the scribes by their
incipits 𒈨𒈨 ME-ME and 𒀀𒀀 A-A) or 𒌅𒋫𒋾 (TU-TA-TI); these basic syllables are then
used later in the curriculum to describe pronunciations of more complex signs
in sign lists such as Aa or Ea.
The basic syllables have been encoded atomically, and should not be
represented as sequences.
For instance, according to the other encoding principles,
the sign 𒅇 U₃ could be represented as the sequence 𒅆𒁳
IGI.DIB, or the sign 𒊻 UZ as 𒊺𒄷 ŠE.ḪU, but they are atomically
encoded.
See also item 4 in [<a href="#Changes">Changes</a>].
Note that the sequences can appear in cuneiform text when they are not
read as the basic syllables:
</p>
<div align="center">
<table class="subtle">
<tr><th>Cuneiform</th><th>Transliteration</th><th>Translation</th><th>Representation of underlined text</th></tr>
<tr><td>𒉭​<span class="xsux-highlight">𒊻</span>𒄷</td>
<td>nunuz <span class="xsux-highlight">uz</span><sup>mušen</sup></td>
<td>duck eggs</td><td rowspan="2" style="vertical-align:middle">𒊻 UZ</td></tr>
<tr><td>𒄿𒍪<span class="xsux-highlight">𒊻</span>𒍪</td>
<td><i>i-zu-<span class="xsux-highlight">uz</span>-zu</i></td>
<td>they will divide</td></tr>
<tr><td>𒑏​𒐈​𒋡​<span class="xsux-highlight">𒊺​𒄷</span>​𒊺</td>
<td>1(ban₂) 3(diš) sila₃ <span class="xsux-highlight">še mušen</span> niga</td>
<td>1 ban 3 sila (~13 l) of barley for the fattened birds</td><td rowspan="3" style="vertical-align:middle">𒊺 ŠE followed by 𒄷 ḪU=MUŠEN</td>
<tr><td>𒁕<span class="xsux-highlight">𒊺𒄷</span>𒌝</td><td>da-<span class="xsux-highlight">še-ḫu</span>-um</td><td>(a name)</td></tr>
<tr><td>𒊭​𒆷​𒄿𒉺𒀸<span class="xsux-highlight">𒊺𒄷</span></td><td><i>ša la i-pa-aš-<span class="xsux-highlight">še-ḫu</span></i></td><td>that cannot be soothed</td></tr>
</table>
</div>
<p>
In all styles of cuneiform some signs that are analysed as sequences diverge in appearance
from their components.
Fonts targeting specific styles
should include ligatures for these sequences
as appropriate. This is discussed in <cite>Section 2.6, <a href="#Ligatures">Ligatures</a></cite>.
</p>
<blockquote><b>Note:</b> While signs encoded as sequences
are generally signs that originated as sequences, this is not always the
case; some sequences are reanalyses that are not consistent with the
earlier forms of the sign.
For example, the sign 𒄘𒃼 IDIGNA, the name of the river Tigris, is
encoded as the sequence GU₂.GAR₃, and the related sign 𒈦𒄘𒃼 DALLA,
meaning “bright” or “fierce”, as MAŠ.IDIGNA=MAŠ.GU₂.GAR₃; this analysis
is only applicable starting in the late third millennium:
the glyph for Early Dynastic IIIb
<span class="oraccrsp">𒈦𒄘𒃼</span> DALLA does not have a recognizable
<span class="oraccrsp">𒃼</span> GAR₃, as illustrated here by the font
[<a href="#OFS-RSP">OFS-RSP</a>].
</blockquote>
<h3>2.3 <a name="Mergers_and_Splits" href="#Mergers_and_Splits">Mergers and Splits</a></h3>
<p>
Some signs have distinct glyphs in the styles of earlier periods, but identical glyphs in those of later periods;
such occurrences are called <em>mergers</em>. Conversely, some signs have identical glyphs in the styles of earlier
periods, distinct glyphs in those of later periods; such occurrences are called <em>splits</em>.
</p>
<p>
When encoding texts written in styles where the glyphs of merged or split signs are identical, the character
corresponding to the correct sign value should be used, so that the encoding of a text is independent of the
style in which it is written.
</p>
<p>
<a href="#Split-and-Merger">Figure 2</a> illustrates splits and mergers affecting four signs; note that a sign can be affected both by a split and
a merger, as is the case of TI₂ 𒎗, which splits from DIN 𒁷 and merges with ḪI 𒄭.
The source of the hand copy shown is given in each cell of the table.
</p>
<p class="caption">
Figure 2. <a name="Split-and-Merger" href="#Split-and-Merger">
Mergers and splits of 𒊹, 𒄭, 𒎗, and 𒁷.
</a>
</p>
<div align="center">
<table class="subtle">
<tr><th></th><th>Early Dynastic IIIa</th><th>Ur III</th><th>Old Assyrian</th><th>Middle Assyrian</th></tr>
<tr>
<th>𒊹 ŠAR₂</th>
<td><img src="images/EDIIIa-ŠAR₂.png" alt="𒊹 (circular)"><br>[<a href="#P010576">P010576</a>]</td>
<td><img src="images/UrIII-ŠAR₂.png" alt="𒊹 (square, like 𒄭)"><br>[<a href="#P142296">P142296</a>]</td>
<td></td>
<td><img src="images/MA-ŠAR₂.png" alt="𒊹 (four diagonal wedges, like 𒎗)"><br>[<a href="#P281820">P281820</a>]</td>
</tr>
<tr>
<th>𒄭 ḪI</th>
<td><img src="images/EDIIIa-ḪI.png" alt="𒄭 (square)"><br>[<a href="#P225950">P225950</a>]</td>
<td><img src="images/UrIII-ḪI.png" alt="𒄭 (square)"><br>[<a href="#P142296">P142296</a>]</td>
<td><img src="images/OA-ḪI.png" alt="𒄭 (box enclosed by two horizontal wedges and two Winkelhaken)"><br>[<a href="#P360975">P360975</a>]</td>
<td><img src="images/MA-ḪI.png" alt="𒄭 (four diagonal wedges, like 𒎗)"><br>[<a href="#P282017">P282017</a>]</td>
</tr>
<tr>
<th>𒎗 TI₂</th>
<td></td>
<td><img src="images/UrIII-TI₂.png" alt="𒎗 (shaped like an arrow, like 𒁷)"><br>[<a href="#P142296">P142296</a>]</td>
<td><img src="images/OA-TI₂.png" alt="𒎗 (four diagonal wedges)"><br>[<a href="#P360975">P360975</a>]</td>
<td><img src="images/MA-TI₂.png" alt="𒎗 (four diagonal wedges)"><br>[<a href="#P282017">P282017</a>]</td>
</tr>
<tr>
<th>𒁷 DIN</th>
<td><img src="images/EDIIIa-DIN.png" alt="𒁷 (shaped like an arrow)"><br>[<a href="#P225950">P225950</a>]</td>
<td><img src="images/UrIII-DIN.png" alt="𒁷 (shaped like an arrow)"><br>[<a href="#P103303">P103303</a>]</td>
<td></td>
<td><img src="images/MA-DIN.png" alt="𒁷 (a vertical wedge between two Winkelhaken, atop a horizontal wedge)"><br>[<a href="#P282017">P282017</a>]</td>
</tr>
</table>
</div>
<p>
This diachronic approach to the encoding means that characters newly
encoded to represent a contrast present in some styles may need to be supported in
fonts where that contrast is absent.
For instance, after the sign 𒎌 MEŠ was encoded in Unicode Version 7.0
to represent the contrast with the sequence me-eš in Neo-Assyrian styles,
as illustrated in
<cite>Section 2.3.1, <a href="#Mergers_and_Splits_of_Sequences">Mergers and Splits of Sequences</a></cite>,
fonts for Old Babylonian styles had to be updated to support newly
encoded Akkadian texts, even though the plural marker MEŠ looks
identical to the sequence of syllables me-eš in Old Babylonian.
</p>
<p>
See also item 11 in [<a href="#Principles">Principles</a>], as well as <cite>Mergers and Splits</cite> in <cite>Section 11.1, <a href="https://www.unicode.org/versions/latest/core-spec/#G26852">Sumero-Akkadian</a></cite>, of [<a href="#Unicode">Unicode</a>].
</p>
<h4>2.3.1 <a name="Mergers_and_Splits_of_Sequences" href="#Mergers_and_Splits_of_Sequences">Mergers and Splits of Sequences</a></h4>
<p>
A special case of mergers and splits is that of signs that look like sequences of other signs in some styles, but
have a different appearance (and are sometimes even used contrastively with the corresponding sequence) in
other styles. When such a sign has a distinctive
appearance throughout the third millennium or in the Neo-Assyrian style, it is generally not considered as a sequence as described in <cite>Section 2.2, <a href="#Sequences">Sequences</a></cite>, and is separately encoded. The special treatment of the Neo-Assyrian style
is due to its status as the index form in most classical reference works.
Fonts catering to more cursive styles may need to include many ligatures,
as described in <cite>Section 2.6, <a href="#Ligatures">Ligatures</a></cite>.
</p>
<p>
For example, the sign MEŠ 𒎌 (an Akkadian plural marker) originally looks like the sequence of syllables <i>me-eš</i> 𒈨𒌍, but their appearance diverges in Neo-Assyrian styles, as shown in <a href="#𒈨𒌍-vs-𒎌">Figure 3</a>. This is a split.
</p>
<blockquote>
<b>Note:</b> As in the single-character case, the term <em>split</em>
refers to the divergence of the visual representations of two fixed
character sequences, here 𒈨𒌍 and 𒎌.
That term does not refer to the phenomenon of a sign becoming a sequence
of signs; indeed 𒎌 instead arose by two pre-existing signs coalescing
into one.
</blockquote>
<p class="caption">Figure 3. <a name="𒈨𒌍-vs-𒎌" href="#𒈨𒌍-vs-𒎌">The sequence <i>me-eš</i> <span class="nabuninuaihsus">𒈨𒌍</span> and the sign MEŠ <span class="nabuninuaihsus">𒎌</span> on the Neo-Assyrian prism</a> [<a href="#P422664">P422664</a>].</p>
<div class="center">
<img src="images/ME-EŠ-vs-MEŠ.png" alt="The sequence of signs 𒈨𒌍 and the sign 𒎌, on the same document.">
</div>
<p>
As an example of a merger, the sign 𒋁, whose Sumerian readings include
šeš₂ “to anoint” and še₈ “to weep”, initially looks distinct
from the sequence of unrelated signs SIKI.LAM 𒋠𒇴, the first of which means
“hair” and the latter a kind of tree; this is the case in the reference glyphs.
However, in later styles, the sign ŠEŠ₂ <span class="nabuninuaihsus">𒋁</span> has the same appearance as the
sequence SIKI.LAM <span class="nabuninuaihsus">𒋠𒇴</span>.
</p>
<blockquote>
<b>Note:</b> The term <em>merger</em>
refers to the convergence of the visual representations of two fixed
character sequences, here 𒋁 and 𒋠𒇴.
As far as the scribes were concerned, the sign 𒋁 had broken up into a
sequence of signs.
</blockquote>
<p>
While the diachronic character identity used for the cuneiform encoding
generally matches the understanding scribes had of character identity in
their own script, there are discrepancies as scribes were not aware of
mergers long past, let alone future splits. For example, some lexical
texts describe explicitly the sign ŠEŠ₂ 𒋁 as being made up of the
sequence 𒋠𒇴, see [<a href="#P467315.r.i.22">P467315.r.i.22</a>].
</p>
<h3>2.4 <a name="Representative_Glyphs" href="#Representative_Glyphs">Representative Glyphs</a></h3>
<p>
As mentioned in <cite>Section 2.1, <a href="#Cuneiform_Signs">Cuneiform Signs</a></cite>, sign lists typically use a Neo-Assyrian style for their reference
glyphs, even when illustrating a different style.
</p>
<p>
However, because many signs are merged in the Neo-Assyrian style, this was an impractical choice for the
reference glyphs in the code charts; instead these reference glyphs are primarily in an Ur III style, where most
signs are distinct; where a sign is unattested in the Ur III period, or where signs appear identical in the Ur III
period, a different style was chosen for the sake of distinctiveness of the reference glyphs. For example, the
reference glyph for ŠAR₂ 𒊹 is in an Early Dynastic style, because that sign merges with ḪI 𒄭 by the Ur III
period; the reference glyph for TI₂ 𒎗 is in a style that is Old Assyrian or newer, because it has not yet split from
DIN 𒁷 in the Ur III period.
</p>
<p>
See also item 7 in [<a href="#Principles">Principles</a>], as well as <cite>Fonts</cite> in <cite>Section 11.1, <a href="https://www.unicode.org/versions/latest/core-spec/#G26852">Sumero-Akkadian</a></cite>, of [<a href="#Unicode">Unicode</a>]
</p>
<h3>2.5 <a name="Sign_Names" href="#Sign_Names">Sign Names</a></h3>
<p>
The names of the signs are generally based on a structural analysis of the signs, rather than on the common
sign values; thus 𒄠 is described as GUD×KUR (𒄞×𒆳, meaning 𒆳 inscribed inside 𒄞), rather than
AM. Note that this structural analysis may not be evident in all styles; see <a href="#Neo-Assyrian-𒄠-𒄞-𒆳">Figure 4</a>.
</p>
<p class="caption">Figure 4. <a name="Neo-Assyrian-𒄠-𒄞-𒆳" href="#Neo-Assyrian-𒄠-𒄞-𒆳">Neo-Assyrian glyphs for AM 𒄠, GUD 𒄞, and KUR 𒆳 from</a> [<a href="#MÉA">MÉA</a>].</p>
<div class="center">
<img src="images/Neo-Assyrian-AM-GUD-KUR.png" alt="Neo-Assyrian glyphs for AM 𒄠, GUD 𒄞, and KUR 𒆳.">
</div>
<p>In some styles, the sign may even have a different
structure from the one described by the name, as shown in <a href="#𒉋=𒉈×𒉽">Figure 5</a>, where
U+1224B 𒉋 CUNEIFORM SIGN NE SHESHIG (left) instead appears like NE×PAP 𒉈×𒉽. For
comparison, the appearance of the sign NE 𒉈 on the same artifact is shown on the right.</p>
<p class="caption">Figure 5.
<a name="𒉋=𒉈×𒉽" href="#𒉋=𒉈×𒉽">The signs BIL₂ 𒉋 and NE 𒉈 on the stele of Hammurapi</a> [<a href="#P249253">P249253</a>].</p>
<table align="center" class="subtle">
<tr>
<td>
<img src="images/BIL₂=NE×PAP.png" alt="The sign 𒉋 on the stele of Hammurapi.">
</td>
<td>
<img src="images/OB monumental NE.png" alt="The sign 𒉈 on the stele Hammurapi.">
</td>
</tr>
</table>
<p>
See also item 8 in [<a href="#Principles">Principles</a>].
</p>
<div>
<h3>2.6 <a id="Ligatures" href="#Ligatures">Ligatures</a></h3>
<p>
All styles of cuneiform require ligatures for
some character sequences in order to properly capture the appearance of
compound signs.
As the analysis of signs as sequences takes into account their appearance in
the Neo-Assyrian style, that style requires fewer ligatures.
For example, the sign U₅ 𒄷𒋛, whose meanings include “to ride”, is
encoded as the sequence ḪU.SI. In some Early Dynastic styles and in the Neo-Assyrian style, no
ligature is needed for this sign. However, in the style of Old
Babylonian literary texts, a ligature should be used to capture the
appearance of the U₅ sign. This is illustrated in Figure 6,
which shows the sequence 𒄷𒋛 as displayed in an Old Babylonian literary font [<a href="#OBF">OBF</a>] and a
Neo-Assyrian font [<a href="#OFS-NAO">OFS-NAO</a>].
</p>
<p class="caption">Figure 6. <a name="𒄷𒋛-ligature" href="#𒄷𒋛-ligature">The
text 𒄷+𒋛=𒄷𒋛 shown with two cuneiform fonts.</a></p>
<div align="center">
<table class="noborder">
<tr><td style="vertical-align:middle;text-align:right">[<a href="#OBF">OBF</a>]</td>
<td style="text-align:right;font-size:600%"><span class="obfreie">𒄷</span>+<span class="obfreie">𒋛</span>=</td>
<td class="obfreie" style="font-size:600%">𒄷𒋛</td></tr>
<tr><td style="vertical-align: middle; text-align: right ">[<a href="#OFS-NAO">OFS-NAO</a>]</td>
<td style="text-align:right;font-size:600%"><span class="cuneiformnaoutline">𒄷</span>+<span class="cuneiformnaoutline">𒋛</span>=</td>
<td style="font-size:600%"><span class="cuneiformnaoutline">𒄷𒋛</span></td></tr>
</table>
</div>
<p>
The same ligatures that occur within a sign encoded as a sequence can
also occur when that sequence corresponds to multiple signs.
For instance, in the Hellenistic period, the sign 𒋛𒀀 DIRI is ligated,
but that same ligature is used in occurrences that are read si-a; in the Ur III period, the sequence <span class="obfreie">𒌝‌𒈨</span> um-me is typically ligated as <span class="obfreie">𒌝𒈨</span>.
Note that while some transliterations use a single value for these
sign sequences, such as sa₅ for for si-a or eme₂ for
um-me, this practice is neither consistent nor strongly correlated with
ligation.
</p>
<p>
Even the Neo-Assyrian style requires a few ligatures.
Some are classically analysed as ligatures between separate signs, such
as the very frequent
<span class="nabuninuaihsus">𒀸</span>+<span class="nabuninuaihsus">𒋩</span>=<span class="nabuninuaihsus">𒀸𒋩</span>
<i>aš-šur</i>. Others are analysed as compound signs, such as
<span class="nabuninuaihsus">𒌋</span>+<span class="nabuninuaihsus">𒌆</span>=<span class="nabuninuaihsus">𒌋𒌆</span>
<i>dul</i>(U.TUG₂), or variably transliterated as sequences
or single signs, such as <span class="nabuninuaihsus">𒇧𒇧</span> nenni,
often transliterated BUL.BUL, where BUL is <span class="nabuninuaihsus">𒇧</span>.
</p>
<p>
In order to prevent a ligature between two signs,
U+200C ZERO WIDTH NON-JOINER can be used;
see <cite><a href="https://www.unicode.org/versions/latest/core-spec/#G22789">Non-joiner</a></cite> in <cite>Section 23.2.2, Cursive Connection and Ligatures</cite>,
of [<a href="#Unicode">Unicode</a>].
When generating cuneiform text from transliterations, a zero width non-joiner
should be inserted only where the transliteration marks an exceptional
lack of joining. Since many ligatures occur not only within compound
signs, but also between signs that are separately transliterated without
the ligation being marked in the transliteration, it is not advisable to
systematically prevent ligatures wherever the transliteration indicates
a sign boundary with a hyphen or a dot.
</p>
<p>
Ligatures can occasionally occur across signs that are analyzed as being
part of separate words; for instance, in Early Dynastic IIIb Ŋirsu,
illustrated here by the font [<a href="#OFS-RSP">OFS-RSP</a>], the
signs <span class="oraccrsp">𒊕</span> SAŊ and <span class="oraccrsp">𒅅</span> ŊAL₂ are ligated in <span class="oraccrsp">𒄥</span> <span class="oraccrsp">𒊕𒅅</span> gur saŋ ŋal₂, a unit of volume.
While, for searchability, it is generally preferable to separate words
when generating cuneiform text, if interword ligatures are desired,
the space between ligated words should be suppressed.
</p>
</div>
<h3>2.6.1 <a id="Discretionary_Ligatures" href="#Discretionary_Ligatures">Discretionary Ligatures</a></h3>
<p>
On occasion, some sequences of signs may be combined in a ligature for
stylistic effect, without that ligature being used systematically. This
is illustrated in <a href="#𒀭𒂗-ligature">Figure 7</a>,
where the signs 𒀭 and 𒂗 are ligated on the inscription on the left, but not on the inscription on the right,
even though the inscriptions are in consistent styles which could be expected to be covered by the same font.
Such ligatures are not usually distinguished in transliteration from the corresponding sequences,
so that both inscriptions would be transliterated ᵈsuen or ᵈEN.ZU; they do not carry distinct semantics.
They are not separately encoded; it is left to the font to display these if desired, possibly based on the presence of a zero-width joiner; see <cite><a href="https://www.unicode.org/versions/latest/core-spec/#G22742">Joiner</a></cite> in <cite>Section 23.2.2, Cursive Connection and Ligatures</cite>, of [<a href="#Unicode">Unicode</a>], and item 2 in [<a href="#Principles">Principles</a>].
When one needs to convey the ligature in transliteration, a plus sign is used, thus ᵈ⁺EN.ZU for the ligated example in Figure 7. When converting transliteration to cuneiform plain text, such a plus sign should
be mapped to U+200D ZERO WIDTH JOINER.
</p>
<p class="caption">Figure 7. <a name="𒀭𒂗-ligature" href="#𒀭𒂗-ligature">The name of the god Sîn, 𒀭𒂗𒍪.</a></p>
<table align="center" class="subtle">
<tr><th>[<a href="#P226934">P226934</a>]</th><th>[<a href="#P232275">P232275</a>]</th></tr>
<tr>
<td>
<img src="images/Ligated-ᵈsuen.png" alt="𒀭𒂗𒍪. The 𒀭 is in the top-left of the 𒂗.">
</td>
<td>
<img src="images/Non-ligated-ᵈsuen.png" alt="𒀭𒂗𒍪; no ligaturing.">
</td>
</tr>
</table>
<h2>3 <a name="The_Oracc_Global_Sign_List"></a><a name="The_Oracc_Sign_List" href="#The_Oracc_Sign_List">The Oracc Sign List</a></h2>
<p>
The Oracc Sign List [<a href="#OSL">OSL</a>] (formerly Oracc Global Sign List, OGSL) associates signs with their encoding, with their values, and with their
numbers in various sign lists; it can therefore be used to
automatically produce encoded versions of transliterated texts as described in
<cite>Section 2.1.1, <a href="#Transliteration">Transliteration</a></cite>,
to build input methods based on transliteration, and
to look up the glyphic range of a sign in various styles.
</p>
<p>
The Oracc Sign List is available as the machine-readable file
<a href="https://github.com/oracc/osl/blob/master/00lib/osl.asl">https://github.com/oracc/osl/blob/master/00lib/osl.asl</a>.
A specification of the structure of that file may be found at [<a href="#ASL">ASL</a>].
</p>
<p>
The Oracc Sign List treats the Unicode encoding as a sign list,
and establishes a concordance with the other sign lists.
However, while multiple OSL signs may share the same number in the classical sign lists,
a code point corresponds to at most one OSL sign.
This is a consequence of the principles described in
<cite>Section 2.3, <a href="#Mergers_and_Splits">Mergers and Splits</a></cite>.
</p>
<p>
For example, the signs 𒁆 BALAG and 𒂀 DUB₂ both correspond to sign number 565 in [<a href="#MZL">MZL</a>]
because they merge after the Ur III period,
but they are encoded separately as they are distinct in earlier styles.
</p>
<p>
Not all signs in the OSL correspond to a Unicode code point.
Some signs are encoded as sequences, as described in Section
<cite>Section 2.2, <a href="#Sequences">Sequences</a></cite>;
the OSL documents the appropriate sequence.
Other signs have no documented encoding.
Some of them may be candidates for encoding;
however, as the OSL is a working dataset,
others may eventually be found to be misreadings,
to be duplicates or variants of already-encoded signs,
or to otherwise be unencodable.
</p>
<p>
Indeed, some signs in the OSL, including some that are encoded in Unicode,
are marked as deprecated, because they are the result of errors in the
classification of cuneiform signs.
</p>
<p>
Some of these errors occurred as part of the encoding process.
For example, the sign DUB×EŠ₂ 𒁿 does not exist;
sign number 243 in [<a href="#MZL">MZL</a>] is named DUB׊E,
but that was misread during encoding as DUB׊È
(with a spurious grave accent).
The grave accent is equivalent to subscript 3, and
še₃ and eš₂ are values of the same sign 𒂠,
so the misreading DUB×ŠÈ was
encoded as DUB×EŠ₂.
</p>
<p>
Others are errors in earlier scholarship that were spotted after encoding.
For example, the sign DUB׊E 𒍶, which represents sign number 243 in [<a href="#MZL">MZL</a>],
does not exist; it was listed in [<a href="#MZL">MZL</a>]
based on a misreading of actual tablets in [<a href="#gaz₃">gaz₃</a>];
the sign appearing on these tablets should have been read GUM׊E 𒄤.
</p>
<h2><a name="References" href="#References">References</a></h2>
<table class="noborder" cellpadding="4">
<tr>
<td class="nb" valign="top">[<a name="aBZL" href="#aBZL">aBZL</a>]</td>
<td class="nb" valign="top">
Catherine Mittermayer. <cite>Altbabylonische Zeichenliste der sumerisch-literarische Texte</cite>. 2006.
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="ASL" href="#ASL">ASL</a>]</td>
<td class="nb" valign="top">
Steve Tinney. “ASL/OSL File Format”. <cite>Oracc Sign List.</cite> The OSL Project, 2024.
<br>
<a href="http://oracc.org/osl/asloslfileformat/">http://oracc.org/osl/asloslfileformat/</a>
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="ATF" href="#ATF">ATF</a>]</td>
<td class="nb" valign="top">
Steve Tinney & Eleanor Robson. “Working with ATF to edit texts”.
<cite>Oracc: The Open Richly Annotated Cuneiform Corpus</cite>.<br>
<a href="http://oracc.org/doc/help/editinginatf/index.html">http://oracc.org/doc/help/editinginatf/index.html</a>
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="BAU" href="#BAU">BAU</a>]</td>
<td class="nb" valign="top">
Eric Burrows, <cite>Archaic Texts</cite> (Ur Excavations Texts 2; London 1935)
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="Changes" href="#Changes">Changes</a>]</td>
<td class="nb" valign="top">
Steve Tinney, Rationale for changes to N2664R.<br>
UTC document <a href="https://www.unicode.org/L2/L2004/04080-n2664r-delta.pdf">L2/04-080</a>.
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="ELLes" href="#ELLes">ELLes</a>]</td>
<td class="nb" valign="top">
Pietro Mander, “Lista dei segni dei testi lessicali di Ebla”, in
<cite>Materiali epigrafici di Ebla</cite> 3, pp. 285-382. 1981.
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="gaz₃" href="#gaz₃">gaz₃</a>]</td>
<td class="nb" valign="top">
Miguel Civil, “Bloc-notes: sa-gazₓ(DUB׊E)--ak.”, in
<cite>Revue d’Assyriologie et d’archéologie orientale</cite> 60, p. 92. 1966.
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="HZL" href="#HZL">HZL</a>]</td>
<td class="nb" valign="top">
Christel Rüster & Erich Neu, <cite>Hethitisches Zeichenlexikon</cite> (Harrassowitz Verlag 1989)
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="KWU" href="#KWU">KWU</a>]</td>
<td class="nb" valign="top">
Nikolaus Schneider, <cite>Die Keilschriftzeichen der Wirtschaftsurkunden von Ur III</cite> (Rome
1935)
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="LAK" href="#LAK">LAK</a>]</td>
<td class="nb" valign="top">
Anton Deimel, <cite>Liste der archaischen Keilschriftzeichen von Fara</cite> (Wissenschaftliche
Veröffentlichungen der Deutschen Orient-Gesellschaft 40; Berlin 1922)
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="MÉA" href="#MÉA">MÉA</a>]</td>
<td class="nb" valign="top">
René Labat, <cite>Manuel d'épigraphie akkadienne</cite> (6th ed. Paris 1988)
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="MZL" href="#MZL">MZL</a>]</td>
<td class="nb" valign="top">
Rykle Borger, <cite>Mesopotamisches Zeichenlexikon</cite> (Alter Orient und Altes Testament 305;
Ugarit-Verlag 2003)
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="ICE" href="#ICE">ICE</a>]</td>
<td class="nb" valign="top">
Dean A. Snyder. “Cuneiform: From Clay Tablet to Computer”.<br>
UTC document <a href="https://www.unicode.org/L2/L2000/00398-Cuneiform.txt">L2/00-398</a>.
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="OBF" href="#OBF">OBF</a>]</td>
<td class="nb" valign="top">
Corvin R. Ziegeler,
<cite>Old Babylonian Freie</cite>, Version 2.0.0. November 2024.<br>
<a href="http://dx.doi.org/10.17169/refubium-44983">http://dx.doi.org/10.17169/refubium-44983</a><br>
<a href="https://github.com/crzfub/OB-Freie/releases/tag/v.2.0.0">https://github.com/crzfub/OB-Freie/releases/tag/v.2.0.0</a>
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="OFS-NAO" href="#OFS-NAO">OFS-NAO</a>]</td>
<td class="nb" valign="top">
Steve Tinney,
<cite>Oracc NA Outline</cite>. 2008.<br>
<a href="http://oracc.org/osl/OraccCuneiformFonts/ofs-nao/index.html">http://oracc.org/osl/OraccCuneiformFonts/ofs-nao/index.html</a>
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="OFS-RSP" href="#OFS-RSP">OFS-RSP</a>]</td>
<td class="nb" valign="top">
Steve Tinney,
<cite>Oracc RSP</cite>. 2025.<br>
<a href="http://oracc.org/osl/OraccCuneiformFonts/ofs-rsp/index.html">http://oracc.org/osl/OraccCuneiformFonts/ofs-rsp/index.html</a>
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="OGSL"></a><a name="OSL" href="#OSL">OSL</a>]</td>
<td class="nb" valign="top">
Niek Veldhuis, Steve Tinney, et al. “Oracc Sign List”.
<cite>Oracc: The Open Richly Annotated Cuneiform Corpus.</cite><br>
<a href="http://oracc.org/osl/">http://oracc.org/osl/</a>
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="P010576" href="#P010576">P010576</a>]</td>
<td class="nb" valign="top">
“CDLI Lexical 000014, Ex. 013 & 000027, Ex. 14 Artifact Entry.” 2001.
Cuneiform Digital Library Initiative (CDLI).
December 4, 2001.<br>
<a href="https://cdli.earth/P010576">https://cdli.earth/P010576</a>
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="P103303" href="#P103303">P103303</a>]</td>
<td class="nb" valign="top">
“AUCT 1, 458 Artifact Entry.” 2001.
Cuneiform Digital Library Initiative (CDLI).
December 20, 2001.<br>
<a href="https://cdli.earth/P103303">https://cdli.earth/P103303</a>
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="P142296" href="#P142296">P142296</a>]</td>
<td class="nb" valign="top">
“YOS 04, 232 Artifact Entry.” (2001) 2023.
Cuneiform Digital Library Initiative (CDLI).
February 1, 2023.<br>
<a href="https://cdli.earth/P142296">https://cdli.earth/P142296</a>
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="P225950" href="#P225950">P225950</a>]</td>
<td class="nb" valign="top">
“CDLI Lexical 000010, Ex. 014 Artifact Entry.” 2003.
Cuneiform Digital Library Initiative (CDLI).
August 19, 2003.<br>
<a href="https://cdli.earth/P225950">https://cdli.earth/P225950</a>
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="P226934" href="#P226934">P226934</a>]</td>
<td class="nb" valign="top">
“RIME 3/2.01.04.22, Ex. 01 Artifact Entry.” (2003) 2023.
Cuneiform Digital Library Initiative (CDLI).
June 14, 2023.<br>
<a href="https://cdli.earth/P226934">https://cdli.earth/P226934</a>
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="P232275" href="#P232275">P232275</a>]</td>
<td class="nb" valign="top">
“RIME 3/1.01.07, St B Witness Artifact Entry.” (2003) 2023.
Cuneiform Digital Library Initiative (CDLI).
June 14, 2023.<br>
<a href="https://cdli.earth/P232275">https://cdli.earth/P232275</a>
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="P249253" href="#P249253">P249253</a>]</td>
<td class="nb" valign="top">
“RIME 4.03.06.Add21, Ex. 01 Artifact Entry.” (2004) 2023.
Cuneiform Digital Library Initiative (CDLI).
June 15, 2023.<br>
<a href="https://cdli.earth/P249253">https://cdli.earth/P249253</a>
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="P281820" href="#P281820">P281820</a>]</td>
<td class="nb" valign="top">
“BAM 3, 314 Artifact Entry.” 2005.
Cuneiform Digital Library Initiative (CDLI).
November 11, 2005.<br>
<a href="https://cdli.earth/P281820">https://cdli.earth/P281820</a>
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="P282017" href="#P282017">P282017</a>]</td>
<td class="nb" valign="top">
“KAJ 002 Artifact Entry.” 2005.
Cuneiform Digital Library Initiative (CDLI).
November 11, 2005.<br>
<a href="https://cdli.earth/P282017">https://cdli.earth/P282017</a>
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="P360975" href="#P360975">P360975</a>]</td>
<td class="nb" valign="top">
“AAA 1/3, 01 Artifact Entry.” 2007.
Cuneiform Digital Library Initiative (CDLI).
February 13, 2007.<br>
<a href="https://cdli.earth/P360975">https://cdli.earth/P360975</a>
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="P422664" href="#P422664">P422664</a>]</td>
<td class="nb" valign="top">
“RINAP 5/1 Ashurbanipal 010, Ex. 001 Artifact Entry.” (2011) 2023. Cuneiform Digital Library Initiative (CDLI).
February 1, 2023.<br>
<a href="https://cdli.earth/P422664">https://cdli.earth/P422664</a>
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="P467315.r.i.22" href="#P467315.r.i.22">P467315.r.i.22</a>]</td>
<td class="nb" valign="top">
Niek Veldhuis, et al. YOS 01, 53, reverse i 22. “Digital Corpus of Cuneiform Lexical Texts”.
<cite>Oracc: The Open Richly Annotated Cuneiform Corpus.</cite><br>
<a href="http://oracc.org/dcclt/P467315.210">http://oracc.org/dcclt/P467315.210</a>
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="Principles" href="#Principles">Principles</a>]</td>
<td class="nb" valign="top">
Michael Everson & Karljürgen Feuerherm.
“Basic principles for the encoding of Sumero-Akkadian Cuneiform”.<br>
UTC document <a href="https://www.unicode.org/L2/L2003/03162-n2585-cuneiform.pdf">L2/03-162</a>.
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="PTACE" href="#PTACE">PTACE</a>]</td>
<td class="nb" valign="top">
Amalia Catagnoti, “La paleografia dei testi dell’amministrazione e della cancelleria di Ebla”.
<cite>Quaderni di Semitistica</cite> 30. 2010.
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="RÉC" href="#RÉC">RÉC</a>]</td>
<td class="nb" valign="top">
François Thureau-Dangin, <cite>Recherches sur l'origine de l'écriture cunéiforme</cite> (Paris 1898)
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="RSP" href="#RSP">RSP</a>]</td>
<td class="nb" valign="top">
Yvonne Rosengarten, <cite>Répertoire commenté des signes présargoniques sumériens de Lagash</cite> (Paris 1967)
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="ŠL" href="#ŠL">ŠL</a>]</td>
<td class="nb" valign="top">
Anton Deimel, <cite>Šumerisches Lexikon</cite> (Rome 1925/1950)
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="Syllabaire" href="#Syllabaire">Syllabaire</a>]</td>
<td class="nb" valign="top">
François Thureau-Dangin, <cite>Le Syllabaire Accadien</cite> (Paris 1926)
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="Unicode" href="#Unicode">Unicode</a>]</td>
<td class="nb" valign="top">
<i>The Unicode Standard</i><br>
Latest version:<br>
<a href="https://www.unicode.org/versions/latest/">https://www.unicode.org/versions/latest/</a>
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="UAX38" href="#UAX38">UAX38</a>]</td>
<td class="nb" valign="top">
<i>Unicode Standard Annex #38:</i> <i>Unicode Han Database (Unihan)</i><br>
Latest version:<br>
<a href="https://www.unicode.org/reports/tr38/">https://www.unicode.org/reports/tr38/</a>
</td>
</tr>
<tr>
<td class="nb" valign="top">[<a name="ZATU" href="#ZATU">ZATU</a>]</td>
<td class="nb" valign="top">
Margret W. Green and Hans J. Nissen, <cite>Zeichenliste der Archaischen Texte aus Uruk</cite>
(Archaische Texte aus Uruk 2; Berlin 1987)
</td>
</tr>
</table>
<h2><a name="Acknowledgements" href="#Acknowledgements">Acknowledgements</a></h2>
<p>Robin Leroy authored the bulk of the text, under direction from the Unicode Technical Committee.</p>
<p>
Thanks also to the following people for their feedback or contributions to this document:
Deborah Anderson,
Peter Constable,
Karljürgen Feuerherm,
Asmus Freytag,
Sara Manasterska,
Roozbeh Pournader,
Erica Scarpa,
Steve Tinney,
Niek Veldhuis,
Ken Whistler,
Ben Yang,
Corvin Ziegeler.
</p>
<h2><a name="Modifications" href="#Modifications">Modifications</a></h2>
<div>
<p>The following summarizes modifications from the previous revision of this document.</p>
<p><b>Revision 5</b></p>
<ul>
<li><b>Reissued</b></li>
<li>Section 2.1.1, <a href="#Transliteration">Transliteration</a>: Added a
discussion of cases where usual transliterations are not sufficient to
determine the cuneiform text.</li>
<li>Added Section 2.1.2, <a href="#Numerals">Numerals</a>: A discussion of
practices in numeric transliteration, the disunification of stacking
patterns, and the implications for generating cuneiform text.</li>
<li>Section 2.2, <a href="#Sequences">Sequences</a>: Significantly
reworded to better reflect the nuances of the encoding model.</li>
<li>Section 2.3.1, <a href="#Mergers_and_Splits_of_Sequences">Mergers and Splits of Sequences</a>:
Reworded to take ligatures into account.</li>
<li>Added Section 2.6, <a href="#Ligatures">Ligatures</a>: A discussion of
non-discretionary ligatures.</li>
<li>Section 2.6.1, <a href="#Discretionary_Ligatures">Discretionary Ligatures</a>:
Added a recommendation to map transliteration + to ZWJ.</li>
</ul>
</div>
<hr width="50%">
<p class="copyright">© 2023–2025 Unicode, Inc. This publication is protected by copyright, and permission must be obtained from Unicode, Inc. prior to any reproduction, modification, or other use not permitted by the <a href="https://www.unicode.org/copyright.html">Terms of Use</a>. Specifically, you may make copies of this publication and may annotate and translate it solely for personal or internal business purposes and not for public distribution, provided that any such permitted copies and modifications fully reproduce all copyright and other legal notices contained in the original. You may not make copies of or modifications to this publication for public distribution, or incorporate it in whole or in part into any product or publication without the express written permission of Unicode.</p>
<p class="copyright">Use of all Unicode Products, including this publication, is governed by the Unicode <a href="https://www.unicode.org/copyright.html">Terms of Use</a>. The authors, contributors, and publishers have taken care in the preparation of this publication, but make no express or implied representation or warranty of any kind and assume no responsibility or liability for errors or omissions or for consequential or incidental damages that may arise therefrom. This publication is provided “AS-IS” without charge as a convenience to users.</p>
<p class="copyright">Unicode and the Unicode Logo are registered trademarks of Unicode, Inc., in the United States and other countries.</p>
</div>
</body>
</html>
Rendered documentLive HTML preview