.

KGRKJGETMRETU895U-589TY5MIGM5JGB5SDFESFREWTGR54TY
Server : Apache/2.4.62
System : FreeBSD fbsdweb2.web.rcn.net 14.1-RELEASE FreeBSD 14.1-RELEASE releng/14.1-n267679-10e31f0946d8 GENERIC amd64
User : www ( 80)
PHP Version : 8.3.8
Disable Function : NONE
Directory : /domains/markrose/
Upload File :
Current File : /domains/markrose/numnotes.html
<HTML> 
<meta http-equiv="content-type" content="text-html; charset=utf-8">
<HEAD><TITLE>The numbers list: Notes and warnings</TITLE></HEAD> 

<BODY BGCOLOR="#FFFFFF" TEXT="#000000"> 

<table width="100%" border=0 cellspacing="0" cellpadding="0">
<tr bgcolor="#EEC25A">

<td width="70%"><IMG  Align=Top SRC="numheader.gif"></td>
<td><A HREF="default.html"><img align="right" src="homegold.gif" border=0 alt="Home"></A></td>

</tr><tr bgcolor="#EEC25A">

<td colspan="2"><h2>The numbers list: Notes and warnings </h2>

</table>

<h2>Numbers 2.0!</h2>

As of September 2016, <a href="numbers.shtml">the numbers page</a> has been entirely redone.  The major changes:

<ul>
<li>It uses Unicode to represent the numbers far more accurately.
<li>As a corollary, I can represent native writing systems for many languages.
<li>It uses Javascript to allow you to customize the results (e.g. view just one family).
<li>Families are color-coded to help distinguish parts of the world.
</ul> 

And on my end, the source file has also been upgraded, which allows far easier updating.  (The input is actually a raw text file; the display html is built on the fly in Javascript.)


<h2>How do I use it?</h2>

If it's not clear: push the <b>List</b> button to get a list of numbers!  You must have Javascript enabled.

<p>You can limit the list by using the other controls, allowing you to concentrate on just the data you want.

<p>The page makes very full use of Unicode.  It looks great on my Mac.  If you get a lot of square boxes instead of characters, you probably have bad Unicode support.  Try installing the <a href="http://www.linuxlibertine.org">Linux Libertine</a> or <a href="http://software.sil.org/gentium/">Gentium</a> fonts.  If that's not possible, I've kept around the old non-Unicode pages; <a href="#below">see below</a>.


<h2>Symbols and conventions</h2>

The <b>colored headings</b> indicate language families.  A name in brackets (e.g. [Andean]) is speculative.

<p>Languages with over a <b>million speakers</b> are named in <b>boldface</b>.  The data (from David Crystal) is a couple of decades old but it still indicates the major languages.

<p>If the name starts with <b>+</b>, the language is <b>extinct</b>.  (That means no <i>native</i> speakers; of course people often learn ancient languages for scholarship or other reasons.)  Languages are disappearing at a truly alarming rate, and my sources on this are getting old, so probably many languages are listed as alive when they're really not.

<p>Names in <i>italics</i> are <b>dialects</b> or other variants.  Don't take this as very important; <a href="#dialects">see below</a>.

<p>As should be obvious, if you see a notation like <b>[5] po [1]</b>, that means to substitute the names for 5 and 1 into the expression.  

<p>Less obviously: if you see <b>… [2]</b>, that means that the number is formed just like the number on its left, only using [2] rather [1].  E.g. if you see that 6 is <b>[5] so ɣitne [1]</b> and 7 is <b>… [2]</b>, that's equivalent to <b>[5] so ɣitne [2]</b>.  It saves a lot of space.

<p>A number preceded by <b>*</b> is a reconstructed form.


<!--
<p>Superscript numbers indicate a numbered toneme (e.g. <sup><font size=-1>1</font></sup> = first tone) 
<br>Appended numbers give tonal contours directly (e.g. <font size=-1>35</font> = high rising)
-->


<h2>Sources</h2>

The <a href="sources.htm"><B>Sources Page</b></a> gives the sources for each language (and also lists languages I don't have, and connects the languages to other wide-scale classifications: Ruhlen, Voegelin & Voegelin, Campbell, and the <A href="http://ethnologue.com">Ethnologue</a>).  

<p>I dearly appreciate everyone who's sent me numbers; but I want to particularly salute those whose kindness and hard work have been extraordinary: <a href="http://hometown.aol.com/nahali/myhomepage/profile.html"><b>Jarel Deaton</b></a> of Ohio, who is single-handedly responsible for more than a quarter of the numbers seen here; <a href="http://euslchan.tripod.com/index.html"><b>Eugene S.L. Chan</b></a> of Hong Kong, who sent me his
entire Austronesian database; and <b>Carl Masthay</b> of St. Louis and <b>Pavel Petrov</b> of Kaliningrad, who sent me their enormous, worldwide collection of numbers.  

<p>Special thanks to Claudia Griffith and the staff of the SIL Library in Duncanville, Texas, whose wonderful hospitality made a week of research in the summer of 2004 both pleasant and productive.

<p>You may also enjoy:
<ul>
<li><a href="lang8.html"><b>How languages are classified</b></a>, from the <a href="langfaq.html">sci.lang faq</a>.
<LI><a href="http://members.tripod.com/~rjschellen/IENums.htm"><b>Rick Schellen</b>'s page of the numbers in over 400 Indo-European <b>dialects</b>.</a>
<li><a href="http://www.elite.net/~runner/jennifers/">Jennifer Runner's page</a> on common expressions in many languages.
</ul>


<h2>Some caveats</h2>


<p>There are often complications (e.g. declension of numbers, or different series of numbers for different purposes), 
and I haven't had room for them here.

<p>If you want to trace relationships, numbers may be misleading, as they are easily borrowed.  Conversely, related languages may have numbers that aren't cognate; they may have innovated the names in different ways.

<p>The standard orthography or standard dialect may have changed since my source on a language was published.

<p>Hundreds of millions of English speakers agree that the numbers are one, two, three, etc.  But not all languages are standardized in this way.  For unwritten languages, different linguists' word lists may be strikingly different.  Their ears may not be attuned to the language; or there may be dialectal variation, or even sound change.  Here's a couple examples, one from Asia, one from Africa:
</UL>

<blockquote>
<table>
<tr><td><b>Bru</b>
<td>muəj
<td>ba:r
<td>paj
<td>po:n
<td>sə:ng
<td>təpat
<td>təpu:l
<td>təkual
<td>tikeas
<td>məncit

<tr><td><b>Bru</b>
<td>muoi
<td>bar
<td>pái
<td>poun
<td>sau'ng
<td>tapoât
<td>tapul
<td>takual
<td>takêh
<td>muoi chít

<tr><td><b>Gurma</b>
<td>yèn.dó
<td>lyé
<td>tà
<td>nâ
<td>mù
<td>lwọ̈bà
<td>lèle:
<td>nî
<td>pà:nì
<td>pyêgà

<tr><td><b>Gurma</b>
<td>n lè
<td>nlé
<td>nta
<td>nna
<td>nmu
<td>nluoba
<td>n lele
<td>nni
<td>n-ya
<td>ka piga
</table>
</blockquote>

I use <b>standard orthographies</b>, where there is one, rather than phonetic transcriptions.  This makes comparison a bit more difficult; but I prefer it, for two reasons.  First, it reduces errors; even if I can correctly interpret a source's phonetic description, there may be orthographic irregularities that make a straight transcription ludicrous.  Secondly, an orthography is generally closer to a <i>phonemic</i> representation, which is arguably what people have in their heads.  


<h2><a name="dialects">Languages and dialects</a></h2>

<p>People can get very excited about what's a <b>language</b> vs. what's a <b>dialect</b>.  There is nothing inherent in the language variety to tell us what it is.  Linguists in general use "language" to refer to a mutually intelligible group of dialects (but note that intelligibility can be partial).  

<p>Ordinary people generally call something a "language" if it has a prestigious standard form; but that's a fact about people's attitudes, not about language.  (Nonethelesss, if there <i>is</i> a standard form, it will be on the list!)

<p>I generally rely on Voegelin & Voegelin, or on the original source for the numbers, in deciding whether to list something as a dialect (italicized).  Some of my sources list multiple dialects; I usually try to pick the most widely spoken ones, and list others only if they're interestingly divergent.

<p>Corollary: please don't complain to me about what's a dialect or a language-- you're arguing about nothing.  (But feel free to send me additional dialects, or point out where I've messed up the names.)

<p>Especially in the Amerind sections, I sometimes list <b>older sources</b> which may be of historical interest.  




<h2>What's not here?</h2>

<p>How many languages <b>aren't</b> here?  Well, there's almost
5000 living languages listed in Ruhlen's volume; I have numbers for about 83% of
 them, so there's at least a thousand more.  (If the math doesn't seem to work out, note that I have plenty of dialects and conlangs not included in Ruhlen's list.)
There are about 200 languages with more
than a million speakers, all of which are in the list.

<p>Am I going to do <b>higher numbers</b>?  Or zero?  Probably not, unless I do it for a subset of languages only.  Many of the sources don't even have numbers above ten.


<h2>How was this done?</h2>

People sometimes ask me how I accumulated all these numbers, or how to do this sort of research.

<p>The answer is simple: <b>libraries</b>.  I have access to a few good university libraries, and when I can I visit others.  You look in grammars, dictionaries, and books or journal articles surveying entire families.

<p>And, if possible, find others who've been bitten by the same bug!


<h2><a name="below">The old files</a></h2>

If you can't read the Unicode files, I've kept the oldest versions of the numbers pages, which use no Unicode at all.  (They do use the Latin-1 characters the web has always supported.)

<p><i>The following conventions apply only to the old files.</i>

<p><IMG  Align=left SRC="ipa.gif">  The picture shows the representations used for a number of IPA 
characters.  I haven't been able to retain all phonetic distinctions, and some have been lost-- for instance, the distinction
between a circumflex (â) and a hachek (ǎ).

<p>For African tonal languages, a macron <sup>-</sup> indicates a high level tone, not length, 
and is represented as <font color="#808080">_</font>.  
<font color="#808080">|</font> is another tone, usually low level.  For non-African languages, a macron indicates length and is indicated :.

<p><u>?</u> indicates the glottal stop (but if my sources spell it as an apostrophe or q, I follow them)

<p><b>bold</b> indicates a character which was dotted in the original source-- usually an emphatic or retroflex consonant 

<p><i>italic</i> indicates open e and o and lax i and u, or a character that was italicized in the original source

<ul>
<LI>	<a href="euro.htm#ie">Indo-European</a>, 
	<a href="euro.htm#dravidian">Dravidian</a>, and 
	<a href="euro.htm#nahali">minor European</a> languages
<LI>	<a href="mide.htm#afro">Afro-Asiatic</a> and 
	<a href="mide.htm#caucasian">Caucasian</a>languages
<LI>	<a href="nilo.htm#nilo">Nilo-Saharan</a>, 
	<a href="nilo.htm#kordofanian">Kordofanian</a>, and 
	<a href="nilo.htm#khoisan">Khoisan</a> languages
<li>	<a href="niger.htm">Niger-Congo</a> languages
<li>	<a href="benu.htm">More Niger-Congo</a> languages,
	including Bantu
<li>	<a href="asia.htm">Uralic</a>, and 
	<a href="asia.htm#nahali">Altaic</a>, and 
	<a href="asia.htm#miao">Miao-Yao</a>, and 
	<a href="asia.htm#tai">Tai</a>, and 
	<a href="asia.htm#austro">Austro-Asiatic</a>, and 
	<a href="asia.htm#palaeo">other Asian</a> languages
<li>	<a href="sino.htm">Sino-Tibetan</a> languages
<li>	<a href="anes.htm">Austronesian</a> languages
<li>	North American Indian languages -  
	<a href="amer.htm#eskimo">Eskimo</a>,
	<a href="amer.htm#nadene">Na-den&eacute;</a>, 
	<a href="amer.htm#algonquian">Algic</a>, 
	<a href="amer.htm#keres">Keres</a>, 
	<a href="amer.htm#macro">Siouan</a>, 
	<a href="amer.htm#caddoan">Caddoan</a>, 
	<a href="amer.htm#iroquoian">Iroquoian</a>, 
	<a href="amer.htm#tanoan">Kiowa-Tanoan</a>
	<a href="amer.htm#hokan">"Hokan"</a>,
	<a href="amer.htm#naiso">isolates</a>
<li>	Mesoamerican Indian languages - 
	<a href="came.htm#penutian">"Penutian"</a>,
	<a href="came.htm#aztecan">Uto-Aztecan</a>,
	<a href="came.htm#oto">Oto-Manguean</a>,
	<a href="came.htm#chibchan">Macro-Chibchan</a>,
	<a href="came.htm#paezan">Paezan</a>
	<a href="came.htm#yano">Yanomaman</a>
<li>	South American Indian languages - 
	<a href="same.htm#andean">"Andean"</a>,
	<a href="same.htm#eskimo">"Equatorial"</a>,
	<a href="same.htm#eskimo">Tupi-Cariban</a>,
	<a href="same.htm#otomaco">Macro-Otomakoan</a>,
	<a href="same.htm#guamo">Guamo-Chapacuran</a>,
	<a href="same.htm#arawak">Macro-Arawakan</a>,
	<a href="same.htm#witoto">Bora-Witotoan</a>,
	<a href="same.htm#waikuru">Macro-Waikur&uacute;an</a>,
	<a href="same.htm#panoan">Macro-Panoan</a>,
	<a href="same.htm#ge">Macro-Ge</a>,
	<a href="same.htm#saiso">isolates</a>
<li>	<a href="newg.htm">Indo-Pacific languages</a>
<li>	<a href="aust.htm">Australian languages</a>
<li>	<a href="last.htm">Pidgins and creoles</a>
<li>	<a href="last.htm#conlang">Constructed languages</a>
</ul>

<hr>

<p><center><A HREF="default.html"><img src="home.gif" border=0 alt="Home"></A></center>


</BODY> 
</HTML>
Anon7 - 2021