|
Server : Apache/2.4.62 System : FreeBSD fbsdweb2.web.rcn.net 14.1-RELEASE FreeBSD 14.1-RELEASE releng/14.1-n267679-10e31f0946d8 GENERIC amd64 User : www ( 80) PHP Version : 8.3.8 Disable Function : NONE Directory : /domains/markrose/ |
Upload File : |
<HTML>
<HEAD><TITLE>Deriving Proto-World </TITLE></HEAD>
<BODY>
<IMG Align=Top SRC="berries.jpg">
<H3>Deriving Proto-World with tools you probably have at home</H3>
Discussions of 'Proto-World' have gotten quite a bit of press lately-- not as much as Di's divorce, but about as much as any topic in historical linguistics ever gets.
<p>Is there anything to it? Very probably not-- which is a pity, because getting back to Proto-World sounds like a lot of fun, and now it seems like the only alternative is to wait for aliens to come by who had a tape recorder running one or two hundred thousand years ago.
<p>Hans Henrich Hock gave a talk at CLS 29 on Ruhlen and Greenberg's "world etymology" <i>maliq'a</i> 'swallow, throat', pointing out quite a few serious methodological problems. It may be worth repeating some of his points. To start with, <a name="maliqa">here's R&G's supporting citations</a>:
<table>
<tr><td>Proto-Afro-Asiatic</td> <td>Afro-Asiatic</td> <td>*mlg</td> <td>'suck, breast, udder'</td></tr>
<tr><td>Arabic</td> <td>Afro-Asiatic</td> <td>m-l-j</td> <td>'suck the breast'</td></tr>
<tr><td>Old Egyptian</td> <td>Afro-Asiatic</td> <td>mndy</td> <td>'woman's breast, udder'</td></tr>
<tr><td>Proto-Indo-European</td> <td>Indo-European</td> <td>*melg-</td> <td>'to milk'</td></tr>
<tr><td>English</td> <td>Indo-European</td> <td>milk</td> <td>'to milk, milk'</td></tr>
<tr><td>Latin</td> <td>Indo-European</td> <td>mulg-e:re</td> <td>'to milk'</td></tr>
<tr><td>Proto-Finno-Ugric</td> <td>Finno-Ugric</td> <td>*mälke</td> <td>'breast'</td></tr>
<tr><td>Saami</td> <td>Finno-Ugric</td> <td>mielga</td> <td>'breast'</td></tr>
<tr><td>Hungarian</td> <td>Finno-Ugric</td> <td>mell</td> <td>'breast'</td></tr>
<tr><td>Tamil</td> <td>Dravidian</td> <td>melku</td> <td>'to chew'</td></tr>
<tr><td>Malayalam</td> <td>Dravidian</td> <td>melluka</td> <td>'to chew'</td></tr>
<tr><td>Kurux</td> <td>Dravidian</td> <td>melkha:</td> <td>'throat'</td></tr>
<tr><td>Central Yupik</td> <td>Eskimo-Aleut</td> <td>melug-</td> <td>'to suck'</td></tr>
<tr><td>Proto-Amerind</td> <td></td> <td>*maliq'a</td> <td>'to swallow, throat'</td></tr>
<tr><td>Halkomelem</td> <td>Almosan</td> <td>m@lqw</td> <td>'throat'</td></tr>
<tr><td>Kwakwala</td> <td>Almosan</td> <td>m'lXw-'id</td> <td>'chew food for the baby'</td></tr>
<tr><td>Kutenai</td> <td>Almosan</td> <td>u'mqolh</td> <td>'to swallow'</td></tr>
<tr><td>Chinook</td> <td>Penutian</td> <td>mlqw-tan</td> <td>'cheek'</td></tr>
<tr><td>Takelma</td> <td>Penutian</td> <td>mülk'</td> <td>'to swallow'</td></tr>
<tr><td>Tfaltik</td> <td>Penutian</td> <td>milq</td> <td>'to swallow'</td></tr>
<tr><td>Mixe</td> <td>Penutian</td> <td>amu'ul</td> <td>'to suck'</td></tr>
<tr><td>Mohave</td> <td>Hokan</td> <td>malyaqe'</td> <td>'throat'</td></tr>
<tr><td>Walapei</td> <td>Hokan</td> <td>malqi'</td> <td>'throat, neck'</td></tr>
<tr><td>Akwa'ala</td> <td>Hokan</td> <td>milqi</td> <td>'neck'</td></tr>
<tr><td>Cuna</td> <td>Chibchan</td> <td>murki-</td> <td>'to swallow'</td></tr>
<tr><td>Quechua</td> <td>Andean</td> <td>malq'a</td> <td>'throat'</td></tr>
<tr><td>Aymara</td> <td>Andean</td> <td>malyq'a</td> <td>'throat'</td></tr>
<tr><td>Iranshe</td> <td>Macro-Tucanoan</td> <td>moke'i</td> <td>'neck'</td></tr>
<tr><td>Guamo</td> <td>Equatorial</td> <td>mirko</td> <td>'to drink'</td></tr>
<tr><td>Surinam</td> <td>Macro-Carib</td> <td>e'mo:kï</td> <td>'to swallow'</td></tr>
<tr><td>Faai</td> <td>Macro-Carib</td> <td>mekeli</td> <td>'nape of the neck'</td></tr>
<tr><td>Kaliana</td> <td>Macro-Carib</td> <td>imukulali</td> <td>'throat'
</table>
<p>Now, there's no denying that seeing such a list is suggestive, and that it
<b>seems</b> like there must be something in it. I'll maintain, however, that
this is simply self-delusion-- a consequence of the human ability to make
connections even in the face of near-random data.
<p>Take a closer look at the list; the rules for this game are evidently quite
lax. The vowels are completely ignored. The middle consonant varies
from <b>l</b> to <b>ly</b> to <b>lh</b> to <b>n</b> to <b>r</b> to zero. The end consonant ranges from <b>g</b>
to <b>j</b> to <b>d</b> to <b>k</b> to <b>q</b> to <b>q'</b> to <b>kh</b> to <b>k'</b> to <b>X</b> to zero. Switching around medial
consonants seems to be allowed; extra consonants and syllables can appear
where needed.
<p>Observe the semantic variation as well: body parts ranging from neck to
nape to throat to breast to cheek; actions including swallowing, milking, drinking, chewing, and
sucking. Some defenders of Ruhlen & Greenberg make much of the probability of finding such lists
among given numbers of families; but notice that one can pretty much
pick and choose what languages from a family to include. If Greek
doesn't do it for you, try Latin; if Hebrew doesn't work, use Arabic.
<p>The truth is that lists like this are not hard to produce-- <i>au contraire.</i>
Just to demonstrate this I've taken a number of words at random in
<a name="chinesequechua">Chinese</a> and looked for 'cognates' in Quechua (and wherever else I could
think of one), using as best as I could the level of phonetic and semantic variation evidenced in R&G's list above. If I had more dictionaries at hand I'd find you more.
<pre>Chinese ren 'person'
<br>Quechua runa 'person'
<br>
<br>Chinese ch'ung 'insect'
<br>Quechua chinchi 'type of insect'
<br>English chigger
<br>
<br>Chinese shui 'water'
<br>Quechua sut'u 'wet'
<br>French suée 'sweat'
<br>Greek hudor 'water'
<br>Dutch schuit 'boat'
<br>Turkish su 'water'
<br>
<br>Chinese shuohua 'talk'
<br>Quechua suka 'whistle'
<br>French charler 'chat'
<br>
<br>Chinese lao 'old'
<br>Quechua laqla 'old'
<br>Tok Psn. lapun 'old'
<br>
<br>Chinese nai 'breast'
<br>Quechua ñuñu 'breast'
<br>French néné 'breast'
<br>Bulgar. nenka 'breast'
<br>
<br>Chinese sheng 'rise'
<br>Quechua seqay 'rise'
<br>
<br>Chinese cheh 'this'
<br>Quechua chay 'that'
<br>French ce 'this/that'
<br>
<br>Chinese chihfan 'eat'
<br>Quechua chipay 'close mouth'
<br>French chef 'cook'
<br>
<br>Chinese chung 'middle'
<br>Quechua chawpi 'center'
<br>Italian centro 'center' (c = ch)
<br>
<br>Chinese ti 'earth'
<br>Quechua tiksimuyu 'earth'
<br>Spanish tierra 'earth'
<br>
<br>Chinese ch'ing 'please'
<br>Quechua hinay 'do thus'
<br>
<br>Chinese wang 'king'
<br>Quechua waminqa 'chief'
<br>
<br>Chinese you 'again'
<br>Quechua yapa 'addition'
<br>Spanish ya 'already'
<br>
<br>Chinese kung 'work'
<br>Quechua kunay 'carry'
<br>English gung-ho 'eager to work'
<br>
<br>Chinese ch'uan 'river'
<br>Quechua chumay 'dip in water'
<br>Spanish chupar 'drink, suck'
<br>Dutch schoon 'clean'
<br>
<br>Chinese lai 'come'
<br>Quechua riy 'go'
<br>French aller 'go'
<br>
<br>Chinese ai 'love'
<br>Quechua ayni 'mutual help'
<br>French aimer 'love'
<br>
<br>Chinese san 'mountain'
<br>Quechua senqa 'mountain peak'
<br>French chaîne 'mountain range'
<br>
<br>Chinese nü 'woman'
<br>Quechua ñusta 'princess'
<br>Dutch nuf 'aloof girl'
<br>Greek (gy)ne 'woman'
<br>Latin (femi)na'woman'
<br>French nana 'woman'
<br>German -in fem. suffix
<br>
<br>Chinese ma 'mother'
<br>Quechua mama 'mother'
<br>French maman 'mother'
<br>
<br>Chinese nan 'difficult'
<br>Quechua nanaq 'painful'
<br>
<br>Chinese kei 'give'
<br>Quechua qoy 'give'
<br>Scots gie 'give'
</pre>
<P>Now, anyone can see that almost all of these correspondences are completely bogus. We know where French <i>suée</i> comes from, and it's <b>not</b> from Chinese. R&G really gain the benefit of obscurity here: how many of us can determine whether they are (unconsciously) playing the same kind of tricks with Tfaltik and Guamo as I am playing with Chinese and Quechua here? (Amerindian specialists, in fact, are quite skeptical about R&G's claims.)
<p>(By the way, if anyone thinks I'm using odd words or glosses, so are R&G. I can't even find their Quechua word <i>malq'a</i> in eight Quechua dictionaries; the usual Quechua word for 'throat' is <i>q'oto</i>. <i>Mallq'a</i> is the Aymara word for 'throat', but I don't know where the 'to swallow' gloss comes from; Aymara for 'to swallow' is <i>thataña</i>.)
<p>All this is intended to show how easy it is to find such spurious
correspondences. But that's not the end of it; Ruhlen & Greenberg
have the opposite problem as well: there's not only too <b>much</b> variation
in their list, there's too <b>little</b>. Languages that really are related
have diverged much more in 6000 years than some of R&G's words seem
to have diverged in at least 10,000.
<p>Hock uses Hindi and English as an example. The following words, for instance, are real cognates:
<pre>cakka: wheel
<br>pa:nch five
<br>si:~g horn
<br>chah six
<br>pissu: flea
</pre>
<p>Surely R&G would be embarrassed to pick words that far apart as cognates for Proto-World; with that level of phonetic resemblance, everything is related to everything. On the other hand, they might seize upon such a pair as Hindi <i>lu:t.</i> 'rob' and English 'loot'... but these are not cognates; English borrowed the word from Hindi. The actual English cognate of Hindi <i>lu:t.</i> is 'leaf'... which illustrates as well some of the semantic divergence that can occur in 6000 years.
<p>Applied to the Indo-European family (which we know from careful comparative work to be related), R&G's mass comparison would yield large numbers of both false positives (<i>lu:t.</i> and loot, day and <i>dies</i>, have and <i>habere</i>) and large numbers of false negatives (<i>cakka:</i> and wheel, <i>lu:t.</i> and leaf, date and dacha, milk and lettuce). Applied to unrelated languages, the method will generate long lists of bogus resemblances due to chance (as in my Quechua/Chinese comparison above).
<p>I'm tempted to say that the true cognate of <i>maliq'a</i> in English is 'malarkey'. Only the <A HREF="lang9.html#10">comparative method</a> can reveal whether any of the relationships postulated by R&G are real. But the comparative method takes time and patience, and so it's probably long going to be at a disadvantage in the marketplace of ideas, compared to a method which offers quick answers of the type we want to hear.
<hr>
<h4>Or maybe Chinese does derive from Quechua?</h4>
When I first posted this stuff to the Net, one gentleman wondered aloud (wondered anet?) if I might have proved that Chinese and Quechua <b>are</b> related. Some days it's not worth getting out of bed.
<p>Similar words with similar meanings do <b>not</b> prove that languages are related. They might point to a relationship-- but they might also be due to borrowing ('gung ho' really is from Chinese); they might be due to universal processes like babytalk or onomatopoeia; and above all they may just be chance.
<p>This seems to be hard for some people to accept. Just look at <i>ren</i> and <i>runa</i>, or <i>gaijin</i> and <i>goyim</i>, they seem to think-- how could that possibly be due to chance?
<p>These people should be treated with respect. They are the people who made Las Vegas what it is today.
<p>What are the chances of finding <i>maliq'a</i>-style pseudo-cognates? Well, empirically, based on my experiences finding the above Quechua/Chinese list, the answer is "One half." That is, with a little ingenuity, and given languages with reasonably compatible phonologies, you can find a 'cognate' between two unrelated languages about once out of every two words you try.
<ul>
<li><a href="chance.htm">Here's a more rigorous calculation</a> of the chances of finding random matches between languages.
</ul>
<p>People sometimes offer statistical algorithms showing that this cannot possibly be; but a good rule of thumb is that when reality doesn't match your algorithm, you throw out the algorithm, not reality. Finding meaningless resemblances between languages <b>is</b> easy. If your probability estimates say it's not, you're doing something wrong-- most likely describing absurdly close matches when calculating your probablities, but using absurdly loose ones when searching for cognates.
<p>Don't the probabilities become meaningful once you look at hundreds of words, or at many language families? Well, no. A bad methodology doesn't become more respectable just by repeating it. My Quechua/Chinese bogus cognates do not merit additional respect when I add to them a few more bogus cognates from Greek, Spanish, or French.
<p>Note that R&G's list does contain quite a few real cognates-- within families, which bulks up the list and adds to the impression of suggestive similarity without actually adding any more information. There's three Indo-European languages in the list, three Afro-Asiatic ones, three Finno-Ugric, two Dravidian, three Almosan, three Macro-Carib, four Penutian, three Hokan, and two Andean (well, Quechua and Aymara may not be related, but the two words cited certainly are). All in all there's 19 completely non-functional entries in the list, or more than half of the list.
<p>(For that matter, the situation gets worse rather than better for R&G
if recently proposed superfamilies are accepted. If Greenberg is right about Amerind, for instance, the <i>maliq'a</i> list is reduced to six cognates; if <A HREF="lang21.html#22">Nostratic</a> is accepted, it's reduced to <b>three</b>.)
<p>Just to ram the point into the ground, here's another list of pseudo-cognates, this time between <a name="chineseenglish">English and Chinese</a> (and this time using pinyin, in case anyone thought I was playing some kind of trick by using Wade-Giles above).
<pre>baba 'daddy' papa
<br>bai 'white' fair (in color)
<br>ban 'remove' ban
<br>bao 'luxuriant foliage' bough
<br>bei 'low, vulgar, mean' base
<br>bei 'passive marker' by
<br>beihou 'behind' behind
<br>bengdai 'bandage' bandage
<br>bi 'pen' bic, biro
<br>bu 'book' book
<br>chang 'sing' chant
<br>chao 'stir-fry' chow
<br>chi 'eat' chew
<br>dadu 'bet' debt
<br>dage ren '12 people' dozen
<br>dai 'put on' tie 'fasten'
<br>dan 'dawn' dawn
<br>dao 'to' to
<br>dei 'must' duty, due
<br>dun 'ton' ton
<br>er 'ear' ear
<br>fazi 'way, means' fashion
<br>fei 'fly' fly
<br>feibo 'shabby, trifling' feeble
<br>feishi 'troublesome, fussy' fussy
<br>gang 'work collectively' gang 'group'
<br>gei 'give' give
<br>gouhe 'gully' gully
<br>gu (W-G ku) 'cow' cow
<br>guizi 'devil' ghost
<br>guo 'pass through' go
<br>hao 'hero' hero
<br>hong 'hum of crowd' hum
<br>huran 'suddenly' hurrying
<br>ji 'mock' jeer
<br>jiemei 'sisters' geminate
<br>jueding 'decide' judge
<br>junfa 'warlord' junta
<br>kan 'read' ken
<br>ken 'willing' can
<br>keneng 'possible' can
<br>kouyu 'spoken language' koine 'common language'
<br>kuai 'fast' quick
<br>kusi 'very similar' quasi
<br>lazhu 'hold fast' lasso
<br>lei 'flower bud' lei 'flower necklace'
<br>lian 'connect' line
<br>lianxi 'contact' link
<br>libie 'leave' leave
<br>long 'dragon' lion
<br>long 'grand' long
<br>loulie 'base, mean' lowly
<br>luedi 'conquer' loot
<br>ma 'mother' Ma
<br>ma 'horse' mare
<br>manbu 'stroll' mambo
<br>meili 'beauty' mellifluous
<br>meiju 'enumerate' measure
<br>mian 'face' mien
<br>miao 'mewing' mew, miaow
<br>moter 'model' model
<br>mubing 'raise troops' mobilize
<br>mutong 'shepherd' mutton
<br>nanti 'difficult, baffling' knotty
<br>naiyou 'cream' mayo
<br>pan 'plate' pan
<br>paxiu 'shy' bashful
<br>pei 'match, pair' pair
<br>pei 'compensate' pay
<br>po 'pour' pour
<br>sha 'shark' shark
<br>shafa 'sofa' sofa
<br>shan 'mountain' (mountain) chain
<br>shangai 'correct' change
<br>shange 'folk song' song
<br>shei 'who' she '3p fem. pron.'
<br>shenti 'health' sanity
<br>shezhi 'arrange, put' schedule
<br>shechi 'shooting' shoot
<br>shenshi 'gentleman' gentry
<br>shi 'eat' chew
<br>shi (pron. <i>shr</i>) 'true, real' sure
<br>shi 'Mrs, Madam' she
<br>shi 'see, examine' see
<br>shifu 'master, expert' chief
<br>shiming 'mission' scheme
<br>shu 'school' school
<br>shuo 'say, tell' show
<br>si 'silk' silk
<br>song 'give, send' send
<br>songge 'ballad' song
<br>soucha 'search' search
<br>sunzi 'grandson' son
<br>tamen 'they, them' them
<br>tai 'too' too
<br>ti 'tear' tear
<br>tie 'stick on' tie
<br>tou 'throw' throw
<br>toupi 'deep' deep
<br>wei 'weft' weft
<br>weida 'great' wide
<br>wen 'lukewarm' warm
<br>yun 'iron' iron
<br>xi 'drama, play' show
<br>xiang 'sound' sound
<br>xin 'suffering' sin
<br>xinshi 'new style' ginchy
<br>zeguo 'wetlands' soggy
<br>zhuan 'turn' turn
</pre>
<hr>
<A HREF="default.html">[Home]</A>
</BODY>
</HTML>