KGRKJGETMRETU895U-589TY5MIGM5JGB5SDFESFREWTGR54TY
Server : Apache/2.4.62
System : FreeBSD fbsdweb2.web.rcn.net 14.1-RELEASE FreeBSD 14.1-RELEASE releng/14.1-n267679-10e31f0946d8 GENERIC amd64
User : www ( 80)
PHP Version : 8.3.8
Disable Function : NONE
Directory :  /domains/markrose/

Upload File :
current_dir [ Writeable ] document_root [ Writeable ]

 

Current File : /domains/markrose/proto.html
<HTML> 

<HEAD><TITLE>Deriving Proto-World </TITLE></HEAD> 

<BODY> 
<IMG  Align=Top SRC="berries.jpg">

<H3>Deriving Proto-World with tools you probably have at home</H3>

Discussions of 'Proto-World' have gotten quite a bit of press lately-- not as much as Di's divorce, but about as much as any topic in historical linguistics ever gets.  

<p>Is there anything to it?  Very probably not-- which is a pity, because getting back to Proto-World sounds like a lot of fun, and now it seems like the only alternative is to wait for aliens to come by who had a tape recorder running one or two hundred thousand years ago.

<p>Hans Henrich Hock gave a talk at CLS 29 on Ruhlen and Greenberg's "world etymology" <i>maliq'a</i> 'swallow, throat', pointing out quite a few serious methodological problems.  It may be worth repeating some of his points.  To start with, <a name="maliqa">here's R&G's supporting citations</a>:

<table>
<tr><td>Proto-Afro-Asiatic</td>	<td>Afro-Asiatic</td>	<td>*mlg</td>	<td>'suck, breast, udder'</td></tr>
<tr><td>Arabic</td>	<td>Afro-Asiatic</td>	<td>m-l-j</td>	<td>'suck the breast'</td></tr>
<tr><td>Old Egyptian</td>	<td>Afro-Asiatic</td>	<td>mndy</td>	<td>'woman's breast, udder'</td></tr>
<tr><td>Proto-Indo-European</td>	<td>Indo-European</td>	<td>*melg-</td>	<td>'to milk'</td></tr>
<tr><td>English</td>	<td>Indo-European</td>	<td>milk</td>	<td>'to milk, milk'</td></tr>
<tr><td>Latin</td>	<td>Indo-European</td>	<td>mulg-e:re</td>	<td>'to milk'</td></tr>
<tr><td>Proto-Finno-Ugric</td>	<td>Finno-Ugric</td>	<td>*m&auml;lke</td>	<td>'breast'</td></tr>
<tr><td>Saami</td>	<td>Finno-Ugric</td>	<td>mielga</td>	<td>'breast'</td></tr>
<tr><td>Hungarian</td>	<td>Finno-Ugric</td>	<td>mell</td>	<td>'breast'</td></tr>
<tr><td>Tamil</td>	<td>Dravidian</td>	<td>melku</td>	<td>'to chew'</td></tr>
<tr><td>Malayalam</td>	<td>Dravidian</td>	<td>melluka</td>	<td>'to chew'</td></tr>
<tr><td>Kurux</td>	<td>Dravidian</td>	<td>melkha:</td>	<td>'throat'</td></tr>
<tr><td>Central Yupik</td>	<td>Eskimo-Aleut</td>	<td>melug-</td>	<td>'to suck'</td></tr>
<tr><td>Proto-Amerind</td>	<td></td>	<td>*maliq'a</td>	<td>'to swallow, throat'</td></tr>
<tr><td>Halkomelem</td>	<td>Almosan</td>	<td>m@lqw</td>	<td>'throat'</td></tr>
<tr><td>Kwakwala</td>	<td>Almosan</td>	<td>m'lXw-'id</td>	<td>'chew food for the baby'</td></tr>
<tr><td>Kutenai</td>	<td>Almosan</td>	<td>u'mqolh</td>	<td>'to swallow'</td></tr>
<tr><td>Chinook</td>	<td>Penutian</td>	<td>mlqw-tan</td>	<td>'cheek'</td></tr>
<tr><td>Takelma</td>	<td>Penutian</td>	<td>m&uuml;lk'</td>	<td>'to swallow'</td></tr>
<tr><td>Tfaltik</td>	<td>Penutian</td>	<td>milq</td>	<td>'to swallow'</td></tr>
<tr><td>Mixe</td>	<td>Penutian</td>	<td>amu'ul</td>	<td>'to suck'</td></tr>
<tr><td>Mohave</td>	<td>Hokan</td>	<td>malyaqe'</td>	<td>'throat'</td></tr>
<tr><td>Walapei</td>	<td>Hokan</td>	<td>malqi'</td>	<td>'throat, neck'</td></tr>
<tr><td>Akwa'ala</td>	<td>Hokan</td>	<td>milqi</td>	<td>'neck'</td></tr>
<tr><td>Cuna</td>	<td>Chibchan</td>	<td>murki-</td>	<td>'to swallow'</td></tr>
<tr><td>Quechua</td>	<td>Andean</td>	<td>malq'a</td>	<td>'throat'</td></tr>
<tr><td>Aymara</td>	<td>Andean</td>	<td>malyq'a</td>	<td>'throat'</td></tr>
<tr><td>Iranshe</td>	<td>Macro-Tucanoan</td>	<td>moke'i</td>	<td>'neck'</td></tr>
<tr><td>Guamo</td>	<td>Equatorial</td>	<td>mirko</td>	<td>'to drink'</td></tr>
<tr><td>Surinam</td>	<td>Macro-Carib</td>	<td>e'mo:k&iuml;</td>	<td>'to swallow'</td></tr>
<tr><td>Faai</td>	<td>Macro-Carib</td>	<td>mekeli</td>	<td>'nape of the neck'</td></tr>
<tr><td>Kaliana</td>	<td>Macro-Carib</td>	<td>imukulali</td>	<td>'throat'
</table>

<p>Now, there's no denying that seeing such a list is suggestive, and that it 
<b>seems</b> like there must be something in it.  I'll maintain, however, that 
this is simply self-delusion-- a consequence of the human ability to make 
connections even in the face of near-random data.

<p>Take a closer look at the list; the rules for this game are evidently quite
lax.  The vowels are completely ignored.  The middle consonant varies
from <b>l</b> to <b>ly</b> to <b>lh</b> to <b>n</b> to <b>r</b> to zero.  The end consonant ranges from <b>g</b>
to <b>j</b> to <b>d</b> to <b>k</b> to <b>q</b> to <b>q'</b> to <b>kh</b> to <b>k'</b> to <b>X</b> to zero.  Switching around medial 
consonants seems to be allowed; extra consonants and syllables can appear 
where needed.  

<p>Observe the semantic variation as well: body parts ranging from neck to
nape to throat to breast to cheek; actions including swallowing, milking, drinking, chewing, and
sucking.  Some defenders of Ruhlen & Greenberg make much of the probability of finding such lists 
among given numbers of families; but notice that one can pretty much
pick and choose what languages from a family to include.  If Greek
doesn't do it for you, try Latin; if Hebrew doesn't work, use Arabic.

<p>The truth is that lists like this are not hard to produce-- <i>au contraire.</i>
Just to demonstrate this I've taken a number of words at random in 
<a name="chinesequechua">Chinese</a> and looked for 'cognates' in Quechua (and wherever else I could
think of one), using as best as I could the level of phonetic and semantic variation evidenced in R&G's list above.   If I had more dictionaries at hand I'd find you more.

<pre>Chinese  ren     'person'
<br>Quechua  runa    'person'
<br>
<br>Chinese  ch'ung  'insect'
<br>Quechua  chinchi 'type of insect'
<br>English  chigger
<br>
<br>Chinese  shui    'water'
<br>Quechua  sut'u   'wet'
<br>French   su&eacute;e    'sweat'
<br>Greek    hudor   'water'
<br>Dutch    schuit  'boat'
<br>Turkish  su      'water'
<br>
<br>Chinese  shuohua 'talk'
<br>Quechua  suka    'whistle'
<br>French   charler 'chat'
<br>
<br>Chinese  lao     'old'
<br>Quechua  laqla   'old'
<br>Tok Psn. lapun   'old'
<br>
<br>Chinese  nai     'breast'
<br>Quechua  &ntilde;u&ntilde;u    'breast'
<br>French   n&eacute;n&eacute;    'breast'
<br>Bulgar.  nenka   'breast'
<br>
<br>Chinese  sheng   'rise'
<br>Quechua  seqay   'rise'
<br>
<br>Chinese  cheh    'this'
<br>Quechua  chay    'that'
<br>French   ce      'this/that'
<br>
<br>Chinese  chihfan 'eat'
<br>Quechua  chipay  'close mouth'
<br>French   chef    'cook'
<br>
<br>Chinese  chung   'middle'
<br>Quechua  chawpi  'center'
<br>Italian  centro  'center' (c = ch)
<br>
<br>Chinese  ti      'earth'
<br>Quechua  tiksimuyu 'earth'
<br>Spanish  tierra  'earth'
<br>
<br>Chinese  ch'ing  'please'
<br>Quechua  hinay   'do thus'
<br>
<br>Chinese  wang    'king'
<br>Quechua  waminqa 'chief'
<br>
<br>Chinese  you     'again'
<br>Quechua  yapa    'addition'
<br>Spanish  ya      'already'
<br>
<br>Chinese  kung    'work'
<br>Quechua  kunay   'carry'
<br>English  gung-ho 'eager to work'
<br>
<br>Chinese  ch'uan  'river'
<br>Quechua  chumay  'dip in water'
<br>Spanish  chupar  'drink, suck'
<br>Dutch    schoon  'clean'
<br>
<br>Chinese  lai     'come'
<br>Quechua  riy     'go'
<br>French   aller   'go'
<br>
<br>Chinese  ai      'love'
<br>Quechua  ayni    'mutual help'
<br>French   aimer   'love'
<br>
<br>Chinese  san     'mountain'
<br>Quechua  senqa   'mountain peak'
<br>French   cha&icirc;ne  'mountain range'
<br> 
<br>Chinese  n&uuml;      'woman'
<br>Quechua  &ntilde;usta   'princess'
<br>Dutch    nuf     'aloof girl'
<br>Greek    (gy)ne  'woman'
<br>Latin    (femi)na'woman'
<br>French   nana    'woman'
<br>German   -in     fem. suffix
<br>
<br>Chinese  ma      'mother'
<br>Quechua  mama    'mother'
<br>French   maman   'mother'
<br>
<br>Chinese  nan     'difficult'
<br>Quechua  nanaq   'painful'
<br>
<br>Chinese  kei     'give'
<br>Quechua  qoy     'give'
<br>Scots    gie     'give'
</pre>

<P>Now, anyone can see that almost all of these correspondences are completely bogus.  We know where French <i>su&eacute;e</i> comes from, and it's <b>not</b> from Chinese.  R&G really gain the benefit of obscurity here: how many of us can determine whether they are (unconsciously) playing the same kind of tricks with Tfaltik and Guamo as I am playing with Chinese and Quechua here?  (Amerindian specialists, in fact, are quite skeptical about R&G's claims.)  

<p>(By the way, if anyone thinks I'm using odd words or glosses, so are R&G.  I can't even find their Quechua word <i>malq'a</i> in eight Quechua dictionaries; the usual Quechua word for 'throat' is <i>q'oto</i>.  <i>Mallq'a</i> is the Aymara word for 'throat', but I don't know where the 'to swallow' gloss comes from; Aymara for 'to swallow' is <i>thata&ntilde;a</i>.)

<p>All this is intended to show how easy it is to find such spurious 
correspondences.  But that's not the end of it; Ruhlen & Greenberg
have the opposite problem as well: there's not only too <b>much</b> variation
in their list, there's too <b>little</b>.  Languages that really are related
have diverged much more in 6000 years than some of R&G's words seem
to have diverged in at least 10,000.

<p>Hock uses Hindi and English as an example.  The following words, for instance, are real cognates:
<pre>cakka:       wheel
<br>pa:nch       five
<br>si:~g        horn
<br>chah         six
<br>pissu:       flea 
</pre>

<p>Surely R&G would be embarrassed to pick words that far apart as cognates for Proto-World; with that level of phonetic resemblance, everything is related to everything.  On the other hand, they might seize upon such a pair as Hindi <i>lu:t.</i> 'rob' and English 'loot'... but these are not cognates; English borrowed the word from Hindi.  The actual English cognate of Hindi <i>lu:t.</i> is 'leaf'... which illustrates as well some of the semantic divergence that can occur in 6000 years.  

<p>Applied to the Indo-European family (which we know from careful comparative work to be related), R&G's mass comparison would yield large numbers of both false positives (<i>lu:t.</i> and loot, day and <i>dies</i>, have and <i>habere</i>) and large numbers of false negatives (<i>cakka:</i> and wheel, <i>lu:t.</i> and leaf, date and dacha, milk and lettuce).  Applied to unrelated languages, the method will generate long lists of bogus resemblances due to chance (as in my Quechua/Chinese comparison above).  

<p>I'm tempted to say that the true cognate of <i>maliq'a</i> in English is 'malarkey'.  Only the <A HREF="lang9.html#10">comparative method</a> can reveal whether any of the relationships postulated by R&G are real.  But the comparative method takes time and patience, and so it's probably long going to be at a disadvantage in the marketplace of ideas, compared to a method which offers quick answers of the type we want to hear.

<hr>
<h4>Or maybe Chinese does derive from Quechua?</h4>

When I first posted this stuff to the Net, one gentleman wondered aloud (wondered anet?) if I might have proved that Chinese and Quechua <b>are</b> related.  Some days it's not worth getting out of bed.

<p>Similar words with similar meanings do <b>not</b> prove that languages are related.  They might point to a relationship-- but they might also be due to borrowing ('gung ho' really is from Chinese); they might be due to universal processes like babytalk or onomatopoeia; and above all they may just be chance. 

<p>This seems to be hard for some people to accept.  Just look at <i>ren</i> and <i>runa</i>, or <i>gaijin</i> and <i>goyim</i>, they seem to think-- how could that possibly be due to chance?

<p>These people should be treated with respect.  They are the people who made Las Vegas what it is today.

<p>What are the chances of finding <i>maliq'a</i>-style pseudo-cognates?  Well, empirically, based on my experiences finding the above Quechua/Chinese list, the answer is "One half."  That is, with a little ingenuity, and given languages with reasonably compatible phonologies, you can find a 'cognate' between two unrelated languages about once out of every two words you try.

<ul>
<li><a href="chance.htm">Here's a more rigorous calculation</a> of the chances of finding random matches between languages.
</ul>

<p>People sometimes offer statistical algorithms showing that this cannot possibly be; but a good rule of thumb is that when reality doesn't match your algorithm, you throw out the algorithm, not reality.  Finding meaningless resemblances between languages <b>is</b> easy.  If your probability estimates say it's not, you're doing something wrong-- most likely describing absurdly close matches when calculating your probablities, but using absurdly loose ones when searching for cognates.

<p>Don't the probabilities become meaningful once you look at hundreds of words, or at many language families?  Well, no.  A bad methodology doesn't become more respectable just by repeating it.  My Quechua/Chinese bogus cognates do not merit additional respect when I add to them a few more bogus cognates from Greek, Spanish, or French. 

<p>Note that R&G's list does contain quite a few real cognates-- within families, which bulks up the list and adds to the impression of suggestive similarity without actually adding any more information.  There's three Indo-European languages in the list, three Afro-Asiatic ones, three Finno-Ugric, two Dravidian, three Almosan, three Macro-Carib, four Penutian, three Hokan, and two Andean (well, Quechua and Aymara may not be related, but the two words cited certainly are).  All in all there's 19 completely non-functional entries in the list, or more than half of the list.

<p>(For that matter, the situation gets worse rather than better for R&G
if recently proposed superfamilies are accepted.  If Greenberg is right about Amerind, for instance, the <i>maliq'a</i> list is reduced to six cognates; if <A HREF="lang21.html#22">Nostratic</a> is accepted, it's reduced to <b>three</b>.) 

<p>Just to ram the point into the ground, here's another list of pseudo-cognates, this time between <a name="chineseenglish">English and Chinese</a> (and this time using pinyin, in case anyone thought I was playing some kind of trick by using Wade-Giles above).  

<pre>baba 'daddy'                 papa
<br>bai 'white'                  fair (in color)
<br>ban 'remove'                 ban
<br>bao 'luxuriant foliage'      bough
<br>bei 'low, vulgar, mean'      base
<br>bei 'passive marker'         by 
<br>beihou 'behind'              behind
<br>bengdai 'bandage'            bandage
<br>bi 'pen'                     bic, biro
<br>bu 'book'                    book
<br>chang 'sing'                 chant
<br>chao 'stir-fry'              chow
<br>chi 'eat'                    chew
<br>dadu 'bet'                   debt
<br>dage ren '12 people'         dozen
<br>dai 'put on'                 tie 'fasten'
<br>dan 'dawn'                   dawn
<br>dao 'to'                     to
<br>dei 'must'                   duty, due
<br>dun 'ton'                    ton
<br>er 'ear'                     ear
<br>fazi 'way, means'            fashion
<br>fei 'fly'                    fly
<br>feibo 'shabby, trifling'     feeble
<br>feishi 'troublesome, fussy'  fussy
<br>gang 'work collectively'     gang 'group'
<br>gei 'give'                   give
<br>gouhe 'gully'                gully
<br>gu (W-G ku) 'cow'            cow
<br>guizi 'devil'                ghost
<br>guo 'pass through'           go
<br>hao 'hero'                   hero
<br>hong 'hum of crowd'          hum
<br>huran 'suddenly'             hurrying
<br>ji 'mock'                    jeer
<br>jiemei 'sisters'             geminate
<br>jueding 'decide'             judge
<br>junfa 'warlord'              junta
<br>kan 'read'                   ken
<br>ken 'willing'                can
<br>keneng 'possible'            can
<br>kouyu 'spoken language'      koine 'common language'
<br>kuai 'fast'                  quick
<br>kusi 'very similar'          quasi
<br>lazhu 'hold fast'            lasso
<br>lei 'flower bud'             lei 'flower necklace'
<br>lian 'connect'               line
<br>lianxi 'contact'             link
<br>libie 'leave'                leave
<br>long 'dragon'                lion
<br>long 'grand'                 long
<br>loulie 'base, mean'          lowly
<br>luedi 'conquer'              loot
<br>ma 'mother'                  Ma
<br>ma 'horse'                   mare
<br>manbu 'stroll'               mambo
<br>meili 'beauty'               mellifluous
<br>meiju 'enumerate'            measure
<br>mian 'face'                  mien
<br>miao 'mewing'                mew, miaow
<br>moter 'model'                model
<br>mubing 'raise troops'        mobilize
<br>mutong 'shepherd'            mutton
<br>nanti 'difficult, baffling'  knotty
<br>naiyou 'cream'               mayo
<br>pan 'plate'                  pan
<br>paxiu 'shy'                  bashful
<br>pei 'match, pair'            pair
<br>pei 'compensate'             pay
<br>po 'pour'                    pour
<br>sha 'shark'                  shark
<br>shafa 'sofa'                 sofa
<br>shan 'mountain'              (mountain) chain 
<br>shangai 'correct'            change
<br>shange 'folk song'           song
<br>shei 'who'                   she '3p fem. pron.'
<br>shenti 'health'              sanity
<br>shezhi 'arrange, put'        schedule
<br>shechi 'shooting'            shoot
<br>shenshi 'gentleman'          gentry
<br>shi 'eat'                    chew
<br>shi (pron. <i>shr</i>) 'true, real' sure
<br>shi 'Mrs, Madam'             she
<br>shi 'see, examine'           see
<br>shifu 'master, expert'       chief
<br>shiming 'mission'            scheme
<br>shu 'school'                 school
<br>shuo 'say, tell'             show
<br>si 'silk'                    silk
<br>song 'give, send'            send
<br>songge 'ballad'              song
<br>soucha 'search'              search
<br>sunzi 'grandson'             son
<br>tamen 'they, them'           them
<br>tai 'too'                    too
<br>ti 'tear'                    tear
<br>tie 'stick on'               tie
<br>tou 'throw'                  throw
<br>toupi 'deep'                 deep
<br>wei 'weft'                   weft
<br>weida 'great'                wide
<br>wen 'lukewarm'               warm
<br>yun 'iron'                   iron
<br>xi 'drama, play'             show
<br>xiang 'sound'                sound
<br>xin 'suffering'              sin
<br>xinshi 'new style'           ginchy
<br>zeguo 'wetlands'             soggy
<br>zhuan 'turn'                 turn
</pre>

<hr>
<A HREF="default.html">[Home]</A>

</BODY> 
</HTML> 


Anon7 - 2021