|
Server : Apache/2.4.62 System : FreeBSD fbsdweb2.web.rcn.net 14.1-RELEASE FreeBSD 14.1-RELEASE releng/14.1-n267679-10e31f0946d8 GENERIC amd64 User : www ( 80) PHP Version : 8.3.8 Disable Function : NONE Directory : /domains/markrose/yingzi/ |
Upload File : |
<HTML> <HEAD><TITLE>Yingzi</TITLE></HEAD> <BODY BGCOLOR="#DDDDFF"> <IMG Align=Top SRC="chinese.gif"> <H2>If English was written like Chinese</H2> <hr> <i>Also see the <a href="http://www.fatcow.com/edu/yingzi-be/">Belorussian translation</a> provided by <a href="http://www.fatcow.com/">Fatcow</a>.</i> <hr> <p>The English spelling system is such a pain, we'd might as well switch to <i>hanzi</i>-- Chinese characters. How should we go about it? <h3>Japanese style</h3> <p>One way would be to use hanzi directly, asthe Japanese do. For instance, we'd write "work" as <img align=absmiddle src="gang.gif">, and "ruler" as <img align=absmiddle src="jun.gif">. Chinese and Japanese borrowings could be written using the original hanzi, e.g. "gung-ho" would be <img align=absmiddle src="gang.gif"><img align=absmiddle src="he.gif">, and "tycoon" as <img align=absmiddle src="da.gif"><img align=absmiddle src="jun.gif">. <p>You can already see that this is going to be tricky. We've just given <img align=absmiddle src="gang.gif"> two readings, for instance-- /wrk/ and /gûng/-- and <img align=absmiddle src="jun.gif"> two as well-- /rulr/ and /kun/. <p>Proper names will be a problem as well. Again, Chinese, Japanese, and Korean names already have hanzi forms-- e.g. <img align=absmiddle src="fayename.gif"> for the name of the bodaciously cute singer <a href="http://www.tezcat.com/~markrose/faye.html">Faye Wong</a>-- but for English names we'd have no better recourse than to spell things out using the nearest Chinese syllables. For instance, Winston Churchill would be represented by hanzi that would be transliterated <i>Wensuteng Chuerqilu.</i> <h3>Chinese style</h3> <p>Maybe there's a better approach. Instead of using hanzi directly, let's invent a new system-- we'll call it <i>yingzi</i>, "English characters"-- that would work for English exactly as hanzi works for Chinese. <p>The basic principle will be, <b>one yingzi for a syllable with a particular meaning.</b> So <i>two</i>, <i>to</i>, and <i>too</i> will each have their own yingzi. (If we were creating a syllabary, by contrast, we'd write all three with the same symbol, the one for /tu/.) <p>Does that mean we need a completely separate symbol for each of the thousands of possible English syllables? Not at all. We can simplify the task enormously with one more principle: <b>syllables that rhyme can have yingzi that are variations on a theme</b>. <h3>Little pictures</h3> <p>You've been reading for half a page and are probably wondering why I haven't yet talked about pictograms. When do we get to draw little pictures? <p>Well, now's the time. Let's draw pictures. For instance: <table> <tr> <td><img align=absmiddle src="horse.gif"><br><i>horse</i> <td><img align=absmiddle src="mount.gif"><br><i>mount</i> <td><img align=absmiddle src="king.gif"><br><i>king</i> <td><img align=absmiddle src="man.gif"><br><i>man</i> <td><img align=absmiddle src="child.gif"><br><i>child</i> <td><img align=absmiddle src="bug.gif"><br><i>bug</i> <td><img align=absmiddle src="sun.gif"><br><i>sun</i> <td><img align=absmiddle src="moon.gif"><br><i>moon</i> <td><img align=absmiddle src="tree.gif"><br><i>tree</i> </tr></table> When the pictures are abstract we can call them "ideograms", but they still represent particular English morphemes: <table> <tr> <td><img align=absmiddle src="one.gif"><br><i>one</i> <td><img align=absmiddle src="not.gif"><br><i>un-</i> <td><img align=absmiddle src="per.gif"><br><i>per</i> </tr></table> Some of our pictures will be kind of clever. For instance, <img align=absmiddle src="woods.gif"> <i>woods</i> repeats the yingzi for <i>tree</i>, while <img align=absmiddle src="east.gif"> <i>east</i> is a little picture of the sun rising through the trees. <img align=absmiddle src="guilt.gif"> <i>guilt</i> is a picture of a man inside an enclosure. <p>Let's not go crazy, however. We only need a thousand or so, and we'll restrict ourselves to fairly simple, one-syllable words. We'll derive the vast majority of our yingzi from this basic stock of pictures. <h3>Phonetic classes</h3> <p>Basically each simple yingzi will be the basis for an open-ended set of yingzi, used for a set of <b>rhyming syllables</b>. For instance, the <i>king </i>character <img align=absmiddle src="king.gif"> will generate the family <i>king</i>, <i>thing, sing, sling, sting, shing(le).</i> <p>It would be awfully confusing to use <img align=absmiddle src="king.gif"> for all of these. Instead we'll use it only for <i>king</i>, which will be the <b>phonetic</b> for this set, and add little signs called <b>radicals</b> to distinguish the rest. Examples: <ul> <li><i>sing</i> will be <img align=absmiddle src="sing.gif">, formed by adding the <i>mouth</i> radical <li><i>sting</i> will be <img align=absmiddle src="sting.gif">, formed by adding the <i>bug</i> radical (since insects sting) <li><i>shing</i> (the first syllable in <i>shingle</i>) will be <img align=absmiddle src="shing.gif">, formed using the <i>roof </i>radical <li><i>sling</i> <img align=absmiddle src="sling.gif"> will be formed using the <i>spear</i> radical. </ul> When we add a radical, we scrunch up the yingzi so the whole thing still fits into a square. All characters, however complex, fit into the same size box. <p>"Rhyming" isn't quite accurate. We don't want each family of words to get <i>too</i> large; so we'll restrict a single family to either voiced or unvoiced initial consonants. <p>So, <i>bring, ring, Bing, wing, zing</i> will form a separate family of yingzi, based on the character <img align=absmiddle src="wing.gif"> <i>wing.</i> <h3>Overlaps and secondary derivations</h3> <p>The yingzi formed from a single phonetic will all rhyme; but not all syllables that rhyme will necessarily have the same yingzi. This is largely because we started with a set of pictograms chosen for their pictorial rather than phonetic qualities; but it also adds visual distinctions to the script, and thus aids the reader. (It rather burdens the writer; but heck, everyone does a lot more reading than writing.) <p>For instance, the phonetic <img align=absmiddle src="not.gif"> <i>un- </i>will be used for <i>fun, ton, pun, thun(der), Hun,</i> etc. But <i>sun</i> will have its own yingzi, <img align=absmiddle src="sun.gif">, and this will be used for <i>son, shun, stun, spun</i>. For instance, <i>sun</i> plus the <i>man</i> radical makes <img align=absmiddle src="son.gif"> <i>son</i>, and <i>sun </i>with the <i>fight</i> radical is used for <img align=absmiddle src="shun.gif"> <i>shun</i>. <p>Moreover, a compound yingzi may itself be used as a phonetic with its own set of yingzi. The <i>shun</i> character <img align=absmiddle src="shun.gif">, for instance, will be used with the <i>work</i> radical to form <img align=absmiddle src="tion.gif"> <i>-tion</i>, used to spell this common suffix, as in <img align=absmiddle src="sect.gif"><img align=absmiddle src="tion.gif"> <i>section</i>. <h3>Radicals</h3> <p>Where do the radicals come from? For the most part they are either simple characters (e.g <i>king, work</i>), or abbreviations of characters; for instance the character <img align=absmiddle src="net.gif"> <i>net</i> is abbreviated to <img align=absmiddle src="netr.gif"> when used as a radical. <p>The set of radicals is not unlimited; there is in fact a <b>fixed set</b> of 214 of them. The total number of yingzi that belong to one phonetic set is thus absolutely limited to 214. No set will actually have this number of yingzi, though some will have a few dozen. <p>(However, the potential number of yingzi is still unlimited, because we can always choose a compound yingzi as a new phonetic, and generate a new set of rhyming yingzi from it.) <p>Because the set of radicals is limited, a really good radical will not always be available to distinguish the yingzi in a rhyming set. We'll just choose the best one we can. In addition, when choosing radicals we will rely on the <b>etymological</b> meaning of a word, which may not always match its current meaning. For instance, the word <i>villain</i> originally meant <i>peasant</i>, and so the sign for <i>vill-</i> <img align=absmiddle src="vill.gif"> uses the <i>field</i> radical (added to the phonetic <i>bill</i>). <p>The yingzi that use a particular radical will form a class of their own-- a sort of <b>meaning class</b>. We can consider the entire English language to be divided into 214 meaning categories. For instance, every yingzi that uses the <i>bug</i> radical will have something to do (at least etymologically) with insects or reptiles. However, since the number of radicals is so limited, and because the choice of radical is sometimes quirky, the resulting sets will be rather vague and eccentric. <h3>Guessing at an unknown character</h3> <p>There will be tens of thousands of yingzi; but we must not let this frighten us. There are tens of thousands of conventional spellings, too, but despite what the wiseacres say, it would be absurd to say that there's no logic to English orthography at all. Likewise, the yingzi themselves are not the basic graphical units or <b>graphemes</b> of the writing system; the phonetics and radicals are. <p>Readers can make use of this fact to guess the pronunciation of an unknown character. For instance, <img align=absmiddle src="curse.gif"> is a straightforward combination of the <img align=absmiddle src="speak.gif"> <i>speech </i>radical with the phonetic <img align=absmiddle src="purse.gif"> <i>purse</i>. A type of speaking that rhymes with purse-- <i>curse</i>, of course. <p>Or, <img align=absmiddle src="wilt.gif">, a combination of the <i>plant</i> radical <img align=absmiddle src="plant.gif"> with the <i>guilt </i>phonetic <img align=absmiddle src="guilt.gif">. Something about plants that rhymes with guilt? This one is a bit harder-- <i>wilt</i>. <p><img align=absmiddle src="peach.gif"> -- a plant (radical <i> plant</i>) that rhymes with <img align=absmiddle src="speak.gif"> <i>speech</i>-- is easy: <i>peach</i>. But note that <i>speech</i>, which we used as a radical above, is used as a phonetic here. <p>Since there are many more phonetics than radicals, the <b>information content</b> of the radical is much less than that of the phonetic. If you knew only the radical for an unknown character, you can only narrow down the meaning to 1/214 of the lexicon; if you knew only the phonetic, you could narrow it down much further, since there are more than a thousand phonetics. <h3>Polysyllabic words</h3> <p>Where possible we will divide a word into morphemes. For instance <i>outsider</i> breaks into <i>out + side + -er</i>; reshipment is <i>re- + ship + -ment</i>. <p>How do we handle morphemes of more than one syllable? We simply create a yingzi for each syllable. For instance, <i>person</i> would be expressed as <img align=absmiddle src="pers.gif"><img align=absmiddle src="son.gif">. The first character is based on <img align=absmiddle src="per.gif"> <i>per</i>, with the addition of the <i>man</i> radical; the second is <img align=absmiddle src="sun.gif"> <i>sun</i> with the addition of the same radical. <p>A polysyllabic morpheme, in fact, can generally be recognized because all the syllables have the same radical. For instance, <img align=absmiddle src="insect.gif"> <i>insect</i> consists of <i>in </i>and <i> sect</i>, each with the addition of the <i>bug</i> radical. (Note that <i>sect</i> is itself a compound character, formed from the <i>rite</i> radical with the <i>specked</i> phonetic.) <h3>Inflections</h3> <p>How about inflections that don't form a full syllable, such as plural <i>-s</i>? It would be pretty tiresome, even with the add-a-radical trick, to create thousands of yingzi for syllables that just happen to have a final <i>-s</i>. <p>Note, however, that the plural morpheme sometimes takes up its own syllable, as in <i>grasses</i>, <i>rashes</i>. So why not use the yingzi for <i>is</i>, which is <img align=absmiddle src="is.gif">? Of course, <i>is</i> and <i>-s</i> are both pretty common, so we should add a little dot to the character to represent final -s: <img align=absmiddle src="s.gif"> So <i>peach</i> is <img align=absmiddle src="peach.gif">, <i>peaches</i> is <img align=absmiddle src="peach.gif"><img align=absmiddle src="s.gif">; <i>sun</i> is <img align=absmiddle src="sun.gif">, <i>suns</i> is <img align=absmiddle src="sun.gif"><img align=absmiddle src="s.gif">. We can use a similar strategy for other inflections. <h3>Foreign words</h3> <p>Very old borrowings (e.g. the mass of words borrowed in medieval and Renaissance times from French and Latin) will be treated like native words. We've already seen examples like <i><peach> peach, <vill><in-p> villain, <in+sect> insect, and <per><son> person.</i> <p>Words borrowed more recently, however, won't get their own radical+phonetic compounds. Instead we'll represent them, syllable by syllable, using the nearest existing characters. For instance, <i>Peking</i> will be represented as <img align=absmiddle src="seeg.gif"><img align=absmiddle src="king.gif">. The first character is the first syllable of <i>pecan</i> (that is, <i>pe-;</i> phonetic <i>see</i>, radical <i>gourd</i>), and the second is the word <i>king. </i>The name <a name="fellini"><i>Fellini</i></a> will be written <img align=absmiddle src="fell.gif"><img align=absmiddle src="bean.gif"><img align=absmiddle src="knee.gif">, composed of the yinzi <i>fell, lean, knee</i>. (You may amuse yourself working out <a href="#answers">what the phonetics and radicals are</a> for these three characters.) <h3>Dictionaries</h3> <p>English dictionaries would no longer be arranged alphabetically, of course, since we're no longer using an alphabet. They'll be organized <b>by radical</b>. <p>The 214 radicals are ordered according to the number of strokes needed to draw them. Radicals of one stroke (e.g. <img align=absmiddle src="one.gif"> <i>one</i> or <img align=absmiddle src="per.gif"> <i>per</i> ) come first, followed by radicals of two strokes (e.g. <img align=absmiddle src="not.gif"> <i>un-</i>), and so on, up to monstrosities like <img align=absmiddle src="toad.gif"> <i>toad</i>, which has 20 strokes. <p>The section for each radical is also organized by stroke number. Under the <i>plant</i> radical, for instance, the first entry is <img align=absmiddle src="plant.gif"> <i>plant</i> itself, followed by characters with one extra stroke (like <img align=absmiddle src="dron.gif"><i> dron</i>, the last character in rhododendron), then characters with two strokes, and so on (up to <img align=absmiddle src="toads.gif">, the first character in <i>toadstool</i>). <p>Note that there are no main entries for what we're used to calling <b>words</b> at all. There wouldn't be a main entry at all for a word like <i>person</i>, for instance. There would be an entry for the <i>man</i> radical; under it a sub-entry for the <per-> character<i> per</i>, and <i>person</i> would be listed as a sub-sub-entry under that. <h3>Thinking in yingzi </h3> <p>The nature of the writing system would encourage lexicographers (and English speakers) to think of everything in the language as <b>built out of yingzi</b>. There wouldn't seem to be a great difference between "words" like <i>storehouse</i>, <i>storage, restore </i>and "expressions" like <i>shoe store, store up, store detective, store manager</i>; or between <i>blackboard</i> and <i>black eye</i>, or between <i>alphabet</i> and <i>alpha male</i>. <p>Many morphemes that now live out a shadowy existence, forever bound to other morphemes, would take on an <b>independent existence</b>; for instance the <i>volve</i> in <i>revolve, evolve, involve, devolve</i>, which would have its own yingzi, and would seem as much a "word" or component of the language as the <i>match</i> in <i>rematch, mismatch, unmatch</i>. There would be a tendency to describe the meanings, vague or miscellaneous as they might be, for such characters. <p>This might seem sensible and even wise for a morpheme like <i>volve</i>, which after all derives from a real Latin root meaning <i>roll</i>; but there would be other, more <b>dubious applications</b>. For instance, the <i>son</i> in <i>person </i>was represented by <img align=absmiddle src="son.gif">, which happens to be the yingzi for <i>son.</i> It will be almost impossible not to assume that <i>person</i> derives from <i>son</i>; but historically it's just a coincidence; <i>person </i>derives from Latin and has nothing to do with <i>son</i>. <p>Worse yet, the -<i>cuit</i> of <i>biscuit</i> and <i>circuit</i> might be written with the same character (a derivative of <i>kit</i>), and a meaning sought for it-- perhaps 'round', since biscuits are round and circuits involve going round. Again, etymologically this is nonsense. <p>Words, perceived as compounds, might lend themselves to <b>abbreviation</b>. After all, why write two yingzi when one will do, especially if it unmistakably implies its partner? For instance, <i>language</i> would be a two-character word <img align=absmiddle src="lang.gif"><img align=absmiddle src="gwidge.gif">, each character defined only as part of this compound and used nowhere else in the language. If you've written <lang> <i>lang</i>, you must write <gwidge> <i>gwidge</i> next. You might as well just write <img align=absmiddle src="lang.gif"> <i>lang</i> and leave it at that. Ultimately of course <img align=absmiddle src="lang.gif"> will acquire a meaning of its own-- namely <i>language. </i>And for consistency's sake lexicographers might well give <i>gwidge</i> a meaning of its own as well--namely, <i>language</i>. <p>The complexities of the writing system, the inherent interest of the pictorial elements, the cleverness inherent in graphic compounds like <woods><i> woods</i> and the radical-phonetic system, and even sociological facts such as the time it takes to learn the system, and the fact that English speakers of all nations can use it whatever their native dialect, would also combine to give the writing system an <b>overwhelming character</b> of its own. It would be seen as more important than speech; there would even be a tendency to think of <b>words as derived from characters</b> rather than the other way around. <p>If someone asks where a word comes from, we (now) think of its original phonetic form; we say for instance that <i>language</i> comes from French<i> langage</i>, itself derived from Latin <i>lingua</i> 'tongue', which in turn comes from Proto-Indo-European <i>dnghu</i>. With the yingzi system, people would be tempted instead to give what we might call the <b>graphic etymology</b>. They'd say that <i>lang</i> derives from the <i>speech</i> radical and the <i>gang </i>phonetic, and that the latter is actually a picture of a gang-- a reduplication of the <i>man</i> character. That is indeed where <img align=absmiddle src="lang.gif"> comes from, but not <i>lang</i>, which did not derive from it! (But it wouldn't even be easy to make this point in yingzi-- how do you distinguish <i>lang</i> from <img align=absmiddle src="lang.gif"> if you can't even write "lang" without writing the character?) <h3>A word is a word is a word</h3> <p>Does all this mean that <b>words are cultural constructs</b> or that the concept of a word would no longer apply to English written in yingzi? Not at all. A <b>word</b> is still a useful linguistic concept-- or rather a series of overlapping concepts. By <i>word</i> linguists may mean one or all of the following: <ul> <li>a phonological unit-- e.g. something with one stress accent or one pitch contour; or a unit within which intervocalic stops get voiced. <li>the abstraction underlying a set of morphological forms (e.g. <i>write</i> underlying <i>write, writes, writing, written, wrote</i>). <li>an element which can stand alone (e.g. in response to a suitably chosen question), as suffixes or bound morphemes cannot. <li>a morphological unit you can't insert other morphemes into (e.g. <i>black dog</i> is not a word since you can change it to <i>black, tired dog</i>; but you can't turn <i>blackbird</i> into <i>blacktiredbird</i>) <li>an expression with a conventional meaning-- something that has to be defined in the mental lexicon (this sense is also called a <b>lexeme</b>). </ul> <p>A moment's thought should show that these definitions <b>may or may not coincide</b> even in English; and that even where they do they may not coincide with the <b>typographical</b> or lexicographical notion of a word. The latter idea-- roughly 'something with spaces around it'-- is of little interest to linguists since it depends on the writing system. That makes it useless for describing most of the languages of the world; and even for written languages it's pretty arbitrary, as this page should show. (Everything you know about writing English would change if we adopted yingzi instead.) <p>It's safe to say, however, that such definitions would seem fairly abstract in a yingzi system. <i>Word</i> might become a technical term, like <i>morpheme</i> or <i>lexeme</i>. Or it might be identified with a yingzi (a written character); or be abstracted into a more vaguely defined linguistic element, applicable to anything from a character to a compound to a whole phrase. <hr> <h3>Hey, did I just learn something about Chinese?</h3> <p>I've attempted in this sketch to lay out, by analogy, the nature and structure of the Chinese writing system. All of the concepts apply: <ul> <li>the limited role of pictograms <li>the clever compound pictures (indeed all three examples are from Chinese) <li>the phonetic-and-radical system (97% of Chinese characters work this way) <li>the inclusion of radicals as part of the character (rather than as separate symbols, as in cuneiform or hieroglyphic writing) <li>the relative information content of radicals and phonetics <li>compounds used as secondary phonetics <li>the handling of multisyllabic and foreign words <li>the handling of subsyllabic morphemes (the model here is Mandarin <i>-r</i>, represented by <i>ér</i>) <li>the organization of dictionaries (in fact, the graphic at the top of the page shows part of the radical index for a Chinese dictionary, organized by stroke count) <li>the psychological effects. </ul> <p>The <b>radicals</b> named are all also Chinese radicals. The <b>phonetics</b> are not, of course, since the phonetics in hanzi refer to the sounds of Chinese words, not English ones. But I tried to pick phonetics which would also be phonetics in Chinese (e.g. <i>sun, king, wing, tree, one, east, field, bill</i>). <p>There are differences, too. For instance, I haven't made any attempt to make my yingzi look like hanzi. <p>The phonetic sets of Chinese are not exactly based on rhymes. Karlgren explains that the hanzi belonging to one set had homorganic initial consonants (e.g. k, g), the same main vowel, and the same final consonant. <p>I've also <b>underreported the complexity</b> (and arguably the inefficiency) of the Chinese script in several important ways: <ul> <li>The phonetic sets in Chinese, though still useful, are two thousand years out of date. It's as if my yingzi phonetics had to rhyme in Proto-Germanic, not in modern English. <li>The scribes who devised hanzi often went wild adding radicals, creating multiple characters for what are etymologically the same root. <li>Four milennia have reduced the pictorial content of the hanzi primitives almost to nil. What the "pictograms" are pictures of is often evident only to the scholar. <li>Clear and precise handwriting is by no means a virtue in Chinese; the most admired style, <i>câoshu</i>, is highly simplified, suggesting rather than delineating the characters intended. <li>The People's Republic has simplified many of the traditional hanzi; and this reform has been accepted in Singapore but not in Taiwan or Hong Kong. It's as if the US had its own versions of a large fraction of English yingzi. </ul> <p>I also haven't gotten into the many additional complications engendered when hanzi were adopted by Japanese, Korean, or Vietnamese; for more on that see John DeFrancis's <i>The Chinese Language: Fact and Fantasy.</i> <p>In some respects, however, yingzi are <b>harder</b> than hanzi. For instance, English has many more multisyllabic morphemes than Chinese. Only about 10% of Chinese morphemes are more than one syllable long. Also, English has borrowed so much that it often has five or six morphemes where Chinese would have just one-- compare <i>wáng</i> vs. <i>king, regal, royal, regicide, Rex</i>, or <i>zì</i> vs. <i>word, verb, logograph, bon mot.</i> <p><i>--Mark Rosenfelder</i> <hr> <a href="http://www.zompist.com/default.html">[ Home ]</A> <ul> <li>Check out this <a href="http://zhongwen.com/">marvelously interactive Chinese dictionary</a>, a great way to explore how hanzi work. </ul> <hr> <a name="answers"><img align=absmiddle src="fell.gif"><img align=absmiddle src="bean.gif"><img align=absmiddle src="knee.gif"></a> <b><i><a href="#fellini">Fellini:</a></i></b> <i>fell</i> has the radical <i>vertical</i> and the phonetic <i>sell</i>; <i>lean</i> has the radical <i>stand</i> and the phonetic <i>bean</i>; and <i>knee</i> has the radical <i>body</i> and phonetic <i>tree.</i> </BODY></HTML>