|
Server : Apache/2.4.62 System : FreeBSD fbsdweb2.web.rcn.net 14.1-RELEASE FreeBSD 14.1-RELEASE releng/14.1-n267679-10e31f0946d8 GENERIC amd64 User : www ( 80) PHP Version : 8.3.8 Disable Function : NONE Directory : /domains/markrose/ |
Upload File : |
<HTML>
<HEAD>
<TITLE>Sounds: The Sound Change Applier</TITLE>
<style>
h2
{color:#A60000;}
h3
{color:#C08700;}
h4
{color:#C08700;}
h5
{color:#C08700;}
h6
{color:#C08700;}
tt
{color:#A60000;
font-weight:bold;
font-family:"Gentium";}
</style>
</HEAD>
<BODY BGCOLOR="#FFFFFF" TEXT="#000000">
<table width="100%">
<tr><td bgcolor="#EEC25A">
<h2><br> <a href="kit.html"><img src="kit-gears.gif" border=0 align="absmiddle" height="53" width="60"></a> sounds: The Sound Change Applier</h2></td></tr>
</table>
This page describes a simple program which can apply a set
of sound changes to a lexicon. You can use <b><tt>sounds</tt></b> to help work
out a <b>reconstruction</b> for actual languages, to create plausible
descendants of a <b>conlang</b>, or in fact to make any structured set
of lexical changes to a database of words.
<blockquote><font color="red"><b>I suggest using <a href="sca2.html">the more powerful SCA²</a> instead.</b> It has many more features, works in your browser, and supports Unicode.</font></blockquote>
<p>The program is available in three forms. Please note: <b>right-click</b> on the links to
the executables, pick Save Target As, and save them to your disk.
<ul>
<li><a href="sounds.c.htm">C source code</a>, which you can compile
on any system (I've used it successfully on Windows, Unix, and MacOS),
and modify for your own use
<li><a href="sounds.exe">a Windows executable</a>.
<li>A Mac executable, which you can download either in <a href="RunSounds.bin">MacBinary II</a> or
<a href="RunSounds.hqx">BinHex (<tt>hqx</tt>) format</a>.
</ul>
<br><a href="kit.html">[ Back to the Language Construction Kit ] </a>
<br><a href="default.html">[ Back to Metaverse ]</a>
<hr>
<h3>Basic operation</h3>
<b><tt>sounds</tt></b> reads a text file containing a lexicon, applies a set of
sound changes described in another text file, and outputs the results
to a third file (and to the screen).
<p>For instance, <b><tt>sounds</tt></b> will take
the two <font color="#008080">input files</font> on the left
and produce <font color="#808000">the output file</font> on the right:
<blockquote>
<table>
<tr><td> <b><tt>latin.lex</tt></b>
<td> <b><tt>port.sc</tt></b>
<td> <b><tt>port.out</tt></b>
<tr><td><pre>
<tt><font color="#008080">lector</font></tt>
<tt><font color="#008080">doctor</font></tt>
<tt><font color="#008080">focus</font></tt>
<tt><font color="#008080">jocus</font></tt>
<tt><font color="#008080">districtus </font></tt>
<tt><font color="#008080">civitatem</font></tt>
<tt><font color="#008080">adoptare</font></tt>
<tt><font color="#008080">opera</font></tt>
<tt><font color="#008080">secundus </font></tt>
</pre>
<td><pre>
<tt><font color="#008080">V=aeiou</font></tt>
<tt><font color="#008080">C=ptcqbdgmnlrhs </font></tt>
<tt><font color="#008080">F=ie</font></tt>
<tt><font color="#008080">B=ou</font></tt>
<tt><font color="#008080">S=ptc</font></tt>
<tt><font color="#008080">Z=bdg</font></tt>
<tt><font color="#008080">s//_#</font></tt>
<tt><font color="#008080">m//_#</font></tt>
<tt><font color="#008080">e//Vr_#</font></tt>
<tt><font color="#008080">v//V_V</font></tt>
<tt><font color="#008080">u/o/_#</font></tt>
<tt><font color="#008080">gn/nh/_</font></tt>
<tt><font color="#008080">S/Z/V_V</font></tt>
<tt><font color="#008080">c/i/F_t</font></tt>
<tt><font color="#008080">c/u/B_t</font></tt>
<tt><font color="#008080">p//V_t</font></tt>
<tt><font color="#008080">ii/i/_</font></tt>
<tt><font color="#008080">e//C_rV </font></tt>
</pre>
<td><pre>
<tt><font color="#808000">leitor [lector]</font></tt>
<tt><font color="#808000">doutor [doctor]</font></tt>
<tt><font color="#808000">fogo [focus]</font></tt>
<tt><font color="#808000">jogo [jocus]</font></tt>
<tt><font color="#808000">distrito [districtus]</font></tt>
<tt><font color="#808000">cidade [civitatem]</font></tt>
<tt><font color="#808000">adotar [adoptare]</font></tt>
<tt><font color="#808000">obra [opera]</font></tt>
<tt><font color="#808000">segundo [secundus] </font></tt>
</pre>
</table></blockquote>
<h3><a name="command">The command line</a></h3>
The program can be run with command line parameters (all of which are optional):
<blockquote><tt><font color="#008080">
sounds latin port <i>control-parameters</i>
</font></tt></blockquote>
<p>The <b>control parameters</b> are one of the following: <tt><font color="#008080">-p -b -l -f -</font></tt>.
<p><tt><font color="#008080">-p</font></tt> tells <b><tt>sounds</tt></b> to <b>print out which rules apply </b>
to each word:
<tt><font color="#008080">
<br> s-> /_# applies to secundus at 7</tt></font>
<p><tt><font color="#008080">-b</font></tt> prints the output
with the original word in brackets (suitable for using as the basis of a lexicon with etymologies):
<tt><font color="#008080">
<br> leitor [lector]</tt></font>
<br>Without this parameter, the output looks like this:
<tt><font color="#008080">
<br> lector --> leitor </tt></font>
<p><tt><font color="#008080">-l</font></tt> overrides <tt><font color="#008080">-b</font></tt> if present) and
omits the source word from the output, leaving <b>only output words</b>, like this:
<br> <tt><font color="#008080">
leitor</font></tt>
<br>The resulting output file is suitable for use as a new <tt>.lex</tt> file.
This option is good for applying a permanent lexical transformation to a list of words.
<p><tt><font color="#008080">-f</font></tt> directs output to the <b>output file only</b>,
rather than to both the screen and the file. This option is useful for very long vocabulary lists.
<p>The first two non-control parameters are taken as <b>filenames</b>:
the first gives the name of <a href="#lex">the <tt>.lex</tt> file</a>, containing the <b>lexicon</b>;
the second gives the name of <a href="#sc">the <tt>.sc</tt> file</a>, containing the <b>sound changes</b>
or lexical rules to apply. The extensions should be left off.
<ul>
<li>If <b>filenames</b> are given, the program will run once, against the
selected files, and then exit.
<li>If <b>no filenames</b> are given, the program will ask for the
input files, produce the output, and ask for more files, continuing
till you enter <tt><font color="#008080">q</font></tt> to quit.
</ul>
<b>Output</b> will be printed to the screen, and also written to a file <tt><i>name</i>.out</tt>
where <tt><i>name</i></tt> is the name of the <tt>.sc</tt> file:
in the example above, <tt>port.out</tt>.
<p>I find myself running the program multiple times, tweaking the rules
or the vocabulary in between runs.
<blockquote><font color="000080"><h3>Common Windows problems</h3>
<p>To use command line parameters you have to have a <b>command line</b>.
That means running the program in a command window.
Look under Programs/Accessories and run Command Prompt.
Now type <b><tt>cd</tt></b> plus the name of the directory
where you downloaded <b><tt>sounds</tt></b>-- e.g. <b><tt>cd c:\downloads\</tt></b>.
As with all commands in the command prompt, hit Enter.
Now you can run <b><tt>sounds</tt></b> as described above.
<p>If <b><tt>sounds</tt></b> says your file <b>could not be read in</b> and you're sure
it's there-- you probably have file extensions turned off, and what you think
is a .lex or .sc file is really a Notepad (.txt) file. The easiest thing to do is to
re-save the file as a real .lex file. In Notepad, for instance, change the
"Save as type" dropdown to "All files" instead of "Text documents". Then
it won't add .txt to your file name. If you've done this right, the file won't
have the Notepad icon, and if you double-click it Windows will ask what app to open
it with; select Notepad.
</font></blockquote>
<h3><a name="lex">The <tt><b>.lex</b></tt> file</a></h3>
The <tt>.lex</tt> file is just a text file, consisting of a list of
words, one per line.
<h3><a name="sc">The <tt><b>.sc</b></tt> file</a></h3>
The key to the operation of <b><tt>sounds</tt></b>
is the <tt>.sc</tt> file.
This text file contains two things: definitions of <b>variables</b>, and a set
of <b>rules</b> or <b>sound changes</b>.
<p>Variable definitions should come first, one per line; then sound changes, one per line.
A line beginning with <tt><font color="#008080">*</font></tt>
will be taken as a <b>comment</b> and ignored.
<h4>Sound change format</h4>
Hopefully the format of the rules will be familiar to any linguist.
For instance, here's one sound change:
<br><tt><font color="#008080">
c/g/V_V </font></tt><br>
This rule says to change <tt><font color="#008080">c</font></tt> to
<tt><font color="#008080">g</font></tt> between vowels. (We'll see how to
generalize this rule below.)
<p>More generally, a sound change looks like this:
<br><tt><font color="#008080">
x/y/z </font></tt><br>
where <tt><font color="#008080">x</font></tt> is the thing to be changed,
<tt><font color="#008080">y</font></tt> is what it changes to,
and <tt><font color="#008080">z</font></tt> is the environment.
<p>The <tt><font color="#008080">z</font></tt> part must always contain
an underline <tt><font color="#008080">_</font></tt>, representing the
part that changes. That can be all there is, as in
<br><tt><font color="#008080">
gn/nh/_ </font></tt><br>
which tells the program to replace <tt><font color="#008080">gn</font></tt> with <tt><font color="#008080">nh</font></tt> unconditionally.
<p>The character <tt><font color="#008080">#</font></tt> represents the <b>beginning or end</b> of the word. So
<br><tt><font color="#008080">
u/o/_# </font></tt><br>
means to replace <tt><font color="#008080">u</font></tt> with <tt><font color="#008080">o</font></tt>, but only at the end of the word.
<p>The middle (<tt><font color="#008080">y</font></tt>) part can be <b>blank</b>, as in
<br><tt><font color="#008080">
s//_# </font></tt><br>
This means that <tt><font color="#008080">s</font></tt> is <b>deleted</b> when it ends a word.
<h4>Variables</h4>
<p>The environment (the <tt><font color="#008080">z</font></tt> part) can contain <b>variables</b>, like <tt><font color="#008080">V</font></tt> above.
These are defined at the top of the file. I use capital letters
for this, though this is not a requirement. Variables can only be one character long.
You can define any variables needed to state your sound changes.
E.g. you could define <tt><font color="#008080">S</font></tt> to be any stop, or <tt><font color="#008080">K</font></tt> for any coronal, or whatever.
<p>So the variable definition and rule
<br><tt><font color="#008080">
F=ie
<br> c/i/F_t </font></tt><br>
means that <tt><font color="#008080">c</font></tt> changes to <tt><font color="#008080">i</font></tt> after a front vowel and before a <tt><font color="#008080">t</font></tt>.
<p>You can use variables in the first two parts as well. For instance,
suppose you've defined
<br><tt><font color="#008080">
S=ptc
<br> Z=bdg
<br> S/Z/V_V </font></tt><br>
This means that the stops <tt><font color="#008080">ptc</font></tt>
change to their voiced equivalents <tt><font color="#008080">bdg</font></tt>
between vowels. In this usage, the variables must correspond one for one--
<tt><font color="#008080">p</font></tt> goes to <tt><font color="#008080">b</font></tt>,
<tt><font color="#008080">t</font></tt> goes to <tt><font color="#008080">d</font></tt>,
etc. Each character in the replacement variable (here <tt><font color="#008080">Z</font></tt>)
gives the transformed value of each character in the input variable (here <tt><font color="#008080">S</font></tt>).
Make sure the two variable definitions are the same length!
<p>A variable can also be set to a fixed value, or deleted. E.g.
<br><tt><font color="#008080">
Z//V_V </font></tt><br>
says to delete voiced stops between vowels.
<h4>Rule order</h4>
<p>Rules apply in the <b>order</b> they're listed. So, with the word <tt><font color="#008080">opera</font></tt> and the rules
<br><tt><font color="#008080">
p/b/V_V <br>
e//C_rV </font></tt>
<br>the first rule voices the <tt><font color="#008080">p</font></tt>, resulting in <tt><font color="#008080">obera</font></tt>; the second
deletes an <tt><font color="#008080">e</font></tt> between a consonant and an intervocalic <tt><font color="#008080">r</font></tt>,
resulting in <tt><font color="#008080">obra</font></tt>.
<p>The <a href="#command"><tt><font color="#008080">-p</font></tt> command line parameter</a>
can assist in debugging rules, since it causes the output to show exactly what rules applied to each word.
<h4>Optional elements in the environment</h4>
<p>One or more elements in the environment can be marked as <b>optional</b> with parentheses. E.g.
<br><tt><font color="#008080">
u/ü/_C(C)F </font></tt><br>
says to change <tt><font color="#008080">u</font></tt> to <tt><font color="#008080">ü</font></tt>
when it's followed by one or two consonants and then a front vowel.
<h3>How to use it</h3>
The program is simple-minded and yet powerful... in fact it's
powerful in part <i>because</i> it's simple-minded. You can do a lot
with these basic pieces.
<h4>Input orthography</h4>
<p>For instance, you may wonder whether the <tt>.lex</tt> file should be
based on spellings or phonemes. It doesn't matter: the program
applies its changes to whatever you give it. In my example I used
conventional spellings, but I could just as easily have used
a phonemic rendering. Similarly, I wrote the rules to output
orthographic Portuguese, simply to make for an easy example.
It would be better to output a phonetic representation.
This would help us realize that we really need a sound change
<br><tt><font color="#008080">
k/s/_F </font></tt><br>
that would handle the change from <tt><font color="#008080">civitatem</font></tt> with /k/ to <tt><font color="#008080">cidade</font></tt> with /s/.
<p>The program will handle whatever you put into the <tt>.lex</tt> and <tt>.sc</tt>
files, including accented characters. If the language you're working with requires a special font,
simply edit the source and output files with an editor, using that
font. This would allow you to use (say) an IPA font.
<p>To improve my Latin-to-Portuguese file, for instance, I would
certainly want to handle vowel length and stress. I might
use accented vowels for this. Of course the program knows
nothing about phonetics, so you have to remember to
define the variables to match how you've set up the <tt>.lex</tt>
file. If you use accented vowels, you will
want to change the definition of <tt><font color="#008080">V</font></tt>.
<h4>Using digraphs</h4>
<p>Though sound changes can refer to <b>digraphs</b>,
variables can't include them. So, for instance, the following rule is
intended to delete an <tt><font color="#008080">i</font></tt> onset following an intervocalic consonant:
<br><tt><font color="#008080">
i//VC_V </font></tt><br>
However, it won't affect (say) <tt><font color="#008080">achior</font></tt>,
because the <tt><font color="#008080">C</font></tt> will not match the digraph <tt><font color="#008080">ch</font></tt>.
You could write extra rules to handle the digraphs; but it's often more convenient
to use an orthography where every phoneme corresponds to a single character.
<p>You can write transformation rules at the beginning of your sound change list
to transform digraphs in the input file:
<br><tt><font color="#008080">
ph/f/_ </font></tt>
<h4>Using <tt>sounds</tt> for conlang development</h4>
To create a child language from a parent, create a <tt>.lex</tt> file containing the
vocabulary of the parent, then a <tt>.sc</tt> file containing the sound changes you
want to apply. Now run <tt><b>sounds</b></tt> to generate the child language's vocabulary.
<p>For an example, you can download a
<a href="metaiun.lex">vocabulary of Methaiun</a>
and the <a href="kebreni.sc">sound changes for Kebreni</a> (<b>right-click!</b>).
You can compare this to the <a href="kebreni.htm">Kebreni grammar</a> in <a href="virtuver.htm">Virtual Verduria</a>.
<p>For me, there is a peculiar, intense pleasure in creating a daughter language with a particular feel to it, merely by altering the set of
sound changes. All I can think of to
compare it to is creating new animals indirectly, by mutating their DNA.
<p>What sort of sound changes should you use? You can examine the history of any
language family for ideas. Some common changes that can form part of your repertoire (with some sample <tt><b>sounds</b></tt> rules):
<ul>
<li> <b>Lenition</b>. Stops become fricatives; unvoiced consonants become voiced;
stops erode into glottal stops, or <tt><font color="#008080">h</font></tt>, or disappear.
The intervocalic position is especially prone to change.
<br><tt><font color="#008080">S/Z/V_V</font></tt><p>
<li> <b>Palatalization</b>. Consonants can palatalize before or after a front vowel
<tt><font color="#008080">i e</font></tt>, perhaps ending up as an affricate or fricative.
<br><tt><font color="#008080">k/ç/_F</font></tt><p>
<li> <b>Monophthongization</b>. Diphthongs tend to simplify. This rule is fun to
apply <i>after</i> letting the vanished sounds affect adjoining consonants.
<br><tt><font color="#008080">i//CV_C</font></tt><p>
<li> <b>Assimilation</b>. Consonants change to match the place or type of articulation
of an adjoining consonant.
<br><tt><font color="#008080">D=td<br>m/n/_D</font></tt><p>
<li> <b>Nasalization</b>. A nasal consonant can disappear, after nasalizing the previous vowel.
<br><tt><font color="#008080">Â=âêîôû <br>N=mn <br>V/Â/_N <br>N//Â_</font></tt><p>
<li> <b>Umlaut</b>. A vowel changes to match the rounding of the next vowel in the word.
<br><tt><font color="#008080">u/ü/_C(C)i</font></tt><p>
<li> <b>Vowel shifts</b>. One vowel can migrate into a free area of the vowel space, perhaps
dragging others behind it.
<br><tt><font color="#008080">a/&/_ <br>o/a/_ <br>u/o/_ </font></tt><p>
<li> <b>Tonogenesis</b>. One way tones can originate is for voiced consonants
to induce the next vowel to be pronounced in a low pitch.
<br><tt><font color="#008080">Z=bdgzvmnlr <br>V=aiu <br>L=áíú <br>V/L/Z_ </font></tt><p>
<li> <b>Loss of unstressed syllables</b>.
<br><tt><font color="#008080">A=áéíóú <br>V//AC(C)_</font></tt><p>
<li> <b>Loss of final sounds</b>. This can really mess up your carefully worked out inflectional system.
<br><tt><font color="#008080">V//_#</font></tt><p>
</ul>
The beauty part of using <tt><b>sounds</b></tt> is that your language will illustrate
the Neo-Grammarian principle: sound changes apply uniformly whenever their
conditions are met. You may choose to edit the results by hand, however, to simulate
the complications of real languages. <b>Analogy</b> can regularize the grammar; words may be borrowed from <b>another
dialect</b> where different changes applied; words may be <b>reborrowed</b> from the
parent language by scholars.
<p>I pay particular attention to the havoc the sound changes are likely to wreak on the
<b>inflectional system</b>.
E.g. if a case distinction is maintained in some words and lost in others, it may spread
to the second category by analogy.
<p>Sound changes can also result in <b>homonyms</b>. For instance, if you voice intervocalic
consonants, <tt><font color="#008080">meta</font></tt> and <tt><font color="#008080">meda</font></tt>
will merge. You can simply live with this, but if the merger is particularly awkward,
the users of the language are likely to invent a new word to replace one of the homonyms.
E.g. Latin American Spanish has innovated <tt><font color="#008080">cocinar</font></tt>
'to cook', since the original <tt><font color="#008080">cocer</font></tt> has merged with
<tt><font color="#008080">coser</font></tt> 'to sew'.
<h4>Using <tt>sounds</tt> to find spelling rules</h4>
I've also used <tt>sounds</tt> to model the spelling rules of English. Here the
input file lists the spellings of several thousand English words, and the "sound changes"
are rules for turning those spellings into a phonetic representation of how the words sound.
<p>Most people think English spelling is hopeless; but in fact the rules predict
the correct pronunciation of the word 60% of the time, and make only minor errors
(e.g. insufficient vowel reduction) another 35% of the time.
<ul>
<li><a href="spell.html">Here's a discussion of the rules</a>, including the
<tt>sounds</tt> input and output files.
</ul>
<hr>
<center><a href="default.html"><img border=0 src="home.gif" alt="Back to main page"></a></center>
</body></html>