Sunday, June 12, 2011

Taming the Tongue: Adventures in Spelling Reform

Warning: The following essay is extremely nerdy, and should be avoided by people who have absolutely no interest in experimental orthography, obsolete letters, and the obsessive splitting of figurative hairs.

The current system for writing the English language is a bizarre apparatus.  It is a strange contraption that promises simplicity, but delivers circumstances where words like rough, plough, and though do not rhyme, but newt, flute, and boot do.  You've heard this all before, of course: the self-evident justification for calls to reform our spelling.  We have a spelling that makes pronunciation only barely predictable: why not have something logical and straightforward?

Well, there's lots of very good reasons why not, but we're going to ignore them for a while, in pursuit of a Quixotic quest: the search for a workable phonetic alphabet for the English language, where every letter is pronounced the same way every single time.  We will establish a one-to-one correspondence between sound and symbol, and eliminate silent and unnecessary letters.  Together, we will dream the impossible dream.

The biggest difficulty in English spelling is usually the vowels, so I at first though that I might transform the language into an abjad.  An abjad is a system that depicts only the consonants of a language, and either leaves out the vowels entirely or marks them with special lines.  However, a brief period of research convinced me that this was foolhardy.  An abjad works for Hebrew and the like because in those languages, related words tend to have the same consonants, and it's clear from context and a few superscript markings what form is required.  This doesn't work in English: without vowels, we can't tell words like bait from bat or abut.  So we'll keep our vowels, but trust me when I say we'll regret it.

My new working alphabet is, actually, two alphabets.  Even though an abjad is an entirely inappropriate form for an English writing system, I've decided to keep vowels and consonants separate. The effect of this is to separate a very easy job from a very difficult one, as we will see.  Along the way, we'll have help from Wikipedia and the International Phonetic Alphabet (IPA), a set of symbols designed to transcribe just about every possible noise the mouth can make.

Section One: Consonants

In this section:
  • The new list of consonants for our alphabet
  • An explanation of the sounds they make
  • An explanation of their order in the new alphabet
This section is likely to be the least interesting, because it is the most straightforward aspect of the process, but requires a lot of explanation.  English consonants are pretty much the same from dialect to dialect.  Occasionally they are articulated slightly differently, and sometimes dropped altogether (most often the "TH" consonants), but on the whole they all generally show up and can be counted upon to make more or less the same sound in all contexts.

To make our new alphabet more purely phonetic, I've decided to force each letter to carry only one sound value, and to eliminate digraphs: say goodbye to PH, SH, CH, and TH.  We don't need them!  In addition, I've called up some extra letters from the ranks of the International Phonetic Alphabet to fill in the gaps.  I'll be treating them in the text as if you can pronounce their names as easily as you might pronounce B or C, but in the list below, you can see the new letter's names spelled out.

(A quick note: I will occasionally use the words "voiced" and "unvoiced" to describe a consonant sound.  A voiced consonant is one that is pronounced with a full vibration of air in your throat, and an unvoiced consonant is pronounced without that vibration.  Compare the sounds of Z and S for the difference.)

B, b,
P, p,
M, m,
D, d,
T, t,
N, n, 
G, g,
K, k,
Ŋ, ŋ, (pronounced "eng")
V, v,
F, f
Ð, ð, (pronounced "eth," with the th pronounced as in "breathe")
Þ, þ, (pronounced "thorn")
Z, z,
S, s,
Ʒ, ʒ, (pronounced "ezh," with the sound of "regime")
Ʃ, ʃ, (pronounced "esh")
X, x,
H, h,
J, j,
C, c,
L, l,
R, r,
W, w
Y, y

Each of these letters now represents one, and only one, sound in the repertoire of English consonants.  For example, in our present system, the word of is spelled with an F, despite the fact that the consonant we hear is voiced; logic demands a V in its place, and the new alphabet is nothing if not logical.  Most of these letters are pronounced in the new alphabet as a seasoned English reader might expect, with a few exceptions.  G is now pronounced exclusively with a hard sound, as in gear, while C is now pronounced exclusively as "CH," as in the Italian word ciao.  All of the other familiar letters sound just like you learned in kindergarten.

As for the letter X, it is NOT pronounced like "KS," because that's silly and it's beyond silly that we have a letter for that.  Instead, we're taking our cues from the IPA and assigning it the role of depicting the "voiceless velar fricative."  This sound is not part of most English varieties, but it is essential for properly pronouncing words like loch and Chanukah.  In the interest of being inclusive to our Scottish and Jewish friends, we'll keep it around for special, inclusive occasions.

The new letters may look scary, but they are actually fairly simple to grasp.  Two of them, Ð and Þ (scroll up if you've forgotten their names), are actually old friends of ours, having been driven out of our alphabet centuries ago by the bizarre combination of T and H.  The sounds that initiate words like thy and thigh are, of course, different sounds (the first one is described as voiced, the second one is not), so it only makes sense that they should be written differently.  Medieval scribes mostly switched haphazardly between Ð and Þ, but we're going to follow the lead of the modern Icelandic language: Ð is for words like the and these and that, and Þ is for words like threat and thank and think.

Another new pair is Ʒ and Ʃ.  Ʒ looks like a strange number 3, but is actually a variant of Z, designed by Isaac Pitman for a phonetic shorthand alphabet.  The IPA uses it for a sound that English words almost never begin with (unless they're borrowed from French), but is found in the middle of several others: treasure, regime, and vision, for example.  Ʒ is another "voiced" consonant, and its voiceless companion is Ʃ.  Ʃ looks like the Greek letter Sigma, and it basically is, except that its lower case form, ʃ, is different.  In the IPA, Ʃ (or rather,ʃ) represents what we would usually write as "SH," if we were still interested in writing useless digraphs (and we aren't).

The last new letter, Ŋ, looks a little frilly and aristocratic at first glance.  However, it is a letter with a genuinely American heritage.  It was used by a genuine American, Mr. Benjamin Franklin, as part of his own quest to develop a phonetic alphabet that people would actually use.  Today, the IPA uses the lowercase form to represent the sound written with "NG" in words like sing and bang, just as Franklin intended.  The IPA also distinguishes between the sounds in singer and finger, noting that the latter has a hard G sound following Ŋ.  Since they show this by writing Ŋ followed by G, I don't see why we can't do the same.

Observant readers can see that the new list of consonants leaves one out from our old list: Q.  Why not Q?  Let me turn that around: why have Q in the first place?  It is totally equivalent to K, and its one special task can only be performed with the help of a vowel.  A vowel, ladies and gentlemen.  Given the choice between quit and kwit, the new alphabet clearly prefers the second.

Finally (yes, finally), a word about the order of the new alphabet.  We're all familiar with our ABCs, of course, and the idea of changing the time-honored order can seem arbitrary (or even like sacrilege).  However, there is still method in my madness.  Consonants are classified in two ways by linguists: the mode and place of articulation, or how and where the sound is produced in the mouth.  I've merely grouped the letters so that they fit together with the sounds they resemble.

The first nine letters, from B to Ŋ, are called stop consonants, which are easily divided into three groups: B P M, D T N, and K G Ŋ.  The three members of each group follow a pattern: the first and second are pronounced with air stopping in the mouth (and are distinguished by being voiced or unvoiced), while the third has the air stopped in the nose.  The first group is pronounced with the lips, the second with the tongue just behind the top teeth, and the third with the soft palate, so the order of the three groups seemed natural.

The next ten letters, from V to H, are known as fricatives.  The first eight of these occur in voiced and voiceless pairs, while X and H are both voiceless and have no voiced equivalents (well, X does, but it simply doesn't exist in any English dialect).  I have also arranged these pairs, more or less, in the order we meet them in the mouth, starting between the lips and teeth and retreating all the way back to the throat.

The next two letters in our new alphabet, J and C, represent another voiced and voiceless pair.  They are our language's only affricate consonants, so called because they begin like stops and then turn into fricatives.  With this in mind, they get their own special placement.

The last four consonants are known as approximants, and they behave in very strange ways: I like to call them the weird consonants, and their weirdness will cause trouble for us later.  The sounds represented by W and Y are known as semivowels, and the letters have traditionally been used as substitute vowels, but because we're devoted to the principle that one symbol can only stand for one sound, our new alphabet will allow their use only as consonants.  On that note, many words (such as cute) contain the Y sound, but rely on the letter U to convey it: that won't be the case in our new system. 

Section Two: Vowels

In this section:
  • A description of the types of full vowels found in English
  • A system for using the five vowel letters (A, E, I, O, and U) to consistently write these sounds.
  • A description of rhotic and reduced vowels.
Believe it or not, that was the easy part.  Assigning a single letter to every consonant sound in the English language was only a matter of forcing each letter to play only one role, and pilfering the IPA's list of symbols for letters to fill in the gaps. 

This approach, however, is massively insufficient with regard to vowels.  Having banished W and Y to the consonant side of the fence, we have only five letters to use for vowel sounds, and at least three times as many distinct sounds to convey with them.  The IPA is of little help: it mostly resorts to combining letters into awkward chains, or producing upside down or otherwise distorted versions of the letters we already have.  We've already introduced enough new consonant letters, so I see no need to make this any harder by bringing in nonsense like ɛ, ɒ, or ə.

The truth is, the biggest reason that people call for English spelling reform, the inconsistency of our vowel use, is the greatest challenge in producing a system for spelling English that makes any sense.  Vowels vary widely from dialect to dialect, and even within a dialect they tend to blend together at the edges.  English actually has three different kinds of vowel sounds, and each presents its own problems.

To begin, we have full vowels.  These are typically pronounced in fully stressed syllables and are the easiest to understand. Next come rhotic, or "R-Colored" vowels, which alter the properties of full vowels by mingling them with the sound of the consonant R (which belongs to our small class of weird consonants).  Finally, there are reduced consonants, which make their homes in unstressed syllables, defying our best efforts to pin them down.  We'll begin with the full vowels.

Checking the IPA's roll call of English sounds, I expected to find a list that I could appreciate as simply as the consonants.  What I found was illustrated with a series of lexical sets: a system devised by John C. Wells that represents the sorts of vowels found in English with a word that features that vowel.  The type words for each lexical set of the full vowels reads as follows:


It was then that I knew my quest was hopeless, because I had run up against the limits of my own Southern California dialect.  After pronouncing each word carefully, I found that I could not distinguish between the vowels of palm, lot, and thought: each one sounded like the same broad "a." 

The goal of my new alphabet is to have one symbol for every sound, but this produces a paradox when applied across multiple dialects.  My pronunciation requires only one symbol for the vowels of palm, lot, and thought; but a different speaker might expect two or three symbols, reflecting his or her style of pronunciation.  I didn't know how to resolve it at first.

Then I remembered, this was MY alphabet.  Other dialects would have to take one for the team.  I devised a system of single and paired letters to represent each of the thirteen full vowels that I distinguished.  All full, non-rhotic vowels would be pronounced with the following letters: A, AE, AI, AU, E, EE, I, II, OI, O, OU, UU, and U.  Using the consonant values I established in the previous section, the type words would now be spelled as follows:

Palm = Pam
Lot = Lat
Trap = Traep
Price = Prais
Mouth = Mauþ
Dress = Dres
Face = Fees
Kit = Kit
Fleece = Fliis
Thought =  Þat
Choice = Cois
Goat = Got
Foot = Fout
Goose = Guus
Strut = Strut

To reiterate, for example, any word that rhymed with trap would, under the new system, spell the rhyming vowel with the letters AE.  Any word that rhymed with foot would spell the rhyming vowel with OU, and so on and so forth.

For those of you who DO distinguish between  the vowels of palm, lot, and thought, you may use the letter OE for the vowel of lot and the letters OA for the vowel of thought (assuming those two sound different to you: I can't even imagine how).

The truth is, as consistent as this system is, it is a massive failure in its primary goal, to produce a phonetic spelling that every English reader can understand.  It's a contradiction in terms, not only because of the differences that already exist between dialects, but also because of changes that are still occurring.  Here on the west coast, I often hear people pronouncing pin and pen, or been and Ben, as though they had the same vowel.  This is maddening to me and I would do anything I could to stop it, but it simply cannot be stopped.

Continuing on with our difficulties, we encounter the rhotic, or R-Colored vowel.  R-Coloring is what its name implies: a change in the "color," or tone of a vowel when it is followed immediately by the letter R.  This sort of thing doesn't happen in most languages, but we're stuck with it.  It wouldn't be quite so bad, except for one problem: English dialects are broadly divided on rhotic accents.  Speakers of non-rhotic accents will simply drop the R unless it is followed directly by a vowel.  For example, the R in bears would not be pronounced, but the R in bury would be.

In some cases, non-rhotic speakers will even add an extra R to the end of words where it doesn't belong, if the next word begins with a vowel.  Consider the chorus of the Oasis song:

In a champagne supernova, champagne supernovaR in the sky...

Variations like this make rendering R-Colored vowels in a universal phonetic alphabet a serious pain.  But since we have to account for them, we may as well try. Every R-colored vowel has an ordinary vowel at its heart, so I propose that the logical thing to do is to simply add R to the most appropriate of the vowel combinations listed above.  Being the architect of this new alphabet, I declare that rhotic pronunciations will set the standard.

Finally, we have reduced vowels, which make their homes in the thin crevices of our languages, unstressed syllables.  While not every unstressed syllable has a reduced vowel, most of them do.  The most common one is the neutral vowel, known as schwa: it is actually the most common vowel in the language, and its sound can (under the current system) be expressed by any of the vowel letters (even y).  It's basically that "uh" sound you make when you aren't sure what to say.

Schwa can be R-Colored, and it can also be accompanied by L, M, and N in such a way that they seem to be making consonant noises by themselves (like rhythm).  Schwa has its own symbol, ə, but it lacks an accepted capital form, so it seems we can't use it.  The best alternative is, I think, to force U into double duty, because out of all the full vowels, Schwa most closely resembles the vowel in strut.  Of all our compromises in this section, this one seems pretty defensible (to me, at least)

Apart from Schwa, we also encounter reduced versions of I (the last vowel in roses), O (the first vowel in omission), U (the last vowel in beautiful) and even a long I (the last vowel in happy).  It seems the best thing to do, then, is to use those letters and rely on the stress to convey the fact that they are reduced.

And there we have it, or the theory at any rate.  It's time to put the whole thing into practice.

Section Three: Sample Text

I hope you're ready, because things are about to get freaky.  I'm going to take some text and re-spell all of the words according to my bold new system.  The spelling will reflect my own pronunciation, and so the system as a whole falls miserably short of our universal goals.  Even so, here's a taste of what phonetic spelling reform might look like...

For skor aend seven yiirz ugo aur faðers brat forþ an ðis kantinent u nuu neeʃun, kunsiivd in liburtii, aend dedikeetid tuu ðu praposiʃun ðaet al men ar kriieetid iikwal.

Nau wii ar engeejd in u greet sivil wor, testiiŋ weður ðaet neeʃun, or aenii neeʃun, so kunsiivd aend so dedikeetid, kaen laŋ enduur. Wii ar met an u greet batulfiild uv ðaet wor. Wii haev kum tuu dedikeet u porʃun uv ðaet fiild, aez ee fainul restiiŋ plees for ðoz huu hiir geev ðeer laivs ðaet ðaet neeʃun mait liv. It iz altuugeður fitiiŋ aend prapur ðaet wii ʃud duu ðis.

But, in ee larjur sens, wii kaen nat dedikeet, wii kaen nat kansikreet, wii kaen nat halo ðis graund. Ðu breev men, liviiŋ aend ded, huu struguld hiir, haev kansekreetid it, far ubuv aur puur pauur tuu aed or ditraekt. Ðu wurld wil litul not, nor laŋ riimembur wut wii see hiir, but it kaen nevur forget wut ðee did hiir. It iz for us ðu liviiŋ, raeður, tuu bii dedikeetid hiir tuu ðii unfiniʃt wurk wic ðee huu fat hiir haev ðus far so noblii aedvaenst. It iz raeður for us tuu bii hiir dedikeetid tuu ðu greet taesk riimeeniiŋ biifor us—ðaet frum thiiz anord ded wii teek inkriist devoʃun tuu ðaet caz for wic ðee geev ðu laest ful meʒur uv devoʃun—ðaet wii hiir hailii risalv ðaet ðiiz ded ʃael nat haev daid in veen—ðaet ðis neeʃun, undur Gad, ʃael haev ee nuu birþ uv friidum—aend ðaet guvurnment uv ðu piipul, bai ðu piipul, for ðu piipul, ʃael nat periʃ frum ðii urþ.


Dear God, what have I done?


Let's be honest: even I have trouble reading that.  If the Gettysburg Address weren't so well known, it would probably be difficult for anyone to follow the meaning, even after reading my description of the precise function of each letter.  It is a fairly accurate representation of the sounds of the English language (at least as I speak it)... but it isn't English.  At first glance, it looks more like Dutch.

I mentioned before that there are a number of good reasons not to have a logical English spelling, where every sound can be predicted from the letters.  I think the best reason has to do with etymology.  Consider the word nation, which I spelled out in the sample as neeʃunNation is a prime offender of the principle of one sound, one symbol: why should an "SH" sound be spelled with "TI?"  But it is a descendent of the Latin word nationem.  When spelled nation, the word carries with it a whole history of evolving usage, but when spelled neeʃun, it is only a sound.

There are practical reasons to avoid total reform as well.  The whole body of English literature, from the humblest scribbles to the greatest masterpieces, would become unreadable to future generations, unless an effort was made to re-transcribe them all.  And even if they did, what pronunciations would they base the spelling on?  A phonetic spelling can only have value to a single community of speakers, but English is a global language of great variety.  It's changed a great deal over the centuries, and it is going to change some more.

Now, there may be a case for some minor reforms.  The world didn't stop spinning when Americans took the U out of colour, or changed plough into plow.  I can't imagine that a serious proponent of spelling reform would even think of taking my mad schemes seriously.  But given the massive success of English literature, and the widespread literacy of English speaking people, I can't think of any compelling reason that English spelling demands a change.  Consistency is the only fair point, and we've seen where that road leads.

At its core, written English is a different language from spoken English, and we shouldn't treat it as the same thing.  As much fun as I had reordering the alphabet to fit my whims, I'm all too happy to put the blocks back in their box.


  1. While this system may ultimately be too unwieldy to implement, I happened to note that the Gettysburg Address was 30 characters shorter thus written than it is in its original form, so there is that.
    However, even if multiple experts agreed and publicly stated that this system is more efficient than the current one, we already know how Americans would respond. Just look at the metric system. Hardly anyone can convert from yards to miles without help, while meters to kilometers is a breeze. Yet we continue to use our silly measurements which are rumored to be based on a certain royal's shoe just because it's easier than changing at this point.

  2. Yes, I agree with you about the metric system. As intuitive and practical as it is, there's no reason we shouldn't be adopting it for practical matters.

    When it comes to written language, though, I think there's something to be said for continuity with the past that overrides practicality. I think people also feel that continuity with the Imperial system of measurements. They actually use both systems side by side in the UK, and I don't see any problem with that.

  3. That's the point I was trying to make, actually. Continuity overrides practicality in this case because it's actually more practical to carry on as always than it is to attempt a complete overhaul of the system. I should have been more clear.

    I did not know that about the UK, but it makes sense. It would certainly be nice to see the metric and imperial systems side by side here on anything besides a ruler.