Comprehending Spelling

Bright, motivated, hard-working students often struggle to spell simple words like <been & says>.  And early in my training, one of my young students got my attention when he wrote *<hoam> for <home> and I realized that I had no way to help him understand why it wasn’t spelled with <oa>.  (The asterisk indicates that the written form inside the angle brackets is not standard—here, it’s noting a spelling error.) We were systematically working our way through all of the ways to spell “long o” and my student could isolate phonemes and spell words accurately as long as I specifically told him whether we were working with an <o-e> pattern as in <home, mole, rode> or an <oa> grapheme as in <foam, goal, road>. But that wasn’t going to help him when he was sitting at his desk trying to remember how to spell <home> and thousands of other words that seem like they could be spelled more than one way. 

Note that angle brackets <   > are used to refer to written letters or graphemes. Graphemes are the letters or combinations of letters that combine to form written words. So < cat > refers to a written form—the graphemes C A T, not the meaning or pronunciation of that word. 
Slash brackets /   / are used to refer to phonemes which are the flexible units of pronunciation in spoken words that distinguish one word from another. So / k /  refers to the phoneme that is the first segment of the spoken word cat, and/ kæt / refers to the pronunciation of the word cat, using IPA symbols.

I didn’t know it then, but to understand why one specific grapheme represents a phoneme in a word, we need to look at the phoneme-grapheme relationships in that word within the context of morphological (structural) families and etymological relationships. This context for understanding the writing system is critical, especially for students who can’t easily memorize spellings by rote.

Unfortunately, though, you will hear people suggest that including morphology and etymology in literacy instruction might be helpful for older students but would probably be too much for younger students and would divert time from what are considered to be the essential, foundational concepts and skills that beginning students need most. To test that premise, let’s look at two confusing categories of words that students will encounter from their first day of school: words that are called “irregular” and words that can be spelled in more than one way, based on their pronunciation.  To understand the spelling of these words, everyone—even a very young student—needs to understand how structures and relationships influence our writing system.

Words called “irregular”

Many very common words are categorized in traditional instruction as “irregular,” including words like <says & been>.  In their first years of school, students are typically asked to memorize many of these words by rote.

But so-called irregular words are not actually irregular; they are simply misunderstood. One of the reasons for that is that short words like <says, been, does, any, exit & real> are often assumed to be simple (built from only one written morpheme, called an element) and their structures are ignored. But many short words are not simple. All of the words I’ve just listed are complex, meaning that they consist of more than one written element. When we stop to think about the spelling of <says>, that’s obvious.

says ➞ say + s

The word <says> is a third person, singular, present tense verb form, constructed by adding suffix <s> to the base element <say>. Many third person verbs are constructed this way. 

I walk; she walks.  <walk + s ➞ walks>
You run; he runs.  <run + s ➞ runs>
I drive; she drives.  <drive + s ➞ drives>
You play; she plays.  <play + s ➞ plays>
I say; he says.  <say + s ➞ says>

In all of the verbs listed above, the structural pattern is exactly the same. So why is the spelling <says>  called “irregular” or “rule-breaking?” It’s called irregular because of an unexpected pronunciation of the spoken word. Says doesn’t rhyme with plays. But in fact, the construction <say + s ➞ says> is perfectly coherent. And an unexpected pronunciation does not make a spelling irregular. When we look at the structure of <says>, we can easily understand its spelling. 

We see the same thing in the written verbal form <been>, which contains the base element <be> with the suffix <en> added to form the past participle. 

be + en ➞ been

This past participle suffix is found in a number of English verbs.

I will eat; I have eaten.  <eat + en ➞ eaten>
I will give; I have given.  <give/ + en ➞ given>
I will take; I have taken.  <take/ + en ➞ taken>
I  will drive; I have driven.  <drive/ + en ➞ driven>
I will be; I have been.  <be + en ➞ been>

Notice that in <given, taken & driven>, the final unpronounced orthographic marker <e> in the base is replaced by the suffix <en> which begins with a vowel. (That replacement is signaled by the slash mark in the word sums above.) This is a consistent suffixing pattern in English. 

But also notice that the <e> in <be> is not replaced. That’s because we pronounce that <e>. It’s not a marker which can be replaced by a vowel suffix but is a grapheme representing the final phoneme in the spoken word be. We don’t replace a final <e> that’s part of a grapheme, so <been> is spelled by just adding the suffix <en> to the base <be>. Again, the pronunciation of been may be unexpected, but the spelling <be + en ➞  been> is perfectly coherent.  

By analyzing the structure of <says & been> we can see that their spellings are not flawed; it turns out that the flaw is in the assumption that spelling is supposed to represent the pronunciation of words as directly as possible. 

The English writing system is morphophonemic, so the spelling of an English word signals both phonology (the phonemes or flexible, distinctive segments of pronunciation in spoken words) and morphology (the structural units that construct the meaning of words). 1 These two aspects of the orthography work together and allow us to comprehend text more quickly and easily than if words were spelled to represent only their pronunciation. 

And think about the fact that English words are spelled the same regardless of regional variations in pronunciation. If words were spelled primarily to represent pronunciation, Americans speakers might think that the spoken word been would make more sense if it were spelled *<bin> or *<ben> depending on which part of the country they lived in, but the spelling <been> might seem quite appropriate to a British speaker.  Which spelling would the writing system use?  

Our writing system facilitates written communication between every speaker of English, no matter where they live. To do that, it ignores minor variations in the pronunciation of specific words. When we think about that fact, and examine the spelling of words that are categorized as “irregular,” we see illustrations of a foundational, critical concept about English spelling.

English spelling is optimized for comprehension, not pronunciation. 

Written words do indeed represent meaningful aspects of pronunciation, but not necessarily as directly as possible. That’s an observable fact, and it allows the entire writing system to make sense, but the fact that it’s a coherent part of the overall system is not understood; the words <been & says> are categorized as “irregular” precisely because their spelling is not the most direct representation of their pronunciation and because it’s obvious that their spelling is not optimized for pronunciation. 

However, if we look at the ways in which spelling allows English speakers to quickly comprehend words while reading, then we can understand why “irregular” words contain the specific graphemes they do. Graphemes in all words represent flexible segments of pronunciation (phonemes) but they must represent those phonemes while maintaining consistently spelled structural units across an entire morphological (structural) family of words. As we read, the specific graphemes in words give us signals about both phonemes and morphemes, which facilitates and deepens comprehension of written text. 

 We can see this in the spelled word  <says> where the graphemes must represent the phonology of not only the word says, but all words that are in the same structural (morphological) family, including say and saying. And they need to do that while maintaining a consistent spelling of the base <say>.

It’s important to know that we can absolutely understand the phoneme-grapheme relationships in words like <says, been, does, any> and every other word that is mistakenly categorized as irregular.  And it’s often very useful to do that type of analysis. However, I’m not going to go into that right now, because I want to point out that many times that type of analysis is not even needed. Once students understand that words are formed by combining consistently spelled structural units, they can often use the “expected” phoneme-grapheme relationships in base elements like <say & be> and common suffixes like <s & en> to reconstruct the accurate spelling of a complex word like <says> or <been> when they want to spell it. 

By examining and analyzing written words, we can see for ourselves that words traditionally categorized as “irregular” are completely logical once we understand their structures, their relationships to other words which influence their spelling, and the phoneme-grapheme relationships within them. Rather than being a source of confusion and frustration, these words can be a point of entry for studying the coherence of the writing system, even with very young students. 

But there’s another pitfall of confusion that beginning students will face, and the spelling <be> is an example. Why is that word spelled <be> and not <bee>? Many words in English are categorized as “regular,” but may use different graphemes to spell the same phoneme. When students try to make sense of words by analyzing the phonemes in a spoken word in isolation, even advanced students who may know all the ways to spell a given phoneme have no way to know how to spell a particular word. 

Understanding multiple spellings

In my school district, kindergarteners learn to read and (hopefully) spell a set of commonly used words over the course of the year, including we, see, me, three, he & she.  Each of these spoken words ends with the same segment of pronunciation. So how is a student supposed to know to use a single <e> to spell <he, she, me & we> but a double <e> to spell <see & three>?

When we teach students to go directly from speech to spelling, there is nothing to help them with this task other than their rote memorization skills. So almost immediately, some students will start to fail. But no student needs to struggle with these words. There’s a convention that explains why we find <e> or <ee> in each of these words. This convention is one of the many ways that our writing system facilitates comprehension of written text. 

To understand this, we first need to know that English words can be roughly divided into two general categories: function words and lexical (content) words. Function words primarily serve grammatical functions and are often unstressed in speech. This category includes pronouns (me, her), auxiliary verbs (do, is), prepositions (in, up, toward), conjunctions (and, but), determiners (this, a) and a few adverbs.  There are a number of spelling conventions in English related to function and content words;  one is that function words can be spelled with just the graphemes needed to represent the phonemes. This means that they may be spelled with only one or two letters, although more may be needed to signal the pronunciation. Examples of one or two letter function words include  I, a, an, he, me, we, in, to, of, up, so.  There are only about 300 function words in English, but they are used constantly in our language, showing relationships between words, indicating grammatical structures, and generally gluing together lexical words. 

Lexical or content words, on the other hand, are the words that carry the meaning of sentences more directly, and they are more likely to be stressed in speech. This category includes nouns (dog, egg, sweater, freedom, awe), verbs (run, catch), adjectives (friendly, peaceful, odd) and most adverbs (slowly, happily). These words, by convention, are spelled with at least three letters, and if there are several graphemes available to represent a phoneme, lexical words will often use a longer grapheme than a function word will. 

This simple convention which begins to differentiate the spelling of function and content words explains the spelling of words like <egg, odd, awe> and many others.  If spelling were based primarily on pronunciation, then *<eg, od & aw> would work just fine. But this spelling convention has evolved to make it easier to comprehend written text efficiently. The hypothesis is that lexical words contain more letters than function words, whenever possible, to visually emphasize meaningful content words in written text. 

Be aware, however, that these two categories—function words and lexical words—are not either/or distinctions, but are really two ends of a spectrum; an individual word may fall somewhere on the continuum between function and content words. 

So now let’s look back at the written words <we, see, me, three, he & she> and see how we might categorize them to make sense of their spelling.

me, she, we, he: pronouns (function words)

see: verb (lexical)

three: a cardinal number (which is not strictly lexical or functional, but probably leans more towards the lexical end of the continuum)

With information about this one convention, we have a way to make sense of the shorter spelling <e> in the pronouns <me, she, we & he> and the longer spelling <ee> in the other two words. And can you now make sense of the spellings <be & bee>?  If we understand something, we are much more likely to remember it and be able to apply it. This makes all the difference for a dyslexic student (or anyone) who does not have good rote memorization skills.

But perhaps you’re thinking that concepts like pronouns and function words are too advanced for the youngest students. They’re not. Very young children know how to use the words me, she, he and we and can have fun playing with sentences where nouns are replaced by pronouns, or sentences where we’ve removed all of the function or lexical words.  Teachers in the early years can begin to use these terms in a very relaxed way. Although students will have many years before they need to master the details of the terminology, they will understand the concepts and begin their journey into literacy with the knowledge that all spelling makes sense. 

A full understanding of spelling deepens and strengthens comprehension 

Words that are called “irregular” and words that use different spellings for the same phoneme make it clear that we can’t understand our writing system by starting with the pronunciation of words in isolation.  English spelling represents the pronunciation of words, but it does so using flexible phoneme-grapheme relationships that allow for consistent spelling of structural units which build words. By maintaining the consistent spelling of base elements, prefixes and suffixes, our writing system makes it easier to quickly see relationships between words like <say & says, be & been, do & does>, and eventually <preside & president, normal & enormous, fast & fasten>.2

 When we ignore morphology and etymology in our instruction, the clues to comprehension that are embedded in the writing system are hidden from view. This restricts students to a shallow understanding of the writing system and we inadvertently force them to do far more rote memorization than is needed. 

There’s a lot of debate about the best way to teach students to read and write, including how to introduce the writing system to young students. For some reason, it’s assumed that showing students the actual structures of words and the ways in which the spelling of words reveals meaningful relationships to other words is too advanced or even unnecessary. So we teach students to analyze pronunciation, which is a moving target. As prefixes and suffixes are added to words, the pronunciation of those words constantly shifts. Meanwhile, the consistently spelled structures of those words are sitting right in front of us, ready to be used to anchor those pronunciations to something that is logical and coherent. The writing system itself shows us that analyzing written words provides the best foundation for learning written English.

And in case you’re wondering, it turns out that the spelling <home> is connected to the spelling of <hamlet>. Both come from the same historical root and have a sense of “settle, dwell.” By looking at their entertaining history, students who struggle with rote memorization can understand and have a meaningful way to remember why the single letter vowel grapheme makes sense in <home> and they will also encounter the word <hamlet> and the idea of a diminutive suffix.3  And although I’ve known how to spell <home> for a very long time, when I was prompted to find out why it’s spelled with <o-e> and not <oa>, I learned why many towns are called hamlets. This type of study not only enhances learning for 5-year olds and struggling students, but expands vocabulary understanding for all of us. Understanding how and why etymology influences spelling is a critical aspect of literacy study. 

No one wants to condemn our youngest students to years of trying (and often failing) to memorize lists of words week after week. The wonderful truth is that every irregular word and every word that doesn’t make sense is a invitation to deepen our understanding of our language. There’s no better way to launch children into literacy than allowing them to understand how those puzzling words actually do make sense, by framing instruction in the real context of structures and relationships from the very beginning. 

1 Venezky, Richard L.“English Orthography: Its Graphical Structure and Its Relation to Sound,” Reading Research Quarterly, Vol 2, No. 3 (Spring 1967) pp. 75-105

2 You may want to take a look at <fasten & hasten>, <enormous> & <presidential presidents>.

3See information on <hamlet> at
And an update: when I published this essay a few hours ago, I including the word <bracelet> as an example of a word that has an <let> suffix. But I didn’t look it up first. It started nagging at me, and when I just checked, I discovered that <bracelet> probably does not include a suffix <let>. Both <let> and <et> are diminutive suffixes, but I needed to look at the history before I could be confident which one seemed more likely, and I didn’t do that. And I thought I had looked at <hamlet> carefully, but I didn’t. As I reflect, I don’t think I”m comfortable listing either <hamlet> or <bracelet> as words that have <et> or <let> suffixes in present day English, at least not without some more investigation. This is a great example of why etymology matters; it’s a critical research tool to help us understand the morphology of present day words. And it’s also a great example of a couple of things that I’ve learned from my friend Pete Bowers—that juicy mistakes often teach us so much and that it’s never a problem NOT to analyze a word as deeply as we can, but that we do want to avoid false analysis.  I’ve updated this essay and the pdf to something that (I think!) is justified by the evidence I’ve had time to look over.

