See which words follow and break the “i before e” rule
I wanted to see how often there were exceptions to the spelling rule “i before e, except after c”. I found a website (wordfrequency.info) that had a list of the 5050 most common english words and decided to do some analysis on it to see which words followed this rule and which did not. Below is a treemap graph that shows the words that follow the rule in green and those that do not in red. The size of the box represents how common the word is in regular American English usage (based on the frequency that it shows up in the Corpus of Contemporary American English).
What we see is that while 81% of the 158 most common words with ‘e’ and ‘i’ adjacent to one another do follow the rule, when you take into account how frequently these words are used, the weighted percentage of words following the rule drops to around 60%. This is because some very commonly used words do not follow this rule and if you were to count how many times you use words from this list, it’s likely that about 40% of the time you’ll be using words that don’t follow the rule. For example, the two most commonly used words with ‘e’ and ‘i’ adjacent (their and being) do not follow the rule, since then have the ‘e’ before the ‘i’ but aren’t after a ‘c’.
I was inspired to look into this after seeing a tweet about the rule in the comic Pearls before Swine by @stephanpastis.
Perhaps the most unhelpful spelling rule ever. pic.twitter.com/sW0aud6rA3
— Stephan Pastis (@stephanpastis) March 9, 2021
I asked my kids but they had never heard of the rule so perhaps this isn’t taught in schools anymore.
Sources and Tools:
5 Comments »
5 Responses to Visualizing the accuracy of the “i before e, except after c” spelling rule
This rule is actually taught in primary schools in the United Kingdom.
I really appreciate that you did this! But I have a quibble, perhaps two. I think your stats are thrown off by including words like “seeing” and “being.” I don’t think these words confuse people the way “weird” or “neighbor” do. Likewise, the rule for plurals (“change the Y to an I and add ES”), seems pretty straight forward, and, to my mind, overrides the “I before E” rule.
Full rule – “I before E, except after C,
AND when pronounced A as in neighbor and weigh”
Albeit, eider is weird.
Fun idea. Ceiling is miscoded. It follows the rule.
Thanks for catching that. A bug in my code didn’t check for the c being the first letter. It’s fixed now.