I wanted to see how often there were exceptions to the spelling rule “i before e, except after c”. I found a website (wordfrequency.info) that had a list of the 5050 most common english words and decided to do some analysis on it to see which words followed this rule and which did not. Below is a treemap graph that shows the words that follow the rule in green and those that do not in red. The size of the box represents how common the word is in regular American English usage (based on the frequency that it shows up in the Corpus of Contemporary American English).
What we see is that while 81% of the 158 most common words with ‘e’ and ‘i’ adjacent to one another do follow the rule, when you take into account how frequently these words are used, the weighted percentage of words following the rule drops to around 60%. This is because some very commonly used words do not follow this rule and if you were to count how many times you use words from this list, it’s likely that about 40% of the time you’ll be using words that don’t follow the rule. For example, the two most commonly used words with ‘e’ and ‘i’ adjacent (their and being) do not follow the rule, since then have the ‘e’ before the ‘i’ but aren’t after a ‘c’.
I was inspired to look into this after seeing a tweet about the rule in the comic Pearls before Swine by @stephanpastis.
Perhaps the most unhelpful spelling rule ever. pic.twitter.com/sW0aud6rA3
— Stephan Pastis (@stephanpastis) March 9, 2021
I asked my kids but they had never heard of the rule so perhaps this isn’t taught in schools anymore.
Sources and Tools: