I wanted to see how often there were exceptions to the spelling rule “i before e, except after c”. I found a website (wordfrequency.info) that had a list of the 5050 most common english words and decided to do some analysis on it to see which words followed this rule and which did not. Below is a treemap graph that shows the words that follow the rule in green and those that do not in red. The size of the box represents how common the word is in regular American English usage (based on the frequency that it shows up in the Corpus of Contemporary American English).
What we see is that while 81% of the 158 most common words with ‘e’ and ‘i’ adjacent to one another do follow the rule, when you take into account how frequently these words are used, the weighted percentage of words following the rule drops to around 60%. This is because some very commonly used words do not follow this rule and if you were to count how many times you use words from this list, it’s likely that about 40% of the time you’ll be using words that don’t follow the rule. For example, the two most commonly used words with ‘e’ and ‘i’ adjacent (their and being) do not follow the rule, since then have the ‘e’ before the ‘i’ but aren’t after a ‘c’.
I was inspired to look into this after seeing a tweet about the rule in the comic Pearls before Swine by @stephanpastis.
Perhaps the most unhelpful spelling rule ever. pic.twitter.com/sW0aud6rA3
— Stephan Pastis (@stephanpastis) March 9, 2021
I asked my kids but they had never heard of the rule so perhaps this isn’t taught in schools anymore.
Sources and Tools:
Disculpe(n) mi pobre español. Utilicé google translate para escribir esto en español.
Aquí está la calculadora que calculará cuánto tiempo lleva contar un millón (o números mayores) en español.
There was lots of interest in the calculator to estimate counting time (in English) to one million, one billion and up to one trillion. I decided to do the same for other popular languages (Spanish). Here is the calculator that will calculate how long it takes to count to one million (or larger numbers) in Spanish. If you’d like to see this in Spanish click here.
Updated: Lots of folks on Reddit pointed out some mistakes in the Spanish calculations, and helped me figure out the solutions, so the Spanish graphs are now updated. The Spanish calculator is now live!
Building off of the last post about Counting to One Million in English, I received some comments about looking at other languages. That seemed like a very good idea, so I looked at a list of the world’s most popular languages and saw Chinese and Spanish listed with English in the Top 3. Having a little experience with both of those, I set out to compare how long it’d take to count in each of these languages, if you had to pronounce every single number from one to one million.
Again, here’s the plot of the number of syllables per number for English. The longest word is seven hundred seventy seven thousand seven hundred seventy seven (20 syllables).