Posts for Tag: languages

Visualizing the accuracy of the “i before e, except after c” spelling rule

Posted In: Language
i before e

See which words follow and break the “i before e” rule

I wanted to see how often there were exceptions to the spelling rule “i before e, except after c”.  I found a website (wordfrequency.info) that had a list of the 5050 most common english words and decided to do some analysis on it to see which words followed this rule and which did not. Below is a treemap graph that shows the words that follow the rule in green and those that do not in red. The size of the box represents how common the word is in regular American English usage (based on the frequency that it shows up in the Corpus of Contemporary American English).

What we see is that while 81% of the 158 most common words with ‘e’ and ‘i’ adjacent to one another do follow the rule, when you take into account how frequently these words are used, the weighted percentage of words following the rule drops to around 60%.  This is because some very commonly used words do not follow this rule and if you were to count how many times you use words from this list, it’s likely that about 40% of the time you’ll be using words that don’t follow the rule. For example, the two most commonly used words with ‘e’ and ‘i’ adjacent (their and being) do not follow the rule, since then have the ‘e’ before the ‘i’ but aren’t after a ‘c’.

 

I was inspired to look into this after seeing a tweet about the rule in the comic Pearls before Swine by @stephanpastis.

I asked my kids but they had never heard of the rule so perhaps this isn’t taught in schools anymore.
 

Sources and Tools:
I downloaded the word list from wordfrequency.info. The wordlist comes from the Corpus of Contemporary American English (COCA), a collection of English works across a wide variety of genres (spoken, fiction, popular magazines, newspapers, academic texts, and TV and Movies subtitles, blogs, and other web pages between 1990 and 2020). This word list was then analyzed using javascript to categorize the word as fitting or breaking the rule. The visualization uses the plotly.js open source graphing library and HTML/CSS/Javascript code for the interactivity and UI.

i before e rule

¿Cuánto tardarías en contar hasta un millón?

Posted In: Counting | Fun | Math

Disculpe(n) mi pobre español. Utilicé google translate para escribir esto en español.
Aquí está la calculadora que calculará cuánto tiempo lleva contar un millón (o números mayores) en español.
(more…)

Counting to One Million, One Billion or One Trillion in Spanish

Posted In: Counting | Programming

There was lots of interest in the calculator to estimate counting time (in English) to one million, one billion and up to one trillion.  I decided to do the same for other popular languages (Spanish).  Here is the calculator that will calculate how long it takes to count to one million (or larger numbers) in Spanish.  If you’d like to see this in Spanish click here.
(more…)

Counting to One Million in Different Languages (Chinese, English and Spanish)

Posted In: Counting | Math

Updated: Lots of folks on Reddit pointed out some mistakes in the Spanish calculations, and helped me figure out the solutions, so the Spanish graphs are now updated. The Spanish calculator is now live!

Building off of the last post about Counting to One Million in English, I received some comments about looking at other languages.  That seemed like a very good idea, so I looked at a list of the world’s most popular languages and saw Chinese and Spanish listed with English in the Top 3.  Having a little experience with both of those, I set out to compare how long it’d take to count in each of these languages, if you had to pronounce every single number from one to one million.

Again, here’s the plot of the number of syllables per number for English.  The longest word is seven hundred seventy seven thousand seven hundred seventy seven (20 syllables).

Count to one million in English
(more…)