Posts for Tag: data

Assembling the USA state-by-state with state-level statistics

Posted In: Maps
Facebooktwittergoogle_plusredditpinterestlinkedinmail

Watch the United States assemble state by state based on statistics of interest

Based on earlier popularity of the country-by-country animation, this map lets you watch as the world is built-up one state at a time. This can be done along a large range of statistical dimensions:

  • Name (alphabetical)
  • abbreviation
  • Date of entry to the United States
  • State Population (2018)
  • Population per Electoral Vote (2018)
  • Population per House Seat (2018)
  • Land Area (square miles)
  • Population Density (ppl per sq mi) (2018)
  • State’s Highest Point
  • Highest Elevation (ft)
  • Mean Elevation (ft)
  • State’s Lowest Point
  • Lowest Point (ft)
  • Life Expectancy at Birth (yrs)
  • Median Age (yrs)
  • Percent with High School Education
  • Percent with Bachelor’s Degree
  • Residential Electricity Price (cents per kWh) (2018)
  • Gasoline Price ($/gal) Regular unleaded (2019)
  • State Gross Domestic Product GDP ($Million) (2018)
  • GDP per capita ($/capita)
  • Number of Counties (or subdivisions)
  • Average Daily Solar Radiation (kWh/m2)
  • Birth rate (per thousand population)
  • Avg Age of Mother at Birth
  • Annual Precipitation (in/yr)
  • Average Temperature (deg F)
  • These statistics can be sorted from small to large or vice versa to get a view of the US and its constituent states plus DC in a unique and interesting way. It’s a bit hypnotic to watch as the states appear and add to the country one by one.

    You can use this map to display all the states that have higher life expectancy than the Texas:
    select “Life expectancy”, sort from “high to low” and use the scroll bar to move to the Texax and you’ll get a picture like this:
    States with higher life expectancy than Texas

    or this map to display all the states that have higher population density than California:
    select “Population density, sort from “high to low” and use the scroll bar to move to the United States and you’ll get a picture like this:
    States with higher population density than California

    I hope you enjoy exploring the United States through a number of different demographic, economic and physical characteristics through this data viz tool. And if you have ideas for other statistics to add, I will try to do so.

    Data and tools: Data was downloaded from a variety of sources:

    • Population https://en.wikipedia.org/wiki/List_of_states_and_territories_of_the_United_States_by_population
    • Admission to union https://simple.wikipedia.org/wiki/List_of_U.S._states_by_date_of_admission_to_the_Union
    • Educational attainment https://nces.ed.gov/programs/digest/d18/tables/dt18_104.88.asp
    • Highest points https://geology.com/state-high-points.shtml
    • Life expectancy https://en.wikipedia.org/wiki/List_of_U.S._states_and_territories_by_life_expectancy
    • Median Age http://www.statemaster.com/graph/peo_med_age-people-median-age
    • Land area https://statesymbolsusa.org/symbol-official-item/national-us/uncategorized/states-size
    • Mean elevation https://www.census.gov/library/publications/2011/compendia/statab/131ed/geography-environment.html
    • Electricity price https://www.chooseenergy.com/electricity-rates-by-state/
    • Gasoline price https://gasprices.aaa.com/state-gas-price-averages/
    • GDP https://www.bea.gov/data/gdp/gdp-state
    • Sunlight North America Land Data Assimilation System (NLDAS) Daily Sunlight (insolation) for years 1979-2011 on CDC WONDER Online Database, released 2013. Accessed at http://wonder.cdc.gov/NASA-INSOLAR.html on Jun 14, 2019 1:37:15 PM
    • Births United States Department of Health and Human Services (US DHHS), Centers for Disease Control and Prevention (CDC), National Center for Health Statistics (NCHS), Division of Vital Statistics, Natality public-use data 2007-2017, on CDC WONDER Online Database, October 2018. Accessed at http://wonder.cdc.gov/natality-current.html on Jun 14, 2019 1:53:58 PM
    • Precipitation North America Land Data Assimilation System (NLDAS) Daily Precipitation for years 1979-2011 on CDC WONDER Online Database, released 2013. Accessed at http://wonder.cdc.gov/NASA-Precipitation.html on Jun 26, 2019 3:30:40 PM
    • Temperature http://www.usa.com/rank/us–average-temperature–state-rank.htm

    The map was created with the help of the open source leaflet javascript mapping library

    Visualizing the 4% Rule, Trinity Study and Safe Withdrawal Rates

    Posted In: Financial Independence | Money
    Facebooktwittergoogle_plusredditpinterestlinkedinmail

    Instructions for using the calculator:


    This calculator is designed to let you learn as you play with it. Tweaking inputs and assumptions and hovering and clicking on results will help you to really gain a feel for how withdrawal rates and market returns affect your chance of retirement success (i.e. making it through without running out of money).

    Inputs You Can Adjust:

    • Spending and initial balance – This will affect your withdrawal rate.  The withdrawal rate is really the only thing that is important (doubling spending and retirement savings will still yield the same success rate).
    • Asset allocation – Raise or lower your risk tolerance by holding more or less stock vs bonds
    • Adjust retirement length – This affects the number of historical cycles that are used in the simulation, but also increases risk of failure.
    • Add tax rates and investment fees – these will put a drag (i.e. lower) market returns and lower success rates

    Options for Visualization:

    • Display all cycles – this is the mess of spaghetti like curves that show all historical cycle simulations
    • Display percentiles – this aggregates the simulations into percentiles to show most likely outcomes
    • Hover/Click on legend years – this will allow you to highlight a single historical cycle (you can also use the arrow keys to step through historical cycles)
    • Bottom graph can show either the sequence of returns (with average returns in 5 year periods) for a single historical cycle or distributions of returns in our historical data (1871 to 2016) and a single historical cycle.  You can choose to look at returns for stocks, bonds or your specific asset allocation.
    • The graph on the right shows a histogram of the ending balance of each historical cycle and color codes them to show percentiles.
     

    What is the 4% Rule?

    The 4% rule is a “rule of thumb” relating to safe retirement withdrawals.  It states that if 4% of your retirement savings can cover one years worth of retirement spending (an alternative way to phrase it is if you have saved up 25 times your annual retirement spending), you have a high likelihood of having enough money to last a 30+ year retirement. A key point is that the probabilities shown here are just historical frequencies and not a guarantee of the future. However, if your plan has a high success rate (95+%) in these simulations, this implies that retirement plan should be okay unless future returns are on par with some of the worst in history.

    The overall goal of this rule and analysis is identifying a “safe withdrawal rate” or SWR for retirement.  A withdrawal rate is the percentage of your money that you withdraw from your retirement savings each year.  If you’ve saved up $1 million and withdraw $100,000 each year, that is a 10% withdrawal rate.

    The “safe” part of the withdrawal rate relates to the fact that if your investments generally grow by more than your annual spending, then your retirement savings should last over the length of your retirement.  But average returns do not tell the whole story as the sequence of returns also plays a very important role, as will be discussed later.

    One way to test this is through a backtesting simulation which forms the basis for the “Trinity Study”.

    What is the Trinity Study?

    The “Trinity Study” is a paper and analysis of this topic entitled “Retirement Spending: Choosing a Sustainable Withdrawal Rate,” by Philip L. Cooley, Carl M. Hubbard, and Daniel T. Walz, three professors at Trinity University. This study is a backtesting simulation that uses historical data to see if a retirement plan (i.e. a withdrawal rate) would have survived under past economic conditions.  The approach is to take a “historical cycle”, i.e. a series of years from the past and test your retirement plan and see if it runs out of money (“fails”) or not (“survives”).

    How do you test withdrawal rate?

    Given modern equity and bond market data only stretches back about 150 years, there is some, but not a huge amount of data to use in this simulation.  One example of a 30 year historical cycle would be 1900 to 1930, and another is 1970 to 2000.  The Trinity study and this calculator tests withdrawal rates against all historical periods from 1871 until the present (e.g. 1871 to 1901, 1872 to 1902, 1873 to 1903, . . . . 1986 to 2016).  Then across this 115 different historical cycles, it determines how many of these survived and how many failed.

    The thinking is that if your retirement plan can survive periods that include recessions, depressions, world wars, and periods of high inflation, then perhaps it can survive the next 30-50 years.

    The 4% rule that comes out of these studies basically states that a 4% withdrawal rate (e.g. $40,000 annual spending on a $1,000,000 retirement portfolio) will survive the vast majority of historical cycles (~96%).  If you raise your withdrawal rate, the rate of failure increases, while if you lower your withdrawal rate, your rate of failure decreases.

    The goal of this tool is to help you understand the mechanics of the a historical cycle simulation like was used in the Trinity Study and how the 4% rule came to be. This understanding can help you better plan for retirement with the uncertainty that goes along with planning 30+ years into the future. If you want to also see how longevity and life expectancy play a role in retirement planning, you can take a look at the Rich, Broke and Dead calculator.

    This post and tool is a work in progress. I have a number of ideas that I will implement and add to it to help improve the visualization and clarity of these concepts.

    Data source and Tools Historical Stock/Bond and Inflation data comes from Prof. Robert Shiller. Javascript is used to create the interactive calculator tool and the create the code in the simulations to test each historical cycle and aggregate the results, and graphed using Plot.ly open-source, javascript graphing library.

    4% rule trinity study

    How do Americans Spend Money? US Household Spending Breakdown by Education Level

    Posted In: Money
    Facebooktwittergoogle_plusredditpinterestlinkedinmail
    How much do US households spend and how does it change with education level?

    This visualization is one of a series of visualizations that present US household spending data from the US Bureau of Labor Statistics. This one looks at the education level of the primary resident.

    This visualization focuses on the education level of the primary resident. This is defined in the BLS documentation as the person who is first mentioned when the survey respondent is asked who in the household rents or owns the home.

    I obtained data from the US Bureau of Labor Statistics (BLS), based upon a survey of consumer households and their spending habits. This data breaks down spending and income into many categories that are aggregated and plotted in a Sankey graph.

    One of the key factors in financial health of an individual or household is making sure that household spending is equal to or below household income. If your spending is higher than income, you will be drawing down your savings (if you have any) or borrowing money. If your spending is lower than your income, you will presumably be saving money which can provide flexibility in the future, fund your retirement (maybe even early) and generally give you peace of mind.

    Instructions:

    • Hover (or on mobile click) on a link to get more information on the definition of a particular spending or income category.
    • Use the dropdown menu to look at averages for different groups of households based on the education level of the primary resident. This data breaks households into the following groups:
      • All
      • Less than HS graduate
      • High school graduate
      • HS grad + some college
      • Associate’s degree
      • Bachelor’s degree
      • Master’s, professional, doctoral degree

      The composition of households and income change as the education level of the primary resident changes, which in turn affects spending totals and individual categories.

    As stated before, one of the keys to financial security is spending less than your income. We can see that on average, income tends to increase with education level. Those with the highest incomes and greatest spending have advanced degrees, but they also save the most money.

    The group with the lowest education level (not finishing high school) have the lowest income and on average needs to borrow or draw down on savings to live their lifestyle.

    How does your overall spending compare with those that have the same education level as you? How about spending in individual categories like housing, vehicles, food, clothing, etc…?

    Probably one of the best things you can do from a financial perspective is to go through your spending and understand where your money is going. These sankey diagrams are one way to do it and see it visually, but of course, you can also make a table or pie chart (Honestly, whatever gets you to look at your income and expenses is a good thing).

    The main thing is to understand where your money is going. Once you’ve done this you can be more conscious of what you are spending your money on, and then decide if you are spending too much (or too little) in certain categories. Having context of what other people spend money on is helpful as well, and why it is useful to compare to these averages, even though the income level, regional cost of living, and household composition won’t look exactly the same as your household.

    **Click Here to view other financial-related tools and data visualizations from engaging-data**

    Here is more information about the Consumer Expenditure Surveys from the BLS website:

    The Consumer Expenditure Surveys (CE) collect information from the US households and families on their spending habits (expenditures), income, and household characteristics. The strength of the surveys is that it allows data users to relate the expenditures and income of consumers to the characteristics of those consumers. The surveys consist of two components, a quarterly Interview Survey and a weekly Diary Survey, each with its own questionnaire and sample.

    Data and Tools:
    Data on consumer spending was obtained from the BLS Consumer Expenditure Surveys, and aggregation and calculations were done using javascript and code modified from the Sankeymatic plotting website. I aggregated many of the survey output categories so as to make the graph legible, otherwise there’d be 4x as many spending categories and all very small and difficult to read.

    household spending

    What are the highest mountains on Earth? Measuring from sea level vs center of earth

    Posted In: Geography
    Facebooktwittergoogle_plusredditpinterestlinkedinmail

    The Highest Mountains On Earth Depend On How You Measure “High”

    Mount Everest is famous for being the highest mountain on Earth. The peak is an incredible 8,848 meters (29,029 ft) above sea level. But that is only one way to measure the height of a mountain. Chimborazo, a mountain in Ecuador, holds the distinction for the mountain whose peak is the furthest from the center of the Earth. How is that possible? This is because the Earth is not a perfect sphere. Rather, due to the spinning of the Earth around it’s axis, the centrifugal force causes the equator to bulge out slightly. This flattened shape is called an oblate spheroid and makes the radius of the earth at the equator about 22 km (about 0.3%) larger than the radius to the poles. Mountains close to the equator will “start” further away from the center of the earth, than those at higher latitudes.

    This graph plots over 800 of the highest mountains on Earth with their peak height above sea level on the x-axis and their peak distance from the center of the earth on the y-axis. Each point represents one mountain. The colors of the plots correspond to the latitude of the mountain. These mountains range from 3000 meters in height to 9000 meters in height. You can hover over a data point (or click on mobile) to get more information about the mountain. You can also switch from metric to imperial units with the button on the graph.

    For a given mountain range at a certain latitude, you can see that as the mountain heights above sea level increases, so does their distance from the center of the Earth. Mountains in the southern hemisphere are colored in blue, those around the equator are green and yellow, and those in the northern hemisphere are red and orange. The mountains with the highest peaks above sea level are shown on the right side of the graph in red and orange (mostly in the Himalaya), with Mt Everest as the right most point on the graph (nearly 9000 meters tall).

    Mountains with peaks the greatest distance from the center of the earth are found near the equator in light green/yellow and are found at the top of the graph. You’ll notice that a number of these mountains are higher than Mt Everest when looking at the distance from the center of the earth.

    The Himalayas are the “highest” mountains on earth if you are measuring height from sea level, while the Andes are the “highest” if you measure from the center of the earth.

     

    Calculating Distance from Earth’s Center to Mountain Peak

    The distance from the center of the Earth is calculated from the following formula:
    $$D_{mountain} = H_{mountain} + R_{lat}$$
    where $D_{mountain}$ is the distance from center of earth to the top of the mountain, $H_{mountain}$ is the mountain height above sea-level and $R_{lat}$ is the radius of earth at the mountain’s latitude. The height is data that was downloaded from a list of mountain heights.

    and the radius of the earth for a given latitude is calculated using the formula:
    $$R_{lat}=\sqrt{a^2cos(lat)^2+b^2sin(lat)^2\over acos(lat)^2+bsin(lat)^2)}$$
    where $a$ and $b$ are the equatorial and polar radii (6378.137 km and 6356.752 km respectively).
     

    Earth Radius Calculator

    Here is a calculator for determining the radius of Earth at a given latitude:

    You can use this to calculate the distance from the center of the earth to sea level at your latitude.

     

    Data and Tools:
    Data on the heights of over 800 mountain peaks over 3000 meters in height was downloaded from Wikipedia. There ended up being alot of google searching and data cleaning to get it into suitable format for plotting. The calculations were made with javascript and plotted using plotly, the open source javascript graphing library.

    Mountain Height Graph

    Visualizing The Growth of Atmospheric CO2 Concentration

    Posted In: Environment
    Facebooktwittergoogle_plusredditpinterestlinkedinmail

    The current CO2 concentration in the atmosphere is over 400 parts per million (ppm). This has grown about 46% since pre-industrial levels (~280 ppm) in the early 1800s. The growing concentration of CO2 is a big concern because it is the most prevalent greenhouse gas, which is increasing the temperature of the planet and leading to substantial changes in the Earth’s climate patterns.

    This graph visualizes the growth in CO2 concentration in the atmosphere (mainly from CO2 emissions due to human activities, such as burning fossil fuels for energy production, deforestation and other industrial processes). The graph starts at 1980 when CO2 concentration in the atmosphere was around 340ppm. It has grown significantly since then.

    One of the interesting aspects of CO2 concentration is that it is not identical all around the globe, as it takes awhile for the atmosphere to mix. The graph shows geographic differences in CO2 concentration as well as seasonal ups and downs, that underly an overall growing trend in annual average (mean) concentration.

    Seasonal trends in CO2 concentration occur due to differences in the amount of plant growth across different months. Spring and summer plant growth in the northern hemisphere causes a significant amount of photosynthesis, and CO2 absorption, relative to the fall and winter. This plant growth causes a very large amount of CO2 to be absorbed by plants and a noticeable reduction in the amount of CO2 in the atmosphere. The southern hemisphere spring and summer (northern hemisphere fall and winter) aren’t as obvious because there is much less land in the southern hemisphere and the land that is there is close to the tropics and green all year round.

    CO2 concentration can change by about 4-5 ppm due to the “breathing” of plants, which is pretty significant. The total weight of CO2 in the atmosphere is about 3 trillion tonnes of CO2, so 4-5 ppm is about 1% of this or 30 billion tons of CO2 removed by plant life each spring/summer.

    Data and Tools:

    Data comes from the US National Oceanic and Atmospheric Administration (NOAA). Data was downloaded using an automated python script and the graphs were made using javascript and the open-sourced Plot.ly javascript engine.

    CO2 concentration graph

    How Much Does Each State Pay In Taxes?

    Posted In: Money
    Facebooktwittergoogle_plusredditpinterestlinkedinmail

    Given that tax day has just passed, I thought it would be good to check out some data on taxes. The IRS provides a great resource on tax data that I’ve only just gotten into. I think I’ll be able to do more with this in the future. This one looks at how taxes paid varies by state and presents it as a choropleth map (coloring states based on certain categories of tax data).

    You can choose from a number of different categories:

    • Mean Federal Tax Paid
    • Mean Adjusted Gross Income
    • Mean State/Local Tax
    • Mean Combined (Fed/State/Local) Tax
    • Percent Income from Dividends and Capital Gains
    • Percent of Returns with Itemized Deductions
    • Number of Tax Returns
    • Mean Federal Tax Rate
    • Mean State/Local Tax Rate
    • Mean Combined (Fed/State/Local) Rate
    • Total Federal Tax Liability

    I may add more categories in the future, so if you have ideas of tax data you want to see visualized let me know and I’ll see what I can do.

    For other tax-related tools and visualizations see my tax bracket calculator and visualization of marginal tax rates.

    **Click Here to view other financial-related tools and data visualizations from engaging-data**

    Data and Tools:

    Data on tax returns by state is from the IRS website in an excel format. The map was made using the leaflet open source mapping library. Data was compiled in excel and calculations made using javascript.

    How much each state paid in taxes

    Code Embed: Cannot use CODECSSresize9 as a global code as it is being used to store 2 unique pieces of code in 3 posts