Posts for Tag: US

Splitting the US by Population

Posted In: Geography | Maps
map of US split into 8 regions by population

This visualization lets you divide the US into 1,2,3,4,5,8 and 10 different segments with equal population and across different dimensions. The divisions are made using counties as the building blocks (of which there are 3143 in the US). There are numerous different ways to make the divisions. This lets you make the divisions by different types of geographic directions and divisions by population density.

Instructions

  • Select a dimension on which to divide up the country – there are geographic dimensions, like north to south or east to west, or by population density
  • For some geographic divisions (concentric rings or pie slices), you can choose the geographic center of the divisions
  • You can also choose the number and color scheme of the divisions
  • To show the divisions, either click the Animate Counties button or use the slider to add counties

If you can think of other interesting ways to divide up the US, please let me know and I can try to add them to this visualization.

Sources and Tools:
2018 county population data is from US Census Bureau. The map visualization is created using the Leaflet javascript mapping library and the data wrangling and user interface and interactivity are created using HTML, CSS and Javascript code.

dividing up US by population

US Baby Name Popularity Visualizer

Posted In: Fun
baby name popularity visualizer

I added a share button (arrow button) that lets you send a graph with specific name. It copies a custom URL to your clipboard which you can paste into a message/tweet/email.

How popular is your name in US history?

Use this visualization to explore statistics about names, specifically the popularity of different names throughout US history (1880 until 2020). This is a useful tool for seeing the rise (and fall) of popularity of names. Look at names that we think of as old-fashioned, and names that are more modern.

Instructions

  • Start typing a name into the input box above and the visualization will show all the names that begin with those letters. The graph will show the historical popularity of all these names as an area graph.
  • You can hover (or click on mobile) to bring up a tooltip (popup) that shows you the exact number of births with that name for different years (or decades) and the names rank in that time period.
  • It’s best used on computer (rather than a mobile or tablet device) so you can see the graph more clearly and also, if you click on a name wedge, it will zoom into names that begin with those letters.
  • You can select different views, Boy names, Girl names or both, as well as looking at the raw number of births or a normalized popularity that accounts for the differential number of births throughout the period between 1880 and 2020.
  • If you click the share (arrow) button, it will copy the parameters of the current graph you are looking at and create a custom URL to share with others. It copes the link into your clipboard and your browser’s address (URL) bar.

Isn’t there something out there like this already? Baby Name Wizard and Baby Name Voyager

This visualization is not my original idea, but rather a re-creation of the Baby Name Voyager (from the Baby Name Wizard website) created by Laura Wattenberg. The original visualization disappeared (for some unknown reason) from the web, and I thought it was a shame that we should be deprived of such a fun resource.

It started about a week ago, when I saw on twitter that the Baby Name Wizard website was gone. Here’s the blog post from Laura. I hadn’t used it in probably a decade, but it flashed me back to many years ago well before I got into web programming and dataviz and I remember seeing the Baby Name Voyager and thinking how amazing it was that someone could even make such a thing. Everyone I knew played with it quite a bit when it first came out. It got me thinking that it should still be around and that I could probably make it now with my programming skills and how cool that would be.

So I downloaded the frequency data for Baby Names from the US Social Security Administration and set to work trying to create a stacked area graph of baby names vs time. I started with my go to library for fast dataviz (Plotly.js) but eventually ended up creating the visualization in d3.js which is harder for me, but made it very responsive. I’m not an expert in d3, but know enough that using some similar examples and with lots of googling and stack overflow, I could create what I wanted.

I emailed Laura after creating a sample version, just to make sure it was okay to re-create it as a tribute to the Baby Name Wizard / Voyager and got the okay from her.

Where does the data come from?

Some info about Data (from SSA Baby Names Website):

All names are from Social Security card applications for births that occurred in the United States after 1879. Note that many people born before 1937 never applied for a Social Security card, so their names are not included in our data.

Name data are tabulated from the “First Name” field of the Social Security Card Application. Hyphens and spaces are removed, thus Julie-Anne, Julie Anne, and Julieanne will be counted as a single entry.
Name data are not edited. For example, the sex associated with a name may be incorrect.

Different spellings of similar names are not combined. For example, the names Caitlin, Caitlyn, Kaitlin, Kaitlyn, Kaitlynn, Katelyn, and Katelynn are considered separate names and each has its own rank.

All data are from a 100% sample of our records on Social Security card applications as of March 2021.

I did notice that there was a significant under-representation of male names in the early data (before 1910) relative to female names. In the normalized data, I set the data for each sex to 500,000 male and 500,000 female births per million total births, instead of the actual data which shows approximately double the number of female names than male names. Not sure why females would have higher rates of social security applications in the early 20th century. Update: A helpful Redditor pointed me to this blog post which explains some of the wonkiness of the early data. The gist of it is that Social Security cards and numbers weren’t really a thing until 1935. Thus the names of births in 1880 are actually 55 year olds who applied for Social Security numbers and since they weren’t mandatory, they don’t include everyone. My correction basically makes the assumption that this data is actually a survey and we got uneven samples from males and female respondents. It’s not perfect (like the later data) but it’s a decent representation of name distribution.

Sources and Tools:
The biggest source of inspiration was of course, Laura Wattenberg’s original Baby Name Explorer.

I downloaded the baby names from the Social Security website. Thanks to Michael W. Shackleford at the SSA for starting their name data reporting. I used a python script to parse and organize the historical data into the proper format my javascript. The visualization is created using HTML, CSS and Javascript code (and the d3.js visualization library) to create interactivity and UI. Curran Kelleher’s area label d3 javascript library was a huge help for adding the names to the graph.

baby name popularity visualizer

National Park 3D Elevation Models

Posted In: Geography | Maps
yosemite 3D model

Play with an interactive 3D model of some popular National Parks in the US

I wanted to try my hand at creating 3D elevation models and thought trying to model some of the popular (and some of my favorite) national parks would be a good starting point.

Instructions

Once a 3D elevation model is selected and shown you can manipulated it in multiple ways:

  • Zoom – You can zoom in and out, though the method depends on the device you are using. Try scrolling or pinch to zoom. You can also select the magnifying glass in the toolbar and drag to zoom.
  • Rotate – You can rotate and change the angle of the model using by clicking and dragging on the model. This is the default selection in the toolbar (circular arrow around z axis)
  • Pan – You can move the model around with if you select the panning tool from the toolbar (arrows going in all directions)
  • Show contours – if you hover or click on part of the map, it can show all the areas of the model with the same elevation and the tooltip will show the geographic coordinates and elevation (you can toggle showing the tool tip if you select the tooltip bar)
  • Save image – click on the camera icon in the toolbar to save as png
  • Colors – you can change the color scale used to show elevation. You can also reverse the color scale.
  • Change vertical exaggeration – you can select whether the vertical height is exaggerated using the ‘Height Scale’ slider.  You can change between 1 (no exaggeration) to 11 (vertical scale is exaggerated by factor of 11).
  • Change min elevation – you can select whether the minimum elevation is sea level or the lowest elevation in the park.

You can select a number of different parks from the drop down menu. If you have suggestions for additional parks, I may be able to add them to the list.

Note: the elevation files are data intensive since the visualization is downloading the elevation across in some cases, many hundreds or thousands of square miles. To keep the data needs down, I’ve reduced the resolution of the elevation data. Though the original data is 90 meter resolution (elevation is specified across every 90 x 90 m square in each park, I’ve averaged these squares together so that each park model only has about tens of thousands of these squares, regardless of the actual area of the park. This improves data loading and rendering times and makes the improves the responsiveness of the model.

Sources and Tools:
This visualization is written in HTML/CSS/Javascript. Digital elevation data is obtained from Open Topography and uses Shuttle Radar Topography Mission GL3 (90 meter resolution). The elevation data is downloaded using the opentopography API and parsed in a python script which downsamples the data to limit the number of elevation cells. The script also determines if a point is inside or outside of the park boundaries in order to create the elevation model. The 3D model is rendered using the Plotly open-source javascript graphing library.

National Park 3D models

Animation of Coronavirus Cases and Deaths in US

Posted In: Health | Maps

Visualize the large number of coronavirus cases and deaths in the US each day/hour in about 10 seconds

The rate of COVID-19 deaths and cases in the US is crazy high after the 2020 winter holidays and maybe still be going up. This visualization shows the number of COVID cases that occur in one hour or the COVID deaths that occur in one day based on the average of the last five days. This is another attempt to show the true scale of how many cases and deaths the US is dealing with, since it is often hard to understand large numbers. I have also attempted to show the scale of US deaths/cases here and here. Unfortunately, there are so many people getting sick and dying, it’s hard to fathom just how many people this actually is.

The 5-day averaging was done to smooth out any peaks and troughs in data reporting due to weekends/holidays, since I noticed that some states were literally reporting zero COVID cases some days while reporting many hundreds or thousands of cases other days.

The dots shown on the animation are located in the state that the cases or deaths occur but are randomly spread out within the state. This is done for visual clarity since if they were shown in their actual location, most of the dots would be overlapping in urban, high density areas. This approach lets you see which states have high COVID instances but still locate them by state.

You can share this animation by putting ?cat=deaths or ?cat=cases behind the url or copying and sharing one of these links:

  • https://engaging-data.com/animation-of-us-covid/?cat=cases
  • https://engaging-data.com/animation-of-us-covid/?cat=deaths
  • Sources and Tools:

    The coronavirus data comes from the covidtracking.com API. The data is parsed daily using a custom python script and visualizations are made using the open-source Leaflet javascript mapping library and the interface and animation are made using HTML/CSS/javascript.

    US covid case death animation

    Stimulus Check Calculator (Late 2020 & Early 2021)

    Posted In: Government | Money
    stimulus check calculator

    How much money can you expect in your stimulus check?

    Updated to include the $1400 stimulus payment per adult and dependent in March 2021.
    Use this stimulus check calculator to figure out how much you will receive in your thrid stimulus check.

    On December 21, 2020, Congress passed a $900 billion dollar stimulus package in response to the COVID pandemic. The bill authorizes economic assistance to Americans in the amount of $600 per person subject to income limits. It also includes expanded unemployment benefits, rental assistance and an extension to the eviction ban. This calculator helps you calculate the amount of stimulus check that you can expect to receive based on your 2019 tax return filing status, adjusted gross income and number of dependents under 17.

    Changing the inputs to the calculator, will show you how your expected stimulus check amount will change. The graph shows for a giving filing status (single, married filing jointly or head of household) how the stimulus check amount will change as a function of income and number of children. You can share a URL with specific parameters included

    Sounds like some checks may even get to folks at the end of December and many more will get them in January 2021.

    On March 5, congress passed the American Rescue Plan which includes $1400 payments for all Americans. The phase out of this stimulus check is different in that over a $10000 range the stimulus goes from 100% to 0% at the phase out threshold, no matter how many dependents you have. This changes things significantly as you’ll see in the calculator.

    Sources and Tools:
    The stimulus check calculator is made using javascript and the plotly open source graphing library. It is based on news reports of the expected stimulus amounts and income thresholds.

    Election Results and Population Density

    Posted In: Elections
    election county population density

    How do 2020 presidential election results correlate with population density?

    The visualization I made about county election results and comparing land area to population size was very popular around the time of the 2020 presidential election. As the counties were represented by population, it was clear that democratic-leaning areas on that map tended to grow in size, while republican-leaning areas tended to shrink. This raised the question of exactly how population density correlates with election results.

    Hover over (or click on) the bubbles to see information about the county.

     

    It’s clear there is a very strong correlation between the vote margin and population density.  Vote margin is the percentage amount that one candidate beat the other candidate by in the county (0% means a tie while 50% means that one candidate got 75% and the other got 25% of the voteshare).  Population density is calculated as people per square mile in the county and is shown in the graph on a log scale, where each major grid line is 10 time greater than the previous one.  This is done because there is one to two orders of magnitude difference in the densest counties (in New York City) and even moderately dense counties.  There are also several counties with population density below 1 person per square mile (several in Alaska because of the size of their counties) but these are excluded from the graph.

    Richmond County, NY (i.e. the Borough of Staten Island) is the densest county (17th densest) in the country that Trump won. The densest counties favored Biden quite heavily as he won 45 of the 50 densest counties in the country, which also tend to have a fairly high population.

    This second graph is a histogram that specifically categorizes counties into discreet bins by population density. Note that they are on a log scale as well. You can toggle the graph to show the number of counties won by each candidate or the number of votes won in each of the population density bins. The black line shows the percentage of counties (or votes) won by the democratic candidate (Joe Biden) in each of those bins.

    Hover over (or click on) the bars to see information about each county bin.

    It’s pretty clear in these graphs that low population density areas clearly favor the republican while the denser areas favor the democrat.

     

    Data and Tools
    The 2020 county-level election data is downloaded from the New York Times county election data API and processed using a python script. Population data used is for 2018. The visualization was created using the open-source plotly javascript graphing library.