Friday, February 3, 2012

The Census: Metropolitan Area Populations

The US Census provides data on the estimated size of metropolitan and micropolitan areas over time.  While the Census collects exact data once every ten years, this information is estimated for the years when the Census did not collect data.  I enjoyed looking at this data set because enough of the listed cities and towns are familiar to me that I'm interested in just seeing which cities are grouped together, and how they choose to list the cities.  (The San Diego metropolitan area is listed as San Diego-Carlsbad-San Marcos.  I'm left curious: does this include Oceanside, the large city north of Carlsbad that adjoins the naval base?  Why did they choose to list Carlsbad, which, while a perfectly lovely town, isn't any more populous than any town adjoining it?  Who made this decision?)  While these questions can't be answered by the data in the table, the data can be used to find other correlations between absolute population (or between population growth or decline) and other data sets.
The first few rows of the available data

The numerical variable is the estimated population of the town in a given year.  The obvious categorical variable is the city and state in which each metropolitan area is located; this can be extended to discuss metropolitan areas in a given region (like New England or the Southwest).  The year is another categorical variable.  Since an estimate is given for each year, this data can be graphed in a time series.

This data would be most interesting if analyzed in conjunction with other data; for instance, I would be interested to see if there is any correlation between economic measures (median income, percent of population below the poverty line, job growth or lack thereof) and the population, or between these same measures and the change in population over time (i.e., a correlation between cities with high job loss and cities whose populations are decreasing over time).  I would also want to compare it with other measures of the general well-being of people in a city, like happiness, unemployment rates, age distributions, education rates, and other interesting measures.  This could reveal interesting information about the relationship between population and other variables.

Welcome!

Welcome to my blog for CS349B, Quantifying the World!  I'll be posting writeups relevant to what I'm doing in the class.