![]() |
The first few rows of the available data |
The numerical variable is the estimated population of the town in a given year. The obvious categorical variable is the city and state in which each metropolitan area is located; this can be extended to discuss metropolitan areas in a given region (like New England or the Southwest). The year is another categorical variable. Since an estimate is given for each year, this data can be graphed in a time series.
This data would be most interesting if analyzed in conjunction with other data; for instance, I would be interested to see if there is any correlation between economic measures (median income, percent of population below the poverty line, job growth or lack thereof) and the population, or between these same measures and the change in population over time (i.e., a correlation between cities with high job loss and cities whose populations are decreasing over time). I would also want to compare it with other measures of the general well-being of people in a city, like happiness, unemployment rates, age distributions, education rates, and other interesting measures. This could reveal interesting information about the relationship between population and other variables.