Population synthesis#
POLARIS simulates individual households and people and therefore needs these as inputs. This data is usually not directly available due to privacy reasons, however the US Census Bureau continuously collects samples of the population of the entire US, the American Community Survey (ACS), and makes a detailed 1% sample available per year, the public usage micro-sample (PUMS). To preserve privacy, all geographic information, like place of home, are aggregated to so-called public usage micro-sample areas (PUMA). Additionally, more geographically detailed data is available for selected variables, like number of households or number of people in a defined age band per census tract. These are marginal distributions, i.e. they provide information on one (or few) variable only and the individual-level information is lost. A population synthesis process then takes the fully cross-tabulated seed sample (PUMS) at a geographically aggregate level and expands it such that it matches a selection of marginal distributions at a geographically detailed scale.
References#
Auld, J., & Mohammadian, A. (2010). Efficient Methodology for Generating Synthetic Populations with Multiple Control Levels. Transportation Research Record, 2175(1), 138–147. https://doi.org/10.3141/2175-16