polaris.analyze.popsyn_analysis.PopsynComparator#
- class polaris.analyze.popsyn_analysis.PopsynComparator(population: Population, sample_factor: float, linker_file: LinkerFile, geo_mapping: GeoMappingStrategy)#
Bases:
object- __init__(population: Population, sample_factor: float, linker_file: LinkerFile, geo_mapping: GeoMappingStrategy)#
Compare POLARIS synthetic population output against control marginals.
- Args:
population: A Population object containing households, persons, and vehicles. sample_factor: Sampling factor used in population synthesis. linker_file: LinkerFile object with control data and seed data specifications. geo_mapping: GeoMappingStrategy for mapping households to regions.
Use USCensusMapping for US Census-based regions, or CustomMapping for non-US regions or alternative geographic hierarchies.
Methods
__init__(population, sample_factor, ...)Compare POLARIS synthetic population output against control marginals.
Attach popsyn and survey region fields using the geo_mapping strategy.
compare_synthpop_to_controls(control_data, ...)from_dir(result_dir, sample_factor, linker_file)Create PopsynComparator by loading population from a directory.
generate_comparison_plots(geo_level, linker)load_population_from_dir(result_dir[, db_name])Load population from a directory containing a demand database.
plot_scatter(obs, synth[, title, plt_style, ...])Plot synth vs obs (observed/control) with OLS regression line and y=x reference.
Attributes
- HHID = 'household'#
- HH_SURVEY_ID = 'hhold'#
- PERSONID = 'person'#
- PERSON_SURVEY_ID = 'id'#
- __init__(population: Population, sample_factor: float, linker_file: LinkerFile, geo_mapping: GeoMappingStrategy)#
Compare POLARIS synthetic population output against control marginals.
- Args:
population: A Population object containing households, persons, and vehicles. sample_factor: Sampling factor used in population synthesis. linker_file: LinkerFile object with control data and seed data specifications. geo_mapping: GeoMappingStrategy for mapping households to regions.
Use USCensusMapping for US Census-based regions, or CustomMapping for non-US regions or alternative geographic hierarchies.
- replace_polaris_data_with_seed_data()#
- summarise()#
- classmethod from_dir(result_dir: PathLike, sample_factor: float, linker_file: LinkerFile, geo_mapping: GeoMappingStrategy | None = None, db_name: str | None = None, location_mode: LocationMode | str | None = None, block_to_puma: DataFrame | None = None, puma_to_tract_csv: PathLike | None = None) PopsynComparator#
Create PopsynComparator by loading population from a directory.
This is a convenience method that loads population from disk and creates a USCensusMapping strategy if geo_mapping is not provided.
- Args:
result_dir: Directory or file path containing demand database. sample_factor: Sampling factor used in population synthesis. linker_file: LinkerFile object with control data and seed data specifications. geo_mapping: GeoMappingStrategy. If None, creates USCensusMapping with the
provided location_mode and block_to_puma parameters.
db_name: Optional demand database filename pattern. location_mode: (used only if geo_mapping is None) Location encoding format. block_to_puma: (used only if geo_mapping is None) DataFrame for BLOCK mode. puma_to_tract_csv: (used only if geo_mapping is None) Optional path to local
PUMA-to-tract crosswalk CSV to avoid downloading.
- Returns:
PopsynComparator instance initialized with population from directory.
- static load_population_from_dir(result_dir: PathLike, db_name: str | None = None) Population#
Load population from a directory containing a demand database.
- Args:
result_dir: Directory or file path containing demand database. db_name: Optional demand database filename pattern.
- Returns:
Population loaded from demand database.
- attach_geo_fields()#
Attach popsyn and survey region fields using the geo_mapping strategy.
popsynth marginals are at popsyn_region level (e.g., census tract), popsynth results at either location, popsyn_region, or block level. Seeds are provided at survey_region level (e.g., PUMA) so we attach these here too to compare results at a more aggregate level.
- extract_population_for_control_variables(linker)#
- classmethod plot_scatter(obs: Series, synth: Series, title: str = '', plt_style: str = 'tableau-colorblind10', figsize: Tuple[int, int] = (5, 5), ax=None)#
Plot synth vs obs (observed/control) with OLS regression line and y=x reference.
- Args:
obs: Observed/control values (x-axis). synth: Synthesized values (y-axis). Must align with obs. title: Plot title. plt_style: Matplotlib style. figsize: Figure size (only used if ax is None). ax: Optional existing axes to draw on. If None, creates a new figure.
- classmethod compare_synthpop_to_controls(control_data, synth_data, sample_fac, geo_col_index, compare_name, plt_style: str = 'tableau-colorblind10', figsize=(5, 5), ax=None)#
- generate_comparison_plots(geo_level: int, linker)#