KPI#

There are a number of standard KPIs that are built into the standard POLARIS model run process. In general these KPIs are custom SQL that is run against the generated activities and consequent network effects stored in the underlying Demand.sqlite database. These KPIs are integrated to the standard run because these SQL can be complicated and non-final iteration databases are stored in a compressed .tar.gz format, therefore running these queries as a post-process after a model run can be quite time-consuming.

Metrics are defined in the ResultKPIs class which has facilities for storing and retrieving the resulting metric from a disk based cache located in the iteration data folder from which the KPIs were extracted. These metrics can be of any data type that is serializable via pickle - most of the metrics in the standard set are actually just small Pandas DataFrames.

For example to extract out the planned_modes metric for a given results directory we would use:

ResultKPIs.from_dir(dir).get_cached_kpi_value("planned_modes")

mode

mode_count

0

BIKE

666

1

BUS

13280

2

HOV

3598

3

NO_MOVE

6650

4

SOV

62319

5

TAXI

740

6

WALK

41124

These dataframes can be easily combined together using pd.concat after assigning them a name:

def f(dir):
    return ResultKPIs.from_dir(dir).get_cached_kpi_value("planned_modes").assign(name=dir.parent.name)

pd.concat(f(project_dir / f"Chicago_iteration_{i}) for i in [1,2,3]) 

mode

mode_count

name

0

BIKE

666

Chicago_iteration_1

1

BUS

13280

Chicago_iteration_1

.

19

TAXI

740

Chicago_iteration_3

20

WALK

41124

Chicago_iteration_3

You can find out which metrics have been cached in a given run by calling kpi.available_metrics().

Adding new metrics#

To add a new metric to the standard set, a new method on the ResultKPIs class should be defined with a name that starts with metric_. The remainder of the name after the metric_ will be the name by which users can access that metric. For example the planned_modes metric in the example above is defined in the metric_planned_modes method.

Standard cached metrics#

As part of our standard runs, the helper method cache_all_available_metrics is called as part of the asynchronous end of loop function. This helper method will iterate through each method which starts with metric_ and evaluate them for the just completed iteration. This means that as new metrices are implemented they are automatically included by default for all subsequent runs. It is possible to excluded some metrics from standard processing but so far metrics have been generalisable enough to avoid special casing.

Comparing metrics#

In general there are two scenarios in which it is desirable to compare the metrics from multiple runs:

  1. Tracking the evolution of a metric across convergence iterations and

  2. Comparing a metric between two (or more) study scenarios

The KpiComparator makes this easy to achieve for both situations. This is most easily demonstrated with a demonstration:

study_dir = Path(r"P:\ShareAndFileTransfer\ForJamie\my_study")
k = KpiComparator()
k.add_run(ResultKPIs.from_dir(study_dir / "scenario_00" / "Chicago_iteration_10"), 'scenario_00')
k.add_run(ResultKPIs.from_dir(study_dir / "scenario_01" / "Chicago_iteration_10"), 'scenario_01')
k.plot_everything()

The comparator object allows us to add runs and label them add_run(kpi_object, label). Calling plot_everything will then perform the metric aggregation across all those runs and produce a set of graphs visualizing these. Like the cache_all_available_metrics method, plot_everything will use meta-programming to find all the plot_ methods that have been defined and call them sequentially. If new visualizations are defined (either for new metrics or alternate views of existing metrics) with this convention - they will be automatically produced by the above code.

Plotting everything is helpful, but of course it is possible to only create the plots desired:

k.plot_pax_in_network()

Passengers in Network

The KPIComparator has a large number of built-in plots to display information from different runs, but users will often want more specific information to be displayed. There are already parameters for each plot method to help with this - for instance, we can look at Vehicle Miles Traveled (VMT) across different scenarios and see how these values progress and converge across iterations:

k.plot_vmt(mode=['SOV'], df_transform=add_cols, group_by='scenario', x='iter')

In this case the group_by column (‘scenario’) isn’t a standard column but is instead added to the data by the add_cols method which is injected via that df_transform argument.

Modified VMT

A full list of plotting functions can be found in the KPIComparator class documentation. These functions should all work without any arguments, but also accept a variadic list of arguments, which is written in the class documentation as kwargs - below is a list of commonly used arguments.

Data Loading options:

  • df - allows the user to pass in their own pd.DataFrame instead of loading it from the cache. Useful when experimenting as it reduces runtime for subsequent calls.

  • df_transform - a function that takes a pd.DataFrame and returns a new pd.DataFrame - useful for adding extra columns.

  • df_filter - a dictionary of {column: expected_value} that is used to select rows from the loaded df. Currently supports only string, int, list[string] & list[int] as the expected value. Single values are treated as equals, lists are treated as in.

  • sort_key - sort the loaded df using the given column or callable.

  • limit_runs - limit to the first N (int) runs in the comparator, the last N runs (int and negative) or to only those runs matching N (regex).

  • skip_cache - Do not use the cached value, instead call the metric_x methods. Useful if implementation has changed since the cached results were generated.

Plotting options:

  • hue - Allows a third data parameter to be mapped, using color alongside the x and y axes.

  • separate_legend - move the legend outside the main plot area

  • xlim - Sets the limit of x values to be displayed.

Specific options (not available on all plot methods):

  • x - Value to graph on the x-axis

  • group_by - column to group by

  • mode - Specifies the sub-set of trip modes that will be displayed.

An example notebook demonstrating these concepts can be found in the examples section.

API Overview#

result_kpis.ResultKPIs

This class provides an easy way to extract relevant metrics for a single simulation run of POLARIS.

kpi_comparator.KpiComparator

This class provides an easy way to group together multiple runs of POLARIS and compare their outputs.