API Reference

API Reference#

GTFSEnergyPredictor (Object-Oriented Interface)#

The primary interface for transit energy prediction.

Bases: object

Predict transit bus energy consumption from GTFS data.

This class provides a complete workflow for RouteE-Transit, including: - Loading and filtering GTFS data - Adding deadhead trips (between trips and to/from depot) - Matching shapes to road networks (OpenStreetMap by default) - Adding road grade information - Predicting energy consumption with RouteE-Powertrain models - Adding HVAC energy impacts

The class is designed to be easily extended via inheritance. For example, a subclass can override network matching methods to use TomTom instead of OSM.

Typical usage:

>>> predictor = GTFSEnergyPredictor(
...     gtfs_path="data/gtfs",
...     # depot_path is optional - uses NTD depot locations by default
... )
>>> predictor.load_gtfs_data()
>>> predictor.filter_trips(date="2023-08-02", routes=["205"])
>>> predictor.add_mid_block_deadhead()
>>> predictor.add_depot_deadhead()  # Uses NTD depot locations
>>> predictor.get_link_level_inputs()
>>> results = predictor.predict_energy(["Transit_Bus_Battery_Electric"])

For extending with custom network data:

>>> class TomTomEnergyPredictor(GTFSEnergyPredictor):
...     def _match_shapes_to_network(self, upsampled_shapes):
...         # Custom TomTom matching logic
...         return matched_shapes

gtfs_path#

Path to GTFS feed directory

Type:: Path

depot_path#

Path to depot shapefile directory

Type:: Path | None

n_processes#

Number of parallel processes to use

Type:: int

feed#

Loaded GTFS feed object

Type:: Feed | None

trips#

Trips DataFrame (initially all, can be filtered)

Type:: pd.DataFrame

shapes#

Shapes DataFrame for loaded trips

Type:: pd.DataFrame

matched_shapes#

Shapes matched to road network

Type:: pd.DataFrame

routee_inputs#

Link-level features for RouteE

Type:: pd.DataFrame

energy_predictions#

Energy predictions by vehicle model

Type:: dict[str, pd.DataFrame]

Initialize the GTFSEnergyPredictor.

Parameters:

gtfs_path -- Path to directory containing GTFS feed files
depot_path -- Path to directory containing depot shapefile (Transit_Depot.shp). If None (default), uses depot locations from the National Transit Database's "Public Transit Facilities and Stations - 2023" dataset. This dataset covers depot/facility locations for transit agencies across the United States. Data source: https://data.transportation.gov/stories/s/gd62-jzra
n_processes -- Number of parallel processes for processing. Defaults to CPU count.
compass_app -- An optional pre-initialized CompassApp instance.
output_dir -- Directory for saving results and caching the CompassApp graph. If None, results are not persisted to disk.
vehicle_models -- List of vehicle model names to use for energy prediction (e.g., ["Transit_Bus_Battery_Electric", "Transit_Bus_Diesel"]). If None, all supported models are used.
overwrite -- If True (default), regenerate the CompassApp graph and results even if cached outputs already exist in output_dir.

add_trip_times() → None[source]#: Add trip time columns to self.trips

static aggregate_inputs_by_link(trips_ext: DataFrame) → DataFrame[source]#: After map matching all trips, aggregate the data by road link.

filter_trips(date: str | None = None, routes: list[str] | None = None, use_block_filter: bool = False) → GTFSEnergyPredictor[source]#

Filter trips by date and/or routes.

This method can be called after load_gtfs_data() to restrict the analysis to specific dates or routes. Can be called multiple times to refine filters.

Parameters:

date (str, optional) -- Date to filter trips (format: "YYYY-MM-DD" or datetime object). If None, keeps all currently loaded trips.
routes (list[str], optional) -- List of route_short_name values to filter by. If None, keeps all currently loaded routes.
use_block_filter (bool, default=False) -- When True, uses block-level filtering via filter_blocks_by_route with route_method="exclusive". This means entire blocks are excluded if any trip in the block belongs to a route not in routes. This is appropriate when deadhead trips are being estimated, because we need complete blocks. When False (the default), trips are filtered purely at the trip level so that individual trips on the requested routes are always included regardless of what other routes share the same block.

Returns:

Self for method chaining.

Return type:

GTFSEnergyPredictor

Raises:

RuntimeError -- If GTFS data hasn't been loaded yet.
ValueError -- If no trips match the specified filters.

get_link_level_inputs() → GTFSEnergyPredictor[source]#

Match GTFS shapes to road network and prepare RouteE inputs.

This method performs the following steps: 1. Upsamples shapes to ~1 Hz GPS traces 2. Matches shapes to OpenStreetMap road network 3. Extends trips with stop and schedule information 4. Aggregates data at road link level 5. Optionally adds road grade information

Returns:: Self for method chaining
Raises:: RuntimeError -- If GTFS data hasn't been loaded yet

get_link_predictions(vehicle_model: str | None = None) → DataFrame[source]#

Get link-level energy predictions.

Parameters:: vehicle_model -- Specific model name, or None for all models
Returns:: DataFrame with predictions, or None if not yet computed

get_trip_predictions(vehicle_model: str | None = None) → DataFrame[source]#

Get trip-level energy predictions.

Parameters:: vehicle_model -- Specific model name, or None for all models
Returns:: DataFrame with predictions
Raises:: KeyError -- If predictions have not been generated yet

load_compass_app(buffer_deg: float = 0.05, n_processes: int | None = None, extra_geoms: list[GeoDataFrame | DataFrame] | None = None) → None[source]#

Initialize the CompassApp using the bounding box of the loaded shapes.

Parameters:

buffer_deg -- Buffer in degrees to add to the bounding box.
n_processes -- Number of processes for parallelism.
extra_geoms -- Optional list of GeoDataFrames or DataFrames with geometry to include in bbox.

load_gtfs_data() → GTFSEnergyPredictor[source]#

Load GTFS data from the feed directory.

This method reads the complete GTFS feed. Use filter_trips() afterwards if you want to restrict to specific dates or routes.

Returns:: Self for method chaining

predict_energy(add_hvac: bool = False) → dict[str, DataFrame][source]#

Predict energy consumption by map matching once, then running CompassApp.run_calculate_path for each vehicle model.

This method: 1. Runs map matching ONCE to get road-level attributes (distance, speed, grade) 2. Extracts edge_ids from the map-matched paths 3. Runs CompassApp.run_calculate_path for each vehicle model with model_name

This is much more efficient than the previous approach of running map matching per vehicle model, since map matching is computationally expensive and the road attributes are the same regardless of vehicle type.

Energy modeling is handled entirely by RouteE-Compass's powertrain traversal models, eliminating the need for the nrel.routee.powertrain package.

Parameters:

add_hvac -- Whether to add HVAC energy consumption to trip-level results

Returns:

'link': DataFrame with link-level predictions for all models
'trip': DataFrame with trip-level predictions for all models
'<model_name>_link': Link-level predictions for specific model
'<model_name>_trip': Trip-level predictions for specific model

Return type:

Dictionary with keys

Raises:

RuntimeError -- If GTFS data hasn't been loaded yet
ValueError -- If vehicle model is not supported

run(*, date: str | None = None, routes: list[str] | None = None, add_mid_block_deadhead: bool = False, add_depot_deadhead: bool = False, add_hvac: bool = True, save_results: bool = True) → DataFrame[source]#

Run the complete energy prediction pipeline with a single method call.

This is a convenience method that chains together all processing steps: 1. Load GTFS data 2. Optionally filter trips (date, routes) 3. Optionally add deadhead trips (add_mid_block_deadhead, add_depot_deadhead) 4. Run map matching and predict energy consumption using CompassApp 5. Optionally save results (save_results)

For more control over individual steps, use the individual methods (load_gtfs_data, filter_trips, add_mid_block_deadhead, etc.).

Parameters:

date (str, optional) -- Filter trips to a specific service date (format: "YYYY-MM-DD" or "YYYY/MM/DD"). If None, all trips across all service dates are included.
routes (list[str], optional) -- Filter trips to specific route IDs. If None, all routes are included.
add_mid_block_deadhead (bool, default=False) -- Whether to add deadhead trips between consecutive revenue trips. When True and routes is specified, block-level filtering is used to ensure only blocks that exclusively serve the selected routes are included (required for correct deadhead estimation).
add_depot_deadhead (bool, default=False) -- Whether to add deadhead trips from/to depots at start/end of blocks. Requires depot_path to be set during initialization. When True and routes is specified, block-level filtering is used (see add_mid_block_deadhead).
add_hvac (bool, default=True) -- Whether to add HVAC energy consumption based on ambient temperature.
save_results (bool, default=True) -- Whether to save results to files.

Returns:

Trip-level energy predictions with columns for each vehicle model.

Return type:

pd.DataFrame

Examples

Simple usage - predict energy for all trips:

>>> predictor = GTFSEnergyPredictor(
...     gtfs_path="data/gtfs",
...     vehicle_models=["Transit_Bus_Battery_Electric", "Transit_Bus_Diesel"],
... )
>>> results = predictor.run()

Filter to specific date and routes:

>>> predictor = GTFSEnergyPredictor(
...     gtfs_path="data/gtfs",
...     vehicle_models="Transit_Bus_Battery_Electric",
...     output_dir="reports/saltlake",
... )
>>> results = predictor.run(date="2023-08-02", routes=["205", "209"])

Minimal processing (no deadhead, no HVAC):

>>> predictor = GTFSEnergyPredictor(
...     gtfs_path="data/gtfs",
...     vehicle_models="Transit_Bus_Battery_Electric",
... )
>>> results = predictor.run(
...     add_mid_block_deadhead=False,
...     add_depot_deadhead=False,
...     add_hvac=False,
...     save_results=False,
... )

save_results(output_dir: str | Path | None = None, save_geometry: bool = True, save_inputs: bool = False) → None[source]#

Save prediction results to CSV files.

Parameters:

output_dir -- Directory to save results. If None, uses self.output_dir, defaulting to the current working directory if that is also None.
save_geometry -- Whether to save link geometry separately
save_inputs -- Whether to save RouteE input features

Raises:

RuntimeError -- If no predictions have been generated yet

Network Routing#

routee.transit.create_deadhead_shapes(app: TransitCompassApp, df: GeoDataFrame, o_col: str = 'geometry_origin', d_col: str = 'geometry_destination', min_distance_m: float = 200.0) → tuple[DataFrame, DataFrame][source]#

Compute deadhead route shapes for unique origin-destination pairs.

This function identifies unique O-D pairs from the input DataFrame and routes only those pairs, significantly reducing the routing burden when many trips share the same O-D pair.

Parameters:

app (TransitCompassApp) -- The TransitCompassApp instance to use for routing.
df (gpd.GeoDataFrame) -- DataFrame with origin and destination geometry columns. Must include a 'block_id' column to identify each input row.
o_col (str, optional) -- Column name for origin geometries (default: "geometry_origin")
d_col (str, optional) -- Column name for destination geometries (default: "geometry_destination")
min_distance_m (float, optional) -- Minimum distance in meters between O and D to perform routing. O-D pairs closer than this use straight-line fallback. (default: 200.0)

Returns:

A tuple of (shapes_df, od_mapping_df):

shapes_df: DataFrame with columns ['shape_id', 'shape_pt_sequence', 'shape_pt_lon', 'shape_pt_lat', 'shape_dist_traveled']
od_mapping_df: DataFrame with columns ['block_id', 'od_key', 'shape_id'] mapping each input block_id to its assigned shape

Return type:

tuple[pd.DataFrame, pd.DataFrame]

HVAC Energy#

routee.transit.add_HVAC_energy(feed: Feed, trips_df: DataFrame, output_dir: Path | None = None) → DataFrame[source]#

Add HVAC energy consumption.

Parameters:

feed (gtfsblocks.Feed) -- GTFS feed object containing blocks DataFrame.
trips_df (pd.DataFrame) -- Trips on selected date and route, including deadhead trips.
output_dir (Path or None) -- Directory used to store downloaded TMY weather files (in a TMY/ subdirectory). If None, defaults to ~/cache/routee-transit/TMY.

Returns:

Updated trip-level energy prediction DataFrame with HVAC energy consumption per trip for each weather scenario (summer, winter, median).

Return type:

pd.DataFrame

GTFS Processing (Internal)#

routee.transit.gtfs_processing.add_stop_flags_to_shape(trip_shape_df: DataFrame, stop_times_ext: DataFrame) → GeoDataFrame[source]#

Attach stop information to a DataFrame of shape points for a specific trip.

Given a DataFrame of shape points (trip_shape_df) and a DataFrame of stop times (stop_times_ext) joined with stop locations, this function identifies which shape points correspond to stops for the trip and annotates them.

Parameters:

trip_shape_df (pd.DataFrame) -- DataFrame containing shape points for a single trip. Must include columns 'trip_id', 'shape_pt_lon', 'shape_pt_lat', and 'coordinate_id'.
stop_times_ext (pd.DataFrame) -- DataFrame containing stop times with extended information. Must include columns 'trip_id', 'stop_lon', and 'stop_lat'.

Returns:

The input DataFrame with an additional column 'with_stop', where 1 indicates the shape point is nearest to a stop, and 0 otherwise.

Return type:

pd.DataFrame

Notes

Uses spatial join to find the nearest shape point for each stop.

routee.transit.gtfs_processing.copy_transit_config(params: HookParameters, vehicle_models: list[str] | None = None) → None[source]#

Hook to copy the transit_energy.toml from package resources to the output directory.

Parameters:

params -- Parameters from generate_compass_dataset
vehicle_models -- Optional list of vehicle models to include. If None, all are included.

routee.transit.gtfs_processing.estimate_trip_timestamps(trip_shape_df: DataFrame) → DataFrame[source]#

Estimate timestamps for each shape point of a trip based on distance traveled.

Parameters:

trip_shape_df (pd.DataFrame) -- DataFrame containing trip shape data with columns: - 'shape_dist_traveled': Cumulative distance traveled along the shape. - 'start_time': Origin time (datetime) of the trip. - 'end_time': Destination time (datetime) of the trip.

Returns:

Modified DataFrame with additional columns:

'segment_duration_delta': Estimated duration for each segment as timedelta.
'timestamp': Estimated timestamp for each segment.
'Datetime_nearest5': Timestamp rounded to the nearest 5 minutes.
'hour': Hour component of the rounded timestamp.
'minute': Minute component of the rounded timestamp.

Return type:

pd.DataFrame

routee.transit.gtfs_processing.extend_trip_traces(trips_df: DataFrame, matched_shapes_df: DataFrame, feed: Feed, add_stop_flag: bool = False, n_processes: int | None = 4) → DataFrame[source]#

Extend trip shapes with stop details and estimated timestamps from GTFS.

This function processes GTFS trip and shape data to:

Summarize stop times for each trip (first/last stop and times)
Merge stop time summaries into the trips DataFrame
Attach stop coordinates to stop times
Merge trip and shape data to create ordered trip traces
Optionally, attach stop indicators to shape trace points
Estimate timestamps for each trace point based on scheduled trip duration and distance

Parameters:

trips_df -- DataFrame containing trip information, including 'trip_id' and 'shape_id'.
matched_shapes_df -- DataFrame with shape points matched to trips, including 'shape_id' and 'shape_dist_traveled'.
feed -- GTFS feed object containing 'stop_times' and 'stops' DataFrames.
add_stop_flag -- If True, attaches stop indicators to shape trace points. Defaults to False.
n_processes -- Number of processes to run in parallel using multiprocessing. Defaults to mp.cpu_count().

Returns:

A single concatenated DataFrame with extended trace information for all trips, including estimated timestamps.

routee.transit.gtfs_processing.upsample_shape(shape_df: DataFrame) → DataFrame[source]#

Upsample a GTFS shape DataFrame to generate a roughly 1 Hz GPS trace.

Interpolates latitude, longitude, and distance traveled, assuming a constant speed. The function performs the following steps:

Calculates the distance between consecutive shape points using great-circle distance
Computes the cumulative distance traveled along the shape
Assigns timestamps based on constant speed (30 km/h)
Resamples and interpolates the shape to 1-second intervals
Returns DataFrame with interpolated coordinates, timestamps, and distances

Parameters:: shape_df -- DataFrame containing GTFS shape points with columns 'shape_pt_lat', 'shape_pt_lon', and 'shape_id'.
Returns:: Upsampled DataFrame with columns 'shape_pt_lat', 'shape_pt_lon', 'shape_dist_traveled', 'timestamp', and 'shape_id', sampled at 1 Hz.

routee.transit.gtfs_processing.write_gtfs_stops(params: HookParameters, feed: Feed) → None[source]#

Hook to write GTFS stop-to-edge mapping after dataset generation.

Parameters:

params -- Parameters from generate_compass_dataset
feed -- GTFS feed object

Deadhead Trips (Internal)#

routee.transit.mid_block_deadhead.create_mid_block_deadhead_stops(feed: Any, deadhead_trips: DataFrame) → tuple[DataFrame, DataFrame, DataFrame][source]#

Create stop_times and stops for mid-block deadhead trips.

Parameters:

feed (Any) -- GTFS feed object (e.g. result from read_in_gtfs).
deadhead_trips (pd.DataFrame) -- Deadhead trip records from create_mid_block_deadhead_trips().

Returns:

A (stop_times_df, stops_df, deadhead_ods_df) tuple where stop_times_df and stops_df can be merged into the GTFS feed, and deadhead_ods_df holds origin/destination geometry for routing.

Return type:

tuple[pd.DataFrame, pd.DataFrame, pd.DataFrame]

routee.transit.mid_block_deadhead.create_mid_block_deadhead_trips(trips_df: DataFrame, stop_times_df: DataFrame) → DataFrame[source]#

Create deadhead trips between consecutive trips for each block.

Parameters:

trips_df (pd.DataFrame) -- GTFS trips_df (e.g. result from read_in_gtfs).
stop_times_df (pd.DataFrame) -- stop_times df in feed resulted from read_in_gtfs.

Returns:

pd.DataFrame

Return type:

DataFrame with created deadhead trips.

routee.transit.depot_deadhead.create_depot_deadhead_stops(first_stops_gdf: GeoDataFrame, last_stops_gdf: GeoDataFrame, deadhead_trips: DataFrame) → tuple[DataFrame, DataFrame][source]#

Create stop_times and stops for deadhead trips from and to depots.

Parameters:

first_stops_gdf (gpd.GeoDataFrame) -- GeoDataFrame of first stops for each block, with geometry_origin (depot) and geometry_destination (first stop) columns. Result from infer_depot_trip_endpoints().
last_stops_gdf (gpd.GeoDataFrame) -- GeoDataFrame of last stops for each block, with geometry_origin (last stop) and geometry_destination (depot) columns. Result from infer_depot_trip_endpoints().
deadhead_trips (pd.DataFrame) -- Deadhead trip records from create_depot_deadhead_trips().

Returns:

A (stop_times_df, stops_df) tuple for the depot deadhead trips.

Return type:

tuple[pd.DataFrame, pd.DataFrame]

routee.transit.depot_deadhead.create_depot_deadhead_trips(trips_df: DataFrame, stop_times_df: DataFrame) → DataFrame[source]#

Create deadhead trips from and to depots for each block.

This function essentially creates rows for the trips.txt DataFrame. It does not generate shape traces for them (that is handled by other functions in this module).

Parameters:

trips_df (pd.DataFrame) -- trips_df of selected date route (e.g. result from read_in_gtfs).
stop_times_df (pd.DataFrame) -- stop_times df in feed resulted from read_in_gtfs.

Returns:

pd.DataFrame

Return type:

DataFrame with created deadhead trips.

routee.transit.depot_deadhead.get_default_depot_path() → Path[source]#

Return the default path to the FTA_Depot directory in the repository.

The default depot locations come from the National Transit Database's "Public Transit Facilities and Stations - 2023" dataset, which contains depot/facility locations for transit agencies across the United States.

Data source: https://data.transportation.gov/stories/s/gd62-jzra

Returns:: Path to the FTA_Depot directory containing Transit_Depot.shp
Return type:: Path

routee.transit.depot_deadhead.infer_depot_trip_endpoints(trips_df: DataFrame, feed: Any, path_to_depots: str | Path) → tuple[Any, Any][source]#

Add origin/destination depot geometry for each block.

Parameters:

trips_df (pd.DataFrame) -- trips_df of selected date and route (result from read_in_gtfs).
feed (Any) -- GTFS feed object (e.g. result from read_in_gtfs).
path_to_depots (str | Path) -- Path to a vector file (GeoJSON/Shapefile) containing depot point geometries.

Returns:

(first_stops_gdf, last_stops_gdf). Each GeoDataFrame contains the stop geometry (column 'stop_geometry') and the matched depot geometry (column 'depot_geometry').

Return type:

tuple[GeoDataFrame, GeoDataFrame]

Thermal Energy (Internal)#

routee.transit.thermal_energy.compute_HVAC_energy(start_hours: Series, end_hours: Series, power_array: ndarray[tuple[Any, ...], dtype[number]]) → ndarray[tuple[Any, ...], dtype[number]][source]#

Calculate HVAC + BTMS energy consumption between time intervals.

Parameters:

start_hours (array-like) -- fractional start times in hours
end_hours (array-like) -- fractional end times in hours (can be > 24)
power_array (array-like) -- hourly average power values [kW] for hours 0–23

Returns:

energy consumption [kWh] for each interval

Return type:

np.ndarray

routee.transit.thermal_energy.download_tmy_files(county_ids: list[str], tmy_dir: Path) → None[source]#

Download and save TMY weather files for estimating thermal energy demand.

TMY stands for Typical Meteorological Year, a dataset that provides representative hourly weather data for a location over a synthetic year. Unlike Actual Meteorological Year (AMY) files, which reflect the observed conditions in a specific calendar year, TMY files are constructed by selecting typical months from multiple years of historical records. This approach smooths out unusual extremes and produces a “typical” climate profile, making TMY data well-suited for long-term energy modeling and system design studies.

This function downloads TMY files for all the supplied county IDs and saves them to TMY_DIR. It returns None.

Parameters:

county_ids (list[str]) -- List of US Census County IDs for which to download TMY files.
tmy_dir (Path) -- Directory where downloaded TMY CSV files are saved.

routee.transit.thermal_energy.fetch_counties_gdf() → GeoDataFrame[source]#

routee.transit.thermal_energy.get_hourly_temperature(county_id: str, scenario: str, tmy_dir: Path) → DataFrame[source]#

routee.transit.thermal_energy.load_thermal_lookup_table() → DataFrame[source]#

API Reference

Contents

API Reference#

GTFSEnergyPredictor (Object-Oriented Interface)#

Network Routing#

HVAC Energy#

GTFS Processing (Internal)#

Deadhead Trips (Internal)#

Thermal Energy (Internal)#