API Reference

This section contains the API documentation auto-generated from docstrings.

GTFS

Bundle Manager

class anduin.gtfs.bundle.GTFSBundleManager(url=None, extract_dir=None)

Bases: object

Manages downloading and extracting GTFS bundles.

DEFAULT_URL = 'https://cdn.mbta.com/MBTA_GTFS.zip'
cleanup()

Remove the extracted GTFS data directory.

Return type:

None

download(dest_path=None)

Download the GTFS bundle.

Parameters:

dest_path (str | Path | None) – Destination path for the zip file. If None, uses a temp file.

Return type:

Path

Returns:

Path to the downloaded zip file.

download_and_extract(keep_zip=False)

Download and extract the GTFS bundle in one step.

Parameters:

keep_zip (bool) – If True, keeps the downloaded zip file in extract_dir.

Return type:

Path

Returns:

Path to the extraction directory.

extract(zip_path)

Extract a GTFS zip file.

Parameters:

zip_path (str | Path) – Path to the GTFS zip file.

Return type:

Path

Returns:

Path to the extraction directory.

Shapes

class anduin.gtfs.shapes.GTFSShapeLoader(gtfs_dir)

Bases: object

Loads GTFS shapes with associated route information.

load(route_types=None)

Load all shapes with their associated route information.

Parameters:

route_types (list[int] | None) – Optional list of GTFS route types to filter by (e.g., [3] for buses). If None, defaults to DEFAULT_ROUTE_TYPES (buses). Valid types: 0=Tram, 1=Subway, 2=Rail, 3=Bus, 4=Ferry

Return type:

dict[str, Shape]

Returns:

Dict mapping shape_id to Shape objects with points and route info, filtered to only include shapes with routes of the specified types and excluding routes with excluded name prefixes (e.g., shuttle routes).

load_for_route(route_id)

Load shapes for a specific route.

Parameters:

route_id (str) – The route ID to filter by.

Return type:

list[Shape]

Returns:

List of Shape objects associated with the route.

load_for_route_type(route_type)

Load shapes for a specific route type.

Route types (GTFS standard):

0 - Tram/Light rail 1 - Subway/Metro 2 - Rail 3 - Bus 4 - Ferry

Parameters:

route_type (int) – The GTFS route type to filter by.

Return type:

list[Shape]

Returns:

List of Shape objects for routes of that type.

class anduin.gtfs.shapes.RouteInfo(route_id, route_short_name, route_long_name, route_type, route_color='', route_text_color='')

Bases: object

Route metadata from routes.txt.

route_color: str = ''
route_id: str
route_long_name: str
route_short_name: str
route_text_color: str = ''
route_type: int
class anduin.gtfs.shapes.Shape(shape_id, points=<factory>, routes=<factory>)

Bases: object

A complete shape with its points and associated route info.

points: list[ShapePoint]
routes: list[RouteInfo]
shape_id: str
class anduin.gtfs.shapes.ShapePoint(lat, lon, sequence, dist_traveled=None)

Bases: object

A single point in a shape.

dist_traveled: float | None = None
lat: float
lon: float
sequence: int

Stops

class anduin.gtfs.stops.Stop(stop_id, stop_name, lat, lon)

Bases: object

A GTFS stop.

lat: float
lon: float
stop_id: str
stop_name: str
class anduin.gtfs.stops.TripStopSequence(trip_id, route_id, shape_id, stops=<factory>)

Bases: object

Ordered stops for a trip with shape points.

route_id: str
shape_id: str
stops: list[tuple[int, Stop]]
trip_id: str
anduin.gtfs.stops.load_stops(gtfs_dir)

Load stops.txt into a dict keyed by stop_id.

Return type:

dict[str, Stop]

Matching

Valhalla Map Matching

class anduin.matching.valhalla.MapMatchResult(shape_id, matched_points=<factory>, edges=<factory>, confidence=0.0, raw_response=<factory>)

Bases: object

Result of a map matching operation.

confidence: float = 0.0
edges: list[MatchedEdge]
matched_points: list[MatchedPoint]
raw_response: dict
shape_id: str
class anduin.matching.valhalla.MatchedEdge(edge_id, way_id, names=<factory>, length=0.0, speed=0.0, road_class='', use='', begin_shape_index=0, end_shape_index=0)

Bases: object

A matched edge from Valhalla.

begin_shape_index: int = 0
edge_id: int
end_shape_index: int = 0
length: float = 0.0
names: list[str]
road_class: str = ''
speed: float = 0.0
use: str = ''
way_id: int
class anduin.matching.valhalla.MatchedPoint(lat, lon, original_index, edge_index, distance_from_edge=0.0)

Bases: object

A matched point snapped to the road network.

distance_from_edge: float = 0.0
edge_index: int
lat: float
lon: float
original_index: int
class anduin.matching.valhalla.PyValhallaBackend(config_path=None, tile_extract=None)

Bases: object

Backend using pyvalhalla for direct C++ bindings (much faster).

trace_attributes(request)
Return type:

dict

class anduin.matching.valhalla.ValhallaBackend(*args, **kwargs)

Bases: Protocol

Protocol for Valhalla backends.

trace_attributes(request)

Execute a trace_attributes request.

Return type:

dict

class anduin.matching.valhalla.ValhallaMapMatcher(costing='auto', backend=None, tile_extract=None, config_path=None)

Bases: object

Map matches GTFS shapes to road network edges using Valhalla.

classmethod from_pyvalhalla(tile_extract=None, config_path=None, costing='auto')

Create a matcher using pyvalhalla backend.

Parameters:
  • tile_extract (str | None) – Path to valhalla_tiles.tar file.

  • config_path (str | None) – Path to valhalla.json config file.

  • costing (str) – Costing model to use.

Return type:

ValhallaMapMatcher

Returns:

ValhallaMapMatcher configured with pyvalhalla backend.

get_edge_way_ids(result)

Extract unique OSM way IDs from the matched result.

Parameters:

result (MapMatchResult) – A MapMatchResult from a matching operation.

Return type:

list[int]

Returns:

List of unique OSM way IDs in order of traversal.

get_matched_geometry(result)

Extract the matched geometry as a list of coordinates.

Parameters:

result (MapMatchResult) – A MapMatchResult from a matching operation.

Return type:

list[tuple[float, float]]

Returns:

List of (lat, lon) tuples representing the matched path.

match_points(points, shape_id='custom', costing=None)

Map match a list of lat/lon points.

Parameters:
  • points (list[tuple[float, float]]) – List of (lat, lon) tuples.

  • shape_id (str) – Identifier for the result.

  • costing (str | None) – Override the default costing model.

Return type:

MapMatchResult

Returns:

MapMatchResult with matched edges and points.

match_shape(shape, costing=None)

Map match a GTFS shape to road network edges.

Parameters:
  • shape (Shape) – The Shape object to match.

  • costing (str | None) – Override the default costing model.

Return type:

MapMatchResult

Returns:

MapMatchResult with matched edges and points.

match_shapes(shapes, costing=None)

Map match multiple shapes.

Parameters:
  • shapes (list[Shape]) – List of Shape objects to match.

  • costing (str | None) – Override the default costing model.

Return type:

dict[str, MapMatchResult]

Returns:

Dict mapping shape_id to MapMatchResult.

Edge Lookup

class anduin.matching.edges.StopEdgeLookup(gtfs_dir, matcher)

Bases: object

Builds a lookup of edges between stop pairs along routes.

build()

Build the stop pair edge lookup from GTFS data.

Return type:

None

export_route_geojson(route_id, include_stops=True)

Export an entire route as a GeoJSON FeatureCollection.

Parameters:
  • route_id (str) – The route ID to export.

  • include_stops (bool) – Whether to include stop points.

Return type:

dict

Returns:

GeoJSON FeatureCollection dict.

export_stop_pair_geojson(from_stop_id, to_stop_id, route_id, include_stops=True)

Export a single stop pair as a GeoJSON FeatureCollection.

Parameters:
  • from_stop_id (str) – Origin stop ID.

  • to_stop_id (str) – Destination stop ID.

  • route_id (str) – The route ID.

  • include_stops (bool) – Whether to include stop points.

Return type:

dict

Returns:

GeoJSON FeatureCollection dict.

export_stop_sequence_geojson(stop_ids, route_id, include_stops=True)

Export a sequence of stops as a GeoJSON FeatureCollection.

Parameters:
  • stop_ids (list[str]) – Ordered list of stop IDs.

  • route_id (str) – The route ID.

  • include_stops (bool) – Whether to include stop points.

Return type:

dict

Returns:

GeoJSON FeatureCollection dict.

get_all_edges_for_route(route_id)

Get all unique edges used by a route in order.

Return type:

list[MatchedEdge]

get_edges_between_stops(from_stop_id, to_stop_id, route_id=None)

Get the edges between two stops.

Parameters:
  • from_stop_id (str) – Origin stop ID.

  • to_stop_id (str) – Destination stop ID.

  • route_id (str | None) – Specific route, or None to find any route.

Return type:

StopPairEdges | None

Returns:

StopPairEdges or None if not found.

get_edges_for_stop_sequence(stop_ids, route_id)

Get edges for a sequence of stops (e.g., a trip segment).

Parameters:
  • stop_ids (list[str]) – Ordered list of stop IDs.

  • route_id (str) – The route ID.

Return type:

list[StopPairEdges]

Returns:

List of StopPairEdges for each consecutive pair.

get_route_stop_pairs(route_id)

Get all consecutive stop pairs for a route.

Return type:

list[tuple[str, str]]

get_stop(stop_id)

Get a stop by ID.

Return type:

Stop | None

get_way_ids_between_stops(from_stop_id, to_stop_id, route_id=None)

Get just the OSM way IDs between two stops.

Return type:

list[int]

save_geojson(geojson, path)

Save a GeoJSON dict to a file.

Parameters:
  • geojson (dict) – The GeoJSON dict to save.

  • path (str | Path) – Output file path.

Return type:

None

stop_pair_to_geojson_feature(stop_pair, route_id=None)

Convert a stop pair to a GeoJSON Feature.

Parameters:
  • stop_pair (StopPairEdges) – The StopPairEdges to convert.

  • route_id (str | None) – Optional route ID to include in properties.

Return type:

dict

Returns:

GeoJSON Feature dict with LineString geometry.

stop_to_geojson_feature(stop)

Convert a stop to a GeoJSON Feature.

Parameters:

stop (Stop) – The Stop to convert.

Return type:

dict

Returns:

GeoJSON Feature dict with Point geometry.

summary()

Get summary statistics.

Return type:

dict

class anduin.matching.edges.StopPairEdges(from_stop_id, to_stop_id, edges=<factory>, way_ids=<factory>, total_length=0.0, geometry=<factory>)

Bases: object

Edges connecting a pair of stops.

property edge_ids: list[int]
edges: list[MatchedEdge]
from_stop_id: str
geometry: list[tuple[float, float]]
to_stop_id: str
total_length: float = 0.0
way_ids: list[int]

Analysis

Shared Segments

class anduin.analysis.segments.RouteSegments(route, way_ids=<factory>, edge_ids=<factory>, total_length=0.0)

Bases: object

All segments used by a particular route.

edge_ids: list[int]
route: RouteInfo
total_length: float = 0.0
way_ids: list[int]
class anduin.analysis.segments.SegmentRoutes(way_id, edge_id, names=<factory>, road_class='', length=0.0, routes=<factory>)

Bases: object

Routes that share a particular segment.

edge_id: int
length: float = 0.0
names: list[str]
road_class: str = ''
property route_names: list[str]

Get short names of all routes using this segment.

routes: list[RouteInfo]
way_id: int
class anduin.analysis.segments.SharedSegmentAnalyzer(matcher)

Bases: object

Analyzes which routes share which road/rail segments.

add_shape(shape, match_result=None)

Add a shape and its routes to the analysis.

Parameters:
  • shape (Shape) – The GTFS shape with route information.

  • match_result (MapMatchResult | None) – Pre-computed match result, or None to compute it.

Return type:

None

add_shapes(shapes, match_results=None)

Add multiple shapes to the analysis.

Parameters:
  • shapes (list[Shape]) – List of GTFS shapes with route information.

  • match_results (dict[str, MapMatchResult] | None) – Pre-computed match results keyed by shape_id, or None.

Return type:

None

get_busiest_segments(limit=10)

Get segments with the most routes.

Parameters:

limit (int) – Maximum number of segments to return.

Return type:

list[SegmentRoutes]

Returns:

List of SegmentRoutes sorted by route count descending.

get_overlap_matrix(route_ids=None)

Compute pairwise route overlap as percentage of shared segments.

Parameters:

route_ids (list[str] | None) – Specific routes to analyze, or None for all routes.

Return type:

dict[str, dict[str, float]]

Returns:

Nested dict where result[route_a][route_b] = percentage of route_a segments that are also used by route_b.

get_routes_for_segment(way_id)

Get all routes that use a specific segment.

Return type:

list[RouteInfo]

get_segment(way_id)

Get segment info and routes for a specific OSM way.

Return type:

SegmentRoutes | None

get_segments_for_route(route_id)

Get all segments used by a specific route.

Return type:

RouteSegments | None

get_segments_shared_by(route_ids)

Get segments shared by all specified routes.

Parameters:

route_ids (list[str]) – List of route IDs that must all share the segment.

Return type:

list[SegmentRoutes]

Returns:

List of SegmentRoutes used by all specified routes.

get_shared_segments(min_routes=2)

Get segments shared by at least N routes.

Parameters:

min_routes (int) – Minimum number of routes sharing the segment.

Return type:

list[SegmentRoutes]

Returns:

List of SegmentRoutes with at least min_routes routes.

summary()

Get a summary of the analysis.

Return type:

dict

to_geojson_features()

Export shared segments as GeoJSON features.

Note: This returns feature properties only since we don’t have the actual geometries. Use with OSM data to get full features.

Return type:

list[dict]

Returns:

List of GeoJSON feature property dicts.

OSM Extraction

OSM way geometry extraction and route mapping.

class anduin.analysis.osm_extract.OSMWayExtractor(route_indexes_dir, osm_pbf_path)

Bases: object

Extracts OSM way geometries for transit routes.

This class reads route index GeoJSON files to identify which OSM ways are used by each route, then extracts the geometries for those ways from an OSM PBF file. The result is a mapping of OSM ways to the routes that use them, with full LineString geometries.

The output can be used for route overlap analysis, visualization, and understanding shared infrastructure across routes.

Example:
>>> extractor = OSMWayExtractor(
...     route_indexes_dir="data/route_indexes",
...     osm_pbf_path="data/osm/massachusetts-latest.osm.pbf"
... )
>>> extractor.build()
>>> geojson = extractor.to_geojson()
>>> extractor.save_geojson("output.geojson")
build()

Execute the full pipeline: load routes, then extract geometries.

Return type:

None

extract_way_geometries()

Extract geometries from OSM PBF for needed way IDs using pyosmium.

Uses a custom osmium handler to perform a single-pass extraction of way geometries from the OSM PBF file, only including ways that are referenced in the route index files.

Return type:

None

get_way(way_id)

Get geometry and route info for a specific way.

Parameters:

way_id (int) – The OSM way ID to retrieve.

Return type:

WayGeometry | None

Returns:

WayGeometry object if found, None otherwise.

get_ways_for_route(route_id)

Get all ways used by a specific route.

Parameters:

route_id (str) – The route ID to query.

Return type:

list[WayGeometry]

Returns:

List of WayGeometry objects for the given route.

load_route_way_mappings()

Load all route GeoJSON files and build way_id -> routes mapping.

Iterates through all route_*.geojson files in the route_indexes_dir, extracts way_ids from feature properties, and builds a mapping of which routes use each way.

Return type:

None

save_geojson(output_path)

Save GeoJSON to file.

Parameters:

output_path (str | Path) – Path where the GeoJSON file should be written.

Return type:

None

summary()

Get summary statistics about extracted ways and routes.

Return type:

dict

Returns:

Dictionary with statistics including total ways, routes, and information about missing or skipped ways.

to_geojson()

Export as GeoJSON FeatureCollection.

Returns a GeoJSON FeatureCollection where each feature represents an OSM way with its geometry and the list of routes that use it.

Return type:

dict

Returns:

GeoJSON dict with the following structure:

{

”type”: “FeatureCollection”, “features”: [

{

“type”: “Feature”, “properties”: {

”way_id”: int, “routes”: [str, …], “tags”: {str: str, …}

}, “geometry”: {

”type”: “LineString”, “coordinates”: [[lon, lat], …]

}

]

}

class anduin.analysis.osm_extract.WayGeometry(way_id, routes=<factory>, coordinates=<factory>, tags=<factory>)

Bases: object

Represents an OSM way with its geometry and associated routes.

Variables:
  • way_id – The OSM way ID.

  • routes – List of route IDs that use this way.

  • coordinates – List of (lon, lat) tuples representing the way geometry.

  • tags – Dictionary of OSM tags (name, highway type, etc.).

coordinates: list[tuple[float, float]]
routes: list[str]
tags: dict[str, str]
way_id: int