Distance Calculation API¶
Distance matrix computation and geographic distance calculations.
Main Functions¶
- allocator.get_distance_matrix(points_from: ndarray, points_to: ndarray | None = None, method: str = 'euclidean', on_progress: Callable[[int, int, str | None], None] | None = None, **kwargs) ndarray[source]¶
Single entry point for all distance calculations.
- Parameters:
points_from – Source points (numpy array with shape [n, 2] where columns are [lon, lat])
points_to – Destination points (optional, defaults to points_from)
method – Distance calculation method: - ‘euclidean’: Local Euclidean distance - ‘haversine’: Great-circle distance - ‘osrm’: OSRM routing service (duration in seconds) - ‘google’: Legacy Google Distance Matrix API (deprecated, use ‘google_routes’) - ‘google_routes’: Google Routes API (recommended)
on_progress – Optional callback for progress reporting (current, total, message)
**kwargs –
Method-specific arguments: - api_key: Required for ‘google’ method (legacy) - credentials_file: Path to service account JSON for ‘google_routes’ method - osrm_base_url: Custom OSRM server URL for ‘osrm’ method - osrm_max_table_size: Chunk size for OSRM requests (default: 100) - duration: For Google methods, return duration instead of distance (default: True) - travel_mode: For ‘google_routes’, one of “DRIVE”, “BICYCLE”, “WALK”,
”TWO_WHEELER”, “TRANSIT” (default: “DRIVE”)
- Returns:
Distance matrix as numpy array with shape [len(points_from), len(points_to)]
- Raises:
ValueError – For invalid method or missing required parameters
Distance Modules¶
Factory Module¶
Factory module for distance calculations - main entry point.
- allocator.distances.factory.get_distance_matrix(points_from: ndarray, points_to: ndarray | None = None, method: str = 'euclidean', on_progress: Callable[[int, int, str | None], None] | None = None, **kwargs) ndarray[source]¶
Single entry point for all distance calculations.
- Parameters:
points_from – Source points (numpy array with shape [n, 2] where columns are [lon, lat])
points_to – Destination points (optional, defaults to points_from)
method – Distance calculation method: - ‘euclidean’: Local Euclidean distance - ‘haversine’: Great-circle distance - ‘osrm’: OSRM routing service (duration in seconds) - ‘google’: Legacy Google Distance Matrix API (deprecated, use ‘google_routes’) - ‘google_routes’: Google Routes API (recommended)
on_progress – Optional callback for progress reporting (current, total, message)
**kwargs –
Method-specific arguments: - api_key: Required for ‘google’ method (legacy) - credentials_file: Path to service account JSON for ‘google_routes’ method - osrm_base_url: Custom OSRM server URL for ‘osrm’ method - osrm_max_table_size: Chunk size for OSRM requests (default: 100) - duration: For Google methods, return duration instead of distance (default: True) - travel_mode: For ‘google_routes’, one of “DRIVE”, “BICYCLE”, “WALK”,
”TWO_WHEELER”, “TRANSIT” (default: “DRIVE”)
- Returns:
Distance matrix as numpy array with shape [len(points_from), len(points_to)]
- Raises:
ValueError – For invalid method or missing required parameters
Euclidean Distance¶
Euclidean distance calculations for geographic coordinates.
- allocator.distances.euclidean.euclidean_distance_matrix(points_from: ndarray, points_to: ndarray | None = None) ndarray[source]¶
Calculate euclidean distance matrix between two sets of lat/lon points.
- Parameters:
points_from – Source points (numpy array with shape [n, 2] where columns are [lon, lat])
points_to – Destination points (optional, defaults to points_from)
- Returns:
Distance matrix as numpy array with shape [len(points_from), len(points_to)]
- allocator.distances.euclidean.latlon2xy(lat: float, lon: float) list[float][source]¶
Transform lat/lon to UTM coordinate
- Parameters:
lat (float) – WGS latitude
lon (float) – WGS longitude
- Returns:
UTM x, y coordinate
- Return type:
[x, y]
- allocator.distances.euclidean.pairwise_distances(X: ndarray, Y: ndarray | None = None) ndarray[source]¶
Pairwise euclidean distance calculation.
- allocator.distances.euclidean.xy2latlog(x: float, y: float, zone_number: int, zone_letter: str | None = None) tuple[float, float][source]¶
Transform x, y coordinate to lat/lon coordinate
- Parameters:
x (float) – UTM x coordinate
y (float) – UTM y coordinate
zone_number (int) – UTM zone number
zone_letter (str, optional) – UTM zone letter. Defaults to None.
- Returns:
WGS latitude, longitude
- Return type:
(lat, lon)
Haversine Distance¶
Haversine distance calculations for geographic coordinates.
Uses Numba JIT compilation for high performance distance matrix calculations.
- allocator.distances.haversine.haversine_distance_matrix(points_from: ndarray, points_to: ndarray | None = None) ndarray[source]¶
Calculate haversine distance matrix between two sets of lat/lon points.
- Parameters:
points_from – Source points (numpy array with shape [n, 2] where columns are [lon, lat])
points_to – Destination points (optional, defaults to points_from)
- Returns:
Distance matrix as numpy array with shape [len(points_from), len(points_to)] Values are in meters.
External APIs¶
External API integrations for distance calculations (OSRM, Google Maps).
- allocator.distances.external_apis.google_distance_matrix(X: ndarray, Y: ndarray | None = None, api_key: str | None = None, duration: bool = True, on_progress: Callable[[int, int, str | None], None] | None = None) ndarray[source]¶
Calculate distance matrix using Google Distance Matrix API.
Limitations:
Users of the standard API: * 2,500 free elements per day, calculated as the sum of client-side and
server-side queries.
Maximum of 25 origins or 25 destinations per request.
100 elements per request.
100 elements per second, calculated as the sum of client-side and server-side queries.
For more informaton:- https://developers.google.com/maps/documentation/distance-matrix/usage-limits
- allocator.distances.external_apis.google_routes_distance_matrix(X: ndarray, Y: ndarray | None = None, credentials_file: str | None = None, duration: bool = True, travel_mode: str = 'DRIVE', on_progress: Callable[[int, int, str | None], None] | None = None) ndarray[source]¶
Calculate distance matrix using Google Routes API (official client).
This uses the official google-maps-routing library which handles rate limiting and retries automatically.
Authentication: Set GOOGLE_APPLICATION_CREDENTIALS environment variable to your service account JSON file path, or pass credentials_file parameter.
To set up: 1. Create a service account in Google Cloud Console 2. Download the JSON key file 3. Enable “Routes API” for your project 4. Either set GOOGLE_APPLICATION_CREDENTIALS=/path/to/key.json
or pass credentials_file=”/path/to/key.json”
- Parameters:
X – Origin coordinates as numpy array with shape [n, 2] where columns are [lon, lat]
Y – Destination coordinates (optional, defaults to X)
credentials_file – Path to service account JSON file (optional if GOOGLE_APPLICATION_CREDENTIALS is set)
duration – If True, return duration in seconds; if False, return distance in meters
travel_mode – One of “DRIVE”, “BICYCLE”, “WALK”, “TWO_WHEELER”, “TRANSIT”
on_progress – Optional callback for progress reporting
- Returns:
Distance matrix as numpy array with shape [len(X), len(Y)]
- allocator.distances.external_apis.osrm_distance_matrix(X: ndarray, Y: ndarray | None = None, chunksize: int = 100, osrm_base_url: str | None = None, on_progress: Callable[[int, int, str | None], None] | None = None) ndarray[source]¶
Calculate distance matrix of arbitrary size using OSRM
Credits: https://github.com/stepankuzmin/distance-matrix
Please note that OSRM distance matrix is in duration in seconds.
Usage Examples¶
Basic Distance Matrix¶
import pandas as pd
import allocator
# Geographic locations
locations = pd.DataFrame({
'longitude': [100.5, 100.6, 100.7],
'latitude': [13.7, 13.8, 13.9],
})
# Calculate distance matrix
distances = allocator.get_distance_matrix(
data=locations,
method='haversine'
)
print(f"Distance matrix shape: {distances.shape}")
print(f"Distance from point 0 to point 1: {distances[0,1]:.2f}km")
Performance Comparison¶
import time
methods = ['euclidean', 'haversine']
for method in methods:
start = time.time()
distances = allocator.get_distance_matrix(locations, method=method)
elapsed = time.time() - start
print(f"{method}: {elapsed:.3f}s")
Distance Methods¶
Euclidean Distance¶
Method:
euclideanFormula: Straight-line distance in Cartesian coordinates
Speed: Fastest
Accuracy: Approximate for geographic coordinates
Best for: Quick estimates, large datasets, non-geographic data
# Fast approximate distances
distances = allocator.get_distance_matrix(data, method='euclidean')
Haversine Distance¶
Method:
haversineFormula: Great-circle distance on Earth’s surface
Speed: Fast
Accuracy: High for geographic coordinates
Best for: Most geographic applications (recommended)
# Accurate geographic distances
distances = allocator.get_distance_matrix(data, method='haversine')
OSRM Driving Distance¶
Method:
osrmData source: OpenStreetMap road networks
Speed: Moderate (API calls)
Accuracy: Realistic driving distances
Requirements: Internet connection, OSRM server access
# Realistic driving distances
distances = allocator.get_distance_matrix(data, method='osrm')
Google Maps Distance¶
Method:
googlemapsData source: Google Maps road data
Speed: Slow (API calls, rate limits)
Accuracy: High-quality driving distances
Requirements: Google Maps API key, billing account
# High-quality driving distances (requires API key)
distances = allocator.get_distance_matrix(
data,
method='googlemaps',
api_key='your_api_key'
)
API Configuration¶
OSRM Configuration¶
# Custom OSRM server
distances = allocator.get_distance_matrix(
data,
method='osrm',
osrm_url='http://your-osrm-server:5000'
)
Google Maps Configuration¶
# Google Maps with API key
import os
os.environ['GOOGLEMAPS_API_KEY'] = 'your_api_key'
distances = allocator.get_distance_matrix(data, method='googlemaps')
Return Format¶
All distance methods return a NumPy array where:
distances[i,j]= distance from point i to point jDiagonal elements are 0 (distance from point to itself)
Matrix is symmetric for undirected distances
Units are always kilometers
Performance Characteristics¶
Method |
Speed |
Accuracy |
Use Case |
|---|---|---|---|
euclidean |
★★★★★ |
★★☆☆☆ |
Quick estimates, large datasets |
haversine |
★★★★☆ |
★★★★★ |
Geographic analysis (recommended) |
osrm |
★★★☆☆ |
★★★★☆ |
Realistic routing |
googlemaps |
★☆☆☆☆ |
★★★★★ |
Production routing applications |
Integration with Other Functions¶
Distance methods are used automatically by other allocator functions:
# Clustering with specific distance method
clusters = allocator.cluster(data, distance='haversine')
# Routing with specific distance method
route = allocator.shortest_path(data, distance='euclidean')
# Assignment with specific distance method
assignments = allocator.assign(points, workers, distance='osrm')
Error Handling¶
try:
distances = allocator.get_distance_matrix(data, method='googlemaps')
except Exception as e:
print(f"API error: {e}")
# Fallback to local method
distances = allocator.get_distance_matrix(data, method='haversine')
Advanced Usage¶
Custom Distance Functions¶
def custom_manhattan_distance(data):
"""Manhattan distance in kilometers."""
# Convert to UTM coordinates for accurate metric calculation
# ... custom implementation ...
pass
# Use custom function with clustering
clusters = allocator.cluster(data, distance=custom_manhattan_distance)
Batch Processing¶
# Process large datasets in chunks
def batch_distance_matrix(data, chunk_size=100):
"""Calculate distance matrix in batches."""
n = len(data)
distances = np.zeros((n, n))
for i in range(0, n, chunk_size):
chunk = data.iloc[i:i+chunk_size]
chunk_distances = allocator.get_distance_matrix(chunk)
distances[i:i+len(chunk), i:i+len(chunk)] = chunk_distances
return distances