allocator package

Subpackages

Submodules

allocator.cluster module

Cluster geographic data points.

Args:

data: Input data (file path, DataFrame, numpy array, or list) n_clusters: Number of clusters to create method: Clustering method (‘kmeans’, ‘kahip’) distance: Distance metric (‘euclidean’, ‘haversine’, ‘osrm’, ‘google’) random_state: Random seed for reproducibility **kwargs: Additional arguments for specific methods

Returns:

ClusterResult with labels, centroids, and metadata

Example:
>>> result = cluster('data.csv', n_clusters=5, method='kmeans')
>>> print(result.labels)  # Cluster assignments
>>> print(result.centroids)  # Cluster centers

allocator.cluster_haversine module

allocator.cluster_kahip module

allocator.cluster_kmeans module

allocator.compare_buffoon_kmeans module

allocator.distance_matrix module

allocator.mst module

allocator.shortest_path_gm module

allocator.shortest_path_mst_tsp module

allocator.shortest_path_ortools module

allocator.shortest_path_osrm module

allocator.sort_by_distance module

Sort points by distance to workers or vice versa.

Args:

points: Geographic points to assign (file path, DataFrame, numpy array, or list) workers: Worker/centroid locations (file path, DataFrame, numpy array, or list) by_worker: If True, sort points by worker; if False, sort workers by point distance: Distance metric (‘euclidean’, ‘haversine’, ‘osrm’, ‘google’) **kwargs: Additional distance-specific arguments

Returns:

SortResult with assignments and distance information

Example:
>>> result = sort_by_distance('points.csv', 'workers.csv')
>>> print(result.data)  # Points with worker assignments

allocator.utils module

allocator.utils.column_exists(df: DataFrame, col: str) bool

Check the column name exists in the DataFrame.

Args:

df: Pandas DataFrame. col: Column name.

Returns:

True if exists, False if not exists.

allocator.utils.fixup_columns(cols: list[str | int]) list[str]

Replace index location column to name with col prefix.

Args:

cols: List of original columns

Returns:

List of column names

Module contents

Allocator package for clustering and routing optimization.

Modern Pythonic API for geographic task allocation, clustering, and routing.

class allocator.ClusterResult(labels: ndarray, centroids: ndarray, n_iter: int, inertia: float | None, data: DataFrame, converged: bool, metadata: dict[str, Any])

Bases: object

Result of clustering operation.

centroids: ndarray
converged: bool
data: DataFrame
inertia: float | None
labels: ndarray
metadata: dict[str, Any]
n_iter: int
class allocator.ComparisonResult(results: dict[str, ClusterResult], statistics: DataFrame, metadata: dict[str, Any])

Bases: object

Result of algorithm comparison.

metadata: dict[str, Any]
results: dict[str, ClusterResult]
statistics: DataFrame
class allocator.RouteResult(route: list[int], total_distance: float, data: DataFrame, metadata: dict[str, Any])

Bases: object

Result of shortest path operation.

data: DataFrame
metadata: dict[str, Any]
route: list[int]
total_distance: float
class allocator.SortResult(data: DataFrame, distance_matrix: ndarray | None, metadata: dict[str, Any])

Bases: object

Result of sort by distance operation.

data: DataFrame
distance_matrix: ndarray | None
metadata: dict[str, Any]
allocator.assign_to_closest(points: str | DataFrame | ndarray | list, workers: str | DataFrame | ndarray | list, distance: str = 'euclidean', **kwargs) SortResult

Assign each point to its closest worker.

Args:

points: Geographic points to assign workers: Worker/centroid locations distance: Distance metric **kwargs: Additional distance-specific arguments

Returns:

SortResult with point assignments

allocator.cluster(data: str | DataFrame | ndarray | list, n_clusters: int = 3, method: str = 'kmeans', distance: str = 'euclidean', random_state: int | None = None, **kwargs) ClusterResult

Cluster geographic data points.

Args:

data: Input data (file path, DataFrame, numpy array, or list) n_clusters: Number of clusters to create method: Clustering method (‘kmeans’, ‘kahip’) distance: Distance metric (‘euclidean’, ‘haversine’, ‘osrm’, ‘google’) random_state: Random seed for reproducibility **kwargs: Additional arguments for specific methods

Returns:

ClusterResult with labels, centroids, and metadata

Example:
>>> result = cluster('data.csv', n_clusters=5, method='kmeans')
>>> print(result.labels)  # Cluster assignments
>>> print(result.centroids)  # Cluster centers
allocator.get_logger(name)

Get a logger instance for a specific module.

Args:

name: Module name (typically __name__)

Returns:

Logger instance

allocator.kahip(data: DataFrame | ndarray, n_clusters: int = 3, distance: str = 'euclidean', n_closest: int = 15, balance_edges: bool = False, buffoon: bool = False, random_state: int | None = None, **kwargs) ClusterResult

KaHIP graph partitioning clustering.

Args:

data: Input data as DataFrame or numpy array n_clusters: Number of clusters distance: Distance metric n_closest: Number of closest neighbors to connect in graph balance_edges: Whether to balance edge weights buffoon: Whether to use buffoon mode random_state: Random seed for reproducibility **kwargs: Additional arguments

Returns:

ClusterResult with clustering information

allocator.kmeans(data: DataFrame | ndarray | list, n_clusters: int = 3, distance: str = 'euclidean', max_iter: int = 300, random_state: int | None = None, **kwargs) ClusterResult

K-means clustering of geographic data.

Args:

data: Input data as DataFrame or numpy array n_clusters: Number of clusters distance: Distance metric (‘euclidean’, ‘haversine’, ‘osrm’, ‘google’) max_iter: Maximum iterations random_state: Random seed for reproducibility **kwargs: Additional distance-specific arguments

Returns:

ClusterResult with clustering information

allocator.setup_logging(level=20)

Set up logging configuration for the allocator package.

Args:

level: Logging level (DEBUG, INFO, WARNING, ERROR)

allocator.shortest_path(data: str | DataFrame | ndarray | list, method: str = 'christofides', distance: str = 'euclidean', **kwargs) RouteResult

Find shortest path through geographic points (TSP).

Args:

data: Input data (file path, DataFrame, numpy array, or list) method: TSP solving method (‘christofides’, ‘ortools’, ‘osrm’, ‘google’) distance: Distance metric (‘euclidean’, ‘haversine’, ‘osrm’, ‘google’) **kwargs: Additional method-specific arguments

Returns:

RouteResult with optimal route and total distance

Example:
>>> result = shortest_path('points.csv', method='ortools')
>>> print(result.route)  # Optimal visiting order
>>> print(result.total_distance)  # Total route distance
allocator.sort_by_distance(points: str | DataFrame | ndarray | list, workers: str | DataFrame | ndarray | list, by_worker: bool = False, distance: str = 'euclidean', **kwargs) SortResult

Sort points by distance to workers or vice versa.

Args:

points: Geographic points to assign (file path, DataFrame, numpy array, or list) workers: Worker/centroid locations (file path, DataFrame, numpy array, or list) by_worker: If True, sort points by worker; if False, sort workers by point distance: Distance metric (‘euclidean’, ‘haversine’, ‘osrm’, ‘google’) **kwargs: Additional distance-specific arguments

Returns:

SortResult with assignments and distance information

Example:
>>> result = sort_by_distance('points.csv', 'workers.csv')
>>> print(result.data)  # Points with worker assignments
allocator.tsp_christofides(data: DataFrame | ndarray, distance: str = 'euclidean', **kwargs) RouteResult

Solve TSP using Christofides algorithm (approximate).

Args:

data: Input data as DataFrame or numpy array distance: Distance metric **kwargs: Additional arguments

Returns:

RouteResult with approximate optimal route

allocator.tsp_google(data: DataFrame | ndarray, api_key: str, **kwargs) RouteResult

Solve TSP using Google Maps Directions API.

Args:

data: Input data as DataFrame or numpy array api_key: Google Maps API key **kwargs: Additional arguments

Returns:

RouteResult with route using Google’s road network

allocator.tsp_ortools(data: DataFrame | ndarray, distance: str = 'euclidean', **kwargs) RouteResult

Solve TSP using Google OR-Tools (exact for small problems).

Args:

data: Input data as DataFrame or numpy array distance: Distance metric **kwargs: Additional arguments

Returns:

RouteResult with optimal route

allocator.tsp_osrm(data: DataFrame | ndarray, osrm_base_url: str | None = None, **kwargs) RouteResult

Solve TSP using OSRM trip service.

Args:

data: Input data as DataFrame or numpy array osrm_base_url: Custom OSRM server URL **kwargs: Additional arguments

Returns:

RouteResult with route using real road network