Contributing to Allocator¶

Thank you for your interest in contributing to allocator! This guide will help you get started.

Development Setup¶

Prerequisites¶

Python 3.11 or higher
Git
UV (recommended) or pip

Installation¶

Clone the repository

git clone https://github.com/geosensing/allocator.git
cd allocator

Install dependencies

# With UV (recommended)
uv sync --all-extras

# Or with pip
pip install -e ".[complete]"

Set up pre-commit hooks

pre-commit install

Code Standards¶

Linting and Formatting¶

We use modern Python tooling:

# Linting and formatting with ruff
uv run ruff check .
uv run ruff format .

# Type checking with mypy  
uv run mypy allocator/

# Traditional tools (if preferred)
uv run black .
uv run isort .

Testing¶

# Run all tests
uv run pytest

# Run with coverage
uv run pytest --cov=allocator --cov-report=html

# Run specific test files
uv run pytest tests/api/test_cluster_api.py

Documentation¶

# Build documentation locally
cd docs/
make html

# View documentation
open build/html/index.html

Contribution Workflow¶

Fork the repository on GitHub
Create a feature branch from main

git checkout -b feature/your-feature-name

Make your changes following our code standards
Add tests for new functionality
Update documentation as needed
Run the full test suite

uv run pytest
uv run ruff check .
uv run mypy allocator/

Commit with clear messages

git commit -m "Add clustering algorithm validation

- Add parameter validation for n_clusters
- Add tests for edge cases
- Update documentation examples"

Push and create a Pull Request

Code Style Guidelines¶

Python Code¶

Follow PEP 8 style guidelines
Use type hints for all function signatures
Write docstrings in Google/NumPy style
Maximum line length: 100 characters
Use meaningful variable and function names

Example Function¶

def cluster_locations(
    data: pd.DataFrame,
    n_clusters: int,
    distance: str = "haversine",
    random_state: Optional[int] = None,
) -> ClusterResult:
    """Cluster geographic locations into balanced groups.
    
    Args:
        data: DataFrame with longitude/latitude columns
        n_clusters: Number of clusters to create
        distance: Distance calculation method
        random_state: Random seed for reproducible results
        
    Returns:
        ClusterResult object with cluster assignments and metadata
        
    Raises:
        ValueError: If n_clusters is invalid
        KeyError: If required columns are missing
    """

Documentation¶

Write clear, concise documentation
Include practical examples for all public functions
Update relevant guides when adding features
Use markdown for all new documentation

Testing Guidelines¶

Test Structure¶

tests/
├── api/              # API function tests
├── core/             # Core algorithm tests  
├── distances/        # Distance calculation tests
├── integration/      # End-to-end tests
└── benchmarks/       # Performance tests

Writing Tests¶

import pytest
import pandas as pd
from allocator import cluster

def test_cluster_basic():
    """Test basic clustering functionality."""
    data = pd.DataFrame({
        'longitude': [100.5, 100.6, 100.7],
        'latitude': [13.7, 13.8, 13.9],
    })
    
    result = cluster(data, n_clusters=2)
    
    assert result['n_clusters'] == 2
    assert 'cluster' in result['data'].columns
    assert len(result['data']) == 3
    assert set(result['data']['cluster']) == {0, 1}

def test_cluster_invalid_input():
    """Test error handling for invalid input."""
    data = pd.DataFrame({'invalid': [1, 2, 3]})
    
    with pytest.raises(KeyError, match="longitude"):
        cluster(data, n_clusters=2)

Documentation Guidelines¶

API Documentation¶

All public functions must have complete docstrings
Include parameter types, return values, and examples
Document error conditions and edge cases

Examples¶

Provide realistic, practical examples
Use real-world data when possible
Include expected output
Show both basic and advanced usage

Guides¶

Write step-by-step tutorials for complex workflows
Include troubleshooting sections
Link related concepts and functions

Release Process¶

Version Numbering¶

We follow semantic versioning (SemVer):

Major (1.0.0): Breaking changes
Minor (1.1.0): New features, backward compatible
Patch (1.0.1): Bug fixes, backward compatible

Preparing a Release¶

Update version in pyproject.toml
Update CHANGELOG.md with release notes
Ensure all tests pass and documentation is current
Create release PR for review
Tag release after merge: git tag v1.1.0
Release to PyPI is automated via GitHub Actions

Getting Help¶

Questions: GitHub Discussions
Issues: GitHub Issues
Documentation: docs.allocator.ai
Examples: See examples/ directory

Areas for Contribution¶

We welcome contributions in these areas:

Core Features¶

New clustering algorithms
Additional routing methods
Performance optimizations
Better error handling

Documentation¶

More real-world examples
Tutorial improvements
API documentation enhancements
Translation support

Testing¶

Edge case coverage
Performance benchmarks
Integration tests
Property-based testing

Tooling¶

CLI improvements
Visualization enhancements
IDE integrations
CI/CD optimizations

Thank you for contributing to make allocator better for everyone!