Quick Start¶

Get started with geo-sampling in 5 minutes! This guide shows you how to extract and sample road segments using both the command-line interface and Python API.

Command-Line Interface (CLI)¶

Complete Workflow in One Command¶

Extract roads and create a sample for Singapore in a single command:

geo-sampling workflow "Singapore" "Central" \
    --sample-size 100 \
    --output singapore_sample.csv \
    --plot

Step-by-Step Approach¶

For more control, use the step-by-step approach:

# 1. Extract all roads
geo-sampling extract "India" "NCT of Delhi" \
    --output delhi_roads.csv

# 2. Create a random sample
geo-sampling sample delhi_roads.csv \
    --sample-size 1000 \
    --strategy random \
    --output delhi_sample.csv \
    --plot

# 3. Get information about a region
geo-sampling info "Thailand" "Bangkok"

Python API¶

One-Liner Convenience Function¶

import geo_sampling as gs

# Quick sampling for research
sample = gs.sample_roads_for_region(
    "Singapore", "Central",
    n=100,
    strategy="random",
    seed=42
)

# Plot the results
gs.quick_plot(sample, title="Singapore Road Sample")

Step-by-Step with Full Control¶

import geo_sampling as gs

# Extract roads from a region
extractor = gs.RoadExtractor("India", "NCT of Delhi")
roads = extractor.get_roads(road_types=["primary", "secondary"])

# Create sampler and generate sample
sampler = gs.RoadSampler(roads)
sample = sampler.random_sample(1000, seed=42)

# Save and visualize
sampler.save_csv(sample, "delhi_sample.csv")
gs.plot_road_segments(sample, title="Delhi Road Sample")

What’s Next?¶

📖 Check out detailed examples for more complex use cases
🔧 Learn about advanced sampling strategies
📚 Browse the API reference for complete documentation

Understanding the Output¶

CSV File Structure¶

The output CSV contains these columns:

Column	Description
`segment_id`	Unique identifier for each road segment
`osm_id`	OpenStreetMap way ID
`osm_name`	Road name from OpenStreetMap
`osm_type`	Road type (primary, secondary, residential, etc.)
`start_lat`, `start_long`	Starting coordinates of segment
`end_lat`, `end_long`	Ending coordinates of segment

Sample Data¶

Here’s what a few rows look like:

segment_id,osm_id,osm_name,osm_type,start_lat,start_long,end_lat,end_long
1,way_123,Orchard Road,primary,1.3048,103.8318,1.3052,103.8322
2,way_124,Marina Bay Drive,trunk,1.2966,103.8558,1.2970,103.8562
3,way_125,Residential Street,residential,1.3100,103.8400,1.3104,103.8404

Working with Sample Data¶

Load and analyze your samples:

import geo_sampling as gs
import pandas as pd

# Load the CSV back into Python
segments = gs.load_segments_from_csv("singapore_sample.csv")
print(f"Loaded {len(segments)} segments")

# Convert to pandas DataFrame for analysis (optional)
sampler = gs.RoadSampler(segments)
df = sampler.to_dataframe()

# Analyze road type distribution
road_type_counts = df['osm_type'].value_counts()
print("Road type distribution:")
print(road_type_counts)

Common Workflows¶

Research Study Design¶

import geo_sampling as gs

# 1. Get road summary to plan sample size
summary = gs.get_road_summary("Thailand", "Bangkok")
print(f"Total roads available: {summary['total_segments']:,}")
print("Road types:", list(summary['road_types']))

# 2. Extract and sample with stratification
sample = gs.sample_roads_for_region(
    "Thailand", "Bangkok",
    n=500,  # Sample size
    strategy="stratified",  # Maintain road type proportions
    road_types=["primary", "secondary", "tertiary"],  # Focus on major roads
    seed=42  # Reproducible results
)

# 3. Export for field work
sampler = gs.RoadSampler(sample)
sampler.save_csv(sample, "bangkok_fieldwork_sample.csv")

# 4. Create field maps
gs.plot_road_segments(sample, title="Bangkok Field Study Sites")

Batch Processing Multiple Regions¶

#!/bin/bash
# Process multiple regions

REGIONS=("Bangkok" "Chiang Mai" "Phuket")

for region in "${REGIONS[@]}"; do
    echo "Processing $region..."

    geo-sampling workflow "Thailand" "$region" \
        --sample-size 200 \
        --strategy stratified \
        --output "${region,,}_sample.csv" \
        --plot
done

Tips for Success¶

1. Start Small¶

Begin with small administrative areas to test your workflow before scaling up.

2. Check Data Quality¶

Always inspect a few segments manually:

# Look at first few segments
for i, seg in enumerate(sample[:3]):
    print(f"Segment {i+1}: {seg.osm_name} ({seg.osm_type})")
    print(f"  From: {seg.start_lat:.4f}, {seg.start_long:.4f}")
    print(f"  To: {seg.end_lat:.4f}, {seg.end_long:.4f}")

3. Validate Geographic Bounds¶

Plot samples to ensure they cover your intended study area:

gs.quick_plot(sample, title="Sample Coverage Check")

4. Document Your Methodology¶

Save your sampling parameters for reproducibility:

import json

metadata = {
    "country": "Singapore",
    "region": "Central",
    "sample_size": len(sample),
    "strategy": "stratified",
    "road_types": ["primary", "secondary"],
    "seed": 42,
    "date_created": "2024-01-15"
}

with open("sample_metadata.json", "w") as f:
    json.dump(metadata, f, indent=2)

Troubleshooting¶

Empty results: Check that your region name matches exactly what’s in GADM. Use geo-sampling info to verify.

Too many segments: Use road type filtering or smaller administrative areas to reduce the sampling frame.

Plotting issues: Install matplotlib if you get visualization errors: pip install matplotlib

Next Steps¶

Ready for more advanced usage? Check out: