Geo Sampling Documentation¶
Randomly sample locations on streets for data collection and research
Geo-sampling is a Python package that helps researchers randomly sample street locations for data collection. Whether you’re studying potholes, street conditions, or conducting urban research, this package provides a systematic approach to selecting representative road segments from OpenStreetMap data.
Features¶
✨ Simple CLI & Python API - Easy to use from command line or Python scripts 🌍 Global Coverage - Works with any country/region via OpenStreetMap 📊 Multiple Sampling Strategies - Random, stratified, and filtered sampling 🎯 Road Type Filtering - Focus on specific road types (highways, residential, etc.) 📈 Built-in Visualization - Plot samples on maps for validation 💾 CSV Export - Standard output format for analysis tools
Quick Start¶
Get started in 5 minutes with the complete workflow:
# Install the package
pip install geo-sampling
# Sample 100 road segments from Singapore
geo-sampling workflow "Singapore" "Central" \
--sample-size 100 \
--output singapore_sample.csv \
--plot
Sampling Strategy¶
This package implements a systematic approach to sampling street locations for data collection. The strategy ensures representative coverage of road networks for research purposes.
1. Sampling Frame¶
Get all the streets in the region of interest from OpenStreetMap. The package:
Downloads administrative boundary data from GADM in ESRI format
Identifies the geographic bounds of your region of interest
Extracts road data from BBBike.org for the bounded area
Processes the road network into manageable segments
Administrative levels are hierarchical - cities are nested in states, which are nested in countries. You can sample at any administrative level depending on your research needs.
2. Sampling Design¶
Road Segmentation¶
Each street is split into 500-meter segments from end to end
Shorter streets (< 500m) remain as single segments
Each segment is treated as a straight line between start and end points
Segments maintain OpenStreetMap metadata (road type, name, ID)
Road Type Classification¶
The package preserves OpenStreetMap road classifications:
trunk: National highways and major arterials
primary: Major roads connecting cities/towns
secondary: Important roads for regional traffic
tertiary: Roads connecting smaller settlements
residential: Roads in residential areas
unclassified: Minor public roads
service: Access roads to buildings/areas
Sampling Methods¶
Random sampling: Equal probability selection across all segments
Stratified sampling: Maintains proportional representation of road types
Filtered sampling: Restricts sampling to specific road types
Length-based sampling: Target specific total coverage distances
3. Data Collection Framework¶
The output provides GPS coordinates and metadata for each sampled segment:
Start/end coordinates: Precise lat/long boundaries for data collection
Road metadata: Type, name, and OpenStreetMap ID for context
Segment ID: Unique identifier for tracking and quality control
These coordinates define the geographic areas where field data collection should occur, ensuring systematic coverage of the road network.
Documentation Sections¶
Example Gallery¶
🐍 Python API Examples
Learn the programmatic interface with complete examples
💻 CLI Usage Guide
Master the command-line interface for batch processing
🎯 Advanced Sampling
Sophisticated sampling strategies for complex research designs
📚 API Reference
Complete documentation of classes and functions
Support¶
📖 Documentation: You’re reading it!
🐛 Issues: GitHub Issues
💬 Discussions: GitHub Discussions
License¶
Released under the MIT License.
Built with ❤️ by Suriyan Laohaprapanon and Gaurav Sood