A reusable geospatial analysis framework for Point-of-Interest (POI) clustering, market saturation analysis, and placement optimization.
Forked from opengeos/python-geospatial and adapted as a Decision Support System for location-based businesses.
Location Intelligence Engine transforms raw POI (Point-of-Interest) data into strategic, actionable business insights.
Works with any dataset shaped like this:
- 🏥 Clinics → Identify underserved healthcare areas
- 🏪 Retail stores → Find competitive saturation zones
- ⚡ EV chargers → Optimize charging network placement
- 📚 Schools → Plan educational infrastructure
- 🏨 Hotels → Analyze hospitality market competition
- 🍕 Restaurants → Study dining density patterns
- Any POI type → Applicable to your use case
Core capabilities:
- 🗺️ Spatial preprocessing - Clean, validate, normalize geographic data
- 📍 DBSCAN clustering - Identify geographic concentration zones
- 📊 Market saturation analysis - Competition intensity by location & category
- 🎯 Placement optimization - Recommend best locations for new facilities
- 📈 Decision support - Actionable metrics for business planning
- 🖼️ Interactive visualization - Publication-ready HTML maps + PNG charts
location-intelligence-engine/
├── geo_preprocessing/ # Data cleaning & validation
│ └── clean_poi_data.py
├── spatial_analysis/ # Clustering & distance metrics
│ ├── spatial_clustering.py
│ └── accessibility_analysis.py
├── decision_support/ # Business intelligence layer
│ ├── market_saturation.py
│ ├── competition_analysis.py
│ └── placement_optimizer.py
├── visualization/ # Maps & charts
│ ├── create_heatmaps.py
│ └── generate_charts.py
├── examples/
│ ├── clinic_analysis/ # Healthcare use case
│ ├── template.ipynb # Generic template for any POI
│ └── data/
├── binder/ # Cloud-ready notebooks
└── README.md
Want to see it in action? Check out 4 complete examples across different industries:
- 🏥 Healthcare Clinic Analysis - 106 medical facilities, patient accessibility
- 🏪 Retail Store Analysis - 14 retail locations, market saturation
- ⚡ EV Charging Analysis - 12 charging stations, network coverage
- 🍕 Restaurants Analysis - 15 food venues, competitive dynamics
All use the same framework! Just swap your dataset. Start with any example that matches your domain: 📖 Examples Guide
git clone https://github.com/yourusername/location-intelligence-engine.git
cd location-intelligence-engine/binder/
conda env create -f environment.yml
conda activate geoaiRequired columns:
name, category, latitude, longitude, rating (optional), review_count (optional)Example:
"Clinic A", "Medical Center", 5.234, 100.523, 4.5, 120
"Dental Clinic B", "Dental", 5.240, 100.515, 4.2, 85from geo_preprocessing import clean_poi_data
from spatial_analysis import spatial_clustering
from decision_support import market_saturation, placement_optimizer
# Load & clean data
poi_data = clean_poi_data.load_and_validate("your_data.csv")
# Cluster locations
clusters = spatial_clustering.dbscan_cluster(poi_data, eps_km=3, min_points=2)
# Analyze market
saturation = market_saturation.analyze(poi_data, clusters)
underserved = placement_optimizer.find_opportunities(poi_data, clusters)
# Generate outputs
visualization.create_heatmap(poi_data, clusters, output="maps/")
visualization.generate_charts(saturation, output="charts/")Any dataset with these columns works with this engine:
| Column | Type | Required | Example |
|---|---|---|---|
name |
string | ✅ | "City Clinic" |
category |
string | ✅ | "Medical Center" |
latitude |
float | ✅ | 5.2345 |
longitude |
float | ✅ | 100.5234 |
rating |
float | ⭕ | 4.5 |
review_count |
int | ⭕ | 120 |
address |
string | ⭕ | "123 Main St" |
Output Contract:
All modules return standardized DataFrames:
# Clustering output
clusters: GeoDataFrame with columns
├── name, category, latitude, longitude
├── cluster_id # DBSCAN cluster assignment
├── nearest_distance # km to nearest facility
└── access_score # 0-100 accessibility metric
# Market analysis output
saturation: DataFrame with columns
├── category
├── count # Number of POIs in category
├── market_saturation # % of market
├── competition_level # Low/Medium/High
└── recommendation # ✅ / ⚠️ / 🔴Data cleaning & validation
from geo_preprocessing import clean_poi_data
# Load CSV with validation
gdf = clean_poi_data.load_and_validate("data.csv")
# Clean specific issues
gdf = clean_poi_data.remove_duplicates(gdf)
gdf = clean_poi_data.validate_coordinates(gdf)
gdf = clean_poi_data.remove_outliers(gdf)Handles:
- Missing/invalid coordinates
- Duplicate entries
- Invalid lat/lon ranges
- Category standardization
from spatial_analysis import spatial_clustering
# DBSCAN clustering (configurable)
clusters = spatial_clustering.dbscan_cluster(
gdf,
eps_km=3, # Cluster radius (km)
min_points=2 # Min POIs per cluster
)
# Accessibility metrics
accessibility = spatial_clustering.calculate_accessibility(
gdf,
service_radius_km=2 # What's "accessible"?
)Outputs:
- Cluster assignments
- Distance to nearest POI
- Coverage gaps (3km+ underserved areas)
from spatial_analysis import accessibility_analysis
# Identify underserved locations
gaps = accessibility_analysis.find_coverage_gaps(
gdf,
radius_km=3
)
# Grid-based heatmap
heatmap_data = accessibility_analysis.create_coverage_grid(
gdf,
grid_size_meters=500
)from decision_support import market_saturation
# By category
saturation_by_type = market_saturation.analyze_by_category(gdf)
# Output: competition_level, saturation_%, recommendation
# By geography
saturation_by_zone = market_saturation.analyze_by_zone(gdf, clusters)
# Output: zones with low/medium/high competitionMetrics:
- Market saturation % per category
- Competition intensity by location
- Category performance (rating, engagement)
from decision_support import placement_optimizer
# Top 5 recommendations
recommendations = placement_optimizer.find_best_locations(
gdf,
clusters,
count=5,
criteria=['low_competition', 'high_demand']
)
# Output: lat/lon, nearby_competition, demand_scorefrom visualization import create_heatmaps
# Main density heatmap
create_heatmaps.density_map(gdf, output="maps/density.html")
# Underserved areas heatmap
create_heatmaps.underserved_map(gdf, clusters, output="maps/gaps.html")
# Category-specific heatmaps
for category in gdf['category'].unique():
create_heatmaps.category_map(gdf, category, output=f"maps/{category}.html")from visualization import generate_charts
# Market saturation bar chart
generate_charts.saturation_chart(saturation, output="charts/01_saturation.png")
# Competition heatmap
generate_charts.competition_heatmap(saturation_by_zone, output="charts/02_competition.png")
# Opportunity scatter plot
generate_charts.opportunity_scatter(recommendations, output="charts/03_opportunities.png")See examples/clinic_analysis/ for a complete workflow:
01_load_clean_data.ipynb- Import clinic CSV, validate coordinates02_spatial_clustering.ipynb- Find clinic concentration zones03_market_saturation.ipynb- Analyze competition by clinic type04_placement_recommendations.ipynb- Where to open new clinics05_visualization.ipynb- Generate maps + charts
Output: Strategic recommendations for healthcare expansion.
Customize behavior via config.yaml:
# Spatial parameters
CLUSTER_RADIUS_KM: 3
UNDERSERVED_THRESHOLD_KM: 3
SERVICE_RADIUS_KM: 2
# Saturation thresholds
LOW_SATURATION: 5
MEDIUM_SATURATION: 10
# Visualization
COLOR_SCHEME: 'viridis'
MAP_CENTER: [5.2, 100.5] # Malaysia centerWhen you run the engine, you get:
outputs/
├── maps/
│ ├── poi_density.html # Interactive heatmap
│ ├── underserved_areas.html # Coverage gaps
│ └── category_heatmaps/ # Per-category maps (13+ files)
├── data/
│ ├── poi_clusters.csv # Cluster assignments
│ ├── market_saturation.csv # Competition by category
│ ├── placement_recommendations.csv # Top 5 locations
│ └── accessibility_analysis.csv # Coverage metrics
├── charts/
│ ├── 01_market_saturation.png # Bar chart
│ ├── 02_competition_intensity.png # Heatmap
│ ├── 03_opportunities.png # Scatter plot
│ └── 04_category_performance.png # Performance metrics
└── reports/
└── EXECUTIVE_SUMMARY.txt # Strategic insights
Core Libraries:
- GeoPandas - Spatial data manipulation
- DBSCAN (scikit-learn) - Clustering algorithm
- Folium - Interactive maps
- Matplotlib/Seaborn - Static charts
- Pandas - Data analysis
Cloud & Notebooks:
- Jupyter - Interactive analysis
- Binder - Cloud notebook environment
- GitHub - Version control
Find underserved areas to open new clinics/hospitals
Optimize store placement to minimize competition
Identify gaps in charging network coverage
Analyze educational facility distribution
Find high-demand areas with low saturation
Pull requests welcome! This engine is designed to be extended:
- Add new clustering algorithms (K-means, hierarchical)
- Extend market analysis (demographic data, traffic patterns)
- New visualization types (3D maps, temporal analysis)
- Domain-specific modules (healthcare demand forecasting, etc.)
MIT License - Fork, adapt, and build on this framework.
Built on the excellent geospatial ecosystem:
- opengeos/python-geospatial - Foundation
- GeoPandas - Spatial data
- Folium - Interactive maps
- scikit-learn - Clustering
Ready to analyze your POI data? Check examples/ to get started. 🚀