Skip to content
getgeolens.com

Infrastructure & Monitoring

GeoLens exposes Prometheus-compatible metrics at /metrics and a connectivity health endpoint at /health. The admin web UI surfaces both at /admin/overview along with catalog statistics. This page covers the operator-facing monitoring surface; service-level diagnostics (Docker logs, database size queries) are at the bottom.

Replace https://geolens.example.com with your GeoLens instance’s URL in every example below.

The /metrics endpoint serves Prometheus-format metrics, gzipped, response-buffered, with scrape paths excluded from histogram contamination. The endpoint is unauthenticated by default — restrict it with a reverse-proxy IP allowlist or basic auth in production.

MetricTypeLabelsDescription
http_requests_totalcountermethod, status, handlerTotal HTTP requests served
http_request_duration_secondshistogrammethod, status, handlerRequest latency distribution
http_requests_inprogressgaugemethod, handlerIn-flight requests
geolens_jobs_queue_depthgaugequeuePending jobs (Procrastinate status=todo)
geolens_jobs_activegaugequeueRunning jobs (Procrastinate status=doing)
geolens_jobs_completed_totalcounterqueueCompleted jobs (since process start)
geolens_jobs_failed_totalcounterqueueFailed jobs (since process start)
geolens_db_pool_checkedoutgaugeConnections currently checked out
geolens_db_pool_checkedingaugeConnections currently available in pool
geolens_db_pool_overflowgaugeOverflow connections currently open
geolens_db_pool_sizegaugeConfigured pool size

Sample Prometheus scrape configuration:

scrape_configs:
- job_name: geolens
metrics_path: /metrics
static_configs:
- targets: ['geolens.example.com']

For Grafana dashboards, the geolens_jobs_queue_depth and geolens_db_pool_checkedout series are the most actionable — sustained queue depth above 50 indicates worker undersizing; sustained pool checkout near pool_size indicates DB-connection contention.

GET /health returns 200 (healthy) or 503 (degraded), with a JSON body covering each provider:

{
"status": "healthy",
"providers": {
"database": { "status": "ok", "latency_ms": 12.3 },
"storage": { "status": "ok", "latency_ms": 45.2 },
"cache": { "status": "ok", "latency_ms": 1.1 }
}
}

The probes:

  • database — exercises a live SELECT to_regclass('catalog.datasets') (catches hung DB, broken search_path).
  • storage — calls the configured storage provider’s health_check() (S3 HeadBucket or local writability test).
  • cache — calls Valkey/Redis PING.

Use this endpoint as the upstream health check for load balancers and Kubernetes liveness/readiness probes. The 503 response is intentional — it signals “do not route traffic here” without 5xx-class application errors that would page on-call.

Terminal window
# Basic check
curl -fsS https://geolens.example.com/health || echo "unhealthy"
# Detailed JSON with latency breakdown
curl -s https://geolens.example.com/health | jq

For internal/private endpoints, the FastAPI process exposes the same health check directly at /health on the API port (default :8001 inside the Docker network). Use this when nginx/the frontend container is itself the failure point.

OIDC connectivity validation runs separately from the standard /health probe — IdP discovery URLs are checked on demand rather than on every health poll, since cold-cache IdP fetches add 200–500 ms latency.

Trigger validation via the admin UI:

  1. Navigate to Admin → Settings → Authentication.
  2. Click Validate Connectivity in the toolbar.
  3. The panel reports per-provider latency, status, and any error details (e.g., DNS failure, expired discovery cache, certificate mismatch).

Or via the API:

Terminal window
curl -X POST https://geolens.example.com/api/admin/validate-connectivity \
-H "Authorization: Bearer $TOKEN"

The response includes one entry per enabled OIDC provider:

{
"providers": [
{ "slug": "google", "status": "ok", "latency_ms": 142.7 },
{ "slug": "keycloak", "status": "error", "error": "Connection refused" }
]
}

Run validation after any of: (1) adding a new OAuth provider, (2) rotating client secrets, (3) network changes affecting outbound HTTPS to IdP endpoints, (4) certificate renewals on self-hosted IdPs.

/admin/overview shows real-time health badges for database, storage, cache, and each enabled OIDC provider, alongside catalog statistics:

  • Total datasets, total bytes, total feature count
  • Recent additions (last 30 days)
  • By-geometry-type breakdown (Point, LineString, Polygon, Raster, etc.)
  • By-visibility breakdown (public, private, restricted)
  • Most-active users (top 10 by upload count, last 30 days)

The health badges poll /health every 30 seconds; the catalog statistics are computed from a materialized view refreshed hourly. For real-time queue/worker metrics, use Prometheus + Grafana (the badges are intentionally coarse-grained).

For programmatic access to the same statistics:

Terminal window
curl https://geolens.example.com/api/admin/stats \
-H "Authorization: Bearer $TOKEN"

Returns total datasets, recent additions (30 days), total storage bytes, datasets by geometry type, and datasets by visibility.

For database-level size queries:

Terminal window
docker compose exec db psql -U geolens -d geolens -c "
SELECT pg_size_pretty(pg_database_size('geolens')) AS db_size;
"

Per-table sizes (largest datasets first):

Terminal window
docker compose exec db psql -U geolens -d geolens -c "
SELECT table_name,
pg_size_pretty(pg_total_relation_size('data.' || table_name)) AS size
FROM catalog.datasets
ORDER BY pg_total_relation_size('data.' || table_name) DESC;
"

Every admin action is recorded in the audit log table. Inspect via the UI at Admin → Audit Log (filterable by action, user, resource, date range; supports CSV/JSON export) or via the API:

Terminal window
# All audit logs
curl https://geolens.example.com/api/admin/audit-logs \
-H "Authorization: Bearer $TOKEN"
# Filter by action
curl "https://geolens.example.com/api/admin/audit-logs?action=dataset.export" \
-H "Authorization: Bearer $TOKEN"
# Filter by user and date range
curl "https://geolens.example.com/api/admin/audit-logs?user_id={user_id}&date_from=2024-01-01" \
-H "Authorization: Bearer $TOKEN"

Available audit actions cover datasets (dataset.view, dataset.export, metadata.edit), collections (collection.create, collection.update, collection.delete), maps (map.create, map.share, map.revoke_share), features (feature.insert, feature.update, feature.delete), embed tokens (embed_token.create, embed_token.revoke), OAuth providers (oauth_provider.create, oauth_provider.update), and config operations (config_import, update, reset, probe_service).

For long-term retention, archive audit logs to S3 nightly — there is no built-in retention/archival policy.

For service-level debugging beyond the metrics endpoint, use Docker Compose log streaming:

Terminal window
# Follow all logs
docker compose logs -f
# Follow specific service logs
docker compose logs -f api
docker compose logs -f db
docker compose logs -f worker
# Last 100 lines
docker compose logs --tail=100 api

For service health (Docker-level, not application-level):

Terminal window
# View all service statuses
docker compose ps
# Check specific service
docker compose ps db
docker compose ps api

Service health here reflects container restart status and entrypoint health checks — it does not exercise the application’s own provider probes. Use /health for application-level connectivity checks; use docker compose ps for “is the container running.”