Zum Inhalt springen
getgeolens.com

Backups & Restore

Dieser Inhalt ist noch nicht in deiner Sprache verfügbar.

GeoLens ships an automated backup container that runs on a configurable cron schedule, retains daily and weekly snapshots locally, and optionally replicates to S3-compatible storage. There is no /admin/backups UI page — backups are operated entirely from the Docker Compose CLI and via environment variables.

Replace https://geolens.example.com with your GeoLens instance’s URL in every example below.

The backup service runs as a Docker Compose profile. Enable it once on the host:

Terminal window
docker compose --profile backup up -d

The container runs pg_dump on the schedule defined by BACKUP_SCHEDULE (default: daily at 02:00 UTC), keeps 7 daily snapshots and 4 weekly (Sunday) snapshots, and optionally uploads to S3. The service uses the same database credentials as the API, so no additional configuration is required for the local-only path.

To verify the backup container is running:

Terminal window
docker compose ps backup

The backup container restarts automatically; if it exits non-zero (e.g., disk full, connectivity issue), Docker restarts it and the next scheduled run resumes normally.

VariableDefaultPurpose
BACKUP_SCHEDULE0 2 * * *Cron expression for backup execution
BACKUP_RETENTION_DAILY7Number of daily backups to keep locally
BACKUP_RETENTION_WEEKLY4Number of weekly (Sunday) backups to keep locally
BACKUP_S3_ENABLEDfalseUpload backups to S3 in addition to local storage
S3_ENDPOINT, S3_BUCKET, S3_ACCESS_KEY_ID, S3_SECRET_ACCESS_KEY, S3_REGIONS3 credentials (shared with the storage provider)

Set these in .env before bringing up the backup profile. Changes require a container restart:

Terminal window
docker compose --profile backup restart backup

The backup service reads the same S3_* credentials as the storage provider — there are no dedicated BACKUP_S3_* bucket/credential overrides. BACKUP_S3_ENABLED is the only backup-specific S3 knob; off-site upload reuses S3_ENDPOINT, S3_BUCKET, S3_ACCESS_KEY_ID, S3_SECRET_ACCESS_KEY, and S3_REGION. To back up to a separate bucket, point those shared S3_* values at it — see Configuration Reference.

Backups are written to:

  • /backups/daily/<db>_<timestamp>.dump — every scheduled run
  • /backups/weekly/<db>_<timestamp>.dump — Sundays only
  • /backups/{daily,weekly}/staging-<timestamp>.tar.gz — the local upload_staging volume (rasters/COGs/source uploads), tarred alongside each dump with the same timestamp so a restore can recover a working instance, not just an empty-object catalog
  • s3://<bucket>/backups/{daily,weekly}/... — both the dump and the staging archive, when BACKUP_S3_ENABLED=true

The <db> segment is the database name (default geolens). The <timestamp> segment is YYYYMMDD_HHMMSS in UTC. Dumps are PostgreSQL custom-format (pg_dump -Fc --no-owner --no-acl) — restore with pg_restore rather than psql.

Local backups live inside the backup_data Docker volume. Inspect them by execing into the backup container:

Terminal window
docker compose exec backup ls -lh /backups/daily/
docker compose exec backup ls -lh /backups/weekly/

To copy a specific backup off the host:

Terminal window
docker compose cp backup:/backups/daily/geolens_20260101_020000.dump ./

The backup container enforces retention on every run. After producing a new backup:

  1. Daily backups older than BACKUP_RETENTION_DAILY runs are deleted from /backups/daily/.
  2. Weekly backups older than BACKUP_RETENTION_WEEKLY runs are deleted from /backups/weekly/.
  3. S3 retention is not enforced by GeoLens — configure S3 lifecycle rules on the bucket itself for off-site retention. The default 30-day Glacier transition + 365-day expiry is a common starting point for compliance-driven deployments.

Manual deletion is safe — the container scans the directory on the next run and updates retention based on what’s present. Out-of-band file deletion does not corrupt the backup state.

For one-off backups outside the scheduled cron, run pg_dump directly against the database container:

Terminal window
# Create a custom-format dump (preferred for pg_restore) — mirror the
# automated backup's flags so it restores cleanly via scripts/restore.sh
docker compose exec db pg_dump -U geolens -d geolens -Fc --no-owner --no-acl -f /tmp/geolens_backup.dump
docker compose cp db:/tmp/geolens_backup.dump ./geolens_backup.dump

For a plain SQL backup (useful for grep/inspection):

Terminal window
docker compose exec db pg_dump -U geolens -d geolens > geolens_backup.sql

For volume-level snapshots (database files plus WAL):

Terminal window
# Stop services first to ensure a consistent snapshot
docker compose down
# Backup the pgdata volume
docker run --rm -v geolens_pgdata:/data -v $(pwd):/backup alpine \
tar czf /backup/pgdata_backup.tar.gz -C /data .
# Restart
docker compose up -d

Volume snapshots are faster to restore than pg_restore for full-instance recovery but cannot be restored to a different PostgreSQL version. Use the pg_dump format for long-term archival and cross-version portability.

The canonical restore path is scripts/restore.sh. It validates the dump with pg_restore --list before touching the database, pre-creates the required extensions/schemas/reader role, stops api and worker to prevent write conflicts, runs the restore, then restarts the services via an EXIT trap (even on failure):

Terminal window
# Restores into the running db container from a custom-format dump
./scripts/restore.sh /path/to/geolens_20260101_020000.dump

Internally it runs:

Terminal window
pg_restore -U "$POSTGRES_USER" -d "$POSTGRES_DB" --clean --if-exists --no-owner < "$BACKUP_FILE"

The --clean --if-exists flags drop existing objects first and tolerate objects that are absent on a fresh database (those produce warnings, not errors — the script treats a warning-only nonzero exit as success). --no-owner skips ownership commands so the dump restores cleanly under the geolens role regardless of who created it.

To restore the DB by hand (e.g. without the wrapper), match those flags:

Terminal window
# Stop services to prevent writes
docker compose stop api worker
# Copy the backup into the database container
docker compose cp ./backup.dump db:/tmp/backup.dump
# Restore (mirror restore.sh's flags)
docker compose exec db pg_restore -U geolens -d geolens \
--clean --if-exists --no-owner /tmp/backup.dump
# Restart
docker compose start api worker

For partial restores (single table), add --table <name>:

Terminal window
docker compose exec db pg_restore -U geolens -d geolens \
--clean --if-exists --no-owner --table catalog.datasets /tmp/backup.dump

To restore a volume-level snapshot:

Terminal window
docker compose down
docker run --rm -v geolens_pgdata:/data -v $(pwd):/backup alpine \
sh -c "rm -rf /data/* && tar xzf /backup/pgdata_backup.tar.gz -C /data"
docker compose up -d

Restore validation is currently operator-driven — there is no automated post-restore checksum or integrity verification. Recommended practice:

  1. Restore the most recent daily backup into a staging instance monthly.

  2. Run curl https://staging.geolens.example.com/health and confirm database.status: ok (see Infrastructure & Monitoring for the health probe).

  3. Spot-check 2–3 dataset rows from the catalog API:

    Terminal window
    curl https://staging.geolens.example.com/api/datasets/ \
    -H "Authorization: Bearer $TOKEN" | jq '.datasets[:3]'
  4. Open a representative dataset in the staging UI and confirm map preview, exports, and metadata render correctly.

If staging restore reveals corruption, the most recent daily backup may be unrecoverable — promote the most recent weekly backup to production-restore status and investigate the failed daily as a separate incident.