Upgrading and backup
Backup Postgres, Neo4j, and the MIM cache. Upgrade Fabrik safely with rollback. Rotate secrets without losing credentials.
Operating Fabrik long-term means two recurring jobs: taking backups you can actually restore, and upgrading without losing data. This page is the operational runbook for both.
What needs backing up
| Data | Lives in | Changes | Recovery cost if lost |
|---|---|---|---|
| PostgreSQL database | fabrik_postgres_data volume | Constantly — every user action | Catastrophic — users, queries, schedules, audit trail gone |
| Neo4j graph | fabrik_neo4j_data volume | Rarely — only on MIM imports | Rebuildable from the MIM registry |
| MIM cache files | fabrik_mim_cache volume | On MIM import | Rebuildable from the MIM registry |
.env file | Repository working tree | Only when you edit it | Critical — ENCRYPTION_KEY is irreplaceable |
| nginx certs | nginx/ssl/ | On cert rotation | Reissueable |
| RabbitMQ volume | fabrik_rabbitmq_data | Transient — queues drain | None — queues rebuild from AWX webhooks |
| Redis volume | fabrik_redis_data | Transient — cache only | None — cache rebuilds on demand |
In practice, backups focus on Postgres and .env. Everything else is either rebuildable or ephemeral.
Losing ENCRYPTION_KEY is unrecoverable. Every stored APIC password, AWX token, and TOTP secret is Fernet-encrypted with it. Restoring a Postgres dump against a different ENCRYPTION_KEY leaves you with ciphertext no process can decrypt. Back up .env with the same care as Postgres.
PostgreSQL backup
Daily dump
Run from the host, scheduled in cron:
docker compose exec -T postgres pg_dump \
-U fabrik \
-d fabrik \
--clean --if-exists \
| gzip > /backup/fabrik-$(date +%F).sql.gz-T disables TTY allocation so cron doesn't hang. --clean --if-exists makes the dump idempotent — restoring it drops and recreates objects cleanly.
Retain 14 daily, 8 weekly, 12 monthly copies off-host. Exact retention is a compliance question; the schedule is a defense-in-depth one.
Restore
Stop the stack first so nothing writes during restore:
docker compose stop backend celery-worker celery-beat \
event-consumer-job event-consumer-workflow event-consumer-output
gunzip -c /backup/fabrik-2026-04-22.sql.gz \
| docker compose exec -T postgres psql -U fabrik -d fabrik
docker compose start backend celery-worker celery-beat \
event-consumer-job event-consumer-workflow event-consumer-outputAfter restore, verify: log in, open Scheduled Tasks, confirm recent queries are present.
Neo4j backup
Neo4j isn't critical — the MIM graph reimports from the registry on next start if empty. But a backup saves import time and lets you pin a known-good MIM version:
# Online backup via cypher-shell APOC export (requires APOC plugin)
docker compose exec neo4j \
cypher-shell -u neo4j -p "$NEO4J_PASSWORD" \
"CALL apoc.export.cypher.all('/data/neo4j-backup.cypher', {})"
docker cp fabrik-neo4j:/data/neo4j-backup.cypher /backup/Or simpler: stop the container, tar the volume, start it again. Slower but bulletproof.
.env backup
# Copy to a secrets vault, not just another directory on the same host
cp /opt/fabrik/.env /secure-backup/fabrik.env.$(date +%F)Store this somewhere you'd store AWS credentials — 1Password, Vault, a sealed secret in your management plane. Anyone with .env can decrypt every credential in the Postgres dump.
Upgrade procedure
Fabrik upgrades are pull-and-restart. The entrypoint runs pending migrations automatically. The procedure:
Take a Postgres backup. See above. Don't skip — upgrades run migrations that can be hard to reverse.
Back up .env. You'll need it if you have to roll back, and you shouldn't be editing it in place anyway.
Pull the new code.
cd /opt/fabrik
git fetch --tags
git checkout v1.2.0 # or whatever the target version isDiff .env.example against your .env.
diff <(grep -oE '^[A-Z_]+' .env.example | sort -u) \
<(grep -oE '^[A-Z_]+' .env | sort -u)Add any new keys the release introduced. Release notes call these out.
Rebuild and restart.
docker compose -f docker-compose.yml -f docker-compose.prod.yml \
up -d --buildCompose rebuilds only what changed. Downtime is usually under a minute — backend restarts, migrations apply, Celery reconnects.
Verify. Check docker compose ps — every service should be healthy. Hit /api/health/ and log in. Run a saved query. Check scheduled tasks list.
Rollback
If the upgrade breaks something you can't live with:
# Stop the new stack
docker compose down
# Restore Postgres from the pre-upgrade backup
gunzip -c /backup/fabrik-pre-upgrade.sql.gz \
| docker compose exec -T postgres psql -U fabrik -d fabrik
# Restore the old code
git checkout <previous-tag>
# Start the old stack
docker compose -f docker-compose.yml -f docker-compose.prod.yml up -dIf the new version applied migrations that the old code doesn't understand, restoring Postgres to the pre-upgrade state is the only safe path — don't try to downgrade Django against a migrated database.
Rotating secrets
DJANGO_SECRET_KEY
Safe to rotate. Invalidates all active sessions and JWTs — users need to log in again. No data loss.
# Generate new
python -c "from django.core.management.utils import get_random_secret_key; print(get_random_secret_key())"
# Update .env, restart backend + workers + beat + consumers
docker compose restart backend celery-worker celery-beat \
event-consumer-job event-consumer-workflow event-consumer-outputENCRYPTION_KEY
Not safe to rotate in place. Every encrypted credential in Postgres was encrypted with the old key. To rotate:
- Go to Settings → Integrations and note every APIC connection, AWX connection, and AI provider key.
- Delete them from Fabrik (the credentials, not the users/groups).
- Take a fresh Postgres backup (sanity).
- Update
ENCRYPTION_KEYin.env. - Restart backend, workers, beat, and consumers.
- Re-enter every credential you noted in step 1.
There is no tooling for in-place key rotation. Rotate only when you have reason to (suspected leak), not on a schedule.
Database passwords
Postgres, Neo4j, and RabbitMQ passwords can be rotated with a slightly longer dance: stop dependent services, change the password inside the database container, update .env, restart everything. Test against a staging environment first — Neo4j in particular can be stubborn if the password doesn't match its stored state.
Volume migration
Named volumes (e.g. fabrik_postgres_data) are host-local. Moving Fabrik to a new host means moving the volumes with it:
# On the old host
docker run --rm -v fabrik_postgres_data:/data -v $(pwd):/backup alpine \
tar czf /backup/postgres_data.tar.gz -C /data .
# On the new host, after docker compose up has created the volume
docker run --rm -v fabrik_postgres_data:/data -v /transfer:/backup alpine \
tar xzf /backup/postgres_data.tar.gz -C /dataStop the stack on both ends during the copy. Repeat for fabrik_neo4j_data and fabrik_rabbitmq_data. Don't bother with Redis — rebuilds itself.
What to monitor
Minimum monitoring for a production install:
- Disk usage on the host — Postgres and audit logs grow.
/api/health/returning 200 at one-minute intervals.- Container restart count. If something is flapping, the restart count climbs.
docker eventsstreams them. - Celery queue depth. Watch the RabbitMQ management UI (for AWX events) and Redis
LLEN celery(for tasks). Persistent backlog means add workers. - Scheduled task success rate. Settings → Scheduled Tasks shows the failure streak per task — surface it in your own dashboards if you care.
Fabrik doesn't ship a Prometheus exporter. If you want one, that's a reasonable contribution — Django has django-prometheus and Celery has celery-exporter.
Behind a Corporate Proxy
Configure Fabrik to build and run on hosts that reach the internet through an HTTP/HTTPS proxy. Covers Docker daemon, container build, and runtime traffic.
API Reference
Fabrik REST API — base URL, authentication, pagination, throttling, error format, and the OpenAPI schema you can feed to any code generator.