Skip to content

Installation Guide

This guide covers deploying Safe Exam Support (SES) on a single production server, suitable for small to medium implementations.

High Availability Architecture

For larger implementations requiring zero-downtime failover and automatic database recovery, SES also offers a multi-VM HA architecture with Patroni auto-failover, Docker Swarm, and pgBouncer connection pooling. This setup survives individual VM failures and handles exam-start connection bursts gracefully. Contact the project team for more information on HA deployment.


Prerequisites

Server Requirements

Resource Minimum Recommended
CPU 4 vCPU 8 vCPU
RAM 8 GB 16 GB
Storage 50 GB SSD 150 GB SSD
OS Ubuntu 22.04 LTS Ubuntu 24.04 LTS

Network Requirements

  • Public IP address with ports 80 and 443 open
  • Domain name pointing to the server IP
  • Outbound HTTPS (443) for Let's Encrypt validation

Software Requirements

  • Docker 24.0+
  • Docker Compose 2.20+
  • Make (optional, for convenience commands)

Architecture Overview

Single-server architecture Figure 1: Single-server deployment architecture

Container Stack

Service Image Purpose
load-balancer nginx:1.27-alpine TLS termination, routing
ses-proxy (3x) Custom Smart proxy with Lua validation
ses-server (3x) Custom Django backend
postgres postgres:16-alpine Primary database
pgbouncer Custom Connection pooler
redis redis:7-alpine Session cache, rate limiting

High Availability Architecture

For larger deployments, SES offers a multi-VM HA architecture with automatic failover:

HA architecture Figure 2: Multi-VM HA architecture with Patroni auto-failover

Feature Single Server HA
VMs 1 3
Containers 9 20
Database Single PostgreSQL Patroni cluster (1 leader + 2 replicas)
Failover Manual restart Automatic (~10-30s)
Connection pooling pgBouncer pgBouncer
Load balancer Single nginx 3 replicas (Swarm VIP)
Survives Process crash 1 VM failure

Contact the Project Team

HA deployment requires additional infrastructure planning (Docker Swarm setup, network configuration, backup strategy). Contact the project team for guidance on HA deployment.


Step 1: Prepare the Server

1.1 Update System

apt update && apt upgrade -y

1.2 Install Docker

curl -fsSL https://get.docker.com | sh
usermod -aG docker $USER

1.3 Install Make (Optional)

apt install -y make

1.4 Configure OS Limits

Create systemd override for Docker:

mkdir -p /etc/systemd/system/docker.service.d
cat > /etc/systemd/system/docker.service.d/ulimit.conf << 'EOF'
[Service]
LimitNOFILE=65535:65535
EOF
systemctl daemon-reload
systemctl restart docker

1.5 Tune Kernel Parameters

cat >> /etc/sysctl.conf << 'EOF'
# SES high-concurrency tuning
net.core.somaxconn=65535
net.ipv4.tcp_max_syn_backlog=65535
net.ipv4.ip_local_port_range=1024 65535
net.ipv4.tcp_max_tw_buckets=200000
net.ipv4.tcp_fin_timeout=15
EOF
sysctl -p

Step 2: Clone Repository

cd /srv
git clone https://github.com/xxx/ses.git
cd ses

Step 3: Configure Environment

3.1 Create Environment File

cp .env.example .env.prd

3.2 Edit Configuration

nano .env.prd

Required Variables

Variable Description Example
SECRET_KEY Django secret key (50+ random chars) openssl rand -hex 32
DB_PASSWORD PostgreSQL password openssl rand -hex 32
REDIS_PASSWORD Redis password openssl rand -hex 32
SES_SESSION_TOKEN_SECRET HMAC signing key openssl rand -hex 32
DJANGO_ALLOWED_HOSTS Allowed hostnames .example.com
CSRF_TRUSTED_ORIGINS Trusted origins https://*.example.com

Optional Variables

Variable Default Description
GUNICORN_WORKERS 16 Worker processes
GUNICORN_THREADS 64 Threads per worker
RATE_LIMIT_GLOBAL 500000 Global rate limit
RATE_LIMIT_INSTITUTE 500000 Per-institute rate limit

Step 4: SSL Certificates

Choose one of the two options below.

Option A: Institute Certificate

If your institution provides a TLS certificate (e.g., from DigiCert, QuoVadis, or an internal CA):

4.A.1 Copy Certificate Files

cp your-institute-fullchain.pem services/load-balancer/certs/cert.pem
cp your-institute-privkey.pem services/load-balancer/certs/key.pem

4.A.2 Verify

openssl x509 -in services/load-balancer/certs/cert.pem -noout -text | grep -i subject

The certificate must cover all subdomains listed in Step 7 (DNS Configuration).

When the certificate is renewed by your institute, repeat 4.A.1 and redeploy:

make prod-deploy

Option B: Let's Encrypt (Free, Auto-Renewing)

4.B.1 Install Certbot

apt install -y certbot

4.B.2 Obtain Certificate

certbot certonly --standalone -d example.com \
    -d www.example.com \
    -d ems1.example.com \
    -d ems2.example.com \
    -d admin.example.com \
    -d manuals.example.com

4.B.3 Copy Certificates

cp /etc/letsencrypt/live/example.com/fullchain.pem \
    services/load-balancer/certs/cert.pem
cp /etc/letsencrypt/live/example.com/privkey.pem \
    services/load-balancer/certs/key.pem

4.B.4 Auto-Renewal

crontab -e

Add:

0 3 * * * certbot renew --quiet --post-hook "cp /etc/letsencrypt/live/example.com/fullchain.pem /srv/ses/services/load-balancer/certs/cert.pem && cp /etc/letsencrypt/live/example.com/privkey.pem /srv/ses/services/load-balancer/certs/key.pem && cd /srv/ses && make prod-deploy"

Step 5: Deploy

5.1 Start Services

make prod-deploy

Or manually:

set -a; source .env.prd; set +a
docker compose --env-file .env.prd -f docker-compose.prod.yml up -d --build

5.2 Verify Deployment

make prod-ps

All 10 containers should show (healthy) status:

NAME                 STATUS
prod-load-balancer   Up 2 minutes (healthy)
prod-pgbouncer       Up 2 minutes (healthy)
prod-postgres        Up 2 minutes (healthy)
prod-redis           Up 2 minutes (healthy)
prod-ses-proxy-1     Up 2 minutes (healthy)
prod-ses-proxy-2     Up 2 minutes (healthy)
prod-ses-proxy-3     Up 2 minutes (healthy)
prod-ses-server-1    Up 2 minutes (healthy)
prod-ses-server-2    Up 2 minutes (healthy)
prod-ses-server-3    Up 2 minutes (healthy)

5.3 Test Endpoints

curl https://ems1.example.com/health
curl https://admin.example.com/health

Expected response: {"status":"healthy"}


Step 6: Initialize Database

6.1 Run Migrations

docker exec prod-ses-server-1 python manage.py migrate

6.2 Create Superuser

docker exec -it prod-ses-server-1 python manage.py createsuperuser

Follow prompts to create an admin account.


Step 7: Configure DNS

Add DNS records for each subdomain:

Subdomain Type Value
example.com A Server IP
www.example.com CNAME example.com
ems1.example.com CNAME example.com
ems2.example.com CNAME example.com
admin.example.com CNAME example.com
manuals.example.com CNAME example.com

Step 8: Deploy Manuals

The manuals site runs as a standalone stack, separate from the main SES services.

8.1 DNS Configuration

Add a DNS record:

Subdomain Type Value
manuals.example.com A Server IP

8.2 SSL Certificate

Choose one of the two options below (must match your choice in Step 4).

Option A: Institute Certificate

If you chose Option A in Step 4, the manuals site is served through the main load balancer, which already has your institute certificate. No additional certificate is needed — skip to 8.3.

Option B: Let's Encrypt

If you chose Option B in Step 4 and want the manuals site on its own domain with its own certificate:

# Create certbot directories
mkdir -p services/manuals/certbot/conf services/manuals/certbot/www

# Request certificate (dry-run first)
docker run --rm -v "/srv/ses/services/manuals/certbot/conf:/etc/letsencrypt" \
    -v "/srv/ses/services/manuals/certbot/www:/var/www/certbot" \
    certbot/certbot certonly --webroot \
    -w /var/www/certbot \
    -d manuals.example.com \
    --email admin@example.com \
    --agree-tos \
    --no-eff-email \
    --dry-run

# If dry-run succeeds, run for real
docker run --rm -v "/srv/ses/services/manuals/certbot/conf:/etc/letsencrypt" \
    -v "/srv/ses/services/manuals/certbot/www:/var/www/certbot" \
    certbot/certbot certonly --webroot \
    -w /var/www/certbot \
    -d manuals.example.com \
    --email admin@example.com \
    --agree-tos \
    --no-eff-email

8.3 Deploy Manuals Stack

cd /srv/ses/services/manuals
make prod-up

Or manually:

docker compose -f services/manuals/docker-compose.manuals.prod.yml up -d --build

8.4 Verify

curl https://manuals.example.com/

Expected: HTML page with "Safe Exam Support Manual" title.

8.5 Auto-Renewal (Let's Encrypt Only)

If you chose Option B, add to crontab:

0 3 * * * cd /srv/ses && docker compose -f services/manuals/docker-compose.manuals.prod.yml run --rm certbot renew && docker compose -f services/manuals/docker-compose.manuals.prod.yml restart nginx

If you chose Option A, the main load balancer's auto-renewal cron (Step 4.B.4) covers the manuals domain as well.

8.6 Makefile Commands Reference

Command Description
make up Build + start manuals (local)
make down Stop manuals
make rebuild Force rebuild + recreate
make ps Show container status
make logs Tail logs
make prod-up Start production stack
make prod-down Stop production stack
make prod-ps Show production status
make prod-logs Tail production logs

Updating

Pull Latest Code

cd /srv/ses
git pull origin main

Redeploy

make prod-deploy

This rebuilds and restarts all containers with zero downtime (rolling update).


Backup

Database Backup

docker exec prod-postgres pg_dump -U ses_user ses_db > backup_$(date +%Y%m%d).sql

Automated Backups

Add to crontab:

0 2 * * * cd /srv/ses && docker exec prod-postgres pg_dump -U ses_user ses_db | gzip > /srv/backups/ses_$(date +\%Y\%m\%d).sql.gz

Monitoring

View Logs

make prod-logs-server   # Django logs
make prod-logs-proxy    # Proxy access logs
make prod-logs-lb       # Load balancer logs

Health Check

curl https://ems1.example.com/health

Troubleshooting

Container Not Starting

docker logs prod-ses-server-1 --tail 50

Database Connection Errors

docker exec prod-pgbouncer cat /var/log/pgbouncer/pgbouncer.log

SSL Certificate Issues (Let's encrypt)

certbot certificates
certbot renew --dry-run

Makefile Commands Reference

Command Description
make prod-deploy Full deployment (pull + build + restart)
make prod-ps List containers with status
make prod-logs-server Django server logs
make prod-logs-proxy Proxy access logs
make prod-logs-lb Load balancer logs
make prod-test Run tests inside container
make prod-shell Django shell
make load-test Run load test

Security Checklist

  • [ ] Changed all default passwords in .env.prd
  • [ ] Generated new SECRET_KEY
  • [ ] Generated new SES_SESSION_TOKEN_SECRET
  • [ ] SSL certificates installed and auto-renewing
  • [ ] Firewall configured (only 80, 443 open)
  • [ ] Database backups configured
  • [ ] Audit log retention set appropriately
  • [ ] Rate limits configured for your traffic