feat(docker): bundle Ollama and wire AI helper in compose

- Add ollama and ollama-init services with ollama_data volume; app waits for the model pull and receives AI_* defaults (AI_BASE_URL=http://ollama:11434). - Document bundled stack, env vars, and Ollama vs host base URL in README and DOCKER_COMPOSE_SETUP. - Align env.example AI defaults with the compose stack. - fix(ai): include api_key_set on AIProviderConfig so from_settings(**get_ai_config()) matches the settings dict.
2026-05-23 06:40:53 -05:00 · 2026-04-27 21:54:03 +02:00
parent 5d4e693a2b
commit bf4c34ff83
5 changed files with 90 additions and 10 deletions
@@ -709,7 +709,7 @@ docker-compose -f docker/docker-compose.remote.yml up -d
 TimeTracker includes an optional **server-side AI helper** for the web app and API clients.

 - **Enable**: set `AI_ENABLED=true`
- **Ollama (default)**: set `AI_PROVIDER=ollama`, `AI_BASE_URL=http://127.0.0.1:11434`, `AI_MODEL=...`
+- **Ollama (default)**: set `AI_PROVIDER=ollama`, `AI_MODEL=...`, and `AI_BASE_URL` to `http://ollama:11434` when using the bundled stack in `docker-compose.yml`, or `http://127.0.0.1:11434` when Ollama runs on the host outside Docker.
 - **Hosted OpenAI-compatible**: set `AI_PROVIDER=openai_compatible` and `AI_API_KEY=...`

 The AI helper is exposed as:
@@ -717,6 +717,17 @@ The AI helper is exposed as:
 - Session web UI JSON: `POST /api/ai/chat` (same-origin, login required)
 - REST API v1: `POST /api/v1/ai/chat` (API token required, scopes `read:ai`/`write:ai`)

+### Bundled Ollama service (Docker Compose)
+
+`docker-compose.yml` ships a CPU-only `ollama` service plus a one-shot `ollama-init` sidecar that pulls the model defined by `AI_MODEL` (default `llama3.1`, ~4.7 GB) on first boot. The `app` service waits for the pull to finish before starting.
+
+- The app reaches it at `http://ollama:11434` over the Docker network — no host ports need to be opened.
+- Change the model by setting `AI_MODEL` in `.env` (e.g. `AI_MODEL=qwen2.5:3b` for lighter hardware) and re-running `docker compose up -d`; the init sidecar will pull the new model.
+- Pulled models are cached in the `ollama_data` named volume, so subsequent boots are instant.
+- Verify in the UI: **Admin → System Settings → AI helper**, then click *Test connection*.
+- To pull additional models manually: `docker compose exec ollama ollama pull <model>`.
+- To use a hosted provider instead, set `AI_PROVIDER=openai_compatible`, `AI_BASE_URL=https://api.your-provider.example/`, `AI_API_KEY=…` in `.env`; the in-cluster Ollama can stay running or be removed.
+
 ### Encrypting stored secrets (recommended)

 To store sensitive settings (OAuth secrets, mail password, AI API key, Peppol token, 2FA secret) encrypted at rest, set:
@@ -42,6 +42,7 @@ class AIProviderConfig:
    base_url: str
    model: str
    api_key: str
+    api_key_set: bool
    timeout_seconds: int
    context_limit: int
    system_prompt: str
@@ -58,7 +59,7 @@ class AIProviderConfig:
            "provider": self.provider,
            "base_url": self.base_url,
            "model": self.model,
-            "api_key_set": bool(self.api_key),
+            "api_key_set": self.api_key_set,
            "timeout_seconds": self.timeout_seconds,
            "context_limit": self.context_limit,
        }
@@ -89,6 +89,14 @@ services:
      - OTEL_METRICS_EXPORT_INTERVAL_MS=${OTEL_METRICS_EXPORT_INTERVAL_MS:-60000}
      - ENABLE_TELEMETRY=${ENABLE_TELEMETRY:-false}
      - TELE_SALT=${TELE_SALT:-8f4a7b2e9c1d6f3a5e8b4c7d2a9f6e3b1c8d5a7f2e9b4c6d3a8f5e1b7c4d9a2f}
+      # AI helper (Ollama in-cluster by default; override AI_PROVIDER for hosted providers)
+      - AI_ENABLED=${AI_ENABLED:-true}
+      - AI_PROVIDER=${AI_PROVIDER:-ollama}
+      - AI_BASE_URL=${AI_BASE_URL:-http://ollama:11434}
+      - AI_MODEL=${AI_MODEL:-llama3.1}
+      - AI_API_KEY=${AI_API_KEY:-}
+      - AI_TIMEOUT_SECONDS=${AI_TIMEOUT_SECONDS:-60}
+      - AI_CONTEXT_LIMIT=${AI_CONTEXT_LIMIT:-40}

    # Expose only internally; nginx publishes ports
    ports: []
@@ -99,6 +107,8 @@ services:
    depends_on:
      db:
        condition: service_healthy
+      ollama-init:
+        condition: service_completed_successfully
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "-s", "-o", "/dev/null", "http://localhost:8080/_health"]
@@ -107,6 +117,47 @@ services:
      retries: 3
      start_period: 40s

+  # Ollama - local LLM runtime for the TimeTracker AI helper
+  # The app talks to it via the OpenAI-compatible endpoint at /v1/chat/completions.
+  # First boot pulls the model defined by AI_MODEL (default llama3.1, ~4.7 GB).
+  ollama:
+    image: ollama/ollama:latest
+    container_name: timetracker-ollama
+    environment:
+      - OLLAMA_HOST=0.0.0.0:11434
+      - OLLAMA_KEEP_ALIVE=${OLLAMA_KEEP_ALIVE:-5m}
+    volumes:
+      - ollama_data:/root/.ollama
+    # Internal-only by default; uncomment to expose for host tools.
+    # ports:
+    #   - "11434:11434"
+    healthcheck:
+      test: ["CMD-SHELL", "ollama list >/dev/null 2>&1 || exit 1"]
+      interval: 15s
+      timeout: 5s
+      retries: 10
+      start_period: 30s
+    restart: unless-stopped
+
+  # One-shot model puller; runs to completion on each `up` (no-op if model already cached).
+  ollama-init:
+    image: ollama/ollama:latest
+    container_name: timetracker-ollama-init
+    depends_on:
+      ollama:
+        condition: service_healthy
+    environment:
+      - OLLAMA_HOST=http://ollama:11434
+    entrypoint: ["/bin/sh","-c"]
+    command:
+      - |
+        set -e
+        MODEL="${AI_MODEL:-llama3.1}"
+        echo "Pulling Ollama model: $$MODEL"
+        ollama pull "$$MODEL"
+        echo "Model ready: $$MODEL"
+    restart: "no"
+
  db:
    image: postgres:16-alpine
    container_name: timetracker-db
@@ -222,4 +273,6 @@ volumes:
  loki_data:
    driver: local
  redis_data:
+    driver: local
+  ollama_data:
    driver: local
@@ -52,7 +52,8 @@ docker-compose logs app --tail=100
 ### 4) Optional services
 - Reverse proxy (HTTPS): See `docker-compose.yml` (services `certgen` and `nginx`).
  - **Note**: The `certgen` service is now self-contained and works with Portainer and other container orchestration tools without requiring host filesystem mounts.
- Monitoring stack: Prometheus, Grafana, Loki, Promtail are available in `docker-compose.yml`.
+- Monitoring stack: Prometheus, Grafana, Loki, Promtail are available in `docker-compose.yml` (commented out by default; uncomment services to enable).
+- **Ollama (bundled LLM)**: The root `docker-compose.yml` includes `ollama` and a one-shot `ollama-init` container that pulls `AI_MODEL` into the `ollama_data` volume. The `app` service defaults to `AI_BASE_URL=http://ollama:11434` and waits for `ollama-init` to succeed before starting. Set `AI_ENABLED=false` in `.env` to turn off the in-app AI helper without removing the containers. Details: [README.md](../../../README.md) (sections *AI Helper* and *Bundled Ollama service*).

 ---

@@ -148,6 +149,18 @@ All environment variables can be provided via `.env` and are consumed by the `ap
 - LOG_LEVEL: Default: `INFO`.
 - LOG_FILE: Default: `/data/logs/timetracker.log` or `/app/logs/timetracker.log` based on compose.

+### AI helper (optional)
+Used by the server-side AI helper (`/api/ai/*`, Admin → System Settings). In the root `docker-compose.yml`, defaults target the bundled `ollama` service.
+
+- AI_ENABLED: Enable the AI helper. Default in root compose: `true` (override with `false` if you do not want LLM calls).
+- AI_PROVIDER: `ollama` or `openai_compatible`. Default: `ollama`.
+- AI_BASE_URL: Provider base URL without a trailing path. Default in root compose: `http://ollama:11434` (Docker service name). For Ollama on the host: `http://127.0.0.1:11434`.
+- AI_MODEL: Model tag (e.g. `llama3.1`, `qwen2.5:3b`). Pulled automatically on startup by `ollama-init` when using the bundled stack.
+- AI_API_KEY: Required when `AI_PROVIDER=openai_compatible`. Empty for Ollama.
+- AI_TIMEOUT_SECONDS: HTTP timeout for provider requests. Default in root compose: `60`.
+- AI_CONTEXT_LIMIT: Max recent time entries included in context. Default: `40`.
+- OLLAMA_KEEP_ALIVE: Passed to the `ollama` service (how long models stay loaded). Default: `5m`.
+
 ### Analytics & Telemetry (optional)
 - SENTRY_DSN: Sentry DSN (empty by default).
 - SENTRY_TRACES_RATE: 0.0–1.0 sampling rate. Default: `0.0`.
@@ -40,13 +40,15 @@ IDLE_TIMEOUT_MINUTES=30
 # SMART_NOTIFY_SCHEDULER_SLOT_MINUTES=30

 # AI helper (server-side; API keys are never sent to clients)
-# AI_ENABLED=false
-# AI_PROVIDER=ollama
-# AI_BASE_URL=http://127.0.0.1:11434
-# AI_MODEL=llama3.1
-# AI_API_KEY=
-# AI_TIMEOUT_SECONDS=30
-# AI_CONTEXT_LIMIT=40
+# Defaults below match the bundled Ollama service in docker-compose.yml.
+AI_ENABLED=true
+AI_PROVIDER=ollama
+AI_BASE_URL=http://ollama:11434
+AI_MODEL=llama3.1
+# AI_API_KEY=          # only needed when AI_PROVIDER=openai_compatible
+AI_TIMEOUT_SECONDS=60
+AI_CONTEXT_LIMIT=40
+# OLLAMA_KEEP_ALIVE=5m  # how long Ollama keeps the model resident in memory

 # API token rate limits (per token; Redis recommended for multi-worker)
 # API_TOKEN_RATE_LIMIT_PER_MINUTE=100