feat(ui): runtime settings (#7320)

* feat(ui): add watchdog settings Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Do not re-read env Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Some refactor, move other settings to runtime (p2p) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add API Keys handling Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Allow to disable runtime settings Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Documentation Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Small fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * show MCP toggle in index Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Drop context default Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-20 08:50:38 -06:00 · 2025-11-20 22:37:20 +01:00
parent 53d51671d7
commit 2dd42292dc
30 changed files with 2245 additions and 438 deletions
--- a/docs/content/advanced/vram-management.md
+++ b/docs/content/advanced/vram-management.md
@@ -48,12 +48,15 @@ curl http://localhost:8080/v1/chat/completions -d '{"model": "model-b", ...}'

 For more flexible memory management, LocalAI provides watchdog mechanisms that automatically unload models based on their activity state. This allows multiple models to be loaded simultaneously, but automatically frees memory when models become inactive or stuck.

+> **Note:** Watchdog settings can be configured via the [Runtime Settings]({{%relref "features/runtime-settings#watchdog-settings" %}}) web interface, which allows you to adjust settings without restarting the application.
+
 ### Idle Watchdog

 The idle watchdog monitors models that haven't been used for a specified period and automatically unloads them to free VRAM.

 #### Configuration

+Via environment variables or CLI:
 ```bash
 LOCALAI_WATCHDOG_IDLE=true ./local-ai

@@ -62,12 +65,15 @@ LOCALAI_WATCHDOG_IDLE=true LOCALAI_WATCHDOG_IDLE_TIMEOUT=10m ./local-ai
 ./local-ai --enable-watchdog-idle --watchdog-idle-timeout=10m
 ```

+Via web UI: Navigate to Settings → Watchdog Settings and enable "Watchdog Idle Enabled" with your desired timeout.
+
 ### Busy Watchdog

 The busy watchdog monitors models that have been processing requests for an unusually long time and terminates them if they exceed a threshold. This is useful for detecting and recovering from stuck or hung backends.

 #### Configuration

+Via environment variables or CLI:
 ```bash
 LOCALAI_WATCHDOG_BUSY=true ./local-ai

@@ -76,6 +82,8 @@ LOCALAI_WATCHDOG_BUSY=true LOCALAI_WATCHDOG_BUSY_TIMEOUT=10m ./local-ai
 ./local-ai --enable-watchdog-busy --watchdog-busy-timeout=10m
 ```

+Via web UI: Navigate to Settings → Watchdog Settings and enable "Watchdog Busy Enabled" with your desired timeout.
+
 ### Combined Configuration

 You can enable both watchdogs simultaneously for comprehensive memory management:
--- a/docs/content/features/_index.en.md
+++ b/docs/content/features/_index.en.md
@@ -32,6 +32,7 @@ LocalAI provides a comprehensive set of features for running AI models locally.
 - **[Stores](stores/)** - Vector similarity search for embeddings
 - **[Model Gallery](model-gallery/)** - Browse and install pre-configured models
 - **[Backends](backends/)** - Learn about available backends and how to manage them
+- **[Runtime Settings](runtime-settings/)** - Configure application settings via web UI without restarting

 ## Getting Started

--- a/docs/content/features/model-gallery.md
+++ b/docs/content/features/model-gallery.md
@@ -33,12 +33,18 @@ Navigate the WebUI interface in the "Models" section from the navbar at the top.

 ## Add other galleries

-You can add other galleries by setting the `GALLERIES` environment variable. The `GALLERIES` environment variable is a list of JSON objects, where each object has a `name` and a `url` field. The `name` field is the name of the gallery, and the `url` field is the URL of the gallery's index file, for example:
+You can add other galleries by:
+
+1. **Using the Web UI**: Navigate to the [Runtime Settings]({{%relref "features/runtime-settings#gallery-settings" %}}) page and configure galleries through the interface.
+
+2. **Using Environment Variables**: Set the `GALLERIES` environment variable. The `GALLERIES` environment variable is a list of JSON objects, where each object has a `name` and a `url` field. The `name` field is the name of the gallery, and the `url` field is the URL of the gallery's index file, for example:

 ```json
 GALLERIES=[{"name":"<GALLERY_NAME>", "url":"<GALLERY_URL"}]
 ```

+3. **Using Configuration Files**: Add galleries to `runtime_settings.json` in the `LOCALAI_CONFIG_DIR` directory.
+
 The models in the gallery will be automatically indexed and available for installation.

 ## API Reference
--- a/docs/content/features/runtime-settings.md
+++ b/docs/content/features/runtime-settings.md
@@ -0,0 +1,180 @@
+++
+disableToc = false
+title = "⚙️ Runtime Settings"
+weight = 25
+url = '/features/runtime-settings'
+++
+
+LocalAI provides a web-based interface for managing application settings at runtime. These settings can be configured through the web UI and are automatically persisted to a configuration file, allowing changes to take effect immediately without requiring a restart.
+
+## Accessing Runtime Settings
+
+Navigate to the **Settings** page from the management interface at `http://localhost:8080/manage`. The settings page provides a comprehensive interface for configuring various aspects of LocalAI.
+
+## Available Settings
+
+### Watchdog Settings
+
+The watchdog monitors backend activity and can automatically stop idle or overly busy models to free up resources.
+
+- **Watchdog Enabled**: Master switch to enable/disable the watchdog
+- **Watchdog Idle Enabled**: Enable stopping backends that are idle longer than the idle timeout
+- **Watchdog Busy Enabled**: Enable stopping backends that are busy longer than the busy timeout
+- **Watchdog Idle Timeout**: Duration threshold for idle backends (default: `15m`)
+- **Watchdog Busy Timeout**: Duration threshold for busy backends (default: `5m`)
+
+Changes to watchdog settings are applied immediately by restarting the watchdog service.
+
+### Backend Configuration
+
+- **Single Backend**: Allow only one backend to run at a time
+- **Parallel Backend Requests**: Enable backends to handle multiple requests in parallel if supported
+
+### Performance Settings
+
+- **Threads**: Number of threads used for parallel computation (recommended: number of physical cores)
+- **Context Size**: Default context size for models (default: `512`)
+- **F16**: Enable GPU acceleration using 16-bit floating point
+
+### Debug and Logging
+
+- **Debug Mode**: Enable debug logging (deprecated, use log-level instead)
+
+### API Security
+
+- **CORS**: Enable Cross-Origin Resource Sharing
+- **CORS Allow Origins**: Comma-separated list of allowed CORS origins
+- **CSRF**: Enable CSRF protection middleware
+- **API Keys**: Manage API keys for authentication (one per line or comma-separated)
+
+### P2P Settings
+
+Configure peer-to-peer networking for distributed inference:
+
+- **P2P Token**: Authentication token for P2P network
+- **P2P Network ID**: Network identifier for P2P connections
+- **Federated Mode**: Enable federated mode for P2P network
+
+Changes to P2P settings automatically restart the P2P stack with the new configuration.
+
+### Gallery Settings
+
+Manage model and backend galleries:
+
+- **Model Galleries**: JSON array of gallery objects with `url` and `name` fields
+- **Backend Galleries**: JSON array of backend gallery objects
+- **Autoload Galleries**: Automatically load model galleries on startup
+- **Autoload Backend Galleries**: Automatically load backend galleries on startup
+
+## Configuration Persistence
+
+All settings are automatically saved to `runtime_settings.json` in the `LOCALAI_CONFIG_DIR` directory (default: `BASEPATH/configuration`). This file is watched for changes, so modifications made directly to the file will also be applied at runtime.
+
+## Environment Variable Precedence
+
+Environment variables take precedence over settings configured via the web UI or configuration files. If a setting is controlled by an environment variable, it cannot be modified through the web interface. The settings page will indicate when a setting is controlled by an environment variable.
+
+The precedence order is:
+1. **Environment variables** (highest priority)
+2. **Configuration files** (`runtime_settings.json`, `api_keys.json`)
+3. **Default values** (lowest priority)
+
+## Example Configuration
+
+The `runtime_settings.json` file follows this structure:
+
+```json
+{
+  "watchdog_enabled": true,
+  "watchdog_idle_enabled": true,
+  "watchdog_busy_enabled": false,
+  "watchdog_idle_timeout": "15m",
+  "watchdog_busy_timeout": "5m",
+  "single_backend": false,
+  "parallel_backend_requests": true,
+  "threads": 8,
+  "context_size": 2048,
+  "f16": false,
+  "debug": false,
+  "cors": true,
+  "csrf": false,
+  "cors_allow_origins": "*",
+  "p2p_token": "",
+  "p2p_network_id": "",
+  "federated": false,
+  "galleries": [
+    {
+      "url": "github:mudler/LocalAI/gallery/index.yaml@master",
+      "name": "localai"
+    }
+  ],
+  "backend_galleries": [
+    {
+      "url": "github:mudler/LocalAI/backend/index.yaml@master",
+      "name": "localai"
+    }
+  ],
+  "autoload_galleries": true,
+  "autoload_backend_galleries": true,
+  "api_keys": []
+}
+```
+
+## API Keys Management
+
+API keys can be managed through the runtime settings interface. Keys can be entered one per line or comma-separated. 
+
+**Important Notes:**
+- API keys from environment variables are always included and cannot be removed via the UI
+- Runtime API keys are stored in `runtime_settings.json`
+- For backward compatibility, API keys can also be managed via `api_keys.json`
+- Empty arrays will clear all runtime API keys (but preserve environment variable keys)
+
+## Dynamic Configuration
+
+The runtime settings system supports dynamic configuration file watching. When `LOCALAI_CONFIG_DIR` is set, LocalAI monitors the following files for changes:
+
+- `runtime_settings.json` - Unified runtime settings
+- `api_keys.json` - API keys (for backward compatibility)
+- `external_backends.json` - External backend configurations
+
+Changes to these files are automatically detected and applied without requiring a restart.
+
+## Best Practices
+
+1. **Use Environment Variables for Production**: For production deployments, use environment variables for critical settings to ensure they cannot be accidentally changed via the web UI.
+
+2. **Backup Configuration Files**: Before making significant changes, consider backing up your `runtime_settings.json` file.
+
+3. **Monitor Resource Usage**: When enabling watchdog features, monitor your system to ensure the timeout values are appropriate for your workload.
+
+4. **Secure API Keys**: API keys are sensitive information. Ensure proper file permissions on configuration files (they should be readable only by the LocalAI process).
+
+5. **Test Changes**: Some settings (like watchdog timeouts) may require testing to find optimal values for your specific use case.
+
+## Troubleshooting
+
+### Settings Not Applying
+
+If settings are not being applied:
+1. Check if the setting is controlled by an environment variable
+2. Verify the `LOCALAI_CONFIG_DIR` is set correctly
+3. Check file permissions on `runtime_settings.json`
+4. Review application logs for configuration errors
+
+### Watchdog Not Working
+
+If the watchdog is not functioning:
+1. Ensure "Watchdog Enabled" is turned on
+2. Verify at least one of the idle or busy watchdogs is enabled
+3. Check that timeout values are reasonable for your workload
+4. Review logs for watchdog-related messages
+
+### P2P Not Starting
+
+If P2P is not starting:
+1. Verify the P2P token is set (non-empty)
+2. Check network connectivity
+3. Ensure the P2P network ID matches across nodes (if using federated mode)
+4. Review logs for P2P-related errors
+
--- a/docs/content/reference/cli-reference.md
+++ b/docs/content/reference/cli-reference.md
@@ -24,7 +24,7 @@ Complete reference for all LocalAI command-line interface (CLI) parameters and e
 | `--models-path` | `BASEPATH/models` | Path containing models used for inferencing | `$LOCALAI_MODELS_PATH`, `$MODELS_PATH` |
 | `--generated-content-path` | `/tmp/generated/content` | Location for assets generated by backends (e.g. stablediffusion, images, audio, videos) | `$LOCALAI_GENERATED_CONTENT_PATH`, `$GENERATED_CONTENT_PATH` |
 | `--upload-path` | `/tmp/localai/upload` | Path to store uploads from files API | `$LOCALAI_UPLOAD_PATH`, `$UPLOAD_PATH` |
-| `--localai-config-dir` | `BASEPATH/configuration` | Directory for dynamic loading of certain configuration files (currently api_keys.json and external_backends.json) | `$LOCALAI_CONFIG_DIR` |
+| `--localai-config-dir` | `BASEPATH/configuration` | Directory for dynamic loading of certain configuration files (currently runtime_settings.json, api_keys.json, and external_backends.json). See [Runtime Settings]({{%relref "features/runtime-settings" %}}) for web-based configuration. | `$LOCALAI_CONFIG_DIR` |
 | `--localai-config-dir-poll-interval` | | Time duration to poll the LocalAI Config Dir if your system has broken fsnotify events (example: `1m`) | `$LOCALAI_CONFIG_DIR_POLL_INTERVAL` |
 | `--models-config-file` | | YAML file containing a list of model backend configs (alias: `--config-file`) | `$LOCALAI_MODELS_CONFIG_FILE`, `$CONFIG_FILE` |

@@ -80,6 +80,7 @@ For more information on VRAM management, see [VRAM and Memory Management]({{%rel
 | `--upload-limit` | `15` | Default upload-limit in MB | `$LOCALAI_UPLOAD_LIMIT`, `$UPLOAD_LIMIT` |
 | `--api-keys` | | List of API Keys to enable API authentication. When this is set, all requests must be authenticated with one of these API keys | `$LOCALAI_API_KEY`, `$API_KEY` |
 | `--disable-webui` | `false` | Disables the web user interface. When set to true, the server will only expose API endpoints without serving the web interface | `$LOCALAI_DISABLE_WEBUI`, `$DISABLE_WEBUI` |
+| `--disable-runtime-settings` | `false` | Disables the runtime settings feature. When set to true, the server will not load runtime settings from the `runtime_settings.json` file and the settings web interface will be disabled | `$LOCALAI_DISABLE_RUNTIME_SETTINGS`, `$DISABLE_RUNTIME_SETTINGS` |
 | `--disable-gallery-endpoint` | `false` | Disable the gallery endpoints | `$LOCALAI_DISABLE_GALLERY_ENDPOINT`, `$DISABLE_GALLERY_ENDPOINT` |
 | `--disable-metrics-endpoint` | `false` | Disable the `/metrics` endpoint | `$LOCALAI_DISABLE_METRICS_ENDPOINT`, `$DISABLE_METRICS_ENDPOINT` |
 | `--machine-tag` | | If not empty, add that string to Machine-Tag header in each response. Useful to track response from different machines using multiple P2P federated nodes | `$LOCALAI_MACHINE_TAG`, `$MACHINE_TAG` |