feat: Add Agentic MCP support with a new chat/completion endpoint (#6381)

* WIP - add endpoint Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Rename Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Wire the Completion API Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Try to make it functional Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Almost functional Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Bump golang versions used in tests Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add description of the tool Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Make it working Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Small optimizations Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Cleanup/refactor Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Update docs Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-02-06 09:59:11 -06:00 · 2025-10-05 17:51:41 +02:00
parent 6b2c8277c2
commit 60b6472fa0
21 changed files with 679 additions and 35 deletions
--- a/docs/content/docs/features/mcp.md
+++ b/docs/content/docs/features/mcp.md
@@ -0,0 +1,200 @@
+++
+title = "Model Context Protocol (MCP)"
+weight = 20
+toc = true
+description = "Agentic capabilities with Model Context Protocol integration"
+tags = ["MCP", "Agents", "Tools", "Advanced"]
+categories = ["Features"]
+icon = "plug"
+++
+
+# Model Context Protocol (MCP) Support
+
+LocalAI now supports the **Model Context Protocol (MCP)**, enabling powerful agentic capabilities by connecting AI models to external tools and services. This feature allows your LocalAI models to interact with various MCP servers, providing access to real-time data, APIs, and specialized tools.
+
+## What is MCP?
+
+The Model Context Protocol is a standard for connecting AI models to external tools and data sources. It enables AI agents to:
+
+- Access real-time information from external APIs
+- Execute commands and interact with external systems
+- Use specialized tools for specific tasks
+- Maintain context across multiple tool interactions
+
+## Key Features
+
+- **🔄 Real-time Tool Access**: Connect to external MCP servers for live data
+- **🛠️ Multiple Server Support**: Configure both remote HTTP and local stdio servers
+- **⚡ Cached Connections**: Efficient tool caching for better performance
+- **🔒 Secure Authentication**: Support for bearer token authentication
+- **🎯 OpenAI Compatible**: Uses the familiar `/mcp/v1/chat/completions` endpoint
+
+## Configuration
+
+MCP support is configured in your model's YAML configuration file using the `mcp` section:
+
+```yaml
+name: my-agentic-model
+backend: llama-cpp
+parameters:
+  model: qwen3-4b.gguf
+
+# MCP Configuration
+mcp:
+  remote: |
+    {
+      "mcpServers": {
+        "weather-api": {
+          "url": "https://api.weather.com/v1",
+          "token": "your-api-token"
+        },
+        "search-engine": {
+          "url": "https://search.example.com/mcp",
+          "token": "your-search-token"
+        }
+      }
+    }
+  
+  stdio: |
+    {
+      "mcpServers": {
+        "file-manager": {
+          "command": "python",
+          "args": ["-m", "mcp_file_manager"],
+          "env": {
+            "API_KEY": "your-key"
+          }
+        },
+        "database-tools": {
+          "command": "node",
+          "args": ["database-mcp-server.js"],
+          "env": {
+            "DB_URL": "postgresql://localhost/mydb"
+          }
+        }
+      }
+    }
+```
+
+### Configuration Options
+
+#### Remote Servers (`remote`)
+Configure HTTP-based MCP servers:
+
+- **`url`**: The MCP server endpoint URL
+- **`token`**: Bearer token for authentication (optional)
+
+#### STDIO Servers (`stdio`)
+Configure local command-based MCP servers:
+
+- **`command`**: The executable command to run
+- **`args`**: Array of command-line arguments
+- **`env`**: Environment variables (optional)
+
+## Usage
+
+### API Endpoint
+
+Use the MCP-enabled completion endpoint:
+
+```bash
+curl http://localhost:8080/mcp/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "my-agentic-model",
+    "messages": [
+      {"role": "user", "content": "What is the current weather in New York?"}
+    ],
+    "temperature": 0.7
+  }'
+```
+
+### Example Response
+
+```json
+{
+  "id": "chatcmpl-123",
+  "created": 1699123456,
+  "model": "my-agentic-model",
+  "choices": [
+    {
+      "text": "The current weather in New York is 72°F (22°C) with partly cloudy skies. The humidity is 65% and there's a light breeze from the west at 8 mph."
+    }
+  ],
+  "object": "text_completion"
+}
+```
+
+## Example Configurations
+
+
+### Docker-based Tools
+
+```yaml
+name: docker-agent
+backend: llama-cpp
+parameters:
+  model: qwen3-4b.gguf
+
+mcp:
+  stdio: |
+    {
+      "mcpServers": {
+        "searxng": {
+          "command": "docker",
+          "args": [
+            "run", "-i", "--rm",
+            "quay.io/mudler/tests:duckduckgo-localai"
+          ]
+        }
+      }
+    }
+```
+
+## How It Works
+
+1. **Tool Discovery**: LocalAI connects to configured MCP servers and discovers available tools
+2. **Tool Caching**: Tools are cached per model for efficient reuse
+3. **Agent Execution**: The AI model uses the [Cogito](https://github.com/mudler/cogito) framework to execute tools
+4. **Response Generation**: The model generates responses incorporating tool results
+
+## Supported MCP Servers
+
+LocalAI is compatible with any MCP-compliant server.
+
+## Best Practices
+
+### Security
+- Use environment variables for sensitive tokens
+- Validate MCP server endpoints before deployment
+- Implement proper authentication for remote servers
+
+### Performance
+- Cache frequently used tools
+- Use appropriate timeout values for external APIs
+- Monitor resource usage for stdio servers
+
+### Error Handling
+- Implement fallback mechanisms for tool failures
+- Log tool execution for debugging
+- Handle network timeouts gracefully
+
+### With External Applications
+
+Use MCP-enabled models in your applications:
+
+```python
+import openai
+
+client = openai.OpenAI(
+    base_url="http://localhost:8080/mcp/v1",
+    api_key="your-api-key"
+)
+
+response = client.chat.completions.create(
+    model="my-agentic-model",
+    messages=[
+        {"role": "user", "content": "Analyze the latest research papers on AI"}
+    ]
+)
+```
--- a/docs/content/docs/overview.md
+++ b/docs/content/docs/overview.md
@@ -30,6 +30,7 @@ LocalAI is more than just a single tool - it's a complete ecosystem:
 1. **[LocalAI Core](https://github.com/mudler/LocalAI)**
   - OpenAI-compatible API
   - Multiple model support (LLMs, image, audio)
+   - Model Context Protocol (MCP) for agentic capabilities
   - No GPU required
   - Fast inference with native bindings
   - [Github repository](https://github.com/mudler/LocalAI)
@@ -78,6 +79,7 @@ For more detailed installation options and configurations, see our [Getting Star
 - **Vision API**: Image understanding and analysis
 - **Embeddings**: Vector database support
 - **Functions**: OpenAI-compatible function calling
+- **MCP Support**: Model Context Protocol for agentic capabilities
 - **P2P**: Distributed inference capabilities

 ## Community and Support