17 KiB
API Reference
Readur provides a comprehensive REST API for integrating with external systems and building custom workflows.
Table of Contents
Base URL
http://localhost:8000/api
For production deployments, replace with your configured domain and ensure HTTPS is used.
Authentication
Readur uses JWT (JSON Web Token) authentication. Include the token in the Authorization header:
Authorization: Bearer <jwt_token>
Obtaining a Token
POST /api/auth/login
Content-Type: application/json
{
"username": "admin",
"password": "your_password"
}
Response:
{
"token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
"user": {
"id": 1,
"username": "admin",
"email": "admin@example.com",
"role": "admin"
}
}
Error Handling
All API errors follow a consistent format:
{
"error": {
"code": "VALIDATION_ERROR",
"message": "Invalid request parameters",
"details": {
"field": "email",
"reason": "Invalid email format"
}
}
}
Common HTTP status codes:
200- Success201- Created400- Bad Request401- Unauthorized403- Forbidden404- Not Found422- Validation Error500- Internal Server Error
Rate Limiting
API requests are rate-limited to prevent abuse:
- Authenticated users: 1000 requests per hour
- Unauthenticated users: 100 requests per hour
Rate limit headers:
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 999
X-RateLimit-Reset: 1640995200
Endpoints
Authentication Endpoints
Register New User
POST /api/auth/register
Content-Type: application/json
{
"username": "john_doe",
"email": "john@example.com",
"password": "secure_password"
}
Login
POST /api/auth/login
Content-Type: application/json
{
"username": "john_doe",
"password": "secure_password"
}
Get Current User
GET /api/auth/me
Authorization: Bearer <jwt_token>
OIDC Login (Redirect)
GET /api/auth/oidc/login
Redirects to the configured OIDC provider for authentication.
OIDC Callback
GET /api/auth/oidc/callback?code=<auth_code>&state=<state>
Handles the callback from the OIDC provider and issues a JWT token.
Logout
POST /api/auth/logout
Authorization: Bearer <jwt_token>
Document Endpoints
Upload Document
POST /api/documents
Authorization: Bearer <jwt_token>
Content-Type: multipart/form-data
file: <binary_file_data>
tags: ["invoice", "2024"] # Optional
Response:
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"filename": "invoice_2024.pdf",
"mime_type": "application/pdf",
"size": 1048576,
"uploaded_at": "2024-01-01T00:00:00Z",
"ocr_status": "pending"
}
List Documents
GET /api/documents?limit=50&offset=0&sort=-uploaded_at
Authorization: Bearer <jwt_token>
Query parameters:
limit- Number of results (default: 50, max: 100)offset- Pagination offsetsort- Sort field (prefix with-for descending)mime_type- Filter by MIME typeocr_status- Filter by OCR statustag- Filter by tag
Get Document Details
GET /api/documents/{id}
Authorization: Bearer <jwt_token>
Download Document
GET /api/documents/{id}/download
Authorization: Bearer <jwt_token>
Delete Document
DELETE /api/documents/{id}
Authorization: Bearer <jwt_token>
Update Document
PATCH /api/documents/{id}
Authorization: Bearer <jwt_token>
Content-Type: application/json
{
"tags": ["invoice", "paid", "2024"]
}
Get Document Debug Information
GET /api/documents/{id}/debug
Authorization: Bearer <jwt_token>
Response:
{
"document_id": "550e8400-e29b-41d4-a716-446655440000",
"processing_pipeline": {
"upload": "completed",
"ocr_queue": "completed",
"ocr_processing": "completed",
"validation": "completed"
},
"ocr_details": {
"confidence": 89.5,
"word_count": 342,
"processing_time": 4.2
},
"file_info": {
"mime_type": "application/pdf",
"size": 1048576,
"pages": 3
}
}
Get Document Thumbnail
GET /api/documents/{id}/thumbnail
Authorization: Bearer <jwt_token>
Get Document OCR Text
GET /api/documents/{id}/ocr
Authorization: Bearer <jwt_token>
Get Document Processed Image
GET /api/documents/{id}/processed-image
Authorization: Bearer <jwt_token>
View Document in Browser
GET /api/documents/{id}/view
Authorization: Bearer <jwt_token>
Get Failed Documents
GET /api/documents/failed?limit=50&offset=0
Authorization: Bearer <jwt_token>
Query parameters:
limit- Number of results (default: 50)offset- Pagination offsetstage- Filter by failure stagereason- Filter by failure reason
View Failed Document
GET /api/documents/failed/{id}/view
Authorization: Bearer <jwt_token>
Get Duplicate Documents
GET /api/documents/duplicates?limit=50&offset=0
Authorization: Bearer <jwt_token>
Delete Low Confidence Documents
POST /api/documents/delete-low-confidence
Authorization: Bearer <jwt_token>
Content-Type: application/json
{
"confidence_threshold": 70.0,
"preview_only": false
}
Delete Failed OCR Documents
POST /api/documents/delete-failed-ocr
Authorization: Bearer <jwt_token>
Content-Type: application/json
{
"preview_only": false
}
Bulk Delete Documents
DELETE /api/documents
Authorization: Bearer <jwt_token>
Content-Type: application/json
{
"document_ids": ["550e8400-e29b-41d4-a716-446655440000", "..."]
}
Search Endpoints
Search Documents
GET /api/search?query=invoice&limit=20
Authorization: Bearer <jwt_token>
Query parameters:
query- Search query (required)limit- Number of resultsoffset- Pagination offsetmime_types- Comma-separated MIME typestags- Comma-separated tagsdate_from- Start date (ISO 8601)date_to- End date (ISO 8601)
Response:
{
"results": [
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"filename": "invoice_2024.pdf",
"snippet": "...invoice for services rendered in Q1 2024...",
"score": 0.95,
"highlights": ["invoice", "2024"]
}
],
"total": 42,
"limit": 20,
"offset": 0
}
Advanced Search
POST /api/search/advanced
Authorization: Bearer <jwt_token>
Content-Type: application/json
{
"query": "invoice",
"filters": {
"mime_types": ["application/pdf"],
"tags": ["unpaid"],
"date_range": {
"from": "2024-01-01",
"to": "2024-12-31"
},
"file_size": {
"min": 1024,
"max": 10485760
}
},
"options": {
"fuzzy": true,
"snippet_length": 200
}
}
OCR Queue Endpoints
Get Queue Status
GET /api/queue/status
Authorization: Bearer <jwt_token>
Response:
{
"pending": 15,
"processing": 3,
"completed_today": 127,
"failed_today": 2,
"average_processing_time": 4.5
}
Retry OCR Processing
POST /api/documents/{id}/retry-ocr
Authorization: Bearer <jwt_token>
Get Failed OCR Jobs
GET /api/queue/failed
Authorization: Bearer <jwt_token>
Get Queue Statistics
GET /api/queue/stats
Authorization: Bearer <jwt_token>
Response:
{
"pending_count": 15,
"processing_count": 3,
"failed_count": 2,
"completed_today": 127,
"average_processing_time_seconds": 4.5,
"queue_health": "healthy"
}
Requeue Failed Items
POST /api/queue/requeue-failed
Authorization: Bearer <jwt_token>
Enqueue Pending Documents
POST /api/queue/enqueue-pending
Authorization: Bearer <jwt_token>
Pause OCR Processing
POST /api/queue/pause
Authorization: Bearer <jwt_token>
Resume OCR Processing
POST /api/queue/resume
Authorization: Bearer <jwt_token>
Settings Endpoints
Get User Settings
GET /api/settings
Authorization: Bearer <jwt_token>
Update User Settings
PUT /api/settings
Authorization: Bearer <jwt_token>
Content-Type: application/json
{
"ocr_language": "eng",
"search_results_per_page": 50,
"enable_notifications": true
}
Sources Endpoints
List Sources
GET /api/sources
Authorization: Bearer <jwt_token>
Create Source
POST /api/sources
Authorization: Bearer <jwt_token>
Content-Type: application/json
{
"name": "Network Drive",
"type": "local_folder",
"config": {
"path": "/mnt/network/documents",
"scan_interval": 3600
},
"enabled": true
}
Update Source
PUT /api/sources/{id}
Authorization: Bearer <jwt_token>
Content-Type: application/json
{
"enabled": false
}
Delete Source
DELETE /api/sources/{id}
Authorization: Bearer <jwt_token>
Sync Source
POST /api/sources/{id}/sync
Authorization: Bearer <jwt_token>
Stop Source Sync
POST /api/sources/{id}/sync/stop
Authorization: Bearer <jwt_token>
Test Source Connection
POST /api/sources/{id}/test
Authorization: Bearer <jwt_token>
Estimate Source Crawl
POST /api/sources/{id}/estimate
Authorization: Bearer <jwt_token>
Estimate Crawl with Configuration
POST /api/sources/estimate
Authorization: Bearer <jwt_token>
Content-Type: application/json
{
"source_type": "webdav",
"config": {
"url": "https://example.com/webdav",
"username": "user",
"password": "pass"
}
}
Test Connection with Configuration
POST /api/sources/test-connection
Authorization: Bearer <jwt_token>
Content-Type: application/json
{
"source_type": "webdav",
"config": {
"url": "https://example.com/webdav",
"username": "user",
"password": "pass"
}
}
WebDAV Endpoints
Test WebDAV Connection
POST /api/webdav/test-connection
Authorization: Bearer <jwt_token>
Content-Type: application/json
{
"url": "https://example.com/webdav",
"username": "user",
"password": "pass"
}
Estimate WebDAV Crawl
POST /api/webdav/estimate-crawl
Authorization: Bearer <jwt_token>
Content-Type: application/json
{
"url": "https://example.com/webdav",
"username": "user",
"password": "pass"
}
Get WebDAV Sync Status
GET /api/webdav/sync-status
Authorization: Bearer <jwt_token>
Start WebDAV Sync
POST /api/webdav/start-sync
Authorization: Bearer <jwt_token>
Content-Type: application/json
{
"url": "https://example.com/webdav",
"username": "user",
"password": "pass"
}
Cancel WebDAV Sync
POST /api/webdav/cancel-sync
Authorization: Bearer <jwt_token>
Labels Endpoints
List Labels
GET /api/labels
Authorization: Bearer <jwt_token>
Create Label
POST /api/labels
Authorization: Bearer <jwt_token>
Content-Type: application/json
{
"name": "Important",
"color": "#FF0000"
}
Update Label
PUT /api/labels/{id}
Authorization: Bearer <jwt_token>
Content-Type: application/json
{
"name": "Very Important",
"color": "#FF00FF"
}
Delete Label
DELETE /api/labels/{id}
Authorization: Bearer <jwt_token>
User Endpoints
List Users (Admin Only)
GET /api/users
Authorization: Bearer <jwt_token>
Get User
GET /api/users/{id}
Authorization: Bearer <jwt_token>
Update User
PUT /api/users/{id}
Authorization: Bearer <jwt_token>
Content-Type: application/json
{
"email": "newemail@example.com",
"role": "user"
}
Delete User (Admin Only)
DELETE /api/users/{id}
Authorization: Bearer <jwt_token>
Notifications Endpoints
List Notifications
GET /api/notifications?limit=50&offset=0
Authorization: Bearer <jwt_token>
Get Notification Summary
GET /api/notifications/summary
Authorization: Bearer <jwt_token>
Response:
{
"unread_count": 5,
"total_count": 23,
"latest_notification": {
"id": 1,
"type": "ocr_completed",
"message": "OCR processing completed for document.pdf",
"created_at": "2024-01-01T12:00:00Z"
}
}
Mark Notification as Read
POST /api/notifications/{id}/read
Authorization: Bearer <jwt_token>
Mark All Notifications as Read
POST /api/notifications/read-all
Authorization: Bearer <jwt_token>
Delete Notification
DELETE /api/notifications/{id}
Authorization: Bearer <jwt_token>
Ignored Files Endpoints
List Ignored Files
GET /api/ignored-files?limit=50&offset=0
Authorization: Bearer <jwt_token>
Query parameters:
limit- Number of results (default: 50)offset- Pagination offsetfilename- Filter by filenamesource_type- Filter by source type
Get Ignored Files Statistics
GET /api/ignored-files/stats
Authorization: Bearer <jwt_token>
Response:
{
"total_ignored_files": 42,
"total_size_bytes": 104857600,
"most_recent_ignored_at": "2024-01-01T12:00:00Z"
}
Get Ignored File Details
GET /api/ignored-files/{id}
Authorization: Bearer <jwt_token>
Remove File from Ignored List
DELETE /api/ignored-files/{id}
Authorization: Bearer <jwt_token>
Bulk Remove Files from Ignored List
DELETE /api/ignored-files/bulk-delete
Authorization: Bearer <jwt_token>
Content-Type: application/json
{
"ignored_file_ids": [1, 2, 3, 4]
}
Metrics Endpoints
Get System Metrics
GET /api/metrics
Authorization: Bearer <jwt_token>
Get Prometheus Metrics
GET /metrics
Returns Prometheus-formatted metrics (no authentication required).
Health Check
Health Check
GET /api/health
Response:
{
"status": "healthy",
"timestamp": "2024-01-01T12:00:00Z",
"version": "1.0.0"
}
Examples
Python Example
import requests
# Configuration
BASE_URL = "http://localhost:8000/api"
USERNAME = "admin"
PASSWORD = "your_password"
# Login
response = requests.post(f"{BASE_URL}/auth/login", json={
"username": USERNAME,
"password": PASSWORD
})
token = response.json()["token"]
headers = {"Authorization": f"Bearer {token}"}
# Upload document
with open("document.pdf", "rb") as f:
files = {"file": ("document.pdf", f, "application/pdf")}
response = requests.post(
f"{BASE_URL}/documents",
headers=headers,
files=files
)
document_id = response.json()["id"]
# Search documents
response = requests.get(
f"{BASE_URL}/search",
headers=headers,
params={"query": "invoice 2024"}
)
results = response.json()["results"]
JavaScript Example
// Configuration
const BASE_URL = 'http://localhost:8000/api';
// Login
async function login(username, password) {
const response = await fetch(`${BASE_URL}/auth/login`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ username, password })
});
const data = await response.json();
return data.token;
}
// Upload document
async function uploadDocument(token, file) {
const formData = new FormData();
formData.append('file', file);
const response = await fetch(`${BASE_URL}/documents`, {
method: 'POST',
headers: { 'Authorization': `Bearer ${token}` },
body: formData
});
return response.json();
}
// Search documents
async function searchDocuments(token, query) {
const response = await fetch(
`${BASE_URL}/search?query=${encodeURIComponent(query)}`,
{
headers: { 'Authorization': `Bearer ${token}` }
}
);
return response.json();
}
cURL Examples
# Login
TOKEN=$(curl -s -X POST http://localhost:8000/api/auth/login \
-H "Content-Type: application/json" \
-d '{"username":"admin","password":"your_password"}' \
| jq -r .token)
# Upload document
curl -X POST http://localhost:8000/api/documents \
-H "Authorization: Bearer $TOKEN" \
-F "file=@document.pdf"
# Search documents
curl -X GET "http://localhost:8000/api/search?query=invoice" \
-H "Authorization: Bearer $TOKEN"
# Get document
curl -X GET http://localhost:8000/api/documents/550e8400-e29b-41d4-a716-446655440000 \
-H "Authorization: Bearer $TOKEN"
OpenAPI Specification
The complete OpenAPI specification is available at:
GET /api-docs/openapi.json
Interactive Swagger UI documentation is available at:
GET /swagger-ui
You can use this with tools like Swagger UI or to generate client libraries.
SDK Support
Official SDKs are planned for:
- Python
- JavaScript/TypeScript
- Go
- Ruby
Check the GitHub repository for the latest SDK availability.