14 KiB
Test Infrastructure Documentation
This document provides a comprehensive guide to the test infrastructure in Readur, including test patterns, utilities, common issues, and best practices.
📋 Table of Contents
- Test Architecture Overview
- TestContext Pattern
- Test Utilities
- Test Isolation and Environment Variables
- Common Patterns
- Troubleshooting
- Best Practices
Test Architecture Overview
Readur uses a three-tier testing approach:
- Unit Tests (
src/tests/) - Fast, isolated component tests - Integration Tests (
tests/) - Full system tests with database - Frontend Tests (
frontend/src/__tests__/) - React component and API tests
Test Execution Flow
┌─────────────────┐
│ Unit Tests │ ← No external dependencies
│ (cargo test) │ ← Milliseconds execution
└────────┬────────┘
│
┌────────▼────────┐
│Integration Tests│ ← Real database (PostgreSQL)
│ (TestContext) │ ← In-memory app instance
└────────┬────────┘
│
┌────────▼────────┐
│ Frontend Tests │ ← Mocked API responses
│ (Vitest) │ ← Component isolation
└─────────────────┘
TestContext Pattern
The TestContext is the cornerstone of integration testing in Readur. It provides an isolated test environment with a real database.
Basic Usage
use readur::test_utils::{TestContext, TestAuthHelper};
#[tokio::test]
async fn test_document_workflow() {
// Create a new test context with default configuration
let ctx = TestContext::new().await;
// Access the app router for making requests
let app = ctx.app();
// Access the application state
let state = ctx.state();
// Test runs with isolated database
}
How TestContext Works
- Database Setup: Spins up a PostgreSQL container using testcontainers
- Migrations: Runs all SQLx migrations automatically
- App Instance: Creates an in-memory Axum router with full API routes
- Isolation: Each test gets its own database container
Custom Configuration
use readur::test_utils::{TestContext, TestConfigBuilder};
#[tokio::test]
async fn test_with_custom_config() {
let config = TestConfigBuilder::default()
.with_concurrent_ocr_jobs(4)
.with_upload_path("./test-uploads")
.with_oidc_enabled(false);
let ctx = TestContext::with_config(config).await;
}
Making Requests
use axum::http::{Request, StatusCode};
use axum::body::Body;
use tower::ServiceExt;
// Direct request to the test app
let request = Request::builder()
.method("GET")
.uri("/api/health")
.body(Body::empty())
.unwrap();
let response = ctx.app().clone().oneshot(request).await.unwrap();
assert_eq!(response.status(), StatusCode::OK);
Test Utilities
TestAuthHelper
Handles user creation and authentication in tests:
let auth_helper = TestAuthHelper::new(ctx.app().clone());
// Create a regular user
let mut test_user = auth_helper.create_test_user().await;
// Generates unique username: testuser_<pid>_<thread>_<nanos>
// Create an admin user
let admin_user = auth_helper.create_admin_user().await;
// Login and get token
let token = test_user.login(&auth_helper).await.unwrap();
// Make authenticated request
let response = auth_helper.make_authenticated_request(
"GET",
"/api/documents",
None,
&token
).await;
Document Helpers
Test data builders for consistent document creation:
use readur::test_utils::document_helpers::*;
// Basic test document
let doc = create_test_document(user_id);
// Document with specific hash
let doc = create_test_document_with_hash(
user_id,
"test.pdf",
"abc123".to_string()
);
// Low confidence OCR document
let doc = create_low_confidence_document(user_id, 45.0);
// Document with OCR error
let doc = create_document_with_ocr_error(user_id);
Test User Pattern
Each test creates unique users to avoid conflicts:
// Unique username pattern: testuser_<process_id>_<thread_id>_<timestamp_nanos>
// Example: testuser_12345_2_1752870966778668050
// This prevents "Username already exists" errors in parallel tests
Test Isolation and Environment Variables
The TESSDATA_PREFIX Problem
One of the most challenging issues in the test suite was related to OCR language validation and environment variables.
The Issue
- Tests set
TESSDATA_PREFIXenvironment variable to point to temporary directories - Environment variables are global and shared across all threads
- When tests run in parallel, they overwrite each other's
TESSDATA_PREFIX - This caused 400 errors when validating OCR languages
The Solution
Modified the OCR retry endpoint to use custom tessdata paths:
// In src/routes/documents/ocr.rs
let health_checker = if let Ok(tessdata_path) = std::env::var("TESSDATA_PREFIX") {
crate::ocr::health::OcrHealthChecker::new_with_path(tessdata_path)
} else {
crate::ocr::health::OcrHealthChecker::new()
};
Test Setup Example
#[tokio::test]
async fn test_retry_ocr_with_language() {
// Create temporary directory for tessdata
let temp_dir = TempDir::new().unwrap();
let tessdata_path = temp_dir.path();
// Create mock language files
fs::write(tessdata_path.join("eng.traineddata"), "mock").unwrap();
fs::write(tessdata_path.join("spa.traineddata"), "mock").unwrap();
// Set environment variable (careful with parallel tests!)
let tessdata_str = tessdata_path.to_string_lossy().to_string();
std::env::set_var("TESSDATA_PREFIX", &tessdata_str);
let ctx = TestContext::new().await;
// ... rest of test
}
Best Practices for Environment Variables
- Avoid Global State: Prefer passing configuration through constructors
- Use TestContext: It provides isolation for most test scenarios
- Serial Execution: For tests that must modify environment variables:
#[tokio::test] #[serial] // Using serial_test crate async fn test_that_modifies_env() { // This test runs in isolation }
Common Patterns
Authentication Test Pattern
#[tokio::test]
async fn test_authenticated_endpoint() {
let ctx = TestContext::new().await;
let auth_helper = TestAuthHelper::new(ctx.app().clone());
// Create and login user
let mut user = auth_helper.create_test_user().await;
let token = user.login(&auth_helper).await.unwrap();
// Make authenticated request
let request = Request::builder()
.method("GET")
.uri("/api/protected")
.header("Authorization", format!("Bearer {}", token))
.body(Body::empty())
.unwrap();
let response = ctx.app().clone().oneshot(request).await.unwrap();
assert_eq!(response.status(), StatusCode::OK);
}
Document Upload Pattern
#[tokio::test]
async fn test_document_upload() {
let ctx = TestContext::new().await;
let auth_helper = TestAuthHelper::new(ctx.app().clone());
let mut user = auth_helper.create_test_user().await;
let token = user.login(&auth_helper).await.unwrap();
// Create multipart form
let form = multipart::Form::new()
.text("tags", "test,document")
.part("file", multipart::Part::bytes(b"test content")
.file_name("test.txt")
.mime_str("text/plain").unwrap());
// Upload document
let response = reqwest::Client::new()
.post("http://localhost:8000/api/documents")
.header("Authorization", format!("Bearer {}", token))
.multipart(form)
.send()
.await
.unwrap();
assert_eq!(response.status(), 201);
}
Database Direct Access Pattern
#[tokio::test]
async fn test_database_operations() {
let ctx = TestContext::new().await;
let user_id = Uuid::new_v4();
// Direct database access
sqlx::query!(
"INSERT INTO users (id, username, email, password_hash, role)
VALUES ($1, $2, $3, $4, $5)",
user_id,
"testuser",
"test@example.com",
"hash",
"user"
)
.execute(&ctx.state().db.pool)
.await
.unwrap();
// Verify through API
// ...
}
Troubleshooting
Common Test Failures
1. "Username already exists" Error
Cause: Parallel tests creating users with same username
Solution: TestAuthHelper now generates unique usernames with timestamps
// Automatic unique username generation
let username = format!("testuser_{}_{}_{}",
std::process::id(),
thread_id,
timestamp_nanos
);
2. "Server is not running" (Integration Tests)
Cause: Tests expecting external server on localhost:8000
Solution: Use TestContext instead of external HTTP requests
// ❌ Wrong - expects external server
let response = reqwest::get("http://localhost:8000/api/health").await;
// ✅ Correct - uses TestContext
let response = ctx.app().clone()
.oneshot(Request::builder()
.uri("/api/health")
.body(Body::empty())
.unwrap())
.await
.unwrap();
3. OCR Language Validation Failures (400 errors)
Cause: TESSDATA_PREFIX environment variable conflicts
Solution: Use new_with_path() for custom tessdata directories
4. Database Connection Errors
Cause: PostgreSQL container not ready or migrations failed
Debug Steps:
# Check if tests can connect to database
RUST_LOG=debug cargo test
# Run single test with output
cargo test test_name -- --nocapture
# Check Docker containers
docker ps
Debugging Techniques
Enable Detailed Logging
# Full debug output
RUST_LOG=debug cargo test -- --nocapture
# Specific module logging
RUST_LOG=readur::routes=debug cargo test
# With backtrace
RUST_BACKTRACE=1 cargo test
Run Tests Serially
# Avoid parallel execution issues
cargo test -- --test-threads=1
Inspect Test Database
// Add debug queries in test
let count: i64 = sqlx::query_scalar("SELECT COUNT(*) FROM users")
.fetch_one(&ctx.state().db.pool)
.await
.unwrap();
println!("User count: {}", count);
Best Practices
1. Use Unique Identifiers
Always use timestamps or UUIDs for test data:
let unique_id = Uuid::new_v4();
let unique_email = format!("test_{}@example.com", unique_id);
2. Clean Test State
TestContext automatically provides isolated databases, but clean up external resources:
// TempDir automatically cleans up
let temp_dir = TempDir::new().unwrap();
// Directory deleted when temp_dir drops
3. Test Both Success and Failure Cases
#[tokio::test]
async fn test_endpoint_success() {
// Happy path test
}
#[tokio::test]
async fn test_endpoint_unauthorized() {
// No auth token - expect 401
}
#[tokio::test]
async fn test_endpoint_not_found() {
// Invalid ID - expect 404
}
4. Use Type-Safe Assertions
// Parse response to proper types
let body_bytes = axum::body::to_bytes(response.into_body(), usize::MAX)
.await
.unwrap();
let document: DocumentResponse = serde_json::from_slice(&body_bytes).unwrap();
// Now assertions are type-safe
assert_eq!(document.filename, "test.pdf");
5. Document Test Purpose
#[tokio::test]
async fn test_ocr_retry_with_multiple_languages() {
// Tests that OCR retry endpoint accepts multiple language codes
// and validates them against available tessdata files.
// This ensures multi-language OCR support works correctly.
}
6. Avoid External Dependencies
- Use TestContext instead of external servers
- Mock external services when possible
- Use in-memory databases for unit tests
- Create test fixtures instead of relying on external files
7. Handle Async Properly
// Use tokio::test for async tests
#[tokio::test]
async fn test_async_operation() {
// Can use .await here
}
// For timeout handling
use tokio::time::{timeout, Duration};
let result = timeout(
Duration::from_secs(30),
long_running_operation()
).await;
Test Organization
Directory Structure
readur/
├── src/
│ └── tests/ # Unit tests
│ ├── mod.rs
│ ├── auth_tests.rs
│ ├── db_tests.rs
│ └── ...
├── tests/ # Integration tests
│ ├── integration_ocr_language_endpoints.rs
│ ├── integration_settings_tests.rs
│ └── ...
└── frontend/
└── src/
└── __tests__/ # Frontend tests
├── components/
└── pages/
Naming Conventions
- Unit tests:
test_<component>_<behavior> - Integration tests:
test_<workflow>_<scenario> - Test files:
integration_<feature>_tests.rs
Summary
The test infrastructure in Readur provides:
- Isolation: Each test runs in its own environment
- Realism: Integration tests use real databases and full app instances
- Speed: Parallel execution with proper isolation
- Reliability: Unique identifiers prevent conflicts
- Maintainability: Clear patterns and utilities
Key takeaways:
- Always use TestContext for integration tests
- Generate unique test data to avoid conflicts
- Be careful with environment variables in parallel tests
- Use the provided test utilities for common operations
- Test both success and failure scenarios
For more examples, see the existing test files in tests/ directory.