formbricks-formbricks/.cursor/rules/cache-optimization.mdc

---
description: Caching rules for performance improvements
globs:
alwaysApply: false
---
# Cache Optimization Patterns for Formbricks

## Cache Strategy Overview

Formbricks uses a **hybrid caching approach** optimized for enterprise scale:

- **Redis** for persistent cross-request caching
- **React `cache()`** for request-level deduplication
- **NO Next.js `unstable_cache()`** - avoid for reliability

## Key Files

### Core Cache Infrastructure
- [packages/cache/src/service.ts](mdc:packages/cache/src/service.ts) - Redis cache service
- [packages/cache/src/client.ts](mdc:packages/cache/src/client.ts) - Cache client initialization and singleton management
- [apps/web/lib/cache/index.ts](mdc:apps/web/lib/cache/index.ts) - Cache service proxy for web app
- [packages/cache/src/index.ts](mdc:packages/cache/src/index.ts) - Cache package exports and utilities

### Environment State Caching (Critical Endpoint)
- [apps/web/app/api/v1/client/[environmentId]/environment/route.ts](mdc:apps/web/app/api/v1/client/[environmentId]/environment/route.ts) - Main endpoint serving hundreds of thousands of SDK clients
- [apps/web/app/api/v1/client/[environmentId]/environment/lib/data.ts](mdc:apps/web/app/api/v1/client/[environmentId]/environment/lib/data.ts) - Optimized data layer with caching

## Enterprise-Grade Cache Key Patterns

**Always use** the `createCacheKey` utilities from the cache package:

```typescript
// ✅ Correct patterns
createCacheKey.environment.state(environmentId)     // "fb:env:abc123:state"
createCacheKey.organization.billing(organizationId) // "fb:org:xyz789:billing"
createCacheKey.license.status(organizationId)       // "fb:license:org123:status"
createCacheKey.user.permissions(userId, orgId)      // "fb:user:456:org:123:permissions"

// ❌ Never use flat keys - collision-prone
"environment_abc123"
"user_data_456"
```

## When to Use Each Cache Type

### Use React `cache()` for Request Deduplication
```typescript
// ✅ Prevents multiple calls within same request
export const getEnterpriseLicense = reactCache(async () => {
  // Complex license validation logic
});
```

### Use `cache.withCache()` for Simple Database Queries
```typescript
// ✅ Simple caching with automatic fallback (TTL in milliseconds)
export const getActionClasses = (environmentId: string) => {
  return cache.withCache(() => fetchActionClassesFromDB(environmentId),
    createCacheKey.environment.actionClasses(environmentId),
    60 * 30 * 1000 // 30 minutes in milliseconds
  );
};
```

### Use Explicit Redis Cache for Complex Business Logic
```typescript
// ✅ Full control for high-stakes endpoints
export const getEnvironmentState = async (environmentId: string) => {
  const cached = await environmentStateCache.getEnvironmentState(environmentId);
  if (cached) return cached;

  const fresh = await buildComplexState(environmentId);
  await environmentStateCache.setEnvironmentState(environmentId, fresh);
  return fresh;
};
```

## Caching Decision Framework

### When TO Add Caching

```typescript
// ✅ Expensive operations that benefit from caching
- Database queries (>10ms typical)
- External API calls (>50ms typical)
- Complex computations (>5ms)
- File system operations
- Heavy data transformations

// Example: Database query with complex joins (TTL in milliseconds)
export const getEnvironmentWithDetails = withCache(
  async (environmentId: string) => {
    return prisma.environment.findUnique({
      where: { id: environmentId },
      include: { /* complex joins */ }
    });
  },
  { key: createCacheKey.environment.details(environmentId), ttl: 60 * 30 * 1000 } // 30 minutes
)();
```

### When NOT to Add Caching

```typescript
// ❌ Don't cache these operations - minimal overhead
- Simple property access (<0.1ms)
- Basic transformations (<1ms)
- Functions that just call already-cached functions
- Pure computation without I/O

// ❌ Bad example: Redundant caching
const getCachedLicenseFeatures = withCache(
  async () => {
    const license = await getEnterpriseLicense(); // Already cached!
    return license.active ? license.features : null; // Just property access
  },
  { key: "license-features", ttl: 1800 * 1000 } // 30 minutes in milliseconds
);

// ✅ Good example: Simple and efficient
const getLicenseFeatures = async () => {
  const license = await getEnterpriseLicense(); // Already cached
  return license.active ? license.features : null; // 0.1ms overhead
};
```

### Computational Overhead Analysis

Before adding caching, analyze the overhead:

```typescript
// ✅ High overhead - CACHE IT
- Database queries: ~10-100ms
- External APIs: ~50-500ms
- File I/O: ~5-50ms
- Complex algorithms: >5ms

// ❌ Low overhead - DON'T CACHE
- Property access: ~0.001ms
- Simple lookups: ~0.1ms
- Basic validation: ~1ms
- Type checks: ~0.01ms

// Example decision tree:
const expensiveOperation = async () => {
  return prisma.query(); // 50ms - CACHE IT
};

const cheapOperation = (data: any) => {
  return data.property; // 0.001ms - DON'T CACHE
};
```

### Avoid Cache Wrapper Anti-Pattern

```typescript
// ❌ Don't create wrapper functions just for caching
const getCachedUserPermissions = withCache(
  async (userId: string) => getUserPermissions(userId),
  { key: createCacheKey.user.permissions(userId), ttl: 3600 * 1000 } // 1 hour in milliseconds
);

// ✅ Add caching directly to the original function
export const getUserPermissions = withCache(
  async (userId: string) => {
    return prisma.user.findUnique({
      where: { id: userId },
      include: { permissions: true }
    });
  },
  { key: createCacheKey.user.permissions(userId), ttl: 3600 * 1000 } // 1 hour in milliseconds
);
```

## TTL Coordination Strategy

### Multi-Layer Cache Coordination
For endpoints serving client SDKs, coordinate TTLs across layers:

```typescript
// Client SDK cache (expiresAt) - longest TTL for fewer requests
const CLIENT_TTL = 60 * 60;           // 1 hour (seconds for client)

// Server Redis cache - shorter TTL ensures fresh data for clients
const SERVER_TTL = 60 * 30 * 1000;   // 30 minutes in milliseconds

// HTTP cache headers (seconds)
const BROWSER_TTL = 60 * 60;         // 1 hour (max-age)
const CDN_TTL = 60 * 30;             // 30 minutes (s-maxage)
const CORS_TTL = 60 * 60;            // 1 hour (balanced approach)
```

### Standard TTL Guidelines (in milliseconds for cache-manager + Keyv)
```typescript
// Configuration data - rarely changes
const CONFIG_TTL = 60 * 60 * 24 * 1000;     // 24 hours

// User data - moderate frequency
const USER_TTL = 60 * 60 * 2 * 1000;        // 2 hours

// Survey data - changes moderately
const SURVEY_TTL = 60 * 15 * 1000;          // 15 minutes

// Billing data - expensive to compute
const BILLING_TTL = 60 * 30 * 1000;         // 30 minutes

// Action classes - infrequent changes
const ACTION_CLASS_TTL = 60 * 30 * 1000;    // 30 minutes
```

## High-Frequency Endpoint Optimization

### Performance Patterns for High-Volume Endpoints

```typescript
// ✅ Optimized high-frequency endpoint pattern
export const GET = async (request: NextRequest, props: { params: Promise<{ id: string }> }) => {
  const params = await props.params;

  try {
    // Simple validation (avoid Zod for high-frequency)
    if (!params.id || typeof params.id !== 'string') {
      return responses.badRequestResponse("ID is required", undefined, true);
    }

    // Single optimized query with caching
    const data = await getOptimizedData(params.id);

    return responses.successResponse(
      {
        data,
        expiresAt: new Date(Date.now() + CLIENT_TTL * 1000), // SDK cache duration
      },
      true,
      "public, s-maxage=1800, max-age=3600, stale-while-revalidate=1800, stale-if-error=3600"
    );
  } catch (err) {
    // Simplified error handling for performance
    if (err instanceof ResourceNotFoundError) {
      return responses.notFoundResponse(err.resourceType, err.resourceId);
    }
    logger.error({ error: err, url: request.url }, "Error in high-frequency endpoint");
    return responses.internalServerErrorResponse(err.message, true);
  }
};
```

### Avoid These Performance Anti-Patterns

```typescript
// ❌ Avoid for high-frequency endpoints
const inputValidation = ZodSchema.safeParse(input);     // Too slow
const startTime = Date.now(); logger.debug(...);       // Logging overhead
const { data, revalidateEnvironment } = await get();   // Complex return types
```

### CORS Optimization
```typescript
// ✅ Balanced CORS caching (not too aggressive)
export const OPTIONS = async (): Promise<Response> => {
  return responses.successResponse(
    {},
    true,
    "public, s-maxage=3600, max-age=3600" // 1 hour balanced approach
  );
};
```

## Redis Cache Migration from Next.js

### Avoid Legacy Next.js Patterns
```typescript
// ❌ Old Next.js unstable_cache pattern (avoid)
const getCachedData = unstable_cache(
  async (id) => fetchData(id),
  ['cache-key'],
  { tags: ['environment'], revalidate: 900 }
);

// ❌ Don't use revalidateEnvironment flags with Redis
return { data, revalidateEnvironment: true }; // This gets cached incorrectly!

// ✅ New Redis pattern with withCache (TTL in milliseconds)
export const getCachedData = (id: string) =>
  withCache(
    () => fetchData(id),
    {
      key: createCacheKey.environment.data(id),
      ttl: 60 * 15 * 1000, // 15 minutes in milliseconds
    }
  )();
```

### Remove Revalidation Logic
When migrating from Next.js `unstable_cache`:
- Remove `revalidateEnvironment` or similar flags
- Remove tag-based invalidation logic
- Use TTL-based expiration instead
- Handle one-time updates (like `appSetupCompleted`) directly in cache

## Data Layer Optimization

### Single Query Pattern
```typescript
// ✅ Optimize with single database query
export const getOptimizedEnvironmentData = async (environmentId: string) => {
  return prisma.environment.findUniqueOrThrow({
    where: { id: environmentId },
    include: {
      project: {
        select: { id: true, recontactDays: true, /* ... */ }
      },
      organization: {
        select: { id: true, billing: true }
      },
      surveys: {
        where: { status: "inProgress" },
        select: { id: true, name: true, /* ... */ }
      },
      actionClasses: {
        select: { id: true, name: true, /* ... */ }
      }
    }
  });
};

// ❌ Avoid multiple separate queries
const environment = await getEnvironment(id);
const organization = await getOrganization(environment.organizationId);
const surveys = await getSurveys(id);
const actionClasses = await getActionClasses(id);
```

## Invalidation Best Practices

**Always use explicit key-based invalidation:**

```typescript
// ✅ Clear and debuggable
await invalidateCache(createCacheKey.environment.state(environmentId));
await invalidateCache([
  createCacheKey.environment.surveys(environmentId),
  createCacheKey.environment.actionClasses(environmentId)
]);

// ❌ Avoid complex tag systems
await invalidateByTags(["environment", "survey"]); // Don't do this
```

## Critical Performance Targets

### High-Frequency Endpoint Goals
- **Cache hit ratio**: >85%
- **Response time P95**: <200ms
- **Database load reduction**: >60%
- **HTTP cache duration**: 1hr browser, 30min Cloudflare
- **SDK refresh interval**: 1 hour with 30min server cache

### Performance Monitoring
- Use **existing elastic cache analytics** for metrics
- Log cache errors and warnings (not debug info)
- Track database query reduction
- Monitor response times for cached endpoints
- **Avoid performance logging** in high-frequency endpoints

## Error Handling Pattern

Always provide fallback to fresh data on cache errors:

```typescript
try {
  const cached = await cache.get(key);
  if (cached) return cached;

  const fresh = await fetchFresh();
  await cache.set(key, fresh, ttl); // ttl in milliseconds
  return fresh;
} catch (error) {
  // ✅ Always fallback to fresh data
  logger.warn("Cache error, fetching fresh", { key, error });
  return fetchFresh();
}
```

## Common Pitfalls to Avoid

1. **Never use Next.js `unstable_cache()`** - unreliable in production
2. **Don't use revalidation flags with Redis** - they get cached incorrectly
3. **Avoid Zod validation** for simple parameters in high-frequency endpoints
4. **Don't add performance logging** to high-frequency endpoints
5. **Coordinate TTLs** between client and server caches
6. **Don't over-engineer** with complex tag systems
7. **Avoid caching rapidly changing data** (real-time metrics)
8. **Always validate cache keys** to prevent collisions
9. **Don't add redundant caching layers** - analyze computational overhead first
10. **Avoid cache wrapper functions** - add caching directly to expensive operations
11. **Don't cache property access or simple transformations** - overhead is negligible
12. **Analyze the full call chain** before adding caching to avoid double-caching
13. **Remember TTL is in milliseconds** for cache-manager + Keyv stack (not seconds)

## Monitoring Strategy

- Use **existing elastic cache analytics** for metrics
- Log cache errors and warnings
- Track database query reduction
- Monitor response times for cached endpoints
- **Don't add custom metrics** that duplicate existing monitoring

## Important Notes

### TTL Units
- **cache-manager + Keyv**: TTL in **milliseconds**
- **Direct Redis commands**: TTL in **seconds** (EXPIRE, SETEX) or **milliseconds** (PEXPIRE, PSETEX)
- **HTTP cache headers**: TTL in **seconds** (max-age, s-maxage)
- **Client SDK**: TTL in **seconds** (expiresAt calculation)