chore: add envoy rate-limit hardening tooling

2026-04-20 19:30:41 -05:00 · 2026-04-01 14:17:04 +05:30
6 changed files with 1124 additions and 0 deletions
@@ -0,0 +1,108 @@
+# Envoy Rate-Limit Coverage Matrix
+
+This document is the staging coverage source of truth for `formbricks/internal#1519`.
+
+It answers two separate questions:
+
+- which request prefixes currently traverse Envoy on staging
+- which current application limiter call sites are already covered, coverable later, or intentionally left in
+  the app
+
+## Current Envoy-routed prefixes
+
+The staging ALB forwards only these prefixes to Envoy:
+
+- `/api/auth/callback`
+- `/api/v1/client`
+- `/api/v2/client`
+- `/api/v1/management`
+- `/api/v1/webhooks`
+- `/storage`
+
+Everything else still goes directly to the chart-managed Formbricks Service.
+
+## Current Envoy rate-limited route groups
+
+These route groups already have active Envoy rate-limit policies on staging:
+
+- `POST /api/auth/callback/credentials`
+  - coarse `40 / hour` gateway approximation for the stricter app `10 / 15 minutes` login limit
+- `POST /api/auth/callback/token`
+  - `10 / hour`
+- `GET|POST|PUT /api/v1/client/{environmentId}/(environment|responses|responses/{responseId}|displays|user)`
+  - `100 / minute`
+- `POST /api/v1/client/{environmentId}/storage`
+  - `5 / minute`
+- `POST|PUT /api/v2/client/{environmentId}/responses(?:/{responseId})`
+  - `100 / minute`
+- `POST /api/v2/client/{environmentId}/displays`
+  - `100 / minute`
+- `POST /api/v2/client/{environmentId}/storage`
+  - `5 / minute`
+- `GET|POST|PUT|DELETE /api/v1/management/*` when `x-api-key` is present
+  - `100 / minute`
+- `POST /api/v1/management/storage` when `x-api-key` is present
+  - `5 / minute`
+- `GET|POST|PUT|DELETE /api/v1/webhooks/*` when `x-api-key` is present
+  - `100 / minute`
+- `DELETE /storage/{environmentId}/{public|private}/{fileName}` when `x-api-key` is present
+  - `5 / minute`
+
+## Call-site coverage matrix
+
+This matrix covers every current `applyIPRateLimit` / `applyRateLimit` call site on `main`.
+
+| Limiter config | Caller / route family | Key type | Stable gateway path | Status | Reason |
+| --- | --- | --- | --- | --- | --- |
+| `rateLimitConfigs.auth.login` | `modules/auth/lib/authOptions.ts` credentials callback | IP | `POST /api/auth/callback/credentials` | `covered_now` | Stable public callback path. Envoy uses a coarse `40 / hour` approximation while the stricter app limit remains active. |
+| `rateLimitConfigs.auth.verifyEmail` | `modules/auth/lib/authOptions.ts` token callback | IP | `POST /api/auth/callback/token` | `covered_now` | Stable public callback path and identical `10 / hour` limit at the gateway. |
+| `rateLimitConfigs.auth.signup` | `modules/auth/signup/actions.ts` | IP | None | `not_coverable_now` | Server action flow, not a stable gateway-managed HTTP route. |
+| `rateLimitConfigs.auth.forgotPassword` | `modules/auth/forgot-password/actions.ts` | IP | None | `not_coverable_now` | Server action flow, not a stable gateway-managed HTTP route. |
+| `rateLimitConfigs.auth.verifyEmail` | `modules/auth/verification-requested/actions.ts` resend verification flow | IP | None | `not_coverable_now` | Same config as the covered token callback, but this caller is a server action instead of a stable public API path. |
+| `rateLimitConfigs.api.client` | `app/lib/api/with-api-logging.ts` public V1 client routes | IP | `/api/v1/client/{environmentId}/(environment|responses|responses/{responseId}|displays|user)` | `covered_now` | Public V1 client paths already match the Envoy policy set. |
+| `rateLimitConfigs.storage.upload` | `app/api/v1/client/[environmentId]/storage/route.ts` via `with-api-logging.ts` | IP | `POST /api/v1/client/{environmentId}/storage` | `covered_now` | Custom storage upload limit is already broken out as its own Envoy rule. |
+| `rateLimitConfigs.api.v1` | `app/lib/api/with-api-logging.ts` API-key-authenticated V1 management routes | API key | `/api/v1/management/*` except storage | `covered_now` | Stable `x-api-key` surface already routed through Envoy. |
+| `rateLimitConfigs.storage.upload` | `app/api/v1/management/storage/route.ts` API-key branch via `with-api-logging.ts` | API key | `POST /api/v1/management/storage` | `covered_now` | Custom storage upload limit already has its own Envoy rule. |
+| `rateLimitConfigs.api.v1` | `app/lib/api/with-api-logging.ts` API-key-authenticated webhooks | API key | `/api/v1/webhooks/*` | `covered_now` | Stable `x-api-key` webhook surface already routed through Envoy. |
+| `rateLimitConfigs.api.v1` | `app/lib/api/with-api-logging.ts` session-authenticated integration routes | Session user ID | None | `not_coverable_now` | Session identity is only resolved inside the app, not at the gateway. |
+| `rateLimitConfigs.api.v1` | `app/lib/api/with-api-logging.ts` session-authenticated V1 management routes | Session user ID | None | `not_coverable_now` | The current Envoy rules only cover the API-key branch of the V1 management surface. |
+| `rateLimitConfigs.api.v1` | `app/api/v1/management/me/route.ts` API-key branch | API key | `GET /api/v1/management/me` | `covered_now` | Direct handler, but the path is already behind Envoy and matched by the V1 management policy. |
+| `rateLimitConfigs.api.v1` | `app/api/v1/management/me/route.ts` session branch | Session user ID | None | `not_coverable_now` | Same path, but this branch keys on session user ID instead of an edge-visible identifier. |
+| `rateLimitConfigs.api.v2` | `modules/api/v2/auth/api-wrapper.ts` authenticated V2 API surface | API key | `/api/v2/*` outside `/api/v2/client` | `coverable_later` | Stable API-key paths exist, but the current Envoy POC only routes the public `/api/v2/client` surface. |
+| `rateLimitConfigs.api.v3` or custom V3 configs | `app/api/v3/lib/api-wrapper.ts` | Session user ID or API key | Route-specific, mainly `/api/v3/*` | `coverable_later` | The wrapper supports mixed auth modes. Production hardening needs a deliberate V3 route inventory before moving any of it to Envoy. |
+| `rateLimitConfigs.storage.delete` | `app/storage/[environmentId]/[accessType]/[fileName]/route.ts` API-key branch | API key | `DELETE /storage/{environmentId}/{public|private}/{fileName}` | `covered_now` | Envoy already enforces the API-key branch on the stable storage delete path. |
+| `rateLimitConfigs.storage.delete` | `app/storage/[environmentId]/[accessType]/[fileName]/route.ts` user branch | User ID | None | `not_coverable_now` | The user-authenticated delete branch depends on app-side identity. |
+| `rateLimitConfigs.actions.sendLinkSurveyEmail` | `modules/survey/link/actions.ts` | IP | None | `not_coverable_now` | Server action flow with no stable public API contract. |
+| `rateLimitConfigs.actions.emailUpdate` | `app/(app)/environments/[environmentId]/settings/(account)/profile/actions.ts` | User ID | None | `not_coverable_now` | User-scoped server action, not an edge-visible identifier. |
+| `rateLimitConfigs.actions.surveyFollowUp` | `modules/survey/follow-ups/lib/follow-ups.ts` | Organization ID | None | `not_coverable_now` | Organization-scoped internal workflow, not a stable public API route. |
+| `rateLimitConfigs.actions.licenseRecheck` | `modules/ee/license-check/actions.ts` | User ID | None | `not_coverable_now` | Internal server action with user-scoped identity resolved inside the app. |
+
+## Envoy-covered paths without a matching current app limiter call site
+
+These routes are already covered by Envoy even though `main` does not currently have a dedicated
+`applyIPRateLimit` / `applyRateLimit` call site for them:
+
+- `POST|PUT /api/v2/client/{environmentId}/responses(?:/{responseId})`
+- `POST /api/v2/client/{environmentId}/displays`
+- `POST /api/v2/client/{environmentId}/storage`
+
+They stay in scope for the hardening load tests because they are part of the active gateway policy set.
+
+## Explicit exclusions
+
+- `/api/v1/client/og`
+  - routed through the `/api/v1/client` prefix, but intentionally excluded from Envoy rate limiting
+- `/api/v2/health`
+  - not routed through Envoy and explicitly used as the negative control
+- `OPTIONS`
+  - excluded from the current Envoy rate-limit match set
+
+## How to interpret failures
+
+- Gateway `429`
+  - look for `x-envoy-ratelimited` or `x-ratelimit-*`
+  - body does not use the Formbricks `code: "too_many_requests"` JSON shape
+- App `429`
+  - V1 responses use `apps/web/app/lib/api/response.ts`
+  - V2 responses use `apps/web/modules/api/v2/lib/response.ts`
+  - V3 responses use `apps/web/app/api/v3/lib/response.ts`
@@ -0,0 +1,132 @@
+# Envoy Rate-Limit Validation
+
+This directory holds the staging validation tooling for the Envoy rate-limit POC:
+
+- [burst-test.sh](/Users/bhagya/work/formbricks/formbricks-1519/scripts/rate-limit/burst-test.sh) and
+  [demo.sh](/Users/bhagya/work/formbricks/formbricks-1519/scripts/rate-limit/demo.sh) are the operator-facing
+  smoke/demo scripts.
+- [run-k6.sh](/Users/bhagya/work/formbricks/formbricks-1519/scripts/rate-limit/run-k6.sh) and
+  [k6/envoy-hardening.js](/Users/bhagya/work/formbricks/formbricks-1519/scripts/rate-limit/k6/envoy-hardening.js)
+  are the repeatable hardening suite for `internal#1519`.
+
+Use the shell scripts for live demos and quick one-off checks. Use the `k6` suite for smoke, burst, and soak
+validation.
+
+## k6 suite
+
+The `k6` suite covers the three first-class hardening scenarios:
+
+- `public`
+  - `GET /api/v1/client/{environmentId}/environment`
+- `management`
+  - `GET /api/v1/management/me` with `x-api-key`
+- `negative`
+  - `GET /api/v2/health`
+
+Profiles:
+
+- `smoke`
+  - 1-5 requests to confirm the request path is Envoy-backed (`source=gateway`)
+- `burst`
+  - enough concurrency to force gateway `429`s on the covered routes
+- `soak`
+  - longer sustained load to surface `500/503`, probe flaps, or cache instability
+
+The wrapper prefers a local `k6` binary and falls back to Docker automatically.
+
+### Required environment variables
+
+- `HOST`
+  - defaults to `https://staging.app.formbricks.com`
+- `ENVIRONMENT_ID`
+  - required for `public`
+- `API_KEY`
+  - required for `management`
+
+### Optional environment variables
+
+- `VUS`
+  - override the profile default concurrent virtual users
+- `ITERATIONS`
+  - override the `per-vu-iterations` count used by `smoke` and `burst`
+- `DURATION`
+  - override the soak duration
+- `MAX_DURATION`
+  - override the `smoke`/`burst` max duration
+- `SLEEP_SECONDS`
+  - add a delay between iterations
+- `K6_DOCKER_IMAGE`
+  - override the default Docker fallback image (`grafana/k6:latest`)
+
+### Example
+
+```bash
+HOST=https://staging.app.formbricks.com \
+ENVIRONMENT_ID=<environment_id> \
+API_KEY=<api_key> \
+scripts/rate-limit/run-k6.sh smoke all
+```
+
+```bash
+HOST=https://staging.app.formbricks.com \
+ENVIRONMENT_ID=<environment_id> \
+scripts/rate-limit/run-k6.sh burst public
+```
+
+```bash
+HOST=https://staging.app.formbricks.com \
+API_KEY=<api_key> \
+VUS=20 \
+ITERATIONS=6 \
+scripts/rate-limit/run-k6.sh burst management
+```
+
+### What the `k6` summary reports
+
+Each run ends with a machine-readable summary block:
+
+- total requests
+- `200`, `429`, `5xx`, and `other` counts
+- `gateway_routed_responses`
+- `gateway_429s`, `app_429s`, `unknown_429s`
+- p95/p99 latency
+- `result=PASS|FAIL`
+
+Pass criteria:
+
+- `public` / `management` `smoke`
+  - at least one gateway-tagged response
+  - no `429`s
+  - no `5xx`s
+  - no `status_other` responses
+- `public` / `management` `burst`
+  - at least one gateway `429`
+  - zero app `429`s
+  - zero `5xx`s
+  - zero `status_other` responses
+- `public` / `management` `soak`
+  - gateway path confirmed
+  - at least one gateway `429`
+  - zero app `429`s
+  - zero `5xx`s
+  - zero `status_other` responses
+- `negative`
+  - zero `429`s, zero `5xx`s, and zero `status_other` responses
+
+## Shell scripts
+
+The shell scripts keep their existing role as quick operator tools:
+
+- `burst-test.sh`
+  - request-by-request output for ad hoc checks or live debugging
+- `demo.sh`
+  - guided staging demo flow used in meetings
+
+## How the shell scripts classify responses
+
+`source=gateway` means the response included Envoy-visible headers such as `x-envoy-ratelimited` or
+`x-ratelimit-*`, or the POC returned an empty-body `429`.
+
+`source=app` means the response body matched the Formbricks `too_many_requests` JSON shape.
+
+`source=unknown` means the response was neither of those and should be inspected manually.
@@ -0,0 +1,218 @@
+#!/usr/bin/env bash
+
+set -euo pipefail
+
+SCENARIO="${1:-}"
+HOST="${HOST:-https://staging.app.formbricks.com}"
+ENVIRONMENT_ID="${ENVIRONMENT_ID:-}"
+API_KEY="${API_KEY:-}"
+COUNT="${COUNT:-20}"
+CONCURRENCY="${CONCURRENCY:-1}"
+SLEEP_SECONDS="${SLEEP_SECONDS:-0}"
+RESPONSE_ID="${RESPONSE_ID:-envoy-poc-response}"
+WEBHOOK_ID="${WEBHOOK_ID:-envoy-poc-webhook}"
+FILE_KEY="${FILE_KEY:-envoy-poc-file.txt}"
+
+if [[ -z "$SCENARIO" ]]; then
+  echo "usage: scripts/rate-limit/burst-test.sh <scenario>" >&2
+  exit 1
+fi
+
+require_env_id() {
+  if [[ -z "$ENVIRONMENT_ID" ]]; then
+    echo "ENVIRONMENT_ID is required for scenario '$SCENARIO'" >&2
+    exit 1
+  fi
+}
+
+require_api_key() {
+  if [[ -z "$API_KEY" ]]; then
+    echo "API_KEY is required for scenario '$SCENARIO'" >&2
+    exit 1
+  fi
+}
+
+METHOD="GET"
+URL=""
+BODY=""
+CONTENT_TYPE=""
+EXTRA_HEADERS=()
+
+case "$SCENARIO" in
+  login)
+    METHOD="POST"
+    URL="$HOST/api/auth/callback/credentials"
+    BODY="email=rate-limit%40example.com&password=wrong-password"
+    CONTENT_TYPE="application/x-www-form-urlencoded"
+    ;;
+  verify-token)
+    METHOD="POST"
+    URL="$HOST/api/auth/callback/token"
+    BODY="token=invalid-token"
+    CONTENT_TYPE="application/x-www-form-urlencoded"
+    ;;
+  v1-client-environment)
+    require_env_id
+    URL="$HOST/api/v1/client/$ENVIRONMENT_ID/environment"
+    ;;
+  v1-client-storage)
+    require_env_id
+    METHOD="POST"
+    URL="$HOST/api/v1/client/$ENVIRONMENT_ID/storage"
+    BODY='{}'
+    CONTENT_TYPE="application/json"
+    ;;
+  v2-responses-post)
+    require_env_id
+    METHOD="POST"
+    URL="$HOST/api/v2/client/$ENVIRONMENT_ID/responses"
+    BODY='{}'
+    CONTENT_TYPE="application/json"
+    ;;
+  v2-responses-put)
+    require_env_id
+    METHOD="PUT"
+    URL="$HOST/api/v2/client/$ENVIRONMENT_ID/responses/$RESPONSE_ID"
+    BODY='{}'
+    CONTENT_TYPE="application/json"
+    ;;
+  v2-displays-post)
+    require_env_id
+    METHOD="POST"
+    URL="$HOST/api/v2/client/$ENVIRONMENT_ID/displays"
+    BODY='{}'
+    CONTENT_TYPE="application/json"
+    ;;
+  v2-client-storage)
+    require_env_id
+    METHOD="POST"
+    URL="$HOST/api/v2/client/$ENVIRONMENT_ID/storage"
+    BODY='{}'
+    CONTENT_TYPE="application/json"
+    ;;
+  v2-health)
+    URL="$HOST/api/v2/health"
+    ;;
+  management-api-key)
+    require_api_key
+    URL="$HOST/api/v1/management/me"
+    EXTRA_HEADERS+=("x-api-key: $API_KEY")
+    ;;
+  management-storage-api-key)
+    require_api_key
+    METHOD="POST"
+    URL="$HOST/api/v1/management/storage"
+    BODY='{}'
+    CONTENT_TYPE="application/json"
+    EXTRA_HEADERS+=("x-api-key: $API_KEY")
+    ;;
+  webhooks-api-key)
+    require_api_key
+    URL="$HOST/api/v1/webhooks/$WEBHOOK_ID"
+    EXTRA_HEADERS+=("x-api-key: $API_KEY")
+    ;;
+  storage-delete-api-key)
+    require_env_id
+    require_api_key
+    METHOD="DELETE"
+    URL="$HOST/storage/$ENVIRONMENT_ID/public/$FILE_KEY"
+    EXTRA_HEADERS+=("x-api-key: $API_KEY")
+    ;;
+  *)
+    echo "unknown scenario: $SCENARIO" >&2
+    exit 1
+    ;;
+esac
+
+TMP_DIR="$(mktemp -d)"
+trap 'rm -rf "$TMP_DIR"' EXIT
+
+run_request() {
+  local i="$1"
+  local header_file
+  local body_file
+  local status_code
+  local source
+  local header_summary
+  local has_gateway_headers="false"
+  header_file="$TMP_DIR/$i.headers"
+  body_file="$TMP_DIR/$i.body"
+
+  curl_args=(
+    -sS
+    -D "$header_file"
+    -o "$body_file"
+    -X "$METHOD"
+  )
+
+  if [[ -n "$CONTENT_TYPE" ]]; then
+    curl_args+=(-H "content-type: $CONTENT_TYPE")
+  fi
+
+  # Bash 3.x + `set -u` treats empty arrays as unset during expansion, so guard the loop.
+  if [[ ${#EXTRA_HEADERS[@]:-0} -gt 0 ]]; then
+    for header in "${EXTRA_HEADERS[@]}"; do
+      curl_args+=(-H "$header")
+    done
+  fi
+
+  if [[ -n "$BODY" ]]; then
+    curl_args+=(--data "$BODY")
+  fi
+
+  status_code="$(curl "${curl_args[@]}" -w '%{http_code}' "$URL")"
+
+  source="unknown"
+  if rg -q '"code":"too_many_requests"' "$body_file"; then
+    source="app"
+  else
+    if rg -qi '^(x-envoy-ratelimited|x-ratelimit-limit|x-ratelimit-remaining|x-ratelimit-reset):' "$header_file"; then
+      has_gateway_headers="true"
+    fi
+
+    if [[ "$has_gateway_headers" == "true" ]]; then
+      source="gateway"
+    elif [[ "$status_code" == "429" && ! -s "$body_file" ]]; then
+      source="gateway"
+    fi
+  fi
+
+  printf '%03d scenario=%s status=%s source=%s\n' "$i" "$SCENARIO" "$status_code" "$source"
+
+  if [[ "$status_code" == "429" ]]; then
+    header_summary="$(
+      {
+        tr -d '\r' < "$header_file" |
+          rg -i '^(x-envoy-ratelimited|x-ratelimit-limit|x-ratelimit-remaining|x-ratelimit-reset|content-type|retry-after):' |
+          paste -sd '; ' -
+      } || true
+    )"
+    printf '  headers: %s\n' "${header_summary:-<none>}"
+  fi
+
+  if [[ "$SLEEP_SECONDS" != "0" ]]; then
+    sleep "$SLEEP_SECONDS"
+  fi
+}
+
+if (( CONCURRENCY <= 1 )); then
+  for i in $(seq 1 "$COUNT"); do
+    run_request "$i"
+  done
+else
+  pids=()
+
+  for i in $(seq 1 "$COUNT"); do
+    run_request "$i" &
+    pids+=("$!")
+
+    if (( ${#pids[@]} >= CONCURRENCY )); then
+      wait "${pids[0]}"
+      pids=("${pids[@]:1}")
+    fi
+  done
+
+  for pid in "${pids[@]}"; do
+    wait "$pid"
+  done
+fi
@@ -0,0 +1,284 @@
+#!/usr/bin/env bash
+
+set -euo pipefail
+
+MODE="${1:-all}"
+HOST="${HOST:-https://staging.app.formbricks.com}"
+ENVIRONMENT_ID="${ENVIRONMENT_ID:-}"
+API_KEY="${API_KEY:-}"
+PUBLIC_COUNT="${PUBLIC_COUNT:-125}"
+PUBLIC_CONCURRENCY="${PUBLIC_CONCURRENCY:-20}"
+MANAGEMENT_COUNT="${MANAGEMENT_COUNT:-200}"
+MANAGEMENT_CONCURRENCY="${MANAGEMENT_CONCURRENCY:-40}"
+NEGATIVE_COUNT="${NEGATIVE_COUNT:-25}"
+NEGATIVE_CONCURRENCY="${NEGATIVE_CONCURRENCY:-10}"
+LOG_WINDOW="${LOG_WINDOW:-5m}"
+WORKDIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+BURST_SCRIPT="$WORKDIR/burst-test.sh"
+TMP_DIR="$(mktemp -d)"
+trap 'rm -rf "$TMP_DIR"' EXIT
+
+usage() {
+  cat <<'EOF'
+usage: scripts/rate-limit/demo.sh [preflight|public|management|negative|evidence|all]
+
+Required environment variables:
+  ENVIRONMENT_ID   Staging environment ID for public client route checks
+  API_KEY          Single-environment staging API key for management route checks
+
+Optional environment variables:
+  HOST                    Defaults to https://staging.app.formbricks.com
+  PUBLIC_COUNT            Defaults to 125
+  PUBLIC_CONCURRENCY      Defaults to 20
+  MANAGEMENT_COUNT        Defaults to 200
+  MANAGEMENT_CONCURRENCY  Defaults to 40
+  NEGATIVE_COUNT          Defaults to 25
+  NEGATIVE_CONCURRENCY    Defaults to 10
+  LOG_WINDOW              Defaults to 5m
+EOF
+}
+
+require_env_id() {
+  if [[ -z "$ENVIRONMENT_ID" ]]; then
+    echo "ENVIRONMENT_ID is required" >&2
+    exit 1
+  fi
+}
+
+require_api_key() {
+  if [[ -z "$API_KEY" ]]; then
+    echo "API_KEY is required" >&2
+    exit 1
+  fi
+}
+
+section() {
+  printf '\n== %s ==\n' "$1"
+}
+
+run_and_capture() {
+  local output_file="$1"
+  shift
+
+  "$@" | tee "$output_file"
+}
+
+summarize_output() {
+  local output_file="$1"
+
+  awk '
+    /scenario=/ {
+      status = ""
+      source = ""
+      for (i = 1; i <= NF; i++) {
+        if ($i ~ /^status=/) {
+          status = substr($i, 8)
+        }
+        if ($i ~ /^source=/) {
+          source = substr($i, 8)
+        }
+      }
+      if (status != "" && source != "") {
+        counts[status "|" source]++
+      }
+    }
+    END {
+      for (key in counts) {
+        split(key, parts, "|")
+        printf "status=%s source=%s count=%d\n", parts[1], parts[2], counts[key]
+      }
+    }
+  ' "$output_file" | sort
+}
+
+print_summary_insights() {
+  local output_file="$1"
+  local gateway_429_count
+  local app_429_count
+  local unknown_429_count
+  local server_error_count
+
+  gateway_429_count="$(count_matches 'status=429 source=gateway' "$output_file")"
+  app_429_count="$(count_matches 'status=429 source=app' "$output_file")"
+  unknown_429_count="$(count_matches 'status=429 source=unknown' "$output_file")"
+  server_error_count="$(count_matches 'status=5[0-9][0-9] source=' "$output_file")"
+
+  echo "gateway_429s=$gateway_429_count"
+  echo "app_429s=$app_429_count"
+  echo "unknown_429s=$unknown_429_count"
+  echo "server_errors=$server_error_count"
+}
+
+count_matches() {
+  local pattern="$1"
+  local input_file="$2"
+  local count
+
+  count="$(rg -c "$pattern" "$input_file" 2>/dev/null || true)"
+  echo "${count:-0}"
+}
+
+assert_gateway_probe() {
+  local output_file="$1"
+  if ! rg -q 'source=gateway' "$output_file"; then
+    echo "Expected a gateway-tagged response in probe output, but none was found." >&2
+    exit 1
+  fi
+}
+
+assert_gateway_rate_limit() {
+  local output_file="$1"
+  if ! rg -q 'status=429 source=gateway' "$output_file"; then
+    echo "Expected at least one gateway 429 in burst output, but none was found." >&2
+    exit 1
+  fi
+}
+
+assert_no_429() {
+  local output_file="$1"
+  if rg -q 'status=429 source=' "$output_file"; then
+    echo "Expected no 429s in excluded-route output, but at least one was found." >&2
+    exit 1
+  fi
+}
+
+show_envoy_log_evidence() {
+  local pattern="$1"
+
+  section "Recent Envoy Evidence"
+
+  if ! command -v kubectl >/dev/null 2>&1; then
+    echo "kubectl not available; skipping live Envoy log evidence."
+    return
+  fi
+
+  if ! kubectl logs -n formbricks-stage deploy/formbricks-stage-envoy -c envoy --since="$LOG_WINDOW" 2>/dev/null | \
+    rg "$pattern" | \
+    rg 'request_rate_limited|response_flags":"RL"'; then
+    echo "No matching Envoy log lines found in the last $LOG_WINDOW."
+  fi
+}
+
+print_known_caveat() {
+  cat <<'EOF'
+Known staging caveat:
+- intermittent 500/503 responses can still appear under high burst load on the environment route
+- this is a staging stability issue on top of the Envoy POC, not a sign that the gateway path is bypassed
+- the demo still passes if you see gateway-tagged 429 responses
+EOF
+}
+
+run_preflight() {
+  require_env_id
+  require_api_key
+
+  section "Preflight"
+  echo "Host: $HOST"
+  echo "Environment ID: $ENVIRONMENT_ID"
+  echo "API key: provided"
+
+  section "Public Route Probe"
+  public_probe_output="$TMP_DIR/public-probe.txt"
+  run_and_capture \
+    "$public_probe_output" \
+    env HOST="$HOST" ENVIRONMENT_ID="$ENVIRONMENT_ID" COUNT=1 "$BURST_SCRIPT" v1-client-environment
+  assert_gateway_probe "$public_probe_output"
+
+  section "Management Route Probe"
+  management_probe_output="$TMP_DIR/management-probe.txt"
+  run_and_capture \
+    "$management_probe_output" \
+    env HOST="$HOST" API_KEY="$API_KEY" COUNT=1 "$BURST_SCRIPT" management-api-key
+  assert_gateway_probe "$management_probe_output"
+
+  print_known_caveat
+}
+
+run_public_demo() {
+  require_env_id
+
+  section "Public IP Demo"
+  echo "Route: GET /api/v1/client/$ENVIRONMENT_ID/environment"
+  echo "Expected: gateway 429 after threshold"
+  public_output="$TMP_DIR/public-burst.txt"
+  run_and_capture \
+    "$public_output" \
+    env HOST="$HOST" ENVIRONMENT_ID="$ENVIRONMENT_ID" COUNT="$PUBLIC_COUNT" CONCURRENCY="$PUBLIC_CONCURRENCY" \
+      "$BURST_SCRIPT" v1-client-environment
+
+  section "Public IP Summary"
+  summarize_output "$public_output"
+  print_summary_insights "$public_output"
+  assert_gateway_rate_limit "$public_output"
+  show_envoy_log_evidence 'formbricks-stage-v1-client'
+}
+
+run_management_demo() {
+  require_api_key
+
+  section "API Key Demo"
+  echo "Route: GET /api/v1/management/me"
+  echo "Expected: gateway 429 after threshold"
+  management_output="$TMP_DIR/management-burst.txt"
+  run_and_capture \
+    "$management_output" \
+    env HOST="$HOST" API_KEY="$API_KEY" COUNT="$MANAGEMENT_COUNT" CONCURRENCY="$MANAGEMENT_CONCURRENCY" \
+      "$BURST_SCRIPT" management-api-key
+
+  section "API Key Summary"
+  summarize_output "$management_output"
+  print_summary_insights "$management_output"
+  assert_gateway_rate_limit "$management_output"
+  show_envoy_log_evidence 'formbricks-stage-v1-management'
+}
+
+run_negative_demo() {
+  section "Excluded Route Demo"
+  echo "Route: GET /api/v2/health"
+  echo "Expected: no 429 responses because this route is excluded from the gateway policy set"
+  negative_output="$TMP_DIR/negative-burst.txt"
+  run_and_capture \
+    "$negative_output" \
+    env HOST="$HOST" COUNT="$NEGATIVE_COUNT" CONCURRENCY="$NEGATIVE_CONCURRENCY" \
+      "$BURST_SCRIPT" v2-health
+
+  section "Excluded Route Summary"
+  summarize_output "$negative_output"
+  print_summary_insights "$negative_output"
+  assert_no_429 "$negative_output"
+}
+
+run_evidence_only() {
+  show_envoy_log_evidence 'formbricks-stage-v1-client|formbricks-stage-v1-management'
+}
+
+case "$MODE" in
+  preflight)
+    run_preflight
+    ;;
+  public)
+    run_public_demo
+    ;;
+  management)
+    run_management_demo
+    ;;
+  negative)
+    run_negative_demo
+    ;;
+  evidence)
+    run_evidence_only
+    ;;
+  all)
+    run_preflight
+    run_public_demo
+    run_management_demo
+    run_negative_demo
+    ;;
+  -h|--help|help)
+    usage
+    ;;
+  *)
+    usage >&2
+    exit 1
+    ;;
+esac
@@ -0,0 +1,272 @@
+import http from "k6/http";
+import { Counter } from "k6/metrics";
+import { sleep } from "k6";
+
+const PROFILE = (__ENV.PROFILE || "smoke").toLowerCase();
+const SCENARIO = (__ENV.SCENARIO || "public").toLowerCase();
+const HOST = __ENV.HOST || "https://staging.app.formbricks.com";
+const ENVIRONMENT_ID = __ENV.ENVIRONMENT_ID || "";
+const API_KEY = __ENV.API_KEY || "";
+const SLEEP_SECONDS = Number(__ENV.SLEEP_SECONDS || "0");
+
+const totalResponses = new Counter("total_responses");
+const status2xx = new Counter("status_2xx");
+const status429 = new Counter("status_429");
+const status5xx = new Counter("status_5xx");
+const statusOther = new Counter("status_other");
+const gatewayRoutedResponses = new Counter("gateway_routed_responses");
+const gateway429s = new Counter("gateway_429s");
+const app429s = new Counter("app_429s");
+const unknown429s = new Counter("unknown_429s");
+
+const profileDefaults = {
+  smoke: {
+    public: { executor: "per-vu-iterations", vus: 1, iterations: 3 },
+    management: { executor: "per-vu-iterations", vus: 1, iterations: 3 },
+    negative: { executor: "per-vu-iterations", vus: 1, iterations: 5 },
+  },
+  burst: {
+    public: { executor: "per-vu-iterations", vus: 20, iterations: 7 },
+    management: { executor: "per-vu-iterations", vus: 20, iterations: 6 },
+    negative: { executor: "per-vu-iterations", vus: 10, iterations: 3 },
+  },
+  soak: {
+    public: { executor: "constant-vus", vus: 10, duration: "5m" },
+    management: { executor: "constant-vus", vus: 15, duration: "5m" },
+    negative: { executor: "constant-vus", vus: 5, duration: "3m" },
+  },
+};
+
+function requireValue(value, name, scenario) {
+  if (!value) {
+    throw new Error(`${name} is required for scenario "${scenario}"`);
+  }
+}
+
+function getScenarioConfig(profile, scenario) {
+  const profileConfig = profileDefaults[profile];
+  if (!profileConfig) {
+    throw new Error(`Unsupported PROFILE "${profile}". Use smoke, burst, or soak.`);
+  }
+
+  const scenarioConfig = profileConfig[scenario];
+  if (!scenarioConfig) {
+    throw new Error(`Unsupported SCENARIO "${scenario}". Use public, management, or negative.`);
+  }
+
+  return scenarioConfig;
+}
+
+function buildOptions() {
+  const base = getScenarioConfig(PROFILE, SCENARIO);
+  const scenarioConfig = {
+    executor: base.executor,
+    exec: "runScenario",
+    gracefulStop: "0s",
+    tags: {
+      profile: PROFILE,
+      scenario: SCENARIO,
+    },
+  };
+
+  if (base.executor === "constant-vus") {
+    scenarioConfig.vus = Number(__ENV.VUS || String(base.vus));
+    scenarioConfig.duration = __ENV.DURATION || base.duration;
+  } else {
+    scenarioConfig.vus = Number(__ENV.VUS || String(base.vus));
+    scenarioConfig.iterations = Number(__ENV.ITERATIONS || String(base.iterations));
+    scenarioConfig.maxDuration = __ENV.MAX_DURATION || "10m";
+  }
+
+  return {
+    scenarios: {
+      envoy_hardening: scenarioConfig,
+    },
+  };
+}
+
+function getHeader(response, name) {
+  const target = name.toLowerCase();
+  for (const [key, value] of Object.entries(response.headers || {})) {
+    if (key.toLowerCase() === target) {
+      return Array.isArray(value) ? value[0] : value;
+    }
+  }
+
+  return undefined;
+}
+
+function hasGatewayHeaders(response) {
+  return [
+    "x-envoy-ratelimited",
+    "x-ratelimit-limit",
+    "x-ratelimit-remaining",
+    "x-ratelimit-reset",
+  ].some((header) => Boolean(getHeader(response, header)));
+}
+
+function classifyResponse(response) {
+  const body = typeof response.body === "string" ? response.body : "";
+  if (body.includes('"code":"too_many_requests"')) {
+    return "app";
+  }
+
+  if (hasGatewayHeaders(response)) {
+    return "gateway";
+  }
+
+  if (response.status === 429 && body.trim().length === 0) {
+    return "gateway";
+  }
+
+  return "unknown";
+}
+
+function buildRequest() {
+  switch (SCENARIO) {
+    case "public":
+      requireValue(ENVIRONMENT_ID, "ENVIRONMENT_ID", SCENARIO);
+      return {
+        label: "GET /api/v1/client/{environmentId}/environment",
+        method: "GET",
+        url: `${HOST}/api/v1/client/${ENVIRONMENT_ID}/environment`,
+        body: null,
+        params: { timeout: "30s" },
+      };
+    case "management":
+      requireValue(API_KEY, "API_KEY", SCENARIO);
+      return {
+        label: "GET /api/v1/management/me",
+        method: "GET",
+        url: `${HOST}/api/v1/management/me`,
+        body: null,
+        params: {
+          timeout: "30s",
+          headers: {
+            "x-api-key": API_KEY,
+          },
+        },
+      };
+    case "negative":
+      return {
+        label: "GET /api/v2/health",
+        method: "GET",
+        url: `${HOST}/api/v2/health`,
+        body: null,
+        params: { timeout: "30s" },
+      };
+    default:
+      throw new Error(`Unsupported SCENARIO "${SCENARIO}"`);
+  }
+}
+
+function recordResponse(response) {
+  totalResponses.add(1);
+
+  if (response.status >= 200 && response.status < 300) {
+    status2xx.add(1);
+  } else if (response.status === 429) {
+    status429.add(1);
+  } else if (response.status >= 500) {
+    status5xx.add(1);
+  } else {
+    statusOther.add(1);
+  }
+
+  const source = classifyResponse(response);
+  if (source === "gateway") {
+    gatewayRoutedResponses.add(1);
+  }
+
+  if (response.status === 429) {
+    if (source === "gateway") {
+      gateway429s.add(1);
+    } else if (source === "app") {
+      app429s.add(1);
+    } else {
+      unknown429s.add(1);
+    }
+  }
+}
+
+function metricCount(data, name) {
+  return data.metrics[name]?.values?.count ?? 0;
+}
+
+function trendValue(data, name, key) {
+  return data.metrics[name]?.values?.[key] ?? 0;
+}
+
+function evaluateRun(data) {
+  const total429s = metricCount(data, "status_429");
+  const gatewayTagged = metricCount(data, "gateway_routed_responses");
+  const gatewayLimited = metricCount(data, "gateway_429s");
+  const appLimited = metricCount(data, "app_429s");
+  const errors5xx = metricCount(data, "status_5xx");
+  const otherStatuses = metricCount(data, "status_other");
+
+  if (SCENARIO === "negative") {
+    return total429s === 0 && errors5xx === 0 && otherStatuses === 0;
+  }
+
+  if (PROFILE === "smoke") {
+    return gatewayTagged > 0 && total429s === 0 && errors5xx === 0 && otherStatuses === 0;
+  }
+
+  if (PROFILE === "burst") {
+    return gatewayLimited > 0 && appLimited === 0 && errors5xx === 0 && otherStatuses === 0;
+  }
+
+  return (
+    gatewayTagged > 0 &&
+    gatewayLimited > 0 &&
+    appLimited === 0 &&
+    errors5xx === 0 &&
+    otherStatuses === 0
+  );
+}
+
+function formatNumber(value) {
+  return Number(value || 0).toFixed(2);
+}
+
+export const options = buildOptions();
+
+export function runScenario() {
+  const request = buildRequest();
+  const response = http.request(request.method, request.url, request.body, request.params);
+  recordResponse(response);
+
+  if (SLEEP_SECONDS > 0) {
+    sleep(SLEEP_SECONDS);
+  }
+}
+
+export function handleSummary(data) {
+  const result = evaluateRun(data) ? "PASS" : "FAIL";
+  const totalRequests = data.metrics.http_reqs?.values?.count ?? 0;
+  const summary = [
+    "=== Envoy Hardening Summary ===",
+    `profile=${PROFILE}`,
+    `scenario=${SCENARIO}`,
+    `host=${HOST}`,
+    `route=${buildRequest().label}`,
+    `total_requests=${totalRequests}`,
+    `status_2xx=${metricCount(data, "status_2xx")}`,
+    `status_429=${metricCount(data, "status_429")}`,
+    `status_5xx=${metricCount(data, "status_5xx")}`,
+    `status_other=${metricCount(data, "status_other")}`,
+    `gateway_routed_responses=${metricCount(data, "gateway_routed_responses")}`,
+    `gateway_429s=${metricCount(data, "gateway_429s")}`,
+    `app_429s=${metricCount(data, "app_429s")}`,
+    `unknown_429s=${metricCount(data, "unknown_429s")}`,
+    `http_req_duration_p95_ms=${formatNumber(trendValue(data, "http_req_duration", "p(95)"))}`,
+    `http_req_duration_p99_ms=${formatNumber(trendValue(data, "http_req_duration", "p(99)"))}`,
+    `iteration_duration_p95_ms=${formatNumber(trendValue(data, "iteration_duration", "p(95)"))}`,
+    `result=${result}`,
+  ];
+
+  return {
+    stdout: `${summary.join("\n")}\n`,
+  };
+}
@@ -0,0 +1,110 @@
+#!/usr/bin/env bash
+
+set -euo pipefail
+
+PROFILE="${1:-}"
+SCENARIO="${2:-all}"
+K6_DOCKER_IMAGE="${K6_DOCKER_IMAGE:-grafana/k6:latest}"
+
+if [[ -z "$PROFILE" ]]; then
+  echo "usage: scripts/rate-limit/run-k6.sh <smoke|burst|soak> [public|management|negative|all]" >&2
+  exit 1
+fi
+
+case "$PROFILE" in
+  smoke|burst|soak) ;;
+  *)
+    echo "invalid profile: $PROFILE" >&2
+    exit 1
+    ;;
+esac
+
+case "$SCENARIO" in
+  public|management|negative|all) ;;
+  *)
+    echo "invalid scenario: $SCENARIO" >&2
+    exit 1
+    ;;
+esac
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+REPO_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
+K6_SCRIPT="/workspace/scripts/rate-limit/k6/envoy-hardening.js"
+
+build_env_args() {
+  local key
+  env_args=()
+  for key in HOST ENVIRONMENT_ID API_KEY VUS ITERATIONS DURATION MAX_DURATION SLEEP_SECONDS; do
+    if [[ -n "${!key:-}" ]]; then
+      env_args+=(-e "$key=${!key}")
+    fi
+  done
+}
+
+run_single() {
+  local scenario="$1"
+  local tmp_output
+  local status=0
+  local command_status=0
+  tmp_output="$(mktemp)"
+
+  echo "== k6 profile=$PROFILE scenario=$scenario =="
+
+  build_env_args
+
+  if command -v k6 >/dev/null 2>&1; then
+    set +e
+    (
+      cd "$REPO_ROOT"
+      k6 run "${env_args[@]}" -e "PROFILE=$PROFILE" -e "SCENARIO=$scenario" \
+        "scripts/rate-limit/k6/envoy-hardening.js"
+    ) | tee "$tmp_output"
+    command_status="${PIPESTATUS[0]}"
+    set -e
+  else
+    if ! docker info >/dev/null 2>&1; then
+      echo "docker is required for the k6 fallback, but the Docker daemon is not reachable" >&2
+      rm -f "$tmp_output"
+      return 1
+    fi
+
+    set +e
+    docker run --rm -i \
+      -v "$REPO_ROOT:/workspace" \
+      -w /workspace \
+      "${env_args[@]}" \
+      -e "PROFILE=$PROFILE" \
+      -e "SCENARIO=$scenario" \
+      "$K6_DOCKER_IMAGE" run "$K6_SCRIPT" | tee "$tmp_output"
+    command_status="${PIPESTATUS[0]}"
+    set -e
+  fi
+
+  if [[ "$command_status" -ne 0 ]]; then
+    rm -f "$tmp_output"
+    return "$command_status"
+  fi
+
+  if rg -q '^result=FAIL$' "$tmp_output"; then
+    status=1
+  fi
+
+  rm -f "$tmp_output"
+  return "$status"
+}
+
+overall_status=0
+
+if [[ "$SCENARIO" == "all" ]]; then
+  for scenario in public management negative; do
+    if ! run_single "$scenario"; then
+      overall_status=1
+    fi
+  done
+else
+  if ! run_single "$SCENARIO"; then
+    overall_status=1
+  fi
+fi
+
+exit "$overall_status"