Compare commits

...

1 Commits

Author SHA1 Message Date
Bhagya Amarasinghe a67aa1b3ce chore: add envoy rate-limit hardening tooling 2026-04-01 14:17:04 +05:30
6 changed files with 1124 additions and 0 deletions
@@ -0,0 +1,108 @@
# Envoy Rate-Limit Coverage Matrix
This document is the staging coverage source of truth for `formbricks/internal#1519`.
It answers two separate questions:
- which request prefixes currently traverse Envoy on staging
- which current application limiter call sites are already covered, coverable later, or intentionally left in
the app
## Current Envoy-routed prefixes
The staging ALB forwards only these prefixes to Envoy:
- `/api/auth/callback`
- `/api/v1/client`
- `/api/v2/client`
- `/api/v1/management`
- `/api/v1/webhooks`
- `/storage`
Everything else still goes directly to the chart-managed Formbricks Service.
## Current Envoy rate-limited route groups
These route groups already have active Envoy rate-limit policies on staging:
- `POST /api/auth/callback/credentials`
- coarse `40 / hour` gateway approximation for the stricter app `10 / 15 minutes` login limit
- `POST /api/auth/callback/token`
- `10 / hour`
- `GET|POST|PUT /api/v1/client/{environmentId}/(environment|responses|responses/{responseId}|displays|user)`
- `100 / minute`
- `POST /api/v1/client/{environmentId}/storage`
- `5 / minute`
- `POST|PUT /api/v2/client/{environmentId}/responses(?:/{responseId})`
- `100 / minute`
- `POST /api/v2/client/{environmentId}/displays`
- `100 / minute`
- `POST /api/v2/client/{environmentId}/storage`
- `5 / minute`
- `GET|POST|PUT|DELETE /api/v1/management/*` when `x-api-key` is present
- `100 / minute`
- `POST /api/v1/management/storage` when `x-api-key` is present
- `5 / minute`
- `GET|POST|PUT|DELETE /api/v1/webhooks/*` when `x-api-key` is present
- `100 / minute`
- `DELETE /storage/{environmentId}/{public|private}/{fileName}` when `x-api-key` is present
- `5 / minute`
## Call-site coverage matrix
This matrix covers every current `applyIPRateLimit` / `applyRateLimit` call site on `main`.
| Limiter config | Caller / route family | Key type | Stable gateway path | Status | Reason |
| --- | --- | --- | --- | --- | --- |
| `rateLimitConfigs.auth.login` | `modules/auth/lib/authOptions.ts` credentials callback | IP | `POST /api/auth/callback/credentials` | `covered_now` | Stable public callback path. Envoy uses a coarse `40 / hour` approximation while the stricter app limit remains active. |
| `rateLimitConfigs.auth.verifyEmail` | `modules/auth/lib/authOptions.ts` token callback | IP | `POST /api/auth/callback/token` | `covered_now` | Stable public callback path and identical `10 / hour` limit at the gateway. |
| `rateLimitConfigs.auth.signup` | `modules/auth/signup/actions.ts` | IP | None | `not_coverable_now` | Server action flow, not a stable gateway-managed HTTP route. |
| `rateLimitConfigs.auth.forgotPassword` | `modules/auth/forgot-password/actions.ts` | IP | None | `not_coverable_now` | Server action flow, not a stable gateway-managed HTTP route. |
| `rateLimitConfigs.auth.verifyEmail` | `modules/auth/verification-requested/actions.ts` resend verification flow | IP | None | `not_coverable_now` | Same config as the covered token callback, but this caller is a server action instead of a stable public API path. |
| `rateLimitConfigs.api.client` | `app/lib/api/with-api-logging.ts` public V1 client routes | IP | `/api/v1/client/{environmentId}/(environment|responses|responses/{responseId}|displays|user)` | `covered_now` | Public V1 client paths already match the Envoy policy set. |
| `rateLimitConfigs.storage.upload` | `app/api/v1/client/[environmentId]/storage/route.ts` via `with-api-logging.ts` | IP | `POST /api/v1/client/{environmentId}/storage` | `covered_now` | Custom storage upload limit is already broken out as its own Envoy rule. |
| `rateLimitConfigs.api.v1` | `app/lib/api/with-api-logging.ts` API-key-authenticated V1 management routes | API key | `/api/v1/management/*` except storage | `covered_now` | Stable `x-api-key` surface already routed through Envoy. |
| `rateLimitConfigs.storage.upload` | `app/api/v1/management/storage/route.ts` API-key branch via `with-api-logging.ts` | API key | `POST /api/v1/management/storage` | `covered_now` | Custom storage upload limit already has its own Envoy rule. |
| `rateLimitConfigs.api.v1` | `app/lib/api/with-api-logging.ts` API-key-authenticated webhooks | API key | `/api/v1/webhooks/*` | `covered_now` | Stable `x-api-key` webhook surface already routed through Envoy. |
| `rateLimitConfigs.api.v1` | `app/lib/api/with-api-logging.ts` session-authenticated integration routes | Session user ID | None | `not_coverable_now` | Session identity is only resolved inside the app, not at the gateway. |
| `rateLimitConfigs.api.v1` | `app/lib/api/with-api-logging.ts` session-authenticated V1 management routes | Session user ID | None | `not_coverable_now` | The current Envoy rules only cover the API-key branch of the V1 management surface. |
| `rateLimitConfigs.api.v1` | `app/api/v1/management/me/route.ts` API-key branch | API key | `GET /api/v1/management/me` | `covered_now` | Direct handler, but the path is already behind Envoy and matched by the V1 management policy. |
| `rateLimitConfigs.api.v1` | `app/api/v1/management/me/route.ts` session branch | Session user ID | None | `not_coverable_now` | Same path, but this branch keys on session user ID instead of an edge-visible identifier. |
| `rateLimitConfigs.api.v2` | `modules/api/v2/auth/api-wrapper.ts` authenticated V2 API surface | API key | `/api/v2/*` outside `/api/v2/client` | `coverable_later` | Stable API-key paths exist, but the current Envoy POC only routes the public `/api/v2/client` surface. |
| `rateLimitConfigs.api.v3` or custom V3 configs | `app/api/v3/lib/api-wrapper.ts` | Session user ID or API key | Route-specific, mainly `/api/v3/*` | `coverable_later` | The wrapper supports mixed auth modes. Production hardening needs a deliberate V3 route inventory before moving any of it to Envoy. |
| `rateLimitConfigs.storage.delete` | `app/storage/[environmentId]/[accessType]/[fileName]/route.ts` API-key branch | API key | `DELETE /storage/{environmentId}/{public|private}/{fileName}` | `covered_now` | Envoy already enforces the API-key branch on the stable storage delete path. |
| `rateLimitConfigs.storage.delete` | `app/storage/[environmentId]/[accessType]/[fileName]/route.ts` user branch | User ID | None | `not_coverable_now` | The user-authenticated delete branch depends on app-side identity. |
| `rateLimitConfigs.actions.sendLinkSurveyEmail` | `modules/survey/link/actions.ts` | IP | None | `not_coverable_now` | Server action flow with no stable public API contract. |
| `rateLimitConfigs.actions.emailUpdate` | `app/(app)/environments/[environmentId]/settings/(account)/profile/actions.ts` | User ID | None | `not_coverable_now` | User-scoped server action, not an edge-visible identifier. |
| `rateLimitConfigs.actions.surveyFollowUp` | `modules/survey/follow-ups/lib/follow-ups.ts` | Organization ID | None | `not_coverable_now` | Organization-scoped internal workflow, not a stable public API route. |
| `rateLimitConfigs.actions.licenseRecheck` | `modules/ee/license-check/actions.ts` | User ID | None | `not_coverable_now` | Internal server action with user-scoped identity resolved inside the app. |
## Envoy-covered paths without a matching current app limiter call site
These routes are already covered by Envoy even though `main` does not currently have a dedicated
`applyIPRateLimit` / `applyRateLimit` call site for them:
- `POST|PUT /api/v2/client/{environmentId}/responses(?:/{responseId})`
- `POST /api/v2/client/{environmentId}/displays`
- `POST /api/v2/client/{environmentId}/storage`
They stay in scope for the hardening load tests because they are part of the active gateway policy set.
## Explicit exclusions
- `/api/v1/client/og`
- routed through the `/api/v1/client` prefix, but intentionally excluded from Envoy rate limiting
- `/api/v2/health`
- not routed through Envoy and explicitly used as the negative control
- `OPTIONS`
- excluded from the current Envoy rate-limit match set
## How to interpret failures
- Gateway `429`
- look for `x-envoy-ratelimited` or `x-ratelimit-*`
- body does not use the Formbricks `code: "too_many_requests"` JSON shape
- App `429`
- V1 responses use `apps/web/app/lib/api/response.ts`
- V2 responses use `apps/web/modules/api/v2/lib/response.ts`
- V3 responses use `apps/web/app/api/v3/lib/response.ts`
+132
View File
@@ -0,0 +1,132 @@
# Envoy Rate-Limit Validation
This directory holds the staging validation tooling for the Envoy rate-limit POC:
- [burst-test.sh](/Users/bhagya/work/formbricks/formbricks-1519/scripts/rate-limit/burst-test.sh) and
[demo.sh](/Users/bhagya/work/formbricks/formbricks-1519/scripts/rate-limit/demo.sh) are the operator-facing
smoke/demo scripts.
- [run-k6.sh](/Users/bhagya/work/formbricks/formbricks-1519/scripts/rate-limit/run-k6.sh) and
[k6/envoy-hardening.js](/Users/bhagya/work/formbricks/formbricks-1519/scripts/rate-limit/k6/envoy-hardening.js)
are the repeatable hardening suite for `internal#1519`.
Use the shell scripts for live demos and quick one-off checks. Use the `k6` suite for smoke, burst, and soak
validation.
## k6 suite
The `k6` suite covers the three first-class hardening scenarios:
- `public`
- `GET /api/v1/client/{environmentId}/environment`
- `management`
- `GET /api/v1/management/me` with `x-api-key`
- `negative`
- `GET /api/v2/health`
Profiles:
- `smoke`
- 1-5 requests to confirm the request path is Envoy-backed (`source=gateway`)
- `burst`
- enough concurrency to force gateway `429`s on the covered routes
- `soak`
- longer sustained load to surface `500/503`, probe flaps, or cache instability
The wrapper prefers a local `k6` binary and falls back to Docker automatically.
### Required environment variables
- `HOST`
- defaults to `https://staging.app.formbricks.com`
- `ENVIRONMENT_ID`
- required for `public`
- `API_KEY`
- required for `management`
### Optional environment variables
- `VUS`
- override the profile default concurrent virtual users
- `ITERATIONS`
- override the `per-vu-iterations` count used by `smoke` and `burst`
- `DURATION`
- override the soak duration
- `MAX_DURATION`
- override the `smoke`/`burst` max duration
- `SLEEP_SECONDS`
- add a delay between iterations
- `K6_DOCKER_IMAGE`
- override the default Docker fallback image (`grafana/k6:latest`)
### Example
```bash
HOST=https://staging.app.formbricks.com \
ENVIRONMENT_ID=<environment_id> \
API_KEY=<api_key> \
scripts/rate-limit/run-k6.sh smoke all
```
```bash
HOST=https://staging.app.formbricks.com \
ENVIRONMENT_ID=<environment_id> \
scripts/rate-limit/run-k6.sh burst public
```
```bash
HOST=https://staging.app.formbricks.com \
API_KEY=<api_key> \
VUS=20 \
ITERATIONS=6 \
scripts/rate-limit/run-k6.sh burst management
```
### What the `k6` summary reports
Each run ends with a machine-readable summary block:
- total requests
- `200`, `429`, `5xx`, and `other` counts
- `gateway_routed_responses`
- `gateway_429s`, `app_429s`, `unknown_429s`
- p95/p99 latency
- `result=PASS|FAIL`
Pass criteria:
- `public` / `management` `smoke`
- at least one gateway-tagged response
- no `429`s
- no `5xx`s
- no `status_other` responses
- `public` / `management` `burst`
- at least one gateway `429`
- zero app `429`s
- zero `5xx`s
- zero `status_other` responses
- `public` / `management` `soak`
- gateway path confirmed
- at least one gateway `429`
- zero app `429`s
- zero `5xx`s
- zero `status_other` responses
- `negative`
- zero `429`s, zero `5xx`s, and zero `status_other` responses
## Shell scripts
The shell scripts keep their existing role as quick operator tools:
- `burst-test.sh`
- request-by-request output for ad hoc checks or live debugging
- `demo.sh`
- guided staging demo flow used in meetings
## How the shell scripts classify responses
`source=gateway` means the response included Envoy-visible headers such as `x-envoy-ratelimited` or
`x-ratelimit-*`, or the POC returned an empty-body `429`.
`source=app` means the response body matched the Formbricks `too_many_requests` JSON shape.
`source=unknown` means the response was neither of those and should be inspected manually.
+218
View File
@@ -0,0 +1,218 @@
#!/usr/bin/env bash
set -euo pipefail
SCENARIO="${1:-}"
HOST="${HOST:-https://staging.app.formbricks.com}"
ENVIRONMENT_ID="${ENVIRONMENT_ID:-}"
API_KEY="${API_KEY:-}"
COUNT="${COUNT:-20}"
CONCURRENCY="${CONCURRENCY:-1}"
SLEEP_SECONDS="${SLEEP_SECONDS:-0}"
RESPONSE_ID="${RESPONSE_ID:-envoy-poc-response}"
WEBHOOK_ID="${WEBHOOK_ID:-envoy-poc-webhook}"
FILE_KEY="${FILE_KEY:-envoy-poc-file.txt}"
if [[ -z "$SCENARIO" ]]; then
echo "usage: scripts/rate-limit/burst-test.sh <scenario>" >&2
exit 1
fi
require_env_id() {
if [[ -z "$ENVIRONMENT_ID" ]]; then
echo "ENVIRONMENT_ID is required for scenario '$SCENARIO'" >&2
exit 1
fi
}
require_api_key() {
if [[ -z "$API_KEY" ]]; then
echo "API_KEY is required for scenario '$SCENARIO'" >&2
exit 1
fi
}
METHOD="GET"
URL=""
BODY=""
CONTENT_TYPE=""
EXTRA_HEADERS=()
case "$SCENARIO" in
login)
METHOD="POST"
URL="$HOST/api/auth/callback/credentials"
BODY="email=rate-limit%40example.com&password=wrong-password"
CONTENT_TYPE="application/x-www-form-urlencoded"
;;
verify-token)
METHOD="POST"
URL="$HOST/api/auth/callback/token"
BODY="token=invalid-token"
CONTENT_TYPE="application/x-www-form-urlencoded"
;;
v1-client-environment)
require_env_id
URL="$HOST/api/v1/client/$ENVIRONMENT_ID/environment"
;;
v1-client-storage)
require_env_id
METHOD="POST"
URL="$HOST/api/v1/client/$ENVIRONMENT_ID/storage"
BODY='{}'
CONTENT_TYPE="application/json"
;;
v2-responses-post)
require_env_id
METHOD="POST"
URL="$HOST/api/v2/client/$ENVIRONMENT_ID/responses"
BODY='{}'
CONTENT_TYPE="application/json"
;;
v2-responses-put)
require_env_id
METHOD="PUT"
URL="$HOST/api/v2/client/$ENVIRONMENT_ID/responses/$RESPONSE_ID"
BODY='{}'
CONTENT_TYPE="application/json"
;;
v2-displays-post)
require_env_id
METHOD="POST"
URL="$HOST/api/v2/client/$ENVIRONMENT_ID/displays"
BODY='{}'
CONTENT_TYPE="application/json"
;;
v2-client-storage)
require_env_id
METHOD="POST"
URL="$HOST/api/v2/client/$ENVIRONMENT_ID/storage"
BODY='{}'
CONTENT_TYPE="application/json"
;;
v2-health)
URL="$HOST/api/v2/health"
;;
management-api-key)
require_api_key
URL="$HOST/api/v1/management/me"
EXTRA_HEADERS+=("x-api-key: $API_KEY")
;;
management-storage-api-key)
require_api_key
METHOD="POST"
URL="$HOST/api/v1/management/storage"
BODY='{}'
CONTENT_TYPE="application/json"
EXTRA_HEADERS+=("x-api-key: $API_KEY")
;;
webhooks-api-key)
require_api_key
URL="$HOST/api/v1/webhooks/$WEBHOOK_ID"
EXTRA_HEADERS+=("x-api-key: $API_KEY")
;;
storage-delete-api-key)
require_env_id
require_api_key
METHOD="DELETE"
URL="$HOST/storage/$ENVIRONMENT_ID/public/$FILE_KEY"
EXTRA_HEADERS+=("x-api-key: $API_KEY")
;;
*)
echo "unknown scenario: $SCENARIO" >&2
exit 1
;;
esac
TMP_DIR="$(mktemp -d)"
trap 'rm -rf "$TMP_DIR"' EXIT
run_request() {
local i="$1"
local header_file
local body_file
local status_code
local source
local header_summary
local has_gateway_headers="false"
header_file="$TMP_DIR/$i.headers"
body_file="$TMP_DIR/$i.body"
curl_args=(
-sS
-D "$header_file"
-o "$body_file"
-X "$METHOD"
)
if [[ -n "$CONTENT_TYPE" ]]; then
curl_args+=(-H "content-type: $CONTENT_TYPE")
fi
# Bash 3.x + `set -u` treats empty arrays as unset during expansion, so guard the loop.
if [[ ${#EXTRA_HEADERS[@]:-0} -gt 0 ]]; then
for header in "${EXTRA_HEADERS[@]}"; do
curl_args+=(-H "$header")
done
fi
if [[ -n "$BODY" ]]; then
curl_args+=(--data "$BODY")
fi
status_code="$(curl "${curl_args[@]}" -w '%{http_code}' "$URL")"
source="unknown"
if rg -q '"code":"too_many_requests"' "$body_file"; then
source="app"
else
if rg -qi '^(x-envoy-ratelimited|x-ratelimit-limit|x-ratelimit-remaining|x-ratelimit-reset):' "$header_file"; then
has_gateway_headers="true"
fi
if [[ "$has_gateway_headers" == "true" ]]; then
source="gateway"
elif [[ "$status_code" == "429" && ! -s "$body_file" ]]; then
source="gateway"
fi
fi
printf '%03d scenario=%s status=%s source=%s\n' "$i" "$SCENARIO" "$status_code" "$source"
if [[ "$status_code" == "429" ]]; then
header_summary="$(
{
tr -d '\r' < "$header_file" |
rg -i '^(x-envoy-ratelimited|x-ratelimit-limit|x-ratelimit-remaining|x-ratelimit-reset|content-type|retry-after):' |
paste -sd '; ' -
} || true
)"
printf ' headers: %s\n' "${header_summary:-<none>}"
fi
if [[ "$SLEEP_SECONDS" != "0" ]]; then
sleep "$SLEEP_SECONDS"
fi
}
if (( CONCURRENCY <= 1 )); then
for i in $(seq 1 "$COUNT"); do
run_request "$i"
done
else
pids=()
for i in $(seq 1 "$COUNT"); do
run_request "$i" &
pids+=("$!")
if (( ${#pids[@]} >= CONCURRENCY )); then
wait "${pids[0]}"
pids=("${pids[@]:1}")
fi
done
for pid in "${pids[@]}"; do
wait "$pid"
done
fi
+284
View File
@@ -0,0 +1,284 @@
#!/usr/bin/env bash
set -euo pipefail
MODE="${1:-all}"
HOST="${HOST:-https://staging.app.formbricks.com}"
ENVIRONMENT_ID="${ENVIRONMENT_ID:-}"
API_KEY="${API_KEY:-}"
PUBLIC_COUNT="${PUBLIC_COUNT:-125}"
PUBLIC_CONCURRENCY="${PUBLIC_CONCURRENCY:-20}"
MANAGEMENT_COUNT="${MANAGEMENT_COUNT:-200}"
MANAGEMENT_CONCURRENCY="${MANAGEMENT_CONCURRENCY:-40}"
NEGATIVE_COUNT="${NEGATIVE_COUNT:-25}"
NEGATIVE_CONCURRENCY="${NEGATIVE_CONCURRENCY:-10}"
LOG_WINDOW="${LOG_WINDOW:-5m}"
WORKDIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
BURST_SCRIPT="$WORKDIR/burst-test.sh"
TMP_DIR="$(mktemp -d)"
trap 'rm -rf "$TMP_DIR"' EXIT
usage() {
cat <<'EOF'
usage: scripts/rate-limit/demo.sh [preflight|public|management|negative|evidence|all]
Required environment variables:
ENVIRONMENT_ID Staging environment ID for public client route checks
API_KEY Single-environment staging API key for management route checks
Optional environment variables:
HOST Defaults to https://staging.app.formbricks.com
PUBLIC_COUNT Defaults to 125
PUBLIC_CONCURRENCY Defaults to 20
MANAGEMENT_COUNT Defaults to 200
MANAGEMENT_CONCURRENCY Defaults to 40
NEGATIVE_COUNT Defaults to 25
NEGATIVE_CONCURRENCY Defaults to 10
LOG_WINDOW Defaults to 5m
EOF
}
require_env_id() {
if [[ -z "$ENVIRONMENT_ID" ]]; then
echo "ENVIRONMENT_ID is required" >&2
exit 1
fi
}
require_api_key() {
if [[ -z "$API_KEY" ]]; then
echo "API_KEY is required" >&2
exit 1
fi
}
section() {
printf '\n== %s ==\n' "$1"
}
run_and_capture() {
local output_file="$1"
shift
"$@" | tee "$output_file"
}
summarize_output() {
local output_file="$1"
awk '
/scenario=/ {
status = ""
source = ""
for (i = 1; i <= NF; i++) {
if ($i ~ /^status=/) {
status = substr($i, 8)
}
if ($i ~ /^source=/) {
source = substr($i, 8)
}
}
if (status != "" && source != "") {
counts[status "|" source]++
}
}
END {
for (key in counts) {
split(key, parts, "|")
printf "status=%s source=%s count=%d\n", parts[1], parts[2], counts[key]
}
}
' "$output_file" | sort
}
print_summary_insights() {
local output_file="$1"
local gateway_429_count
local app_429_count
local unknown_429_count
local server_error_count
gateway_429_count="$(count_matches 'status=429 source=gateway' "$output_file")"
app_429_count="$(count_matches 'status=429 source=app' "$output_file")"
unknown_429_count="$(count_matches 'status=429 source=unknown' "$output_file")"
server_error_count="$(count_matches 'status=5[0-9][0-9] source=' "$output_file")"
echo "gateway_429s=$gateway_429_count"
echo "app_429s=$app_429_count"
echo "unknown_429s=$unknown_429_count"
echo "server_errors=$server_error_count"
}
count_matches() {
local pattern="$1"
local input_file="$2"
local count
count="$(rg -c "$pattern" "$input_file" 2>/dev/null || true)"
echo "${count:-0}"
}
assert_gateway_probe() {
local output_file="$1"
if ! rg -q 'source=gateway' "$output_file"; then
echo "Expected a gateway-tagged response in probe output, but none was found." >&2
exit 1
fi
}
assert_gateway_rate_limit() {
local output_file="$1"
if ! rg -q 'status=429 source=gateway' "$output_file"; then
echo "Expected at least one gateway 429 in burst output, but none was found." >&2
exit 1
fi
}
assert_no_429() {
local output_file="$1"
if rg -q 'status=429 source=' "$output_file"; then
echo "Expected no 429s in excluded-route output, but at least one was found." >&2
exit 1
fi
}
show_envoy_log_evidence() {
local pattern="$1"
section "Recent Envoy Evidence"
if ! command -v kubectl >/dev/null 2>&1; then
echo "kubectl not available; skipping live Envoy log evidence."
return
fi
if ! kubectl logs -n formbricks-stage deploy/formbricks-stage-envoy -c envoy --since="$LOG_WINDOW" 2>/dev/null | \
rg "$pattern" | \
rg 'request_rate_limited|response_flags":"RL"'; then
echo "No matching Envoy log lines found in the last $LOG_WINDOW."
fi
}
print_known_caveat() {
cat <<'EOF'
Known staging caveat:
- intermittent 500/503 responses can still appear under high burst load on the environment route
- this is a staging stability issue on top of the Envoy POC, not a sign that the gateway path is bypassed
- the demo still passes if you see gateway-tagged 429 responses
EOF
}
run_preflight() {
require_env_id
require_api_key
section "Preflight"
echo "Host: $HOST"
echo "Environment ID: $ENVIRONMENT_ID"
echo "API key: provided"
section "Public Route Probe"
public_probe_output="$TMP_DIR/public-probe.txt"
run_and_capture \
"$public_probe_output" \
env HOST="$HOST" ENVIRONMENT_ID="$ENVIRONMENT_ID" COUNT=1 "$BURST_SCRIPT" v1-client-environment
assert_gateway_probe "$public_probe_output"
section "Management Route Probe"
management_probe_output="$TMP_DIR/management-probe.txt"
run_and_capture \
"$management_probe_output" \
env HOST="$HOST" API_KEY="$API_KEY" COUNT=1 "$BURST_SCRIPT" management-api-key
assert_gateway_probe "$management_probe_output"
print_known_caveat
}
run_public_demo() {
require_env_id
section "Public IP Demo"
echo "Route: GET /api/v1/client/$ENVIRONMENT_ID/environment"
echo "Expected: gateway 429 after threshold"
public_output="$TMP_DIR/public-burst.txt"
run_and_capture \
"$public_output" \
env HOST="$HOST" ENVIRONMENT_ID="$ENVIRONMENT_ID" COUNT="$PUBLIC_COUNT" CONCURRENCY="$PUBLIC_CONCURRENCY" \
"$BURST_SCRIPT" v1-client-environment
section "Public IP Summary"
summarize_output "$public_output"
print_summary_insights "$public_output"
assert_gateway_rate_limit "$public_output"
show_envoy_log_evidence 'formbricks-stage-v1-client'
}
run_management_demo() {
require_api_key
section "API Key Demo"
echo "Route: GET /api/v1/management/me"
echo "Expected: gateway 429 after threshold"
management_output="$TMP_DIR/management-burst.txt"
run_and_capture \
"$management_output" \
env HOST="$HOST" API_KEY="$API_KEY" COUNT="$MANAGEMENT_COUNT" CONCURRENCY="$MANAGEMENT_CONCURRENCY" \
"$BURST_SCRIPT" management-api-key
section "API Key Summary"
summarize_output "$management_output"
print_summary_insights "$management_output"
assert_gateway_rate_limit "$management_output"
show_envoy_log_evidence 'formbricks-stage-v1-management'
}
run_negative_demo() {
section "Excluded Route Demo"
echo "Route: GET /api/v2/health"
echo "Expected: no 429 responses because this route is excluded from the gateway policy set"
negative_output="$TMP_DIR/negative-burst.txt"
run_and_capture \
"$negative_output" \
env HOST="$HOST" COUNT="$NEGATIVE_COUNT" CONCURRENCY="$NEGATIVE_CONCURRENCY" \
"$BURST_SCRIPT" v2-health
section "Excluded Route Summary"
summarize_output "$negative_output"
print_summary_insights "$negative_output"
assert_no_429 "$negative_output"
}
run_evidence_only() {
show_envoy_log_evidence 'formbricks-stage-v1-client|formbricks-stage-v1-management'
}
case "$MODE" in
preflight)
run_preflight
;;
public)
run_public_demo
;;
management)
run_management_demo
;;
negative)
run_negative_demo
;;
evidence)
run_evidence_only
;;
all)
run_preflight
run_public_demo
run_management_demo
run_negative_demo
;;
-h|--help|help)
usage
;;
*)
usage >&2
exit 1
;;
esac
+272
View File
@@ -0,0 +1,272 @@
import http from "k6/http";
import { Counter } from "k6/metrics";
import { sleep } from "k6";
const PROFILE = (__ENV.PROFILE || "smoke").toLowerCase();
const SCENARIO = (__ENV.SCENARIO || "public").toLowerCase();
const HOST = __ENV.HOST || "https://staging.app.formbricks.com";
const ENVIRONMENT_ID = __ENV.ENVIRONMENT_ID || "";
const API_KEY = __ENV.API_KEY || "";
const SLEEP_SECONDS = Number(__ENV.SLEEP_SECONDS || "0");
const totalResponses = new Counter("total_responses");
const status2xx = new Counter("status_2xx");
const status429 = new Counter("status_429");
const status5xx = new Counter("status_5xx");
const statusOther = new Counter("status_other");
const gatewayRoutedResponses = new Counter("gateway_routed_responses");
const gateway429s = new Counter("gateway_429s");
const app429s = new Counter("app_429s");
const unknown429s = new Counter("unknown_429s");
const profileDefaults = {
smoke: {
public: { executor: "per-vu-iterations", vus: 1, iterations: 3 },
management: { executor: "per-vu-iterations", vus: 1, iterations: 3 },
negative: { executor: "per-vu-iterations", vus: 1, iterations: 5 },
},
burst: {
public: { executor: "per-vu-iterations", vus: 20, iterations: 7 },
management: { executor: "per-vu-iterations", vus: 20, iterations: 6 },
negative: { executor: "per-vu-iterations", vus: 10, iterations: 3 },
},
soak: {
public: { executor: "constant-vus", vus: 10, duration: "5m" },
management: { executor: "constant-vus", vus: 15, duration: "5m" },
negative: { executor: "constant-vus", vus: 5, duration: "3m" },
},
};
function requireValue(value, name, scenario) {
if (!value) {
throw new Error(`${name} is required for scenario "${scenario}"`);
}
}
function getScenarioConfig(profile, scenario) {
const profileConfig = profileDefaults[profile];
if (!profileConfig) {
throw new Error(`Unsupported PROFILE "${profile}". Use smoke, burst, or soak.`);
}
const scenarioConfig = profileConfig[scenario];
if (!scenarioConfig) {
throw new Error(`Unsupported SCENARIO "${scenario}". Use public, management, or negative.`);
}
return scenarioConfig;
}
function buildOptions() {
const base = getScenarioConfig(PROFILE, SCENARIO);
const scenarioConfig = {
executor: base.executor,
exec: "runScenario",
gracefulStop: "0s",
tags: {
profile: PROFILE,
scenario: SCENARIO,
},
};
if (base.executor === "constant-vus") {
scenarioConfig.vus = Number(__ENV.VUS || String(base.vus));
scenarioConfig.duration = __ENV.DURATION || base.duration;
} else {
scenarioConfig.vus = Number(__ENV.VUS || String(base.vus));
scenarioConfig.iterations = Number(__ENV.ITERATIONS || String(base.iterations));
scenarioConfig.maxDuration = __ENV.MAX_DURATION || "10m";
}
return {
scenarios: {
envoy_hardening: scenarioConfig,
},
};
}
function getHeader(response, name) {
const target = name.toLowerCase();
for (const [key, value] of Object.entries(response.headers || {})) {
if (key.toLowerCase() === target) {
return Array.isArray(value) ? value[0] : value;
}
}
return undefined;
}
function hasGatewayHeaders(response) {
return [
"x-envoy-ratelimited",
"x-ratelimit-limit",
"x-ratelimit-remaining",
"x-ratelimit-reset",
].some((header) => Boolean(getHeader(response, header)));
}
function classifyResponse(response) {
const body = typeof response.body === "string" ? response.body : "";
if (body.includes('"code":"too_many_requests"')) {
return "app";
}
if (hasGatewayHeaders(response)) {
return "gateway";
}
if (response.status === 429 && body.trim().length === 0) {
return "gateway";
}
return "unknown";
}
function buildRequest() {
switch (SCENARIO) {
case "public":
requireValue(ENVIRONMENT_ID, "ENVIRONMENT_ID", SCENARIO);
return {
label: "GET /api/v1/client/{environmentId}/environment",
method: "GET",
url: `${HOST}/api/v1/client/${ENVIRONMENT_ID}/environment`,
body: null,
params: { timeout: "30s" },
};
case "management":
requireValue(API_KEY, "API_KEY", SCENARIO);
return {
label: "GET /api/v1/management/me",
method: "GET",
url: `${HOST}/api/v1/management/me`,
body: null,
params: {
timeout: "30s",
headers: {
"x-api-key": API_KEY,
},
},
};
case "negative":
return {
label: "GET /api/v2/health",
method: "GET",
url: `${HOST}/api/v2/health`,
body: null,
params: { timeout: "30s" },
};
default:
throw new Error(`Unsupported SCENARIO "${SCENARIO}"`);
}
}
function recordResponse(response) {
totalResponses.add(1);
if (response.status >= 200 && response.status < 300) {
status2xx.add(1);
} else if (response.status === 429) {
status429.add(1);
} else if (response.status >= 500) {
status5xx.add(1);
} else {
statusOther.add(1);
}
const source = classifyResponse(response);
if (source === "gateway") {
gatewayRoutedResponses.add(1);
}
if (response.status === 429) {
if (source === "gateway") {
gateway429s.add(1);
} else if (source === "app") {
app429s.add(1);
} else {
unknown429s.add(1);
}
}
}
function metricCount(data, name) {
return data.metrics[name]?.values?.count ?? 0;
}
function trendValue(data, name, key) {
return data.metrics[name]?.values?.[key] ?? 0;
}
function evaluateRun(data) {
const total429s = metricCount(data, "status_429");
const gatewayTagged = metricCount(data, "gateway_routed_responses");
const gatewayLimited = metricCount(data, "gateway_429s");
const appLimited = metricCount(data, "app_429s");
const errors5xx = metricCount(data, "status_5xx");
const otherStatuses = metricCount(data, "status_other");
if (SCENARIO === "negative") {
return total429s === 0 && errors5xx === 0 && otherStatuses === 0;
}
if (PROFILE === "smoke") {
return gatewayTagged > 0 && total429s === 0 && errors5xx === 0 && otherStatuses === 0;
}
if (PROFILE === "burst") {
return gatewayLimited > 0 && appLimited === 0 && errors5xx === 0 && otherStatuses === 0;
}
return (
gatewayTagged > 0 &&
gatewayLimited > 0 &&
appLimited === 0 &&
errors5xx === 0 &&
otherStatuses === 0
);
}
function formatNumber(value) {
return Number(value || 0).toFixed(2);
}
export const options = buildOptions();
export function runScenario() {
const request = buildRequest();
const response = http.request(request.method, request.url, request.body, request.params);
recordResponse(response);
if (SLEEP_SECONDS > 0) {
sleep(SLEEP_SECONDS);
}
}
export function handleSummary(data) {
const result = evaluateRun(data) ? "PASS" : "FAIL";
const totalRequests = data.metrics.http_reqs?.values?.count ?? 0;
const summary = [
"=== Envoy Hardening Summary ===",
`profile=${PROFILE}`,
`scenario=${SCENARIO}`,
`host=${HOST}`,
`route=${buildRequest().label}`,
`total_requests=${totalRequests}`,
`status_2xx=${metricCount(data, "status_2xx")}`,
`status_429=${metricCount(data, "status_429")}`,
`status_5xx=${metricCount(data, "status_5xx")}`,
`status_other=${metricCount(data, "status_other")}`,
`gateway_routed_responses=${metricCount(data, "gateway_routed_responses")}`,
`gateway_429s=${metricCount(data, "gateway_429s")}`,
`app_429s=${metricCount(data, "app_429s")}`,
`unknown_429s=${metricCount(data, "unknown_429s")}`,
`http_req_duration_p95_ms=${formatNumber(trendValue(data, "http_req_duration", "p(95)"))}`,
`http_req_duration_p99_ms=${formatNumber(trendValue(data, "http_req_duration", "p(99)"))}`,
`iteration_duration_p95_ms=${formatNumber(trendValue(data, "iteration_duration", "p(95)"))}`,
`result=${result}`,
];
return {
stdout: `${summary.join("\n")}\n`,
};
}
+110
View File
@@ -0,0 +1,110 @@
#!/usr/bin/env bash
set -euo pipefail
PROFILE="${1:-}"
SCENARIO="${2:-all}"
K6_DOCKER_IMAGE="${K6_DOCKER_IMAGE:-grafana/k6:latest}"
if [[ -z "$PROFILE" ]]; then
echo "usage: scripts/rate-limit/run-k6.sh <smoke|burst|soak> [public|management|negative|all]" >&2
exit 1
fi
case "$PROFILE" in
smoke|burst|soak) ;;
*)
echo "invalid profile: $PROFILE" >&2
exit 1
;;
esac
case "$SCENARIO" in
public|management|negative|all) ;;
*)
echo "invalid scenario: $SCENARIO" >&2
exit 1
;;
esac
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
REPO_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
K6_SCRIPT="/workspace/scripts/rate-limit/k6/envoy-hardening.js"
build_env_args() {
local key
env_args=()
for key in HOST ENVIRONMENT_ID API_KEY VUS ITERATIONS DURATION MAX_DURATION SLEEP_SECONDS; do
if [[ -n "${!key:-}" ]]; then
env_args+=(-e "$key=${!key}")
fi
done
}
run_single() {
local scenario="$1"
local tmp_output
local status=0
local command_status=0
tmp_output="$(mktemp)"
echo "== k6 profile=$PROFILE scenario=$scenario =="
build_env_args
if command -v k6 >/dev/null 2>&1; then
set +e
(
cd "$REPO_ROOT"
k6 run "${env_args[@]}" -e "PROFILE=$PROFILE" -e "SCENARIO=$scenario" \
"scripts/rate-limit/k6/envoy-hardening.js"
) | tee "$tmp_output"
command_status="${PIPESTATUS[0]}"
set -e
else
if ! docker info >/dev/null 2>&1; then
echo "docker is required for the k6 fallback, but the Docker daemon is not reachable" >&2
rm -f "$tmp_output"
return 1
fi
set +e
docker run --rm -i \
-v "$REPO_ROOT:/workspace" \
-w /workspace \
"${env_args[@]}" \
-e "PROFILE=$PROFILE" \
-e "SCENARIO=$scenario" \
"$K6_DOCKER_IMAGE" run "$K6_SCRIPT" | tee "$tmp_output"
command_status="${PIPESTATUS[0]}"
set -e
fi
if [[ "$command_status" -ne 0 ]]; then
rm -f "$tmp_output"
return "$command_status"
fi
if rg -q '^result=FAIL$' "$tmp_output"; then
status=1
fi
rm -f "$tmp_output"
return "$status"
}
overall_status=0
if [[ "$SCENARIO" == "all" ]]; then
for scenario in public management negative; do
if ! run_single "$scenario"; then
overall_status=1
fi
done
else
if ! run_single "$SCENARIO"; then
overall_status=1
fi
fi
exit "$overall_status"