Rate Limiting
Request rate limits, error handling, and best practices for the Pendium MCP server.
Limits
The MCP endpoint enforces per-caller rate limits within a 60-second sliding window:
| Caller type | Limit | Key |
|---|---|---|
| Authenticated (API key) | 60 requests / minute | User ID |
| Unauthenticated | 15 requests / minute | IP address |
These limits apply at the transport level — every MCP request (including initialize, tool calls, and listing) counts toward the limit.
Rate limit response
When the limit is exceeded, the server responds with HTTP 429 and a JSON-RPC error body:
The response includes these headers:
| Header | Description |
|---|---|
Retry-After | Seconds until the rate limit resets |
X-RateLimit-Limit | Maximum requests allowed in the window |
X-RateLimit-Remaining | Remaining requests (always 0 when limited) |
Best practices
- Poll scans at 30–60 second intervals. After triggering
scan_visibility, callget_scan_statusevery 30–60 seconds rather than in a tight loop. - Cache account data. The
get_accountresponse changes infrequently — avoid calling it on every interaction. - Batch your reads. If you need multiple pieces of data (report, voice, factsheet), call all tools in sequence within a single conversation turn rather than spreading them across many turns with overhead between each.
- Use authenticated requests. The 60 req/min authenticated limit is 4x the unauthenticated limit.