0016. Notification Service Rate Limiting Strategy
Status: Accepted
Date: 2026-01-23
Context: Outbound rate limiting for SES, Twilio, and future providers so limits are respected and critical messages are not starved.
Context
The Notification Service sends notifications to external providers (AWS SES, Twilio, etc.), each with different API rate limits and constraints:
- AWS SES: 1 email/sec (sandbox), 14 emails/sec (production), burst limits
- Twilio SMS: Account-specific limits (varies by account tier)
- Future Providers: Each will have their own constraints
The existing libs/security rate limiting framework (RateLimitingFilter) is designed for
incoming HTTP request rate limiting (per-user, per-service, per-IP).
This is fundamentally different from what the notification service needs:
outgoing notification rate limiting to external providers with provider-specific constraints.
Requirements:
- Respect provider-specific API limits (SES, Twilio, etc.)
- Ensure critical notifications (password resets) are not blocked
- Support priority-based throttling
- Handle provider rate limit errors gracefully
Decision
The Notification Service will implement per-provider rate limiting with priority-aware throttling.
Rate Limiting Strategy:
- Per-Provider Rate Limiting
- Each provider (AWS SES, Twilio, etc.) has its own rate limiter
- Provider-specific limits configured per provider
- Limits respect external service constraints
- Priority-Aware Throttling
- CRITICAL notifications: Bypass or have very high limits (e.g., 1000/min)
- HIGH notifications: High limits (e.g., 500/min)
- NORMAL notifications: Standard limits (e.g., 100/min)
- LOW notifications: Lower limits (e.g., 50/min), may be throttled further under load
- Capacity Management
- When system load > 80%: Throttle LOW priority notifications
- When system load > 95%: Pause LOW priority processing
- CRITICAL and HIGH always processed regardless of load
Implementation:
- Custom rate limiter in
infrastructure/provider/ or infrastructure/retry/
- Provider-specific limits configured per provider
- Priority-aware: CRITICAL/HIGH may bypass or have higher limits, NORMAL/LOW respect provider limits
- Rate limit errors trigger retry logic with exponential backoff
Rationale
Why Per-Provider (Not Per-Channel or Per-Priority Only):
- Provider Constraints - Each provider has different limits (SES ≠ Twilio)
- Granularity - More granular than per-channel (multiple providers per channel)
- Accuracy - Respects actual external service constraints
- Flexibility - Can add new providers without affecting existing ones
Why Not Use Existing libs/security Rate Limiting:
- Different Purpose - Existing framework is for incoming HTTP requests, not outgoing provider calls
- Provider-Specific - Need provider-specific limits, not user/service-based
- Priority-Aware - Need priority-based throttling, not just flat limits
- Custom Logic - Requires custom logic for provider failover and retry
Why Priority-Aware:
- Critical Notifications - Password resets, account lockouts must not be blocked
- Graceful Degradation - Low-priority notifications can be delayed during high load
- Resource Optimization - Process important notifications first
- User Experience - Critical notifications always get through
Consequences
Positive:
- Provider Compliance - Respects external service rate limits
- Critical Notifications Protected - CRITICAL/HIGH notifications always processed
- Graceful Degradation - System degrades gracefully under load
- Flexibility - Easy to add new providers with their own limits
- Cost Optimization - Prevents rate limit errors and associated costs
Negative / Tradeoffs:
- Custom Implementation - Requires custom rate limiter (not using existing framework)
- Configuration Overhead - Must configure limits per provider
- Complexity - Priority-aware logic adds complexity
- Monitoring - Need metrics for rate limiting per provider/priority
Mitigations:
- Reuse Patterns - Can reuse rate limiting patterns from
libs/security (Bucket4j, etc.)
- Configuration - Provider limits in configuration files
- Metrics - Rate limiting metrics integrated with existing metrics infrastructure
- Documentation - Clear documentation of provider limits and configuration
Implementation
Rate Limiter Structure:
@ApplicationScoped
public class ProviderRateLimiter {
private final Map<String, RateLimiter> providerLimiters;
public boolean tryConsume(String provider, NotificationPriority priority) {
RateLimiter limiter = providerLimiters.get(provider);
// Priority-aware: CRITICAL bypasses, LOW respects strict limits
return limiter.tryConsume(priority);
}
}
Configuration:
# AWS SES Rate Limits
notification.rate-limit.ses.critical.per-minute=1000
notification.rate-limit.ses.high.per-minute=500
notification.rate-limit.ses.normal.per-minute=100
notification.rate-limit.ses.low.per-minute=50
# Twilio Rate Limits (when implemented)
notification.rate-limit.twilio.critical.per-minute=1000
notification.rate-limit.twilio.high.per-minute=500
notification.rate-limit.twilio.normal.per-minute=100
notification.rate-limit.twilio.low.per-minute=50
Integration:
- Rate limiter called before provider.send()
- If rate limit exceeded, notification queued for retry
- Retry respects priority (CRITICAL retried immediately, LOW delayed)
- ADR-0015: Fire-and-Forget / Asynchronous Messaging Pattern (priority-based processing)
- ADR-0011: Stateless JWT Authentication (stateless service design)
Future Considerations
Distributed Rate Limiting:
- When multiple service instances, may need distributed rate limiter (Redis-backed)
- Current: Per-instance rate limiting (acceptable for stateless design)
Dynamic Rate Limit Detection:
- Future: Detect provider rate limits dynamically (from error responses)
- Current: Static configuration