forge-docs

0016. Notification Service Rate Limiting Strategy

Status: Accepted Date: 2026-01-23 Context: Outbound rate limiting for SES, Twilio, and future providers so limits are respected and critical messages are not starved.

Context

The Notification Service sends notifications to external providers (AWS SES, Twilio, etc.), each with different API rate limits and constraints:

AWS SES: 1 email/sec (sandbox), 14 emails/sec (production), burst limits
Twilio SMS: Account-specific limits (varies by account tier)
Future Providers: Each will have their own constraints

The existing libs/security rate limiting framework (RateLimitingFilter) is designed for incoming HTTP request rate limiting (per-user, per-service, per-IP). This is fundamentally different from what the notification service needs: outgoing notification rate limiting to external providers with provider-specific constraints.

Requirements:

Respect provider-specific API limits (SES, Twilio, etc.)
Ensure critical notifications (password resets) are not blocked
Support priority-based throttling
Handle provider rate limit errors gracefully

Decision

The Notification Service will implement per-provider rate limiting with priority-aware throttling.

Rate Limiting Strategy:

Per-Provider Rate Limiting
- Each provider (AWS SES, Twilio, etc.) has its own rate limiter
- Provider-specific limits configured per provider
- Limits respect external service constraints
Priority-Aware Throttling
- CRITICAL notifications: Bypass or have very high limits (e.g., 1000/min)
- HIGH notifications: High limits (e.g., 500/min)
- NORMAL notifications: Standard limits (e.g., 100/min)
- LOW notifications: Lower limits (e.g., 50/min), may be throttled further under load
Capacity Management
- When system load > 80%: Throttle LOW priority notifications
- When system load > 95%: Pause LOW priority processing
- CRITICAL and HIGH always processed regardless of load

Implementation:

Custom rate limiter in infrastructure/provider/ or infrastructure/retry/
Provider-specific limits configured per provider
Priority-aware: CRITICAL/HIGH may bypass or have higher limits, NORMAL/LOW respect provider limits
Rate limit errors trigger retry logic with exponential backoff

Rationale

Why Per-Provider (Not Per-Channel or Per-Priority Only):

Provider Constraints - Each provider has different limits (SES ≠ Twilio)
Granularity - More granular than per-channel (multiple providers per channel)
Accuracy - Respects actual external service constraints
Flexibility - Can add new providers without affecting existing ones

Why Not Use Existing `libs/security` Rate Limiting:

Different Purpose - Existing framework is for incoming HTTP requests, not outgoing provider calls
Provider-Specific - Need provider-specific limits, not user/service-based
Priority-Aware - Need priority-based throttling, not just flat limits
Custom Logic - Requires custom logic for provider failover and retry

Why Priority-Aware:

Critical Notifications - Password resets, account lockouts must not be blocked
Graceful Degradation - Low-priority notifications can be delayed during high load
Resource Optimization - Process important notifications first
User Experience - Critical notifications always get through

Consequences

Positive:

Provider Compliance - Respects external service rate limits
Critical Notifications Protected - CRITICAL/HIGH notifications always processed
Graceful Degradation - System degrades gracefully under load
Flexibility - Easy to add new providers with their own limits
Cost Optimization - Prevents rate limit errors and associated costs

Negative / Tradeoffs:

Custom Implementation - Requires custom rate limiter (not using existing framework)
Configuration Overhead - Must configure limits per provider
Complexity - Priority-aware logic adds complexity
Monitoring - Need metrics for rate limiting per provider/priority

Mitigations:

Reuse Patterns - Can reuse rate limiting patterns from libs/security (Bucket4j, etc.)
Configuration - Provider limits in configuration files
Metrics - Rate limiting metrics integrated with existing metrics infrastructure
Documentation - Clear documentation of provider limits and configuration

Implementation

Rate Limiter Structure:

@ApplicationScoped
public class ProviderRateLimiter {
    private final Map<String, RateLimiter> providerLimiters;
    
    public boolean tryConsume(String provider, NotificationPriority priority) {
        RateLimiter limiter = providerLimiters.get(provider);
        // Priority-aware: CRITICAL bypasses, LOW respects strict limits
        return limiter.tryConsume(priority);
    }
}

Configuration:

# AWS SES Rate Limits
notification.rate-limit.ses.critical.per-minute=1000
notification.rate-limit.ses.high.per-minute=500
notification.rate-limit.ses.normal.per-minute=100
notification.rate-limit.ses.low.per-minute=50

# Twilio Rate Limits (when implemented)
notification.rate-limit.twilio.critical.per-minute=1000
notification.rate-limit.twilio.high.per-minute=500
notification.rate-limit.twilio.normal.per-minute=100
notification.rate-limit.twilio.low.per-minute=50

Integration:

Rate limiter called before provider.send()
If rate limit exceeded, notification queued for retry
Retry respects priority (CRITICAL retried immediately, LOW delayed)

ADR-0015: Fire-and-Forget / Asynchronous Messaging Pattern (priority-based processing)
ADR-0011: Stateless JWT Authentication (stateless service design)

Future Considerations

Distributed Rate Limiting:

When multiple service instances, may need distributed rate limiter (Redis-backed)
Current: Per-instance rate limiting (acceptable for stateless design)

Dynamic Rate Limit Detection:

Future: Detect provider rate limits dynamically (from error responses)
Current: Static configuration