Auth Load Capacity & Cognito Hardening
Executive Summary
Cognito has a default rate limit of 120 RPS shared across ALL user pools in the AWS account. For UPW March 12 (virtual, ~20,000 participants, ~1,500 buyers spread across 4 sales moments over 4 days), peak concurrent auth load during the biggest pitch window is likely 1-3 RPS at checkout, 10-50 RPS including page loads. This is well within the 120 RPS default limit.
For March 12, we deploy three layers: token caching, guest checkout fallback, and CloudWatch monitoring. The default 120 RPS limit is more than sufficient for a virtual event.
Key insight: Stripe natively supports guest customers. rri-order-ingestion already handles anonymous charges by email. The guest checkout fallback (Layer 4) requires zero downstream pipeline changes. For a virtual UPW where buyers click a link from Zoom, guest checkout is the simplest, most reliable path.
What Needs to Happen
Three-Layer Defense for UPW
| Layer | What | Timeline | Cost |
|---|---|---|---|
| 1 | Token caching via ElastiCache Redis | 3-4 days | $50-80/month |
| 2 | Guest checkout fallback (3-sec timeout → Stripe guest) | 2-3 days | $0 |
| 3 | CloudWatch monitoring + Chatot alerts | 1-2 days | $0 |
User pool isolation (checkout vs portal) is a Q2 follow-up at minimal cost.
- Deploy ElastiCache Redis token cache — Cache Cognito tokens to reduce direct API calls. 3-4 days.
- Build guest checkout fallback — 3-second timeout on Cognito auth → automatic fallback to Stripe guest mode. 2-3 days.
- Configure CloudWatch monitoring + Chatot alerts — Real-time auth failure rate monitoring with automated alerting. 1-2 days.
- Configure Chatot pre-warm cron — 100-200 bot sessions 45 minutes before each sales pitch window. Replaces Johnny’s manual warm-up process.
Revenue at risk per 30-minute auth failure window: $371K (750 failed logins x $495 average). Layer 4 (guest checkout fallback) is the safety net — bypasses Cognito entirely if auth is slow.
Claude Code acceleration: Redis cache configuration, fallback code patterns, and CloudWatch setup are all highly automatable. Estimated savings: 2-3 days from the original 7-day timeline.
Completion Criteria
- ElastiCache Redis token cache deployed and reducing Cognito direct API calls
- Guest checkout fallback tested: 3-second Cognito timeout → Stripe guest mode
- CloudWatch monitoring active with Chatot alerting on auth failure rate spikes
- Chatot pre-warm cron configured: 100-200 bot sessions 45min before each pitch window
- March 11 Go/No-Go check passed: token caching active, guest checkout fallback tested, monitoring confirmed
Initiative Attributes
Tools Required
| Tool | Purpose | Cost |
|---|---|---|
| ElastiCache Redis | Token caching — reduces direct Cognito API calls during peak auth | $50-80/month |
| CloudWatch | Auth failure rate monitoring + automated alerting | Included in AWS |
| Chatot | Pre-warm cron — 100-200 bot sessions before pitch windows | Existing infrastructure |
Related Risks
| ID | Risk | Severity | Probability | Mitigation |
|---|---|---|---|---|
| RF7 | Spork overload in Wave 0 (4 initiatives in 9 days) | MEDIUM | HIGH | Kill 6+ daily meetings before March 12. Route status through Kingler. Erik must cancel cross-department meetings. |