D4. Auth Load Capacity & Cognito Hardening

Executive Summary

Cognito has a default rate limit of 120 RPS shared across ALL user pools in the AWS account. For UPW March 12 (virtual, ~20,000 participants, ~1,500 buyers spread across 4 sales moments over 4 days), peak concurrent auth load during the biggest pitch window is likely 1-3 RPS at checkout, 10-50 RPS including page loads. This is well within the 120 RPS default limit.

For March 12, we deploy three layers: token caching, guest checkout fallback, and CloudWatch monitoring. The default 120 RPS limit is more than sufficient for a virtual event.

Key insight: Stripe natively supports guest customers. rri-order-ingestion already handles anonymous charges by email. The guest checkout fallback (Layer 4) requires zero downstream pipeline changes. For a virtual UPW where buyers click a link from Zoom, guest checkout is the simplest, most reliable path.

What Needs to Happen

Three-Layer Defense for UPW

Layer	What	Timeline	Cost
1	Token caching via ElastiCache Redis	3-4 days	$50-80/month
2	Guest checkout fallback (3-sec timeout → Stripe guest)	2-3 days	$0
3	CloudWatch monitoring + Chatot alerts	1-2 days	$0

User pool isolation (checkout vs portal) is a Q2 follow-up at minimal cost.

Deploy ElastiCache Redis token cache — Cache Cognito tokens to reduce direct API calls. 3-4 days.
Build guest checkout fallback — 3-second timeout on Cognito auth → automatic fallback to Stripe guest mode. 2-3 days.
Configure CloudWatch monitoring + Chatot alerts — Real-time auth failure rate monitoring with automated alerting. 1-2 days.
Configure Chatot pre-warm cron — 100-200 bot sessions 45 minutes before each sales pitch window. Replaces Johnny’s manual warm-up process.

Revenue at risk per 30-minute auth failure window: $371K (750 failed logins x $495 average). Layer 4 (guest checkout fallback) is the safety net — bypasses Cognito entirely if auth is slow.

Claude Code acceleration: Redis cache configuration, fallback code patterns, and CloudWatch setup are all highly automatable. Estimated savings: 2-3 days from the original 7-day timeline.

Completion Criteria

ElastiCache Redis token cache deployed and reducing Cognito direct API calls
Guest checkout fallback tested: 3-second Cognito timeout → Stripe guest mode
CloudWatch monitoring active with Chatot alerting on auth failure rate spikes
Chatot pre-warm cron configured: 100-200 bot sessions 45min before each pitch window
March 11 Go/No-Go check passed: token caching active, guest checkout fallback tested, monitoring confirmed

Initiative Attributes

D4 — Auth Load Capacity & Cognito Hardening

Cost

$50-80/month ongoing (token caching) + ~$8-12K one-time labor

Timeline (Original)

7 working days — MUST complete before March 12

Timeline (With Claude Code)

4-5 days

⚡ Redis cache config + fallback code + CloudWatch setup

Owner

Johnny Yarlott + Zach Hardesty (CloudWatch/monitoring) + Spork (Chatot pre-warm)

Dependencies

None (starts immediately). Soft: D3 (credential rotation closes one attack vector)

Unblocks

D6 (load testing validates D4 layers), U2 (checkout depends on Cognito being reliable), U3 (SSO needs Cognito as foundation)

Revenue at Risk

$371K per 30-min window — guest checkout fallback is the safety net

Success Metrics

March 11: token caching active, guest checkout fallback tested, CloudWatch alerting confirmed

Tools Required

Tool	Purpose	Cost
ElastiCache Redis	Token caching — reduces direct Cognito API calls during peak auth	$50-80/month
CloudWatch	Auth failure rate monitoring + automated alerting	Included in AWS
Chatot	Pre-warm cron — 100-200 bot sessions before pitch windows	Existing infrastructure

Related Risks

ID	Risk	Severity	Probability	Mitigation
RF7	Spork overload in Wave 0 (4 initiatives in 9 days)	MEDIUM	HIGH	Kill 6+ daily meetings before March 12. Route status through Kingler. Erik must cancel cross-department meetings.

Confidential Document

Executive Summary

What Needs to Happen

Three-Layer Defense for UPW

Completion Criteria

Initiative Attributes

Tools Required

Related Risks