Confidential Document

This document is restricted to RRI leadership.

Incorrect password
UNCLOG — Remove What Slows You Down
U9

Obv.io Sync Acceleration

IN PROGRESS Wave 0-1 · 3 weeks

Executive Summary

The rri-event-api worker runs in a 30-second polling loop with a 1-second API rate limit between calls, using 1-2 workers. At peak event sales (hundreds of purchases per minute), this creates a growing queue where customers wait 45+ minutes for their magic link to access the virtual event.

Late buyers during Day 4’s final pitch don’t get magic links before the event ends. This directly causes refund requests and customer frustration at the highest-emotion moment of the UPW experience.

The fix is two-phased: Phase 1 (pre-UPW) reduces the polling interval and adds parallelism for immediate relief. Phase 2 (post-UPW) replaces polling entirely with event-driven Postgres LISTEN/NOTIFY for sub-second dispatch.

What Needs to Happen

Phase 1 — Pre-UPW (March 3-12)

  1. Verify SELECT FOR UPDATE SKIP LOCKED — Must confirm before scaling dynos. Without it, multiple workers create duplicate Obv.io attendees. Day 1 critical check.
  2. Reduce polling interval from 30s to 5s — Environment variable change. Immediate 6x throughput improvement. Day 2.
  3. Add Promise.all parallelism (10-20 concurrent) — Process multiple attendee syncs simultaneously within each polling cycle. Day 2-3.
  4. Scale to 3-4 Heroku dynos — Only after SKIP LOCKED is verified. Multiplies throughput by worker count. Day 3-4.
  5. Call Obv.io support — Ask about batch attendee endpoint and per-event rate limits. If batch endpoint exists, Phase 1 becomes trivial. Day 1.

Phase 2 — Post-UPW

  1. Replace polling with Postgres LISTEN/NOTIFY — Using pg-listen library. Sub-second dispatch from Event Credit creation. Eliminates polling entirely.
  2. BullMQ queue refactor — Proper job queue with retry logic, dead letter queue, and monitoring.
  3. HireFire autoscaling — $9-19/month. Automatically scales dynos based on queue depth during events.

Critical check: Without SELECT FOR UPDATE SKIP LOCKED, multiple workers create duplicate Obv.io attendees. This must be verified before scaling dynos. If not implemented, dyno scaling creates a duplicate attendee bug that’s worse than the delay.

Claude Code acceleration: Parallelism implementation (Promise.all batching), LISTEN/NOTIFY integration with pg-listen, and BullMQ queue refactor are all highly automatable. Claude Code saves ~1 week, bringing 3 weeks down to 1.5-2 weeks.

Throughput Model

ScenarioThroughputTime for 1,500 Buyers
Current (baseline)1/sec serial~25 minutes
Phase 160 parallelUnder 2 minutes
Phase 2120 parallel event-drivenUnder 60 seconds

Completion Criteria

  • SELECT FOR UPDATE SKIP LOCKED verified and implemented
  • Polling interval reduced from 30s to 5s
  • Promise.all parallelism (10-20 concurrent) deployed
  • Event-api scaled to 3-4 dynos for UPW
  • Magic link generation within 5 seconds of purchase at peak event volumes (Phase 2)
  • Postgres LISTEN/NOTIFY replacing polling (Phase 2)
  • HireFire autoscaling configured (Phase 2)

Initiative Attributes

U9 — Obv.io Sync Acceleration
Cost
$10K-$18K one-time + $85-120/month ongoing infrastructure
Timeline (Original)
3 weeks — Phase 1: 1 week pre-UPW. Phase 2: 2-3 weeks post-UPW. (Wave 0-1)
Timeline (With Claude Code)
1.5-2 weeks
Parallelism + LISTEN/NOTIFY
Owner
Spork + Zach Hardesty (HireFire autoscaling)
Dependencies
Soft: D6 (load testing validates throughput improvements), U6 (Blackthorn fix removes transaction linking delay — making Obv.io sync the last major bottleneck)
Unblocks
S5 (Portal Unification replaces Obv.io magic link model — U9 optimizes the interim state)
Revenue at Risk
Refund requests from customers who couldn’t access event after purchase
Success Metrics
Magic link generation within 5 seconds of purchase at peak event volumes

Related Risks

IDRiskSeverityProbabilityMitigation
RF8 Obv.io SKIP LOCKED not implemented — duplicate attendees under parallelism HIGH MEDIUM Verify SELECT FOR UPDATE SKIP LOCKED before scaling dynos in U9 Phase 1. If not implemented, dyno scaling creates duplicate attendee bug. Must verify before March 12.
RF7 Spork overload in Wave 0 (4 initiatives in 9 days) MEDIUM HIGH Kill 6+ daily meetings before March 12. Route status through Kingler. Erik must cancel cross-department meetings.