Obv.io Sync Acceleration
Executive Summary
The rri-event-api worker runs in a 30-second polling loop with a 1-second API rate limit between calls, using 1-2 workers. At peak event sales (hundreds of purchases per minute), this creates a growing queue where customers wait 45+ minutes for their magic link to access the virtual event.
Late buyers during Day 4’s final pitch don’t get magic links before the event ends. This directly causes refund requests and customer frustration at the highest-emotion moment of the UPW experience.
The fix is two-phased: Phase 1 (pre-UPW) reduces the polling interval and adds parallelism for immediate relief. Phase 2 (post-UPW) replaces polling entirely with event-driven Postgres LISTEN/NOTIFY for sub-second dispatch.
What Needs to Happen
Phase 1 — Pre-UPW (March 3-12)
- Verify
SELECT FOR UPDATE SKIP LOCKED— Must confirm before scaling dynos. Without it, multiple workers create duplicate Obv.io attendees. Day 1 critical check. - Reduce polling interval from 30s to 5s — Environment variable change. Immediate 6x throughput improvement. Day 2.
- Add
Promise.allparallelism (10-20 concurrent) — Process multiple attendee syncs simultaneously within each polling cycle. Day 2-3. - Scale to 3-4 Heroku dynos — Only after SKIP LOCKED is verified. Multiplies throughput by worker count. Day 3-4.
- Call Obv.io support — Ask about batch attendee endpoint and per-event rate limits. If batch endpoint exists, Phase 1 becomes trivial. Day 1.
Phase 2 — Post-UPW
- Replace polling with Postgres
LISTEN/NOTIFY— Usingpg-listenlibrary. Sub-second dispatch from Event Credit creation. Eliminates polling entirely. - BullMQ queue refactor — Proper job queue with retry logic, dead letter queue, and monitoring.
- HireFire autoscaling — $9-19/month. Automatically scales dynos based on queue depth during events.
Critical check: Without SELECT FOR UPDATE SKIP LOCKED, multiple workers create duplicate Obv.io attendees. This must be verified before scaling dynos. If not implemented, dyno scaling creates a duplicate attendee bug that’s worse than the delay.
Claude Code acceleration: Parallelism implementation (Promise.all batching), LISTEN/NOTIFY integration with pg-listen, and BullMQ queue refactor are all highly automatable. Claude Code saves ~1 week, bringing 3 weeks down to 1.5-2 weeks.
Throughput Model
| Scenario | Throughput | Time for 1,500 Buyers |
|---|---|---|
| Current (baseline) | 1/sec serial | ~25 minutes |
| Phase 1 | 60 parallel | Under 2 minutes |
| Phase 2 | 120 parallel event-driven | Under 60 seconds |
Completion Criteria
SELECT FOR UPDATE SKIP LOCKEDverified and implemented- Polling interval reduced from 30s to 5s
Promise.allparallelism (10-20 concurrent) deployed- Event-api scaled to 3-4 dynos for UPW
- Magic link generation within 5 seconds of purchase at peak event volumes (Phase 2)
- Postgres
LISTEN/NOTIFYreplacing polling (Phase 2) - HireFire autoscaling configured (Phase 2)
Initiative Attributes
Related Risks
| ID | Risk | Severity | Probability | Mitigation |
|---|---|---|---|---|
| RF8 | Obv.io SKIP LOCKED not implemented — duplicate attendees under parallelism | HIGH | MEDIUM | Verify SELECT FOR UPDATE SKIP LOCKED before scaling dynos in U9 Phase 1. If not implemented, dyno scaling creates duplicate attendee bug. Must verify before March 12. |
| RF7 | Spork overload in Wave 0 (4 initiatives in 9 days) | MEDIUM | HIGH | Kill 6+ daily meetings before March 12. Route status through Kingler. Erik must cancel cross-department meetings. |