Confidential Document

This document is restricted to RRI leadership.

Incorrect password
Organizational Restructuring

Team Structure & Process Design

Current state, proposed restructuring, new roles, sprint ceremonies, and product ownership — the organizational foundation for executing the technical roadmap.

Current Team Structure (March 2026)

RRI’s engineering organization is structurally broken: everyone does everything, which means nothing gets done reliably. No sprint has closed in 3 weeks. The team burned out after 3-4 weeks of 10-hour days. Spork attends 6+ standing meetings daily, functioning as a human router instead of an engineering director.

Current State — No Separation
Engineering Leadership
Spork (Michael Evans) Dir. of Engineering
Justin Kahn Head of Digital Innovation
Run Team (Spork) — Operations & Infrastructure
Johnny Yarlott Core Backend / Auth / Payments BF1
Zach Hardesty Infra / K8s / GitOps / Data Lake BF1
Sean Network / Systems
Dean Schwartz Help Desk
Josh Fuller Backend Support
Tim Hooker Salesforce / CRM (leads 4 JIRA projects)
Build Team (Justin) — Products & Digital
Nick Jensen Principal Architect — TonyRobbins.com BF1
Ken UI/UX
Esmee HubSpot / Front-end
Caitlin Noble Data
Alex Hoisington Product Owner — RPM Web
Pam Hendrickson Content
Jay Lane AI (half-time contractor)
Contractors — Critical Systems
Federico Del Rio Nearsure — Sole maintainer: Members Portal BF1
Jonathan Perez DualBoot — Sole maintainer: RPM Planner BF1
Freddy Garcia Nearsure — AI Tools
awilmort Nearsure — Salesforce
OnBuild / Nortal TonyRobbins.com contributors

Structural problems:

• No separation between operations (reactive) and product development (planned)
• Spork is a human router — 6+ standing meetings/day, no time for engineering
• No sprint has closed in 3 weeks
• No PM layer — unplanned work hits engineers directly
• No QA function — engineers test their own code
• No product ownership model — 6+ people own fragments of the customer experience
• 5 engineers are bus factor 1 on revenue-critical systems
• Contractor notice periods unknown

Proposed Team Structure

The fix is proven: separate Run (Kanban, reactive) from Build (Scrum, planned) with a hard organizational wall between them. Framework: Team Topologies (2025 update) — Run = Platform Team, Build = Stream-Aligned Team.

Before (Current)

  • Everyone does everything
  • No sprint velocity tracking
  • Spork routes all requests manually
  • Engineers handle ops + features
  • No PM or PO layer
  • No QA function
  • No on-call rotation
  • 10-hour days, burnout

After (Restructured)

  • Run Team (Kanban) + Build Team (Scrum)
  • 70%+ sprint velocity target
  • Run Team Lead triages all operational requests
  • Build engineers protected from ops interrupts
  • Dedicated PM + PO layer
  • AI-powered QA agents in CI/CD
  • OpsGenie on-call rotation
  • 40-hour weeks, sustainable pace
Future State — Build vs. Run Separation
CTO / Technology Leadership
Lior Weinstein Fractional CTO — Strategy, Architecture, Hiring
Erik Logan CEO — Authority to enforce structure
Run Team (Kanban) — Spork, Director of Engineering
Run Team Lead $140-160K — Triages ALL ops requests NEW
Johnny Yarlott Core Backend / Auth / Payments
Zach Hardesty Infrastructure / K8s / GitOps
DevOps Engineer $130-150K — Zach’s backup, K8s migration NEW
Integration Engineer $120-140K — SF/Stripe/3rd-party integrations NEW
Sean Network / Systems / Security
Dean Schwartz IT Service Desk Lead
Josh Fuller Backend Support (Federico backup)
Tim Hooker Salesforce / CRM
Event Ops Contractor #1 $50-80K — Event-gated operations NEW
Event Ops Contractor #2 $50-80K — Event-gated operations NEW
Build Team (Scrum) — Justin Kahn, Unified Product Owner
Nick Jensen Principal Architect — Experience API, Portal
Senior Backend Developer $140-160K — Johnny’s backup, payments/auth NEW
Full-Stack Developer $120-140K — Portal + TonyRobbins.com NEW
Ken UI/UX Design
Esmee HubSpot / Front-end
Alex Hoisington Product Owner — RPM + Digital Products
Data Engineer $130-150K — Analytics, data lake, reporting NEW
Caitlin Noble Data Analyst
Pam Hendrickson Content
AI Team — Reports to Justin (Build) with governance across both teams
Jay Lane Head of AI (full-time conversion from half-time) FT
Freddy Garcia Nearsure — AI Tools Development
Daniel Agentic AI (Tony’s hire) — under governance
AI Agent Layer — Autonomous agents across both teams
Kingler Codebase knowledge — all repos AGENT
Chatot IT triage — Zendesk/Slack routing AGENT
Inigo Strategy partner — Justin’s AI advisor AGENT
Frontend QA Agent Visual regression, E2E, a11y AGENT
Backend QA Agent API contracts, load testing, data integrity AGENT
BA Agent Meeting transcripts → Jira tickets with acceptance criteria AGENT
TonyRobbins.com Agent Site knowledge — Nick’s backup PLANNED
Portal Agent Members Portal knowledge — Federico’s backup PLANNED
Contractors — With Named Backups
Federico Del Rio Members Portal (backup: Josh Fuller)
Jonathan Perez RPM Planner (backup: Alex + Justin)
OnBuild / Nortal TonyRobbins.com (backup: Nick)

IT Service Desk & MSP Layer

Internal support requests currently go directly to engineers via Slack DMs. This creates constant interrupts and makes it impossible to measure support volume. The fix: a formal three-tier support model with Dean Schwartz as IT Service Desk Lead and AI triage as the first line of defense.

Three-Tier Support Model

TierTeamHandlesSLAEscalation
L0 — AI Triage Chatot Agent Zendesk/Slack intake, auto-categorization, known-issue resolution, password resets, FAQ responses < 2 min response Auto-route to L1 if unresolved
L1 — IT Service Desk Dean Schwartz + MSP Account provisioning, hardware/software requests, VPN/network issues, vendor coordination, basic troubleshooting 4 hour response Run Team Lead (L2)
L2 — Engineering Ops Run Team Infrastructure issues, deployment failures, database problems, integration bugs, performance degradation P2: 1 hour / P3: next day Build Team PO (L3)
L3 — Engineering Dev Build Team Code-level bugs requiring feature changes, architectural issues, new integration development Next sprint planning CTO / Product Council

Ticket Routing Flow

Support Request Flow
Intake — All requests enter here
Chatot AI Triage Zendesk + Slack → auto-categorize & route AGENT
L1 — IT Service Desk
Dean Schwartz IT Service Desk Lead — handles or escalates
MSP Partner (TBD) After-hours + overflow coverage
L2 — Engineering Ops
Run Team Lead Triages engineering-level issues NEW
L3 — Engineering Dev
Build Team PO (Justin/Alex) Adds to sprint backlog if code change required

MSP Evaluation

Decision pending: Evaluate whether RRI needs a Managed Service Provider (MSP) for after-hours IT coverage, hardware lifecycle management, and L1 overflow. Dean currently handles this alone — single point of failure for IT support during events and after hours.

Evaluation criteria: After-hours coverage model, per-seat pricing vs. fixed fee, Zendesk integration capability, onsite support during events, SLA guarantees. Target decision: end of Phase 2 (Week 8).

What does NOT go through IT Service Desk:

Production incidents (P1/P2) — go directly to OpsGenie → Run Team on-call
Feature requests — go through Product Council → Build Team backlog
Infrastructure changes — go through Change Advisory Board (Run Team Lead + Zach)
Security incidents — go directly to Sean + CTO escalation

New Roles & Hiring Timeline

RoleTeamSalaryPriorityPost DateStart DateFirst Productive
Jay Lane (FT conversion)AI$175K ($87.5K incr.)#0N/AApril 1Immediate
Run Team LeadRun (Spork)$140-160K#1 BLOCKINGMarch 17May 15June 15
DevOps EngineerRun (Spork)$130-150K#2March 17May 19June 19
Integration EngineerRun (Spork)$120-140K#3March 17May 21June 21
Data EngineerBuild (Justin)$130-150K#4April 15June 3July 3
Senior Backend DeveloperBuild (Justin)$140-160K#5April 15June 10July 10
Full-Stack DeveloperBuild (Justin)$120-140K#6April 15June 17July 17
PM / Scrum MasterBuild + Cross-team$120-140K#5 HIGHMarch 17May 26June 26
Tony AI Product OwnerBuild (Justin)$130-160K#6April 15June 17July 17
TR Experience Product OwnerBuild (Justin)$130-160K#7April 15June 24July 24
Event Ops Contractor #1Run (event-gated)$50-80KImmediateMarch 17April 14April 28
Event Ops Contractor #2Run (event-gated)$50-80KImmediateMarch 17April 21May 5

Role Descriptions

Run Team Lead ($140-160K) — This is the #1 blocking hire. Owns all operational triage. Routes P1/P2/P3 incidents. Shields Build team from interrupts. Without this role, the Build vs. Run separation is organizational theater — Spork continues as human router. 45-55 days post-to-offer means we must post within 5 days of announcement.

DevOps Engineer ($130-150K) — Zach’s designated backup. Critical for S2 (Heroku → K8s migration) — Zach cannot architect AND execute a 16-week infrastructure migration alone. Also reduces bus factor 1 risk on all infrastructure.

Integration Engineer ($120-140K) — Owns Salesforce-Stripe-HubSpot-Obv.io integration layer. Relieves Tim Hooker (currently doing Salesforce + 4 JIRA projects) and provides backstop for Federico (Members Portal maintenance).

Data Engineer ($130-150K) — Builds the data pipelines that make S8 (Event Intelligence Dashboard) and the broader TROS vision possible. Partners with Caitlin Noble (Data Analyst). Enables Yogesh to get the ROI data he needs to approve AI investments.

Senior Backend Developer ($140-160K) — Johnny Yarlott’s designated backup on payments, authentication, and core backend services. Reduces the single highest-risk bus factor in the organization. Also accelerates Build Team velocity by adding backend capacity — currently the Build Team has zero dedicated backend engineers.

Full-Stack Developer ($120-140K) — Splits time between Members Portal (Federico backup) and TonyRobbins.com (Nick backup). Directly addresses the two highest contractor-dependency risks. Must be comfortable with Next.js, Node.js, and the Sanity CMS stack.

PM / Scrum Master ($120-140K) — The person who sits in requirements meetings so Justin and Spork don’t have to. Attends all stakeholder meetings, translates requirements into stories (with BA Agent support), runs all sprint ceremonies, provides weekly status updates to leadership, and manages cross-team dependencies. This isn’t a coordinator — it’s a real PM who understands the technical stack and can push back on scope creep.

Tony AI Product Owner ($130-160K) — Dedicated owner for Tony AI — the $23M ARR product with 49K subscribers. Owns the product roadmap, growth strategy, retention metrics, feature prioritization, and the path from $39/mo to a full coaching companion. Reports to Justin. Must have SaaS product management experience, ideally in AI/ML consumer products.

TR Experience Product Owner ($130-160K) — Dedicated owner for the Tony Robbins Experience platform — the portal unification (S5), Mastery Path (S3), and Event Passport (S4). This is the product that turns RRI from an events company into a technology company. Owns the unified customer journey from event purchase through lifetime engagement. Reports to Justin. Must understand subscription models and multi-product platforms.

Event Ops Contractors ($50-80K each) — Dedicated to event operations (kiosk setup, day-of support, attendee troubleshooting). Frees senior engineers from event duty. Event-gated — only active during event windows.

Developer Derisking & AI-Augmented Resilience

Five engineers are bus factor 1 on revenue-critical systems. The traditional fix (hire backups) takes 3-6 months per person and doubles headcount cost. Our approach: a three-layer resilience model combining human backups with AI agents that serve as always-available knowledge repositories.

Three-Layer Resilience Model

Layer 1: Primary Owner

  • Deep system expertise
  • Makes architectural decisions
  • Reviews all PRs for their system
  • Writes documentation continuously
  • Trains both human backup and AI agent

Layer 2: Human Backup

  • Can handle P1 incidents solo
  • Reviews 30%+ of PRs
  • Shadows primary on deployments
  • Rotates in during PTO/events
  • Documented runbooks for key scenarios

Layer 3: AI Agent

  • Instant codebase knowledge recall
  • Answers “how does X work?” in seconds
  • Guides human backup through unfamiliar code
  • Generates context for incident response
  • Never forgets, never goes on PTO

Critical System Resilience Map

Payments / Auth / Core Backend BF1
Primary Owner
Johnny Yarlott
Stripe, Auth0, order-ingestion
Human Backup
Senior Backend Dev
New hire #5 — designated backup
AI Agent
Kingler
Full codebase knowledge, payments flow docs
Infrastructure / K8s / GitOps BF1
Primary Owner
Zach Hardesty
K8s clusters, ArgoCD, data lake
Human Backup
DevOps Engineer
New hire #2 — Zach’s designated backup
AI Agent
Kingler
Infra configs, runbooks, deployment procedures
TonyRobbins.com BF1
Primary Owner
Nick Jensen
Next.js, Sanity CMS, Experience API
Human Backup
Full-Stack Dev
New hire #6 — site + portal coverage
AI Agent
TR.com Agent
Planned — site architecture, Sanity schemas
Members Portal BF1
Primary Owner
Federico Del Rio
Nearsure contractor — sole maintainer
Human Backup
Josh Fuller
Backend support — shadowing Federico
AI Agent
Portal Agent
Planned — portal codebase, API contracts
RPM Planner BF1
Primary Owner
Jonathan Perez
DualBoot contractor — sole maintainer
Human Backup
Alex + Justin
Product knowledge + emergency dev capacity
AI Agent
Kingler
RPM codebase indexed, architecture docs

AI Agents as Knowledge Repositories

AgentStatusKnowledge DomainPrimary Use Case
KinglerACTIVEAll RRI repositories, architecture docs, deployment configsCodebase Q&A, onboarding acceleration, incident context
ChatotACTIVEIT support knowledge base, Zendesk history, common issuesL0 triage, auto-resolution of known issues, ticket routing
InigoACTIVEProduct strategy, roadmap context, competitive intelligenceStrategy analysis, pre-read generation, decision support for Justin
TonyRobbins.com AgentPLANNEDTR.com codebase, Sanity CMS schemas, Next.js architectureNick’s knowledge backup, onboarding new devs to the site
Portal AgentPLANNEDMembers Portal codebase, API contracts, user flowsFederico dependency reduction, Josh Fuller training acceleration

The resilience math: With all three layers active, losing any single person degrades capability but doesn’t create a crisis. The human backup can handle incidents with AI agent guidance. The AI agent provides instant context that would otherwise take weeks to rebuild. Combined effect: bus factor moves from 1 → 2.5 effective (human backup + AI-assisted recovery).

QA Strategy: AI-Powered Testing Agents

RRI has no QA function. Engineers test their own code, which means bugs ship to production regularly. Hiring a QA engineer ($90-120K) adds headcount; instead, we deploy AI-powered QA agents that run continuously in CI/CD for a fraction of the cost.

Before (No QA)

  • Engineers test their own code
  • No visual regression testing
  • No accessibility auditing
  • No load testing before events
  • No API contract validation
  • Bugs found in production by users
  • No test coverage metrics

After (Agent-Powered QA)

  • Automated QA in every PR and deploy
  • Visual regression catches UI breaks
  • WCAG 2.1 AA compliance enforced
  • Pre-event load testing automated
  • API contracts validated on every change
  • Bugs caught before merge
  • Coverage dashboards in Swarmia

Frontend QA Agent

Capabilities:

Visual Regression Testing — Playwright screenshots compared against baselines on every PR. Catches unintended UI changes across all breakpoints.
End-to-End Testing — Critical user flows (signup, purchase, login, RPM access) tested on every deploy. Playwright + custom assertions.
Accessibility Auditing — axe-core integrated into CI. Every page scanned for WCAG 2.1 AA violations. PR blocked if new violations introduced.
Performance Monitoring — Lighthouse CI runs on every PR. Core Web Vitals tracked. Regression alerts if LCP/CLS/FID degrade beyond threshold.

Backend QA Agent

Capabilities:

API Contract Testing — OpenAPI spec validation on every backend PR. Ensures frontend/backend contracts stay in sync. Breaking changes flagged automatically.
Pre-Event Load Testing — k6/Artillery load tests run automatically 48 hours before every event. Simulates expected concurrent users. Alerts if response times breach thresholds.
Data Integrity Checks — Validates Stripe ↔ Salesforce ↔ Portal data consistency. Runs nightly + pre-event. Catches sync failures before they impact customers.
Integration Health — Monitors all third-party API endpoints (Stripe, Salesforce, HubSpot, Obv.io). Proactive alerts before failures cascade.

Implementation Phases

PhaseTimelineDeliverablesTools
Phase 1: Foundation Weeks 5-8 E2E tests for 5 critical user flows, API contract testing in CI, basic Lighthouse CI integration Playwright, OpenAPI validator, Lighthouse CI
Phase 2: Visual + Accessibility Weeks 9-12 Visual regression baselines for all customer-facing pages, axe-core a11y scanning in CI, coverage dashboards Playwright visual compare, axe-core, Swarmia
Phase 3: Load + Data Weeks 13-16 Pre-event load testing automation, Stripe/SF data integrity nightly checks, integration health monitoring k6/Artillery, custom data validators, Datadog

Cost comparison: QA Engineer salary: $90-120K/year + benefits. AI QA agent infrastructure: ~$200-500/month (CI compute + tool licenses). That’s 95% cheaper with 24/7 coverage that never calls in sick, never has context-switching overhead, and scales linearly with the number of repos.

Product Ownership & PM Layer

Product ownership across RRI is currently fragmented across 6+ people with no one owning the full customer experience. The proposed model unifies ownership under Justin Kahn as VP/Head of Product with a formal governance structure and dedicated PM roles for each team.

Why Dedicated POs Matter

Tony AI alone has 49K paying subscribers and $23M ARR — that’s a standalone product that needs a dedicated owner who wakes up thinking about retention, engagement, and growth. The Tony Robbins Experience (portal unification, Mastery Path, Event Passport) is the platform play that drives the $1B valuation story. These can’t be side projects for people who also handle RPM, integrations, and CRM.

The Core Problem: Justin & Spork Are Stuck in Meetings

Why nothing ships: Justin and Spork spend their days in requirements meetings, stakeholder updates, and cross-department coordination instead of leading their teams. Spork has 6+ standing meetings daily. Justin is pulled into every product conversation because there’s no one else. Engineers get interrupted directly via Slack. Nobody is protecting development time or running the process. The fix isn’t better time management — it’s dedicated people whose job is the meetings, the process, and the stakeholder communication.

Project Manager / Scrum Master (New Hire)

This is a dedicated PM hire — not Justin wearing another hat. This person sits in requirements meetings so Justin doesn’t have to. They update stakeholders on project status so Spork doesn’t have to. They run sprint ceremonies, protect the team from scope creep, and are the single point of contact for “when will X be done?”

PM / Scrum Master Responsibilities

  • Attends all requirements meetings — so Justin and Spork don’t
  • Updates stakeholders on project status — weekly reports, ad-hoc questions
  • Facilitates all sprint ceremonies (planning, standup, review, retro)
  • Translates business requirements into technical stories (with BA Agent support)
  • Protects sprint from scope creep and unplanned work
  • Tracks velocity, burndown, and DORA metrics in Swarmia
  • Manages cross-team dependencies and blockers
  • Coaches team on Scrum practices

Run Team: Technical PM (Run Team Lead)

  • Manages Kanban board WIP limits
  • Tracks SLA compliance
  • Coordinates incident response
  • Reports ops metrics to leadership
  • Manages vendor/MSP relationships
  • Built into the Run Team Lead role

The unlock: With a dedicated PM in requirements meetings and handling stakeholder updates, Justin focuses on product vision and architecture decisions. Spork focuses on engineering leadership and system reliability. Neither is a human router anymore. The PM becomes the “shield” that lets technical leaders do technical work.

Dedicated Product Owners by Product

Alex Hoisington currently covers too much ground. The three biggest products each need a dedicated owner who lives and breathes that product every day.

Product Ownership — Dedicated POs
VP / Head of Product
Justin Kahn VP/Head of Product — vision, strategy, North Star: Mastery Path
Dedicated Product Owners
Tony AI PO Owns Tony AI product — 49K subscribers, $23M ARR, growth strategy NEW
TR Experience PO Owns Tony Robbins Experience — portal unification, Mastery Path, Event Passport NEW
Alex Hoisington Backend Products PO — RPM, integrations, order pipeline
Tim Hooker CRM PO — Salesforce, data flows
Project Management
PM / Scrum Master Requirements meetings, stakeholder updates, sprint ceremonies, process NEW
BA Agent Creates Jira tickets from requirements meetings AGENT

AI Business Analyst Agent

Requirements meetings generate ideas, decisions, and action items — but translating those into well-structured Jira tickets with acceptance criteria is tedious, error-prone, and often doesn’t happen. The BA Agent sits in every requirements meeting (via transcript) and automatically generates tickets.

BA Agent Capabilities:

Meeting → Tickets: Ingests meeting transcripts (Zoom/Teams recording → Whisper transcription). Identifies action items, decisions, and feature requests. Generates draft Jira stories with title, description, acceptance criteria, and suggested priority.
Requirements Structuring: Takes loose stakeholder language (“we need the checkout to be faster”) and structures it into testable acceptance criteria (“checkout page loads in <2s on 3G, Stripe Payment Element renders within 1s”).
Impact Assessment: Cross-references new requirements against existing backlog and active sprint to flag conflicts, duplicates, and dependencies before tickets are committed.
PM Review Queue: All BA Agent-generated tickets go into a PM review queue — the PM / Scrum Master approves, edits, or rejects before they hit the backlog. No auto-create to backlog.

InputBA Agent ActionOutput
Meeting transcriptExtract action items, feature requests, bug reportsDraft Jira stories in PM review queue
Slack thread with stakeholder requestStructure into story with acceptance criteriaDraft ticket + link to original thread
Email from marketing (“new SKU needed”)Generate PCR draft + engineering impact estimatePCR in Product Council approval queue
Incident post-mortemExtract follow-up action itemsBug/improvement tickets with post-mortem link

Current Fragmentation

Product AreaCurrent OwnerProblem
Tony AI & RPMJustin KahnNo dedicated product manager
Coaching ProgramsChris SchenkeNo tech integration
Platinum PartnershipScottySiloed from digital products
Inner Circle / Biz AcceleratorBree (under Diane)Separate tech stack
Summit & MarketingJesseControls HubSpot, changes pages 5 min before go-live
Traditional EventsNo single ownerRequirements come from everywhere

Proposed: SVPG Product Council

Based on Silicon Valley Product Group methodology. Justin Kahn becomes unified product owner. All product decisions evaluated against a single North Star: the Mastery Path progression (UPW → Tony AI → RPM → Coaching → Inner Circle → Platinum).

  • Quarterly Strategy Review (3 hours) — 7 members max. Sets product direction for the quarter. Inigo (Justin’s AI strategy agent) drafts pre-reads.
  • Monthly Operating Review (90 min) — Progress against quarterly goals, resource reallocation, cross-product dependencies.
  • Product Change Request (PCR) Process — 30-day lead time for new SKUs. Engineering impact assessment required. Council approval gate before any SKU touches Stripe/Salesforce/Sanity/order-ingestion.
  • Freeze Mode — PAD (Product Admin Dashboard, U4) enforces freeze windows. Product changes can be created but cannot publish without CTO approval. Transforms policy into system-enforced guardrail.

Critical: Authority without enforcement is theater. Erik must enforce the structure when the first bypass attempt happens. One enforcement moment sets the precedent. A code freeze was attempted a year ago — it lasted 2 weeks because nobody enforced it.

Sprint Ceremonies & Process Cadence

Build Team (Scrum)

2-week sprints with 20% interrupt buffer built in. First sprint deliberately at 50% velocity — build trust before optimizing throughput. Product Owner (Alex Hoisington) gates ALL unplanned work. No direct Slack pings to Build engineers about ops issues.

Sprint Planning
Every 2 weeks · 2 hours
PO presents prioritized backlog. Team commits to sprint goal. 20% buffer reserved for interrupt work. Velocity tracked via Swarmia.
Build Team
Daily Standup
Daily · 15 min max
What I did, what I’m doing, what’s blocking me. No status updates that belong in Jira. Spork does NOT attend Build standups.
Build Team
Sprint Review / Demo
Every 2 weeks · 1 hour
Working software demonstrated to stakeholders. Not a slide deck — live product. Feedback captured for next sprint.
Build Team + Stakeholders
Sprint Retrospective
Every 2 weeks · 45 min
What went well, what didn’t, what to change. One action item minimum. Track retro actions in Jira.
Build Team
Backlog Refinement
Weekly · 1 hour
PO + tech lead groom upcoming stories. Acceptance criteria defined. Story points estimated. 2 sprints ahead minimum.
Build Team (PO + Leads)

Run Team (Kanban)

Kanban with WIP limits (team_size + 1). P1/P2/P3 incident severity tiers. OpsGenie on-call rotation ($9/user/month). No sprints — continuous flow with SLA targets.

Daily Ops Sync
Daily · 15 min
Review Kanban board. What’s blocked. What’s breaching SLA. Run Team Lead facilitates — Spork observes, doesn’t direct.
Run Team
Weekly Ops Review
Weekly · 30 min
Incident trends. SLA compliance. Capacity planning. Escalation patterns. Run Team Lead reports to Spork.
Run Team + Spork
Incident Post-Mortem
Within 48 hours of P1
Blameless. Root cause analysis. Action items with owners and deadlines. Published to engineering Confluence space.
Run Team + Affected

Incident Severity Tiers

TierDefinitionResponse SLAResolution SLAEscalation
P1 — CriticalRevenue impact, system down, data loss15 minutes30 minutesAll hands + CTO + Erik
P2 — HighDegraded service, workaround exists1 hour4 hoursRun Team Lead + Spork
P3 — NormalNon-urgent bugs, enhancement requestsNext business day5 business daysRun Team Lead routes

Cross-Team Ceremonies

Quarterly Roadmap Planning
Quarterly · Half day
Review DERISK/UNCLOG/SCALE lens. Re-prioritize initiatives. Set quarterly milestones. Both teams + CTO + stakeholders.
All Engineering + Leadership
Product Council
Monthly · 90 min
SVPG-style operating review. Product direction, PCR approvals, cross-product dependencies. Justin facilitates.
Product Owners + CTO + Stakeholders
AI Governance Committee
Monthly · 60 min
Agent fleet review. ROI tracking. New agent approvals. Risk tier assignments. Jay + Justin + Spork + Lior.
AI Team + Leadership
DORA Metrics Review
Monthly · 30 min
Deployment frequency, lead time, change failure rate, MTTR. Tracked via Swarmia. Both teams benchmark against industry.
Engineering Leads + CTO

Process Tooling

ToolPurposeTeamCost
JiraSprint boards (Build) + Kanban boards (Run)BothExisting
OpsGenieOn-call rotation, incident alerting, escalationRun$9/user/month
SwarmiaDORA metrics, sprint velocity, engineering analyticsBoth~$30/user/month
ConfluenceDocumentation, post-mortems, architecture decisionsBothExisting
GitHubCode, PRs, CI/CD, CODEOWNERSBothExisting
PlaywrightE2E testing, visual regression, cross-browser QAQA AgentsOpen source
axe-coreAccessibility auditing (WCAG 2.1 AA) in CIQA AgentsOpen source
k6 / ArtilleryLoad testing, pre-event capacity validationQA AgentsOpen source / ~$100/mo
ZendeskIT service desk ticketing, Chatot AI triage integrationRun (IT)~$55/agent/month
Reclaim.aiAI calendar tool for engineering focus timeBuild$10/user/month (optional)
RetoolProduct Admin Dashboard Phase 1 UIBuild$50/user/month

Restructuring Timeline

PhaseTimelineKey Actions
Phase 1: Stabilize & Separate Weeks 1-2 Announce restructuring. Create two Jira boards. Establish P1/P2/P3 tiers. Set WIP limits. Spork stops attending Build standups. Implement 20% interrupt buffer. Kill Spork’s 6+ daily meetings.
Phase 2: Hire & Stabilize Weeks 3-8 Post Run Team Lead + DevOps + Integration Engineer + Data Engineer + Senior Backend Dev + Full-Stack Dev. Install OpsGenie. Documentation sprints for Zach and Johnny (D1). Convert Jay Lane full-time. Begin QA agent Phase 1. Evaluate MSP options.
Phase 3: Operational Cadence Weeks 9-12 Run Team Lead onboarded and independent. First quarterly roadmap planning session. Swarmia DORA metrics baseline established. Build team achieving 70%+ sprint velocity. QA agent Phase 2 (visual + a11y). IT Service Desk model operational.
Phase 4: Scale & Harden Weeks 13-16 New developers onboarded and productive. QA agents fully operational (Phase 3: load + data). Three-layer resilience model validated for all BF1 systems. MSP evaluation complete. Pre-event load testing automated. AI agent knowledge repositories indexed for all critical systems.

Success looks like: Build team completing 70%+ of sprint commitments. P1 incidents resolved in 30 minutes. No engineer working 10-hour days for 3+ consecutive days. First event with a clean code freeze that actually holds. Product Council meeting monthly with documented decisions. All BF1 systems at bus factor 2+ with AI agent knowledge backup. QA agents catching bugs before they reach production.