Team Structure & Process — RRI Technical Roadmap

Current Team Structure (March 2026)

RRI’s engineering organization is structurally broken: everyone does everything, which means nothing gets done reliably. No sprint has closed in 3 weeks. The team burned out after 3-4 weeks of 10-hour days. Spork attends 6+ standing meetings daily, functioning as a human router instead of an engineering director.

Current State — No Separation

Engineering Leadership

Spork (Michael Evans) Dir. of Engineering

Justin Kahn Head of Digital Innovation

Run Team (Spork) — Operations & Infrastructure

Johnny Yarlott Core Backend / Auth / Payments BF1

Zach Hardesty Infra / K8s / GitOps / Data Lake BF1

Sean Network / Systems

Dean Schwartz Help Desk

Josh Fuller Backend Support

Tim Hooker Salesforce / CRM (leads 4 JIRA projects)

Build Team (Justin) — Products & Digital

Nick Jensen Principal Architect — TonyRobbins.com BF1

Ken UI/UX

Esmee HubSpot / Front-end

Caitlin Noble Data

Alex Hoisington Product Owner — RPM Web

Pam Hendrickson Content

Jay Lane AI (half-time contractor)

Contractors — Critical Systems

Federico Del Rio Nearsure — Sole maintainer: Members Portal BF1

Jonathan Perez DualBoot — Sole maintainer: RPM Planner BF1

Freddy Garcia Nearsure — AI Tools

awilmort Nearsure — Salesforce

OnBuild / Nortal TonyRobbins.com contributors

Structural problems:

• No separation between operations (reactive) and product development (planned)
• Spork is a human router — 6+ standing meetings/day, no time for engineering
• No sprint has closed in 3 weeks
• No PM layer — unplanned work hits engineers directly
• No QA function — engineers test their own code
• No product ownership model — 6+ people own fragments of the customer experience
• 5 engineers are bus factor 1 on revenue-critical systems
• Contractor notice periods unknown

Proposed Team Structure

The fix is proven: separate Run (Kanban, reactive) from Build (Scrum, planned) with a hard organizational wall between them. Framework: Team Topologies (2025 update) — Run = Platform Team, Build = Stream-Aligned Team.

Before (Current)

Everyone does everything
No sprint velocity tracking
Spork routes all requests manually
Engineers handle ops + features
No PM or PO layer
No QA function
No on-call rotation
10-hour days, burnout

After (Restructured)

Run Team (Kanban) + Build Team (Scrum)
70%+ sprint velocity target
Run Team Lead triages all operational requests
Build engineers protected from ops interrupts
Dedicated PM + PO layer
AI-powered QA agents in CI/CD
OpsGenie on-call rotation
40-hour weeks, sustainable pace

Future State — Build vs. Run Separation

CTO / Technology Leadership

Lior Weinstein Fractional CTO — Strategy, Architecture, Hiring

Erik Logan CEO — Authority to enforce structure

Run Team (Kanban) — Spork, Director of Engineering

Run Team Lead $140-160K — Triages ALL ops requests NEW

Johnny Yarlott Core Backend / Auth / Payments

Zach Hardesty Infrastructure / K8s / GitOps

DevOps Engineer $130-150K — Zach’s backup, K8s migration NEW

Integration Engineer $120-140K — SF/Stripe/3rd-party integrations NEW

Sean Network / Systems / Security

Dean Schwartz IT Service Desk Lead

Josh Fuller Backend Support (Federico backup)

Tim Hooker Salesforce / CRM

Event Ops Contractor #1 $50-80K — Event-gated operations NEW

Event Ops Contractor #2 $50-80K — Event-gated operations NEW

Build Team (Scrum) — Justin Kahn, Unified Product Owner

Nick Jensen Principal Architect — Experience API, Portal

Senior Backend Developer $140-160K — Johnny’s backup, payments/auth NEW

Full-Stack Developer $120-140K — Portal + TonyRobbins.com NEW

Ken UI/UX Design

Esmee HubSpot / Front-end

Alex Hoisington Product Owner — RPM + Digital Products

Data Engineer $130-150K — Analytics, data lake, reporting NEW

Caitlin Noble Data Analyst

Pam Hendrickson Content

AI Team — Reports to Justin (Build) with governance across both teams

Jay Lane Head of AI (full-time conversion from half-time) FT

Freddy Garcia Nearsure — AI Tools Development

Daniel Agentic AI (Tony’s hire) — under governance

AI Agent Layer — Autonomous agents across both teams

Kingler Codebase knowledge — all repos AGENT

Chatot IT triage — Zendesk/Slack routing AGENT

Inigo Strategy partner — Justin’s AI advisor AGENT

Frontend QA Agent Visual regression, E2E, a11y AGENT

Backend QA Agent API contracts, load testing, data integrity AGENT

BA Agent Meeting transcripts → Jira tickets with acceptance criteria AGENT

TonyRobbins.com Agent Site knowledge — Nick’s backup PLANNED

Portal Agent Members Portal knowledge — Federico’s backup PLANNED

Contractors — With Named Backups

Federico Del Rio Members Portal (backup: Josh Fuller)

Jonathan Perez RPM Planner (backup: Alex + Justin)

OnBuild / Nortal TonyRobbins.com (backup: Nick)

IT Service Desk & MSP Layer

Internal support requests currently go directly to engineers via Slack DMs. This creates constant interrupts and makes it impossible to measure support volume. The fix: a formal three-tier support model with Dean Schwartz as IT Service Desk Lead and AI triage as the first line of defense.

Three-Tier Support Model

Tier	Team	Handles	SLA	Escalation
L0 — AI Triage	Chatot Agent	Zendesk/Slack intake, auto-categorization, known-issue resolution, password resets, FAQ responses	< 2 min response	Auto-route to L1 if unresolved
L1 — IT Service Desk	Dean Schwartz + MSP	Account provisioning, hardware/software requests, VPN/network issues, vendor coordination, basic troubleshooting	4 hour response	Run Team Lead (L2)
L2 — Engineering Ops	Run Team	Infrastructure issues, deployment failures, database problems, integration bugs, performance degradation	P2: 1 hour / P3: next day	Build Team PO (L3)
L3 — Engineering Dev	Build Team	Code-level bugs requiring feature changes, architectural issues, new integration development	Next sprint planning	CTO / Product Council

Ticket Routing Flow

Support Request Flow

Intake — All requests enter here

Chatot AI Triage Zendesk + Slack → auto-categorize & route AGENT

L1 — IT Service Desk

Dean Schwartz IT Service Desk Lead — handles or escalates

MSP Partner (TBD) After-hours + overflow coverage

L2 — Engineering Ops

Run Team Lead Triages engineering-level issues NEW

L3 — Engineering Dev

Build Team PO (Justin/Alex) Adds to sprint backlog if code change required

MSP Evaluation

Decision pending: Evaluate whether RRI needs a Managed Service Provider (MSP) for after-hours IT coverage, hardware lifecycle management, and L1 overflow. Dean currently handles this alone — single point of failure for IT support during events and after hours.

Evaluation criteria: After-hours coverage model, per-seat pricing vs. fixed fee, Zendesk integration capability, onsite support during events, SLA guarantees. Target decision: end of Phase 2 (Week 8).

What does NOT go through IT Service Desk:

• Production incidents (P1/P2) — go directly to OpsGenie → Run Team on-call
• Feature requests — go through Product Council → Build Team backlog
• Infrastructure changes — go through Change Advisory Board (Run Team Lead + Zach)
• Security incidents — go directly to Sean + CTO escalation

New Roles & Hiring Timeline

Role	Team	Salary	Priority	Post Date	Start Date	First Productive
Jay Lane (FT conversion)	AI	$175K ($87.5K incr.)	#0	N/A	April 1	Immediate
Run Team Lead	Run (Spork)	$140-160K	#1 BLOCKING	March 17	May 15	June 15
DevOps Engineer	Run (Spork)	$130-150K	#2	March 17	May 19	June 19
Integration Engineer	Run (Spork)	$120-140K	#3	March 17	May 21	June 21
Data Engineer	Build (Justin)	$130-150K	#4	April 15	June 3	July 3
Senior Backend Developer	Build (Justin)	$140-160K	#5	April 15	June 10	July 10
Full-Stack Developer	Build (Justin)	$120-140K	#6	April 15	June 17	July 17
PM / Scrum Master	Build + Cross-team	$120-140K	#5 HIGH	March 17	May 26	June 26
Tony AI Product Owner	Build (Justin)	$130-160K	#6	April 15	June 17	July 17
TR Experience Product Owner	Build (Justin)	$130-160K	#7	April 15	June 24	July 24
Event Ops Contractor #1	Run (event-gated)	$50-80K	Immediate	March 17	April 14	April 28
Event Ops Contractor #2	Run (event-gated)	$50-80K	Immediate	March 17	April 21	May 5

Role Descriptions

Run Team Lead ($140-160K) — This is the #1 blocking hire. Owns all operational triage. Routes P1/P2/P3 incidents. Shields Build team from interrupts. Without this role, the Build vs. Run separation is organizational theater — Spork continues as human router. 45-55 days post-to-offer means we must post within 5 days of announcement.

DevOps Engineer ($130-150K) — Zach’s designated backup. Critical for S2 (Heroku → K8s migration) — Zach cannot architect AND execute a 16-week infrastructure migration alone. Also reduces bus factor 1 risk on all infrastructure.

Integration Engineer ($120-140K) — Owns Salesforce-Stripe-HubSpot-Obv.io integration layer. Relieves Tim Hooker (currently doing Salesforce + 4 JIRA projects) and provides backstop for Federico (Members Portal maintenance).

Data Engineer ($130-150K) — Builds the data pipelines that make S8 (Event Intelligence Dashboard) and the broader TROS vision possible. Partners with Caitlin Noble (Data Analyst). Enables Yogesh to get the ROI data he needs to approve AI investments.

Senior Backend Developer ($140-160K) — Johnny Yarlott’s designated backup on payments, authentication, and core backend services. Reduces the single highest-risk bus factor in the organization. Also accelerates Build Team velocity by adding backend capacity — currently the Build Team has zero dedicated backend engineers.

Full-Stack Developer ($120-140K) — Splits time between Members Portal (Federico backup) and TonyRobbins.com (Nick backup). Directly addresses the two highest contractor-dependency risks. Must be comfortable with Next.js, Node.js, and the Sanity CMS stack.

PM / Scrum Master ($120-140K) — The person who sits in requirements meetings so Justin and Spork don’t have to. Attends all stakeholder meetings, translates requirements into stories (with BA Agent support), runs all sprint ceremonies, provides weekly status updates to leadership, and manages cross-team dependencies. This isn’t a coordinator — it’s a real PM who understands the technical stack and can push back on scope creep.

Tony AI Product Owner ($130-160K) — Dedicated owner for Tony AI — the $23M ARR product with 49K subscribers. Owns the product roadmap, growth strategy, retention metrics, feature prioritization, and the path from $39/mo to a full coaching companion. Reports to Justin. Must have SaaS product management experience, ideally in AI/ML consumer products.

TR Experience Product Owner ($130-160K) — Dedicated owner for the Tony Robbins Experience platform — the portal unification (S5), Mastery Path (S3), and Event Passport (S4). This is the product that turns RRI from an events company into a technology company. Owns the unified customer journey from event purchase through lifetime engagement. Reports to Justin. Must understand subscription models and multi-product platforms.

Event Ops Contractors ($50-80K each) — Dedicated to event operations (kiosk setup, day-of support, attendee troubleshooting). Frees senior engineers from event duty. Event-gated — only active during event windows.

Developer Derisking & AI-Augmented Resilience

Five engineers are bus factor 1 on revenue-critical systems. The traditional fix (hire backups) takes 3-6 months per person and doubles headcount cost. Our approach: a three-layer resilience model combining human backups with AI agents that serve as always-available knowledge repositories.

Three-Layer Resilience Model

Layer 1: Primary Owner

Deep system expertise
Makes architectural decisions
Reviews all PRs for their system
Writes documentation continuously
Trains both human backup and AI agent

Layer 2: Human Backup

Can handle P1 incidents solo
Reviews 30%+ of PRs
Shadows primary on deployments
Rotates in during PTO/events
Documented runbooks for key scenarios

Layer 3: AI Agent

Instant codebase knowledge recall
Answers “how does X work?” in seconds
Guides human backup through unfamiliar code
Generates context for incident response
Never forgets, never goes on PTO

Critical System Resilience Map

Payments / Auth / Core Backend BF1

Primary Owner

Johnny Yarlott

Stripe, Auth0, order-ingestion

Human Backup

Senior Backend Dev

New hire #5 — designated backup

AI Agent

Kingler

Full codebase knowledge, payments flow docs

Infrastructure / K8s / GitOps BF1

Primary Owner

Zach Hardesty

K8s clusters, ArgoCD, data lake

Human Backup

DevOps Engineer

New hire #2 — Zach’s designated backup

AI Agent

Kingler

Infra configs, runbooks, deployment procedures

TonyRobbins.com BF1

Primary Owner

Nick Jensen

Next.js, Sanity CMS, Experience API

Human Backup

Full-Stack Dev

New hire #6 — site + portal coverage

AI Agent

TR.com Agent

Planned — site architecture, Sanity schemas

Members Portal BF1

Primary Owner

Federico Del Rio

Nearsure contractor — sole maintainer

Human Backup

Josh Fuller

Backend support — shadowing Federico

AI Agent

Portal Agent

Planned — portal codebase, API contracts

RPM Planner BF1

Primary Owner

Jonathan Perez

DualBoot contractor — sole maintainer

Human Backup

Alex + Justin

Product knowledge + emergency dev capacity

AI Agent

Kingler

RPM codebase indexed, architecture docs

AI Agents as Knowledge Repositories

Agent	Status	Knowledge Domain	Primary Use Case
Kingler	ACTIVE	All RRI repositories, architecture docs, deployment configs	Codebase Q&A, onboarding acceleration, incident context
Chatot	ACTIVE	IT support knowledge base, Zendesk history, common issues	L0 triage, auto-resolution of known issues, ticket routing
Inigo	ACTIVE	Product strategy, roadmap context, competitive intelligence	Strategy analysis, pre-read generation, decision support for Justin
TonyRobbins.com Agent	PLANNED	TR.com codebase, Sanity CMS schemas, Next.js architecture	Nick’s knowledge backup, onboarding new devs to the site
Portal Agent	PLANNED	Members Portal codebase, API contracts, user flows	Federico dependency reduction, Josh Fuller training acceleration

The resilience math: With all three layers active, losing any single person degrades capability but doesn’t create a crisis. The human backup can handle incidents with AI agent guidance. The AI agent provides instant context that would otherwise take weeks to rebuild. Combined effect: bus factor moves from 1 → 2.5 effective (human backup + AI-assisted recovery).

QA Strategy: AI-Powered Testing Agents

RRI has no QA function. Engineers test their own code, which means bugs ship to production regularly. Hiring a QA engineer ($90-120K) adds headcount; instead, we deploy AI-powered QA agents that run continuously in CI/CD for a fraction of the cost.

Before (No QA)

Engineers test their own code
No visual regression testing
No accessibility auditing
No load testing before events
No API contract validation
Bugs found in production by users
No test coverage metrics

After (Agent-Powered QA)

Automated QA in every PR and deploy
Visual regression catches UI breaks
WCAG 2.1 AA compliance enforced
Pre-event load testing automated
API contracts validated on every change
Bugs caught before merge
Coverage dashboards in Swarmia

Frontend QA Agent

Capabilities:

• Visual Regression Testing — Playwright screenshots compared against baselines on every PR. Catches unintended UI changes across all breakpoints.
• End-to-End Testing — Critical user flows (signup, purchase, login, RPM access) tested on every deploy. Playwright + custom assertions.
• Accessibility Auditing — axe-core integrated into CI. Every page scanned for WCAG 2.1 AA violations. PR blocked if new violations introduced.
• Performance Monitoring — Lighthouse CI runs on every PR. Core Web Vitals tracked. Regression alerts if LCP/CLS/FID degrade beyond threshold.

Backend QA Agent

Capabilities:

• API Contract Testing — OpenAPI spec validation on every backend PR. Ensures frontend/backend contracts stay in sync. Breaking changes flagged automatically.
• Pre-Event Load Testing — k6/Artillery load tests run automatically 48 hours before every event. Simulates expected concurrent users. Alerts if response times breach thresholds.
• Data Integrity Checks — Validates Stripe ↔ Salesforce ↔ Portal data consistency. Runs nightly + pre-event. Catches sync failures before they impact customers.
• Integration Health — Monitors all third-party API endpoints (Stripe, Salesforce, HubSpot, Obv.io). Proactive alerts before failures cascade.

Implementation Phases

Phase	Timeline	Deliverables	Tools
Phase 1: Foundation	Weeks 5-8	E2E tests for 5 critical user flows, API contract testing in CI, basic Lighthouse CI integration	Playwright, OpenAPI validator, Lighthouse CI
Phase 2: Visual + Accessibility	Weeks 9-12	Visual regression baselines for all customer-facing pages, axe-core a11y scanning in CI, coverage dashboards	Playwright visual compare, axe-core, Swarmia
Phase 3: Load + Data	Weeks 13-16	Pre-event load testing automation, Stripe/SF data integrity nightly checks, integration health monitoring	k6/Artillery, custom data validators, Datadog

Cost comparison: QA Engineer salary: $90-120K/year + benefits. AI QA agent infrastructure: ~$200-500/month (CI compute + tool licenses). That’s 95% cheaper with 24/7 coverage that never calls in sick, never has context-switching overhead, and scales linearly with the number of repos.

Product Ownership & PM Layer

Product ownership across RRI is currently fragmented across 6+ people with no one owning the full customer experience. The proposed model unifies ownership under Justin Kahn as VP/Head of Product with a formal governance structure and dedicated PM roles for each team.

Why Dedicated POs Matter

Tony AI alone has 49K paying subscribers and $23M ARR — that’s a standalone product that needs a dedicated owner who wakes up thinking about retention, engagement, and growth. The Tony Robbins Experience (portal unification, Mastery Path, Event Passport) is the platform play that drives the $1B valuation story. These can’t be side projects for people who also handle RPM, integrations, and CRM.

The Core Problem: Justin & Spork Are Stuck in Meetings

Why nothing ships: Justin and Spork spend their days in requirements meetings, stakeholder updates, and cross-department coordination instead of leading their teams. Spork has 6+ standing meetings daily. Justin is pulled into every product conversation because there’s no one else. Engineers get interrupted directly via Slack. Nobody is protecting development time or running the process. The fix isn’t better time management — it’s dedicated people whose job is the meetings, the process, and the stakeholder communication.

Project Manager / Scrum Master (New Hire)

This is a dedicated PM hire — not Justin wearing another hat. This person sits in requirements meetings so Justin doesn’t have to. They update stakeholders on project status so Spork doesn’t have to. They run sprint ceremonies, protect the team from scope creep, and are the single point of contact for “when will X be done?”

PM / Scrum Master Responsibilities

Attends all requirements meetings — so Justin and Spork don’t
Updates stakeholders on project status — weekly reports, ad-hoc questions
Facilitates all sprint ceremonies (planning, standup, review, retro)
Translates business requirements into technical stories (with BA Agent support)
Protects sprint from scope creep and unplanned work
Tracks velocity, burndown, and DORA metrics in Swarmia
Manages cross-team dependencies and blockers
Coaches team on Scrum practices

Run Team: Technical PM (Run Team Lead)

Manages Kanban board WIP limits
Tracks SLA compliance
Coordinates incident response
Reports ops metrics to leadership
Manages vendor/MSP relationships
Built into the Run Team Lead role

The unlock: With a dedicated PM in requirements meetings and handling stakeholder updates, Justin focuses on product vision and architecture decisions. Spork focuses on engineering leadership and system reliability. Neither is a human router anymore. The PM becomes the “shield” that lets technical leaders do technical work.

Dedicated Product Owners by Product

Alex Hoisington currently covers too much ground. The three biggest products each need a dedicated owner who lives and breathes that product every day.

Product Ownership — Dedicated POs

VP / Head of Product

Justin Kahn VP/Head of Product — vision, strategy, North Star: Mastery Path

Dedicated Product Owners

Tony AI PO Owns Tony AI product — 49K subscribers, $23M ARR, growth strategy NEW

TR Experience PO Owns Tony Robbins Experience — portal unification, Mastery Path, Event Passport NEW

Alex Hoisington Backend Products PO — RPM, integrations, order pipeline

Tim Hooker CRM PO — Salesforce, data flows

Project Management

PM / Scrum Master Requirements meetings, stakeholder updates, sprint ceremonies, process NEW

BA Agent Creates Jira tickets from requirements meetings AGENT

AI Business Analyst Agent

Requirements meetings generate ideas, decisions, and action items — but translating those into well-structured Jira tickets with acceptance criteria is tedious, error-prone, and often doesn’t happen. The BA Agent sits in every requirements meeting (via transcript) and automatically generates tickets.

BA Agent Capabilities:

• Meeting → Tickets: Ingests meeting transcripts (Zoom/Teams recording → Whisper transcription). Identifies action items, decisions, and feature requests. Generates draft Jira stories with title, description, acceptance criteria, and suggested priority.
• Requirements Structuring: Takes loose stakeholder language (“we need the checkout to be faster”) and structures it into testable acceptance criteria (“checkout page loads in <2s on 3G, Stripe Payment Element renders within 1s”).
• Impact Assessment: Cross-references new requirements against existing backlog and active sprint to flag conflicts, duplicates, and dependencies before tickets are committed.
• PM Review Queue: All BA Agent-generated tickets go into a PM review queue — the PM / Scrum Master approves, edits, or rejects before they hit the backlog. No auto-create to backlog.

Input	BA Agent Action	Output
Meeting transcript	Extract action items, feature requests, bug reports	Draft Jira stories in PM review queue
Slack thread with stakeholder request	Structure into story with acceptance criteria	Draft ticket + link to original thread
Email from marketing (“new SKU needed”)	Generate PCR draft + engineering impact estimate	PCR in Product Council approval queue
Incident post-mortem	Extract follow-up action items	Bug/improvement tickets with post-mortem link

Current Fragmentation

Product Area	Current Owner	Problem
Tony AI & RPM	Justin Kahn	No dedicated product manager
Coaching Programs	Chris Schenke	No tech integration
Platinum Partnership	Scotty	Siloed from digital products
Inner Circle / Biz Accelerator	Bree (under Diane)	Separate tech stack
Summit & Marketing	Jesse	Controls HubSpot, changes pages 5 min before go-live
Traditional Events	No single owner	Requirements come from everywhere

Proposed: SVPG Product Council

Based on Silicon Valley Product Group methodology. Justin Kahn becomes unified product owner. All product decisions evaluated against a single North Star: the Mastery Path progression (UPW → Tony AI → RPM → Coaching → Inner Circle → Platinum).

Quarterly Strategy Review (3 hours) — 7 members max. Sets product direction for the quarter. Inigo (Justin’s AI strategy agent) drafts pre-reads.
Monthly Operating Review (90 min) — Progress against quarterly goals, resource reallocation, cross-product dependencies.
Product Change Request (PCR) Process — 30-day lead time for new SKUs. Engineering impact assessment required. Council approval gate before any SKU touches Stripe/Salesforce/Sanity/order-ingestion.
Freeze Mode — PAD (Product Admin Dashboard, U4) enforces freeze windows. Product changes can be created but cannot publish without CTO approval. Transforms policy into system-enforced guardrail.

Critical: Authority without enforcement is theater. Erik must enforce the structure when the first bypass attempt happens. One enforcement moment sets the precedent. A code freeze was attempted a year ago — it lasted 2 weeks because nobody enforced it.

Sprint Ceremonies & Process Cadence

Build Team (Scrum)

2-week sprints with 20% interrupt buffer built in. First sprint deliberately at 50% velocity — build trust before optimizing throughput. Product Owner (Alex Hoisington) gates ALL unplanned work. No direct Slack pings to Build engineers about ops issues.

Sprint Planning

Every 2 weeks · 2 hours

PO presents prioritized backlog. Team commits to sprint goal. 20% buffer reserved for interrupt work. Velocity tracked via Swarmia.

Build Team

Daily Standup

Daily · 15 min max

What I did, what I’m doing, what’s blocking me. No status updates that belong in Jira. Spork does NOT attend Build standups.

Build Team

Sprint Review / Demo

Every 2 weeks · 1 hour

Working software demonstrated to stakeholders. Not a slide deck — live product. Feedback captured for next sprint.

Build Team + Stakeholders

Sprint Retrospective

Every 2 weeks · 45 min

What went well, what didn’t, what to change. One action item minimum. Track retro actions in Jira.

Build Team

Backlog Refinement

Weekly · 1 hour

PO + tech lead groom upcoming stories. Acceptance criteria defined. Story points estimated. 2 sprints ahead minimum.

Build Team (PO + Leads)

Run Team (Kanban)

Kanban with WIP limits (team_size + 1). P1/P2/P3 incident severity tiers. OpsGenie on-call rotation ($9/user/month). No sprints — continuous flow with SLA targets.

Daily Ops Sync

Daily · 15 min

Review Kanban board. What’s blocked. What’s breaching SLA. Run Team Lead facilitates — Spork observes, doesn’t direct.

Run Team

Weekly Ops Review

Weekly · 30 min

Incident trends. SLA compliance. Capacity planning. Escalation patterns. Run Team Lead reports to Spork.

Run Team + Spork

Incident Post-Mortem

Within 48 hours of P1

Blameless. Root cause analysis. Action items with owners and deadlines. Published to engineering Confluence space.

Run Team + Affected

Incident Severity Tiers

Tier	Definition	Response SLA	Resolution SLA	Escalation
P1 — Critical	Revenue impact, system down, data loss	15 minutes	30 minutes	All hands + CTO + Erik
P2 — High	Degraded service, workaround exists	1 hour	4 hours	Run Team Lead + Spork
P3 — Normal	Non-urgent bugs, enhancement requests	Next business day	5 business days	Run Team Lead routes

Cross-Team Ceremonies

Quarterly Roadmap Planning

Quarterly · Half day

Review DERISK/UNCLOG/SCALE lens. Re-prioritize initiatives. Set quarterly milestones. Both teams + CTO + stakeholders.

All Engineering + Leadership

Product Council

Monthly · 90 min

SVPG-style operating review. Product direction, PCR approvals, cross-product dependencies. Justin facilitates.

Product Owners + CTO + Stakeholders

AI Governance Committee

Monthly · 60 min

Agent fleet review. ROI tracking. New agent approvals. Risk tier assignments. Jay + Justin + Spork + Lior.

AI Team + Leadership

DORA Metrics Review

Monthly · 30 min

Deployment frequency, lead time, change failure rate, MTTR. Tracked via Swarmia. Both teams benchmark against industry.

Engineering Leads + CTO

Process Tooling

Tool	Purpose	Team	Cost
Jira	Sprint boards (Build) + Kanban boards (Run)	Both	Existing
OpsGenie	On-call rotation, incident alerting, escalation	Run	$9/user/month
Swarmia	DORA metrics, sprint velocity, engineering analytics	Both	~$30/user/month
Confluence	Documentation, post-mortems, architecture decisions	Both	Existing
GitHub	Code, PRs, CI/CD, CODEOWNERS	Both	Existing
Playwright	E2E testing, visual regression, cross-browser QA	QA Agents	Open source
axe-core	Accessibility auditing (WCAG 2.1 AA) in CI	QA Agents	Open source
k6 / Artillery	Load testing, pre-event capacity validation	QA Agents	Open source / ~$100/mo
Zendesk	IT service desk ticketing, Chatot AI triage integration	Run (IT)	~$55/agent/month
Reclaim.ai	AI calendar tool for engineering focus time	Build	$10/user/month (optional)
Retool	Product Admin Dashboard Phase 1 UI	Build	$50/user/month

Restructuring Timeline

Phase	Timeline	Key Actions
Phase 1: Stabilize & Separate	Weeks 1-2	Announce restructuring. Create two Jira boards. Establish P1/P2/P3 tiers. Set WIP limits. Spork stops attending Build standups. Implement 20% interrupt buffer. Kill Spork’s 6+ daily meetings.
Phase 2: Hire & Stabilize	Weeks 3-8	Post Run Team Lead + DevOps + Integration Engineer + Data Engineer + Senior Backend Dev + Full-Stack Dev. Install OpsGenie. Documentation sprints for Zach and Johnny (D1). Convert Jay Lane full-time. Begin QA agent Phase 1. Evaluate MSP options.
Phase 3: Operational Cadence	Weeks 9-12	Run Team Lead onboarded and independent. First quarterly roadmap planning session. Swarmia DORA metrics baseline established. Build team achieving 70%+ sprint velocity. QA agent Phase 2 (visual + a11y). IT Service Desk model operational.
Phase 4: Scale & Harden	Weeks 13-16	New developers onboarded and productive. QA agents fully operational (Phase 3: load + data). Three-layer resilience model validated for all BF1 systems. MSP evaluation complete. Pre-event load testing automated. AI agent knowledge repositories indexed for all critical systems.

Success looks like: Build team completing 70%+ of sprint commitments. P1 incidents resolved in 30 minutes. No engineer working 10-hour days for 3+ consecutive days. First event with a clean code freeze that actually holds. Product Council meeting monthly with documented decisions. All BF1 systems at bus factor 2+ with AI agent knowledge backup. QA agents catching bugs before they reach production.

Confidential Document

Current Team Structure (March 2026)

Proposed Team Structure

Before (Current)

After (Restructured)

IT Service Desk & MSP Layer

Three-Tier Support Model

Ticket Routing Flow

MSP Evaluation

New Roles & Hiring Timeline

Role Descriptions

Developer Derisking & AI-Augmented Resilience

Three-Layer Resilience Model

Layer 1: Primary Owner

Layer 2: Human Backup

Layer 3: AI Agent

Critical System Resilience Map

AI Agents as Knowledge Repositories

QA Strategy: AI-Powered Testing Agents

Before (No QA)

After (Agent-Powered QA)

Frontend QA Agent

Backend QA Agent

Implementation Phases

Product Ownership & PM Layer

Why Dedicated POs Matter

The Core Problem: Justin & Spork Are Stuck in Meetings

Project Manager / Scrum Master (New Hire)

PM / Scrum Master Responsibilities

Run Team: Technical PM (Run Team Lead)

Dedicated Product Owners by Product

AI Business Analyst Agent

Current Fragmentation

Proposed: SVPG Product Council

Sprint Ceremonies & Process Cadence

Build Team (Scrum)

Run Team (Kanban)

Incident Severity Tiers

Cross-Team Ceremonies

Process Tooling

Restructuring Timeline