Bus Factor Elimination Program
Executive Summary
Three engineers each represent a single point of failure on systems that generate $180M/year. Zach Hardesty is the only person who can operate the K8s/GitOps/observability/data lake infrastructure. Johnny Yarlott holds 97% of commits on the core payments API. Nick Jensen owns TonyRobbins.com and is INACTIVE in Atlassian.
If any one of them is hit by a bus — or just quits — RRI’s revenue infrastructure stops. This isn’t a hypothetical risk. It’s the single most dangerous structural vulnerability in the engineering organization.
The program pairs each critical engineer with a designated backup over 6 weeks using a driver-navigator model — the backup learns by doing, not watching. Combined with CODEOWNERS enforcement and commit distribution tracking, this transforms tribal knowledge into organizational capability.
What Needs to Happen
- Migrate npm package to
@rriorg scope — Move@alphonso77/rri-lifeforceoff personal npm account. Zero cost, zero risk. Day 3. - Lior 1:1 with Nick Jensen — Assess disengagement level before any sprint commitment. Nick is INACTIVE in Atlassian. Must understand if he’s checked out or just overwhelmed. Week 1.
- Install CODEOWNERS + branch protection on 5 critical repos — Distribute code review and block single-author merges. Week 1.
- Zach pairing sprint with designated backup — 2 weeks covering K8s, GitOps, observability stack, data lake operations. Driver-navigator model. Weeks 2-3.
- Johnny pairing sprint with designated backup — 2 weeks covering auth, payments API, order-ingestion, Stripe integration. Weeks 3-4.
- Nick pairing sprint (if re-engaged) — 2 weeks covering TonyRobbins.com, Sanity CMS, Experience API architecture. Weeks 4-5.
- Live simulation exercises — "What happens if X is unavailable" scenarios. Each backup operates their system independently under observation. Week 6.
- Measurement: RepoSense or ContributorIQ — Track commit distribution shifting from >90% single-author to <60%. Ongoing.
Claude Code acceleration: The code-heavy parts of this initiative — CODEOWNERS config, branch protection rules, npm scope migration — can be generated in minutes. But the core work (pairing sprints) is people-driven and can’t be compressed. Claude Code saves ~1 week on setup and documentation automation.
Completion Criteria
@alphonso77/rri-lifeforcemigrated to@rrinpm org scope- CODEOWNERS files installed on all 5 critical repos with branch protection active
- Zach’s designated backup can independently operate K8s cluster, run GitOps deploys, and query data lake
- Johnny’s designated backup can independently deploy order-ingestion, debug payment failures, and manage Stripe config
- Nick’s designated backup (if applicable) can deploy TonyRobbins.com and manage Sanity content model
- Commit distribution shifted from >90% single-author to <60% on critical repos
- Live simulation exercise completed: each backup operated their system independently for 48 hours
- Nick Jensen 1:1 completed with documented outcome and engagement plan
Initiative Attributes
Tools Required
| Tool | Purpose | Cost |
|---|---|---|
| RepoSense | Commit distribution visualization — tracks who is contributing to which repos | Free (OSS) |
| Swimm | Documentation with CI stale-doc checks — ensures pairing knowledge stays current | $8/user/month |
| CODEOWNERS | Automated code review assignment — forces multi-reviewer PRs | Free (GitHub built-in) |
| ContributorIQ | SaaS alternative to RepoSense for contribution analytics | SaaS pricing |
| Mermaid | Architecture diagrams embedded in repos | Free (OSS) |
Related Risks
| ID | Risk | Severity | Probability | Mitigation |
|---|---|---|---|---|
| RF1 | Nick Jensen disengagement / departure | CRITICAL | MEDIUM | Lior 1:1 Week 1 post-UPW. Frame S3 + S5 ownership as growth opportunity. Load relief from H4. If Nick leaves, U2, U4, S3, S5 all at risk — he’s bus factor 1 on TonyRobbins.com. |