Deploy Broke Prod Again — So I Built PerfSage SignalPilot

Field Notes #3 · TL;DR — You deployed. Errors spiked. Someone opens kubectl, someone opens Grafana, someone blames the last commit. PerfSage SignalPilot runs an observe → correlate → explain → recommend → verify loop across K8s API, metrics-server, logs, cAdvisor, Prometheus, and optional git — then ranks findings with copy-paste kubectl fixes. Open source. MIT. Landing page.

Field Notes #4 — For the why behind SignalPilot (war rooms, MTTR, expensive-tool gap), read I Got Tired of 3-Hour Post-Deploy War Rooms.

The question every deploy review should answer

“Why are errors and performance degradation happening after my last deployment?”

That question is simple. Getting a defensible answer in under five minutes is not.

kubectl describe shows one pod. Grafana shows a metric spike. Git shows a commit. None of them cite each other.

What SignalPilot does differently

SignalPilot fuses cross-source evidence into deterministic RCA rules:

Rule	What it correlates	Typical fix
`oom_killed`	OOMKilled + memory near limit	Raise memory limit
`cpu_throttled`	CFS throttle + latency regression	Raise CPU request/limit
`crash_loop`	CrashLoopBackOff + logs + config diff	Rollback or fix env
`code_regression`	New log fingerprints + git suspect commit	Investigate commit

Each finding cites multiple signal types — not a single chart anomaly.

Quick start

git clone https://github.com/perfsage/signalpilot
cd signalpilot && pip install -e .

kubectl apply -f deploy/signalpilot-rbac.yaml

signalpilot analyze my-namespace --deployment my-app --output report.html

CI gate (exit 1 on HIGH+ findings):

signalpilot gate my-namespace --deployment my-app --junit-xml results.xml

Full docs on the SignalPilot landing page and GitHub README.

The PerfSage ladder: test → gate → RCA

Reveal — JMeter JTL analysis in the lab (/reveal/)
SLO Reporter — CI gates on load tests (/slo-plugin/)
SignalPilot — post-deploy RCA in production (/signalpilot/)

Field Notes #3 · By Aashish Bajpai

The question every deploy review should answer

What SignalPilot does differently

Quick start

The PerfSage ladder: test → gate → RCA

Related Field Notes