claude-vitals
Your AI coding assistant's quality is invisible. You feel it getting worse but can't prove it. Now you can.
Why This Exists
The findings were stark:
Thinking depth dropped 67% before anyone noticed — because redaction hid it. By the time users felt the change, it had already been degrading for weeks.
claude-vitals makes that analysis continuous, automatic, and actionable.
Quick Start
Full Setup 5 min
copygit clone https://github.com/Okohedeki/claude-vitals.git
cd claude-vitals
npm install
npm run build
bash scripts/install.sh
claude-vitals scan
Installs the CLI, skill commands, and runs your first scan.
Already Installed?
copyclaude-vitals scan && claude-vitals health
Ingest new logs and get a one-line status.
copyclaude-vitals prescribe
Get specific fixes for any degraded metrics.
The 20 Metrics
Every metric has a good benchmark from the pre-regression period and a degraded benchmark from the post-regression period, derived from the original 234,760 tool call analysis.
| # | Metric | Good | Degraded | What It Measures |
|---|---|---|---|---|
| 1 | Thinking Depth | ≥ 2,200 | ≤ 600 | Median estimated thinking chars — the upstream indicator of all quality |
| 2 | Read:Edit Ratio | ≥ 6.6 | ≤ 2.0 | Files read per file edited — researching before acting |
| 3 | Research:Mutation | ≥ 8.7 | ≤ 2.8 | All research actions vs all mutation actions |
| 4 | Blind Edit Rate | ≤ 6.2% | ≥ 33.7% | % of edits on files not read in last 10 tool calls |
| 5 | Write vs Edit | ≤ 4.9% | ≥ 11.1% | Full-file writes as % of all mutations |
| 6 | First Tool After Prompt | Read | Edit | Whether the model researches or immediately edits |
| 7 | Reasoning Loops/1K | ≤ 8.2 | ≥ 26.6 | Visible self-corrections ("actually," "oh wait") per 1K calls |
| 8 | Laziness/day | 0 | ≥ 10 | Stop-hook violations: permission seeking, premature stopping, dodging |
| 9 | Self-Admitted Failures/1K | ≤ 0.1 | ≥ 0.5 | Model recognizing its own bad work after correction |
| 10 | User Interrupts/1K | ≤ 0.9 | ≥ 11.4 | How often the user hits Escape to stop the model |
| 11 | Sentiment Ratio | ≥ 4.4 | ≤ 3.0 | Positive to negative word ratio in user prompts |
| 12 | Frustration Rate | ≤ 5.8% | ≥ 9.8% | % of prompts with frustration indicators |
| 13 | Session Autonomy | High | Low | Time gaps between user prompts — longer = model works independently |
| 14 | Edit Churn | Low | High | Same file edited 3+ times in 10 calls — thrashing vs planning |
| 15 | Bash Success Rate | High | Low | Build/test/bash command success rates |
| 16 | Subagent % | — | — | % of tool calls delegated to sub-agents |
| 17 | Context Pressure | < 50% | > 75% | Quality metrics segmented by context window utilization |
| 18 | Model Segmentation | — | — | All metrics broken down by model (Sonnet vs Opus etc.) |
| 19 | Cost Efficiency | $12/day | $1,504/day | Estimated cost per session, per successful edit, per commit |
| 20 | Prompts/Session | ≥ 35.9 | ≤ 27.9 | Users give up faster when quality is poor |
CLI Reference
scan
Ingest Claude Code session logs into the database. Detects config changes and computes all metrics.
copyclaude-vitals scan
claude-vitals scan --force
claude-vitals scan --verbose
--force re-scans all sessions. --verbose shows detailed progress. --db <path> uses a custom database.
health
One-line green/yellow/red health status with alerts.
copyclaude-vitals health
Returns an exit code matching severity. Pipe into CI or cron jobs.
report
Generate a full quality report with trends, sparklines, and benchmarks.
copyclaude-vitals report
claude-vitals report --format md
claude-vitals report --days 14
claude-vitals report --model opus
--format md outputs GitHub-postable Markdown. --days sets the window. --model / --project filter data.
prescribe
Analyze degraded metrics and prescribe specific fixes: env vars, settings.json, CLAUDE.md rules.
copyclaude-vitals prescribe
claude-vitals prescribe --apply
claude-vitals prescribe --target project
claude-vitals prescribe --format json
--apply writes fixes. Default is dry-run. --target chooses global vs project scope.
dashboard
Launch a local Grafana-style web dashboard with all charts and a change timeline.
copyclaude-vitals dashboard
Opens in your default browser. Dark theme, responsive, real-time data.
compare
Side-by-side period comparison for all metrics.
copyclaude-vitals compare
Compares two time periods to show improvement or regression across every metric.
annotate
Log a manual change event for impact tracking.
copyclaude-vitals annotate "Switched to Opus"
claude-vitals annotate "Added error-handling skill"
claude-vitals annotate "Set /effort max"
impact
Show before/after metrics for a specific change event.
copyclaude-vitals impact 3
Computes all 20 metrics for 7 days before and after the change, with a verdict.
changes
List all detected and annotated changes.
copyclaude-vitals changes
Shows config file changes, manual annotations, and their timestamps.
Prescriptions
| Degraded Metric | Type | Prescription |
|---|---|---|
| Thinking Depth | ENV | CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING=1 |
| Thinking Depth | ENV | MAX_THINKING_TOKENS=31999 |
| Thinking Depth | settings.json | effortLevel: "high" |
| Read:Edit Ratio | CLAUDE.md | Before ANY edit, read the target file AND at least 2 related files |
| Blind Edit Rate | CLAUDE.md | Zero tolerance: verify target file was Read in last 10 tool calls before every edit |
| Write vs Edit | CLAUDE.md | Never use Write to modify existing files — use Edit for surgical changes only |
| Laziness Signals | CLAUDE.md | Ban: "should I continue?", "want me to keep going?", "good stopping point", "out of scope" |
| Reasoning Loops | CLAUDE.md | Resolve contradictions internally before producing output |
| Self-Admitted Failures | CLAUDE.md | Read more before editing. Test after changing. Verify before moving on. |
| Research:Mutation | CLAUDE.md | Before any code change: Grep for the symbol, Glob for related files, Read each file |
Dry Run (default)
copyclaude-vitals prescribe
Shows what would be changed without writing anything.
Auto-Apply Fixes
copyclaude-vitals prescribe --apply
Writes environment variables, updates settings.json, and appends rules to CLAUDE.md.
Skills
Skills are slash commands that Claude Code can invoke during a session for self-diagnosis.
/vitals
Full diagnostic. Scans logs, checks health, generates a report, and applies behavioral corrections for any degraded metrics.
/vitals-quick
Fast health check. Silent if green — only reports when something is wrong.
/vitals-report
Generates a GitHub-postable Markdown report with all metrics, trends, and benchmarks.
/vitals-dashboard
Launches the web dashboard in your browser for visual exploration of all metrics.
Install Skills
copybash scripts/install.sh
Installs all four commands to ~/.claude/commands/. Uninstall with bash scripts/uninstall.sh.
Change Tracking
Everything above tells you how quality is trending. Change tracking tells you why.
Automatic Detection
On every scan, claude-vitals hashes and diffs these files:
~/.claude/CLAUDE.mdand per-project CLAUDE.md filessettings.json(model config, effort level, permissions)- Skill files (.md files in skill directories)
Manual Annotations
copyclaude-vitals annotate "Switched to Opus"
claude-vitals annotate "Rewrote CLAUDE.md testing section"
claude-vitals annotate "Started 5 concurrent sessions"
Impact Analysis
For every change, compute all 20 metrics for 7 days before and 7 days after:
copyclaude-vitals impact 3
Change #3: "Switched to Opus" (2025-06-15) Metric Before After Delta ─────────────────────────────────────────── Read:Edit Ratio 3.2 6.8 +112% Blind Edit Rate 28.1% 7.3% -74% Thinking Depth 840 2,340 +178% Laziness/day 8 1 -88% Verdict: IMPROVED quality across 6/8 key metrics
Configuration Guide
Recommended Environment Variables
copyexport CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING=1
export MAX_THINKING_TOKENS=31999
Disables adaptive thinking throttling and sets maximum thinking budget. The single most impactful change you can make.
Recommended settings.json
copy{
"effortLevel": "high",
"showThinkingSummaries": true
}
Forces high-effort responses and makes thinking visible for debugging.
CLAUDE.md Best Practices
A well-crafted CLAUDE.md prevents most quality issues before they start:
copy# Rules
## Read Before Edit
- Before ANY edit, read the target file AND at least 2 related files
- Never edit a file you haven't read in the last 10 tool calls
- Before any code change: Grep for the symbol, Glob for related files
## Surgical Edits Only
- Never use Write/CreateFile to modify existing files
- Use Edit for targeted changes — preserve surrounding context
- Full-file rewrites require explicit user approval
## No Laziness
BANNED PHRASES (never output these):
- "should I continue?"
- "want me to keep going?"
- "shall I proceed?"
- "good stopping point"
- "let's pause here"
- "out of scope for now"
- "continue in a new session"
Do the work. Don't ask permission to do the work.
## Resolve Internally
- Never output "oh wait" / "actually" / "let me reconsider"
- Resolve contradictions in thinking before producing output
- If unsure, research more — don't guess and self-correct
Dashboard
A single-file, Grafana-inspired web dashboard for visual exploration of all quality metrics.
copyclaude-vitals dashboard
Launches a local HTTP server and opens the dashboard in your default browser.
Features
- Dark theme matching this documentation site
- Change timeline across the top — click any change to see its impact overlay
- 6 chart sections: Thinking Depth, Behavioral Ratios, Tool Patterns, User Signals, Session Health, Cost
- Reference benchmark lines showing good/degraded thresholds
- React + Recharts rendered from CDN — no build step
- Responsive layout for desktop and tablet
src/dashboard/dashboard.html. Edit it directly — no build step required.