⚠ PAGE STATUS: BUILD SPECIFICATION
This page contains the build specification for the Failure & Recovery dashboard. The specification below defines exactly what must be implemented. To build this dashboard, give Claude Code the instruction: “Read the build specification on factory-failures.html and implement it.”

Build Specification: Failure & Recovery Dashboard

Specification Source: Hilbert Factory Sections 4 (Chief Engineer), 5 (Packet-not-ready), 9 (Stop Conditions), 17 (Chief Engineer Ops) + Dashboard Spec View 4

Panel 4.1 — Active Failures

Chart Type: Data Table with severity highlighting

Data Source: GET /api/orchestrator/queue?status=failed,repairing,escalated

Refresh Rate: FREQUENT (every 30 seconds)

Display: ESCALATED packets at top with red highlight — these need human action. Each: packet_id, failure type, failure step, retry count, time since failure, current routing.

Panel 4.2 — Escalation Queue (Human Action Required)

Chart Type: Action Card list

Data Source: GET /api/escalations/pending

Refresh Rate: FREQUENT (every 30 seconds)

Display: Each escalation: escalation_id, packet_id, category, time waiting, Chief Engineer’s recommended actions. Action buttons per escalation: “Approve Repair”, “Modify Architecture”, “Override and Resume”, “Suspend Build”.

Interaction: Action buttons trigger POST /api/escalations/{id}/resolve with the chosen action

Panel 4.3 — Chief Engineer Activity

Chart Type: Data Table + Gauge

Data Source: GET /api/chief-engineer/activity — returns all interventions

Refresh Rate: PERIODIC (every 5 minutes)

Display: Resolution rate gauge (% resolved without human escalation). Table: diagnosis_id, packet_id, classification, root cause, confidence score, outcome (REPAIR/ESCALATE), duration.

Panel 4.4 — Failure Pattern Analysis

Chart Type: Pie Chart + Bar Chart + Line Chart (Recharts)

Data Source: GET /api/failures/patterns — returns aggregated failure data

Refresh Rate: SESSION

Display: Most common failure types (pie), failure rate by phase (bar), failure rate trend over 30 days (line). Systemic issue highlight: if same root cause appears 3+ times, show red badge “Systemic Issue — architecture review recommended”.