Relvy, a startup that went through Y Combinator's Fall 2024 batch, has publicly launched an AI-powered platform designed to automate on-call runbooks for software engineering teams. Founded by Bharath and Simranjit, the product deploys an AI agent equipped with telemetry analysis and code inspection tools, with the stated goal of cutting the time engineers spend debugging production incidents from hours to minutes.
Why Autonomous Root Cause Analysis Remains Hard
The company's launch, announced via Hacker News, is frank about the problem it is trying to solve and the difficulty of solving it. According to Relvy, autonomous root cause analysis is one of the harder tasks for current AI systems. The company cites performance on the OpenRCA dataset — a benchmark for root cause analysis in distributed systems — where, according to their post, Claude Opus 4.6 achieves roughly 36 percent accuracy, a markedly lower figure than the same models reach on standard coding tasks.
That gap points to three structural challenges the team has identified: the sheer volume of telemetry data can overwhelm a model with noise; interpreting observability signals correctly requires enterprise-specific context that a general model lacks; and on-call response is inherently time-constrained, meaning a wrong turn from the AI is costly in a way that a slow code suggestion is not.
Rather than asking an AI agent to explore freely, Relvy anchors its agent to predefined runbooks — step-by-step procedures that reflect how experienced engineers already approach a known class of incident. The design choice is deliberate: runbook-anchored investigation produces more deterministic and auditable steps, and reduces the cognitive burden on the engineer who ultimately has to review what the AI did. The agent is built with specialized tools for anomaly detection in time-series data, log pattern search, and span-tree reasoning, each intended to avoid flooding the model's context window with raw, unfiltered data.
How the Platform Works and How It Is Deployed
According to the company's official product description, Relvy can be installed on a local machine via Docker Compose, deployed on Kubernetes using Helm charts, or accessed through a hosted cloud option. Once connected to an observability stack and a code repository, teams create runbooks and point Relvy at a recent alert to begin an investigation. Each investigation is surfaced as a notebook in a web interface, with data visualizations intended to let engineers verify the AI's findings before acting on them. The platform can also be configured to respond automatically to alerts surfaced through Slack, and supports AWS CLI commands for mitigation actions — subject to human approval.
Concrete examples of runbook steps the system can execute, as described by the company, include checking whether errors are isolated to a specific database shard, identifying whether a traffic surge originates from a small set of IP addresses, and reviewing recent commits to an affected endpoint. These are the kind of structured, repeatable checks that experienced on-call engineers run from memory; the premise is that automating them saves both time and the mental overhead of context-switching at 2 a.m.
Background and Current Status
Relvy's founders say the company began with a different approach — continuous log monitoring using smaller language models — before pivoting toward the root cause analysis focus that defines the current product. According to their account, that earlier direction proved too slow to be practical. The current platform represents roughly a year of development shaped by work with early customers, though the company has not disclosed the names of those customers or the scale of current deployments.
The public launch positions Relvy in a crowded but still-evolving segment of developer tooling, where a number of vendors are competing to reduce on-call burden through AI-assisted observability. What distinguishes the approach, at least according to the company, is the explicit trade-off of open-ended agentic exploration for runbook-constrained determinism. Whether that trade-off holds at enterprise scale remains to be demonstrated, but the YC backing and the precision of the technical framing suggest a team that has thought carefully about the failure modes of the problem it is attacking.