Agent Anomaly Detection & Workload Assurance Execution Risk from Silicon to Semantic

Before an autonomous agent executes a critical task, someone needs to answer: "Is it safe to act right now?" FFWD is the only platform that answers this quantitatively — with a real-time Execution Risk Score that evaluates the agent, the full stack beneath it, and the task at hand.

Powered by Nio — our open-source agent guard for Claude Code, Codex CLI, and any tool-calling AI agent.

The 2-way problem

The stack affects the agent. The agent affects the stack.

Agent monitoring alone is insufficient without the full picture.

Stack » Agent

Infrastructure silently degrades agent judgement

The agent’s reasoning is only as good as the stack that supports it. When endpoints drift, data goes stale, or infrastructure shifts beneath it, the agent’s ground truth quietly distorts. The work completes — against a reality that has mutated.

Agent » Stack

Agent actions impacting downstream systems

Every agent action lands somewhere — a service, a database, a dependency, an identity. A single mistake can ripple far beyond what the agent itself can see, while the agent’s own logs stay clean. The damage is happening elsewhere.

The clues and symptoms for agent anomalies are scattered across its surrounding environment — not just inside the agent itself.

NIO — OPEN-SOURCE AGENT GUARD

Execution assurance and observability for autonomous AI agents — the open-source agent-side enforcement that produces FFWD’s Execution Risk Score.

Nio installs at the edge with your tool-calling agent — Claude Code, Codex CLI, OpenClaw, Hermes — and evaluates every tool call through a multi-phase pipeline before it runs. Allow, deny, or request confirmation. Every action is captured as OpenTelemetry signals plus a local audit trail.

Not a chatbot guardrail. Not a security-only filter. Nio gates execution for the agents most safety solutions don’t cover — infrastructure, data pipeline, deployment, identity. The agents with elevated privileges, largely irreversible actions, and large blast radii. Their failure mode isn’t a harmful response; it’s an operational cascade that shows up hours later in an outage report attributed to “infrastructure issues.”

  • Real-time pre-execution gating across multiple phases, with weighted scoring
  • Static, runtime, behavioural, LLM-based and external scoring engines
  • OpenTelemetry metrics, traces, and logs out of the box
  • Local JSONL audit log; optional external OTEL export
  • Apache-2.0 — runs on your machine, no data leaves
Pre-execution verdict

Execution Risk Score

A composite Go/No-Go verdict before every critical action.

Input
Agent State
+
Input
Stack State
+
Input
Task Criticality
=
Output
Risk Score
Go / No-Go

Agent State

Model interaction health, behavioural drift, and anomaly patterns in the agent’s own telemetry.

Stack State

Cross-domain evaluation by FFWD’s anomaly engine across the full stack the agent runs on and acts upon.

Task Criticality

The impact level of the agent actions. A log query vs a production network re-route carry very different risk thresholds.

How the score is computed

The score combines quantitative marker signals from FFWD’s extensive AI/ML toolbox with qualitative reasoning from LLMs.

Explore the AI/ML toolbox
Agent integration

Closed-loop. Both directions. Zero instrumentation.

Telemetry comes in. Risk Scores go back out. All without touching agent code.

Collect — Agent Telemetry Incoming

Two non-intrusive capture methods

  • Nio Plugin — direct integration with Claude Code, Codex CLI, and other tool-calling agent frameworks. Captures and enforces at the tool execution boundary.
  • eBPF Rust Collector — kernel-level OTEL traces via eBPF, captured by a high-performance Rust collector on the host. Picks up main agent and all sub-agents with no code changes.
Captures across 4 dimensions
Identity Content Usage & Cost Behaviour
Deliver — Risk Scores Outgoing

Verdicts arrive where agents already run

  • Nio hooks & plugin — hard stop at the tool execution boundary, before the action runs.
  • MCP server — AI apps query Risk Scores conversationally during their normal tool-calling workflow.
Access control

Enterprise-grade ReBAC permissions — multi-tenant, tiered. Agents only see what they’re authorised to access.