Agent-First Data: Building APIs that Agents Can Read

by CMN Contributors

A field-naming convention and output layer that lets agents infer units, timestamps, and secrets without extra documentation.

You just shipped a CLI tool. It works great. Then an AI agent calls it, reads "timeout": 5000, and confidently converts it to… 5000 seconds. Your 5-second timeout is now 83 minutes. The agent didn’t hallucinate — it just didn’t know.

This keeps happening. An agent reads "size": 5242880 and reports “5.2 million units of size.” Another one logs "api_key": "sk-1234..." straight into a Slack thread. Not because agents are dumb, but because your data doesn’t tell them what it means.

The field name is the schema

Here’s the insight behind Agent-First Data: if you rename timeout to timeout_ms, every agent immediately knows it’s milliseconds. No docs to find, no schema to parse, no guessing.

{
  "timeout_ms": 5000,
  "created_at_epoch_ms": 1738886400000,
  "file_size_bytes": 5242880,
  "api_key_secret": "sk-1234567890abcdef"
}

An agent reading this knows _ms means milliseconds, _epoch_ms is a Unix timestamp, _bytes is a byte count, and _secret means “do not log this.” The field name is the documentation.

This works in JSON, YAML, TOML, env vars, database columns — anywhere you have key-value data.

Output processing: format for humans, protect secrets

Naming alone gets you most of the way. But if you’re building a CLI tool or writing logs, you probably want two more things: human-friendly formatting and automatic secret redaction.

That’s where the output functions come in. Given the JSON above, here’s what output_yaml produces:

---
api_key: "***"
created_at: "2025-02-07T08:00:00.000Z"
file_size: "5.0MB"
timeout: "5.0s"

Three things happened. Suffixes got stripped from the keys (because the formatted value already carries the unit). Values got converted to human-readable form. And api_key_secret was automatically redacted.

You get three output formats depending on the context:

output_json — machine-readable, original keys, secrets redacted:

{"api_key_secret":"***","file_size_bytes":5242880,"timeout_ms":5000}

output_yaml — human-readable, keys stripped, values formatted:

---
api_key: "***"
file_size: "5.0MB"
timeout: "5.0s"

output_plain — compact logfmt, keys stripped, values formatted:

api_key=*** file_size=5.0MB timeout=5.0s

In code, it’s one function call:

use agent_first_data::*;

let status = json!({
    "uptime_s": 86400,
    "memory_bytes": 1048576,
    "db_password_secret": "super-secret"
});

println!("{}", output_yaml(&status));
// ---
// db_password: "***"
// memory: "1.0MB"
// uptime: "86400s"

Same API in Python (output_yaml), TypeScript (outputYaml), and Go (OutputYaml).

The protocol template

For tools that want consistent structure, AFDATA includes an optional protocol template with a required code field and optional trace:

let response = build_json_ok(
    json!({"hash": "abc123", "size_bytes": 456789}),
    Some(json!({"duration_ms": 1280, "source": "db"}))
);
// {"code":"ok","result":{"hash":"abc123","size_bytes":456789},"trace":{"duration_ms":1280,"source":"db"}}

Use build_json_ok, build_json_error, and generic build_json for custom events. Startup diagnostics are now a log event:

let startup = build_json(
    "log",
    json!({
        "event": "startup",
        "config": {"timeout_s": 30}
    }),
    None
);
// {"code":"log","event":"startup","config":{"timeout_s":30}}

In CLI tools, keep startup diagnostics default-off, and enable explicitly with --log startup (or --verbose).

Structured logging

The same naming and formatting logic extends to log output. Rather than adding a separate logging convention, AFDATA logging uses the library’s own output_json/output_yaml/output_plain functions — same suffix stripping, value formatting, and secret redaction.

The key problem it solves: when multiple requests run concurrently, log lines from different requests interleave. Span fields (like request_id) need to appear on every line from that request, without manual threading through every function call. Each language integrates with its native logging ecosystem to handle this automatically.

use agent_first_data::afdata_tracing;
afdata_tracing::init_json(EnvFilter::new("info"));

let span = info_span!("request", request_id = %uuid);
let _guard = span.enter();
info!(duration_ms = 1280, "processed");
// {"timestamp_epoch_ms":...,"code":"info","message":"processed","request_id":"abc-123","duration_ms":1280}

The same in Go, Python, and TypeScript:

afdata.InitJson()
ctx := afdata.WithSpan(ctx, map[string]any{"request_id": uuid})
afdata.LoggerFromContext(ctx).Info("processed", "duration_ms", 1280)
from agent_first_data import init_logging_json, span
init_logging_json("INFO")
with span(request_id=uuid):
    logger.info("processed", extra={"duration_ms": 1280})
import { log, span, initJson } from "agent-first-data";
await span({ request_id: uuid }, async () => {
  log.info("processed", { duration_ms: 1280 });
});

In each case: one init call, one span, and every log line from that scope carries the span fields automatically. duration_ms renders as duration: 1.28s in YAML and plain output.

Before and after

Here’s a real scenario. You run my-tool status and get:

$ my-tool status
{"timeout": 5000, "api_key": "sk-abc123"}

An agent sees this and has to guess everything. Your secret is now in the terminal scrollback, the log aggregator, and probably a Slack thread.

With AFDATA naming (timeout_ms, api_key_secret) and output:

$ my-tool status --output yaml
---
api_key: "***"
timeout: "5.0s"

The agent knows the timeout is 5 seconds. The secret never appears in output. No docs needed.

Try it

cargo add agent-first-data    # Rust
pip install agent-first-data  # Python
npm install agent-first-data  # TypeScript
go get github.com/cmnspore/agent-first-data/go  # Go

The naming convention costs nothing — just rename your fields. The output functions are a single import. The protocol template is there when you want it.

Full spec and docs: github.com/cmnspore/agent-first-data