Agent-First Data: Building APIs that Agents Can Read
A field-naming convention and output layer that lets agents infer units, timestamps, and secrets without extra documentation.
You just shipped a CLI tool. It works great. Then an AI agent calls it, reads "timeout": 5000, and confidently converts it to… 5000 seconds. Your 5-second timeout is now 83 minutes. The agent didn’t hallucinate — it just didn’t know.
This keeps happening. An agent reads "size": 5242880 and reports “5.2 million units of size.” Another one logs "api_key": "sk-1234..." straight into a Slack thread. Not because agents are dumb, but because your data doesn’t tell them what it means.
The field name is the schema
Here’s the insight behind Agent-First Data: if you rename timeout to timeout_ms, every agent immediately knows it’s milliseconds. No docs to find, no schema to parse, no guessing.
{
"timeout_ms": 5000,
"created_at_epoch_ms": 1738886400000,
"file_size_bytes": 5242880,
"api_key_secret": "sk-1234567890abcdef"
}
An agent reading this knows _ms means milliseconds, _epoch_ms is a Unix timestamp, _bytes is a byte count, and _secret means “do not log this.” The field name is the documentation.
This works in JSON, YAML, TOML, env vars, database columns — anywhere you have key-value data.
Output processing: format for humans, protect secrets
Naming alone gets you most of the way. But if you’re building a CLI tool or writing logs, you probably want two more things: human-friendly formatting and automatic secret redaction.
That’s where the output functions come in. Given the JSON above, here’s what output_yaml produces:
---
api_key: "***"
created_at: "2025-02-07T08:00:00.000Z"
file_size: "5.0MB"
timeout: "5.0s"
Three things happened. Suffixes got stripped from the keys (because the formatted value already carries the unit). Values got converted to human-readable form. And api_key_secret was automatically redacted.
You get three output formats depending on the context:
output_json — machine-readable, original keys, secrets redacted:
{"api_key_secret":"***","file_size_bytes":5242880,"timeout_ms":5000}
output_yaml — human-readable, keys stripped, values formatted:
---
api_key: "***"
file_size: "5.0MB"
timeout: "5.0s"
output_plain — compact logfmt, keys stripped, values formatted:
api_key=*** file_size=5.0MB timeout=5.0s
In code, it’s one function call:
use agent_first_data::*;
let status = json!({
"uptime_s": 86400,
"memory_bytes": 1048576,
"db_password_secret": "super-secret"
});
println!("{}", output_yaml(&status));
// ---
// db_password: "***"
// memory: "1.0MB"
// uptime: "86400s"
Same API in Python (output_yaml), TypeScript (outputYaml), and Go (OutputYaml).
The protocol template
For tools that want consistent structure, AFDATA includes an optional protocol template with a required code field and optional trace:
let response = build_json_ok(
json!({"hash": "abc123", "size_bytes": 456789}),
Some(json!({"duration_ms": 1280, "source": "db"}))
);
// {"code":"ok","result":{"hash":"abc123","size_bytes":456789},"trace":{"duration_ms":1280,"source":"db"}}
Use build_json_ok, build_json_error, and generic build_json for custom events. Startup diagnostics are now a log event:
let startup = build_json(
"log",
json!({
"event": "startup",
"config": {"timeout_s": 30}
}),
None
);
// {"code":"log","event":"startup","config":{"timeout_s":30}}
In CLI tools, keep startup diagnostics default-off, and enable explicitly with --log startup (or --verbose).
Structured logging
The same naming and formatting logic extends to log output. Rather than adding a separate logging convention, AFDATA logging uses the library’s own output_json/output_yaml/output_plain functions — same suffix stripping, value formatting, and secret redaction.
The key problem it solves: when multiple requests run concurrently, log lines from different requests interleave. Span fields (like request_id) need to appear on every line from that request, without manual threading through every function call. Each language integrates with its native logging ecosystem to handle this automatically.
use agent_first_data::afdata_tracing;
afdata_tracing::init_json(EnvFilter::new("info"));
let span = info_span!("request", request_id = %uuid);
let _guard = span.enter();
info!(duration_ms = 1280, "processed");
// {"timestamp_epoch_ms":...,"code":"info","message":"processed","request_id":"abc-123","duration_ms":1280}
The same in Go, Python, and TypeScript:
afdata.InitJson()
ctx := afdata.WithSpan(ctx, map[string]any{"request_id": uuid})
afdata.LoggerFromContext(ctx).Info("processed", "duration_ms", 1280)from agent_first_data import init_logging_json, span
init_logging_json("INFO")
with span(request_id=uuid):
logger.info("processed", extra={"duration_ms": 1280})import { log, span, initJson } from "agent-first-data";
await span({ request_id: uuid }, async () => {
log.info("processed", { duration_ms: 1280 });
});
In each case: one init call, one span, and every log line from that scope carries the span fields automatically. duration_ms renders as duration: 1.28s in YAML and plain output.
Before and after
Here’s a real scenario. You run my-tool status and get:
$ my-tool status
{"timeout": 5000, "api_key": "sk-abc123"}
An agent sees this and has to guess everything. Your secret is now in the terminal scrollback, the log aggregator, and probably a Slack thread.
With AFDATA naming (timeout_ms, api_key_secret) and output:
$ my-tool status --output yaml
---
api_key: "***"
timeout: "5.0s"
The agent knows the timeout is 5 seconds. The secret never appears in output. No docs needed.
Try it
cargo add agent-first-data # Rust
pip install agent-first-data # Python
npm install agent-first-data # TypeScript
go get github.com/cmnspore/agent-first-data/go # Go
The naming convention costs nothing — just rename your fields. The output functions are a single import. The protocol template is there when you want it.
Full spec and docs: github.com/cmnspore/agent-first-data