cmn-substrate: The Protocol in a Crate

by CMN Contributors

A zero-I/O Rust library that implements the CMN protocol core — Ed25519 signatures, BLAKE3 tree hashing, content-addressed URIs, and JSON schema validation. Now on crates.io.

The CMN spec defines the protocol. cmn-substrate is the Rust implementation — the types, parsers, hash functions, and signature verification that turn JSON documents into verified, content-addressed objects.

It’s the shared foundation underneath Hypha (the CLI), Synapse (the indexer), and Tendril (the desktop client). Everything those tools know about spores, mycelium, taste, and URIs comes from this crate.

Why a separate crate

CMN tools have wildly different runtime requirements. Hypha runs on your laptop. Synapse runs on a server. Tendril runs in a browser via WASM. What they all share: they need to parse the same manifests, verify the same signatures, and compute the same hashes.

Substrate is designed for that constraint. Zero filesystem I/O in the core. No std::fs, no std::path::Path in the default build. Tree hashing operates on in-memory TreeEntry values, not directory walks. This makes it compile to WASM without feature flags or conditional compilation.

Anything that touches the network or disk is behind optional features:

FeatureWhat it adds
clientAsync HTTP client (reqwest) for fetching manifests
client-safe-dnsClient + DNS filtering to block private IPs (adds tokio)
archive-ruzstdTar+zstd extraction via ruzstd (pure Rust, WASM-compatible)
archive-zstdTar+zstd extraction via zstd (native, faster)

A WASM build uses archive-ruzstd. A CLI build uses archive-zstd and client-safe-dns. The protocol logic — types, crypto, schemas — is always the same.

What’s inside

Content addressing

Every CMN artifact is content-addressed. Substrate implements the full pipeline:

  1. Tree hashing — Git-like Merkle tree (blob_tree_blake3_nfc). Files become blobs (BLAKE3("blob {size}\0" + content)), directories become trees with sorted entries. Names are Unicode NFC-normalized so the same source tree produces the same hash on every platform.

  2. Core hashing — The core object (immutable metadata) is JCS-canonicalized and hashed with its signature: BLAKE3(JCS(core) + core_signature). For spores, the tree hash is prepended: BLAKE3(tree_hash + JCS(core) + core_signature).

  3. URI construction — The hash becomes the URI: cmn://domain/b3.{base58}. Parse it back with CmnUri::parse().

Two-layer signatures

Every manifest has a core_signature (author signs the immutable metadata) and a capsule_signature (host signs everything including distribution URLs). Substrate provides both signing and verification, all over JCS-canonicalized JSON.

// Verify a spore manifest
let spore: Spore = serde_json::from_value(manifest)?;
spore.verify_core_signature(&author_key)?;
spore.verify_capsule_signature(&host_key)?;

Mirrors can re-sign the capsule with their own key. The core stays intact. Authorship is preserved across replication.

Data models

Strongly typed Rust structs for every protocol artifact:

All derive Serialize/Deserialize with proper serde attributes. All use anyhow::Result — no panics, no unwraps (enforced by clippy lints).

Schema validation

Five JSON schemas are embedded at compile time. Validate any CMN document offline:

let doc: serde_json::Value = serde_json::from_str(&json)?;
let schema_type = substrate::validate_schema(&doc)?;
// Returns SchemaType::Spore, SchemaType::Mycelium, etc.

HTTP client

Behind the client feature, substrate provides async functions for the full fetch-and-verify cycle:

let client = substrate::client::http_client(30)?;
let entry = substrate::client::fetch_cmn_entry(&client, "example.com", opts).await?;
let capsule = entry.primary_capsule()?;
let manifest = substrate::client::fetch_mycelium(&client, capsule, opts).await?;

URL validation rejects SSRF vectors — private IPs, localhost, link-local addresses. The client-safe-dns feature adds DNS-level filtering via tokio.

Archive extraction

Tar+zstd extraction with security hardening: rejects symlinks, hardlinks, path traversal, absolute paths, and enforces byte/file count limits.

let entries = substrate::archive::extract_tar_zstd(&compressed_bytes, &limits)?;
// Returns Vec<ArchiveEntry> — flat list of files and directories

ExtractError::Malicious vs ExtractError::Failed distinguishes active threats from normal failures.

Design decisions

No async in the core. Signature verification, hashing, schema validation — none of this needs to be async. The client module is async (reqwest), but it’s optional. This keeps the core dependency tree minimal and compile times fast.

JCS everywhere. JSON Canonicalization Scheme (RFC 8785) ensures deterministic serialization across platforms. Every signature, every hash, every content-addressed URI depends on JCS producing the exact same bytes. Substrate uses serde_jcs for this.

Base58 for human-readable identifiers. Hashes, signatures, and keys are all formatted as {algorithm}.{base58}. Compact, unambiguous, no padding characters, no /+ to escape in URLs.

Strict clippy. unwrap_used = "deny", expect_used = "deny", panic = "deny". If it compiles, it won’t panic at runtime. Every error path returns a Result.

Getting started

[dependencies]
cmn-substrate = { version = "0.1", features = ["client-safe-dns", "archive-zstd"] }

The spec defines the protocol. The crate implements it. The conformance vectors verify it.

Source: github.com/cmnspore/cmn-substrate Crate: crates.io/crates/cmn-substrate License: MIT