Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.xandrlabs.ai/llms.txt

Use this file to discover all available pages before exploring further.

For two independent implementations to agree on a kbHash, they must produce the exact same bytes from the same logical input. That requires a serialization format with no ambiguity: no optional whitespace, no variation in key order, no difference in how numbers or strings are encoded. ALX Protocol uses a deterministic JSON serializer — modeled on RFC 8785 (JSON Canonicalization Scheme, or JCS) — as the foundation of KB identity.

What canonicalization does

The canonicalize() function takes an in-memory object and returns a UTF-8 string with the following properties:
  • Keys are sorted alphabetically at every level of nesting (Unicode ascending order).
  • No insignificant whitespace — no spaces after : or ,, no indentation, no trailing newlines.
  • Arrays preserve their element order (only sources is sorted separately, as a normalization step before canonicalization).
  • Strings use JSON string encoding — the same escaping rules as JSON.stringify.
  • Integers serialize as plain base-10 digits (e.g. 42, not 42.0).
  • Non-integer numbers serialize as JSON numbers (e.g. 0.91).
The result is a single compact UTF-8 string with no implementation-defined variation.

Example: input to canonical output

Given this object:
{
  "domain": "software.security",
  "tier": "open",
  "type": "practice",
  "payload": {
    "type": "practice",
    "failureModes": [],
    "contexts": [],
    "rationale": "Rotate signing keys every 90 days."
  },
  "sources": [],
  "artifactHash": "0xf3c9a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0"
}
After normalizeForHash (which projects to hash-scoped keys only) and canonicalize, the output is a single line with keys sorted at every level:
{"artifactHash":"0xf3c9a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0c1d2e3f4a5b6c7d8e9f0","domain":"software.security","payload":{"contexts":[],"failureModes":[],"rationale":"Rotate signing keys every 90 days.","type":"practice"},"sources":[],"tier":"open","type":"practice"}
Notice that artifactHash sorts before domain, domain before payload, and inside payload the keys are also alphabetically ordered. This output is byte-identical regardless of the order fields appear in your source object.

The KB_V1 domain tag

Before hashing, the protocol prepends the ASCII string "KB_V1" to the canonical JSON. This domain tag serves two purposes:
  1. Cross-domain collision prevention — different protocol objects (tasks, transitions, constraints) each use a distinct domain tag. A KB envelope and a task envelope with identical content will produce different hashes because they are hashed under different tags.
  2. Versioning — if the KB identity rules change in a future protocol version, a new tag like "KB_V2" will be used. The tag is the version boundary.
The full hashing formula is:
canonical_string := canonicalize(normalizeForHash(envelope))
kbHash           := keccak256(UTF8("KB_V1") + UTF8(canonical_string))
UTF8("KB_V1") and UTF8(canonical_string) are concatenated as raw bytes before hashing. For ASCII-safe canonical strings (which all well-formed KB envelopes produce), this is equivalent to hashing the string "KB_V1" + canonical_string.
ALX Protocol uses Keccak-256 — the Ethereum-compatible sponge construction — not NIST SHA3-256. These are different algorithms and produce different output for the same input.

Step-by-step walkthrough

Here is a complete walkthrough of how canonicalization and hashing work together to produce kbHash. Step 1: Start with your envelope object.
const envelope = {
  type: "practice",
  domain: "software.security",
  sources: [],
  artifactHash: "0xf3c9...7d11",
  tier: "open",
  payload: {
    type: "practice",
    rationale: "Rotate signing keys every 90 days.",
    contexts: [],
    failureModes: [],
  },
};
Step 2: Project to hash-scoped fields only. normalizeForHash keeps only type, domain, sources, artifactHash, tier, payload, and derivation (when present). It also sorts sources lexicographically. Fields like curator, createdAt, and signature are dropped. Step 3: Serialize with canonicalize(). canonicalize recurses through the projected object, sorting object keys at every level, and produces a single UTF-8 string with no whitespace. Step 4: Prepend the domain tag and hash. The string "KB_V1" is prepended (as UTF-8 bytes) to the canonical string bytes, and the combined byte sequence is passed to Keccak-256. The result is hex-encoded with a 0x prefix. In code, steps 2–4 are handled by a single call:
import { kbHashFromEnvelope } from "@alx/protocol";

const kbHash = kbHashFromEnvelope(envelope);
// → "0x..." — 0x-prefixed, 64-char lowercase hex

The hash scope

Only these fields participate in the hash preimage. All other fields are excluded by normalizeForHash:
FieldIncluded
typeYes
domainYes
sourcesYes (sorted, deduplicated)
artifactHashYes, when present
tierYes, when present
payloadYes
derivationYes, when present
kbHash (top-level)No
curator, createdAtNo
signatureNo
Transport / indexing metadataNo

Values that are not allowed in canonical input

The canonicalize() function will throw an error for:
  • null — not allowed at object value positions in KB artifact profiles. Use omission instead.
  • BigInt — not a valid JSON type; use string representation for large integers where the schema permits.
  • undefined — treated as absent; fields with undefined values are omitted from the canonical output.
  • Non-finite numbersNaN, Infinity, and -Infinity will cause canonicalization to fail.
Passing null or BigInt values into kbHashFromEnvelope will throw an error. Check your payload schema for nullable fields and replace null with omission before calling the hash function.

Why this matters for your integration

If your system constructs KB envelopes and computes kbHash values, the canonical serialization must match the reference implementation byte-for-byte. Using a generic RFC 8785 library is a good starting point, but the protocol’s conformance gate is agreement with the published test vectors — not RFC 8785 compliance alone. When in doubt, use kbHashFromEnvelope from @alx/protocol directly, or validate your output against the test vectors. For the full identity derivation context, see Identity. For the envelope structure and which fields exist, see Knowledge Blocks.