For two independent implementations to agree on aDocumentation Index
Fetch the complete documentation index at: https://docs.xandrlabs.ai/llms.txt
Use this file to discover all available pages before exploring further.
kbHash, they must produce the exact same bytes from the same logical input. That requires a serialization format with no ambiguity: no optional whitespace, no variation in key order, no difference in how numbers or strings are encoded. ALX Protocol uses a deterministic JSON serializer — modeled on RFC 8785 (JSON Canonicalization Scheme, or JCS) — as the foundation of KB identity.
What canonicalization does
Thecanonicalize() function takes an in-memory object and returns a UTF-8 string with the following properties:
- Keys are sorted alphabetically at every level of nesting (Unicode ascending order).
- No insignificant whitespace — no spaces after
:or,, no indentation, no trailing newlines. - Arrays preserve their element order (only
sourcesis sorted separately, as a normalization step before canonicalization). - Strings use JSON string encoding — the same escaping rules as
JSON.stringify. - Integers serialize as plain base-10 digits (e.g.
42, not42.0). - Non-integer numbers serialize as JSON numbers (e.g.
0.91).
Example: input to canonical output
Given this object:normalizeForHash (which projects to hash-scoped keys only) and canonicalize, the output is a single line with keys sorted at every level:
artifactHash sorts before domain, domain before payload, and inside payload the keys are also alphabetically ordered. This output is byte-identical regardless of the order fields appear in your source object.
The KB_V1 domain tag
Before hashing, the protocol prepends the ASCII string"KB_V1" to the canonical JSON. This domain tag serves two purposes:
- Cross-domain collision prevention — different protocol objects (tasks, transitions, constraints) each use a distinct domain tag. A KB envelope and a task envelope with identical content will produce different hashes because they are hashed under different tags.
- Versioning — if the KB identity rules change in a future protocol version, a new tag like
"KB_V2"will be used. The tag is the version boundary.
UTF8("KB_V1") and UTF8(canonical_string) are concatenated as raw bytes before hashing. For ASCII-safe canonical strings (which all well-formed KB envelopes produce), this is equivalent to hashing the string "KB_V1" + canonical_string.
ALX Protocol uses Keccak-256 — the Ethereum-compatible sponge construction — not NIST SHA3-256. These are different algorithms and produce different output for the same input.
Step-by-step walkthrough
Here is a complete walkthrough of how canonicalization and hashing work together to producekbHash.
Step 1: Start with your envelope object.
normalizeForHash keeps only type, domain, sources, artifactHash, tier, payload, and derivation (when present). It also sorts sources lexicographically. Fields like curator, createdAt, and signature are dropped.
Step 3: Serialize with canonicalize().
canonicalize recurses through the projected object, sorting object keys at every level, and produces a single UTF-8 string with no whitespace.
Step 4: Prepend the domain tag and hash.
The string "KB_V1" is prepended (as UTF-8 bytes) to the canonical string bytes, and the combined byte sequence is passed to Keccak-256. The result is hex-encoded with a 0x prefix.
In code, steps 2–4 are handled by a single call:
The hash scope
Only these fields participate in the hash preimage. All other fields are excluded bynormalizeForHash:
| Field | Included |
|---|---|
type | Yes |
domain | Yes |
sources | Yes (sorted, deduplicated) |
artifactHash | Yes, when present |
tier | Yes, when present |
payload | Yes |
derivation | Yes, when present |
kbHash (top-level) | No |
curator, createdAt | No |
signature | No |
| Transport / indexing metadata | No |
Values that are not allowed in canonical input
Thecanonicalize() function will throw an error for:
null— not allowed at object value positions in KB artifact profiles. Use omission instead.BigInt— not a valid JSON type; use string representation for large integers where the schema permits.undefined— treated as absent; fields withundefinedvalues are omitted from the canonical output.- Non-finite numbers —
NaN,Infinity, and-Infinitywill cause canonicalization to fail.
Why this matters for your integration
If your system constructs KB envelopes and computeskbHash values, the canonical serialization must match the reference implementation byte-for-byte. Using a generic RFC 8785 library is a good starting point, but the protocol’s conformance gate is agreement with the published test vectors — not RFC 8785 compliance alone. When in doubt, use kbHashFromEnvelope from @alx/protocol directly, or validate your output against the test vectors.
For the full identity derivation context, see Identity. For the envelope structure and which fields exist, see Knowledge Blocks.