Implement proper DAG-CBOR (DRISL-CBOR) encoding/decoding layer #20

Closed
opened 2026-04-12 17:28:44 +00:00 by Grandiras · 1 comment
Owner

Summary

ATProto.NET has a CAR file reader, but lacks proper DAG-CBOR (DRISL-CBOR) encoding/decoding for the AT Protocol data model. The spec requires deterministic CBOR encoding for signing/verification and proper round-tripping between JSON and CBOR representations.

Spec Reference

From the Data Model specification:

Required capabilities

  1. DRISL-CBOR Encoding - Deterministic encoding following DRISL rules:
    • Map keys must be sorted by byte value
    • No floating point numbers allowed
    • CIDs encoded with CBOR tag 42
    • Specific integer encoding rules (canonical lengths)
  2. DRISL-CBOR Decoding - Parse and validate CBOR data against AT Protocol rules
  3. JSON ↔ CBOR Round-tripping - Lossless conversion between representations:
    • $link objects ↔ CID tag 42
    • $bytes objects ↔ CBOR byte strings
    • $type discriminator preservation
  4. CID Computation - Generate CIDs from CBOR-encoded data:
    • CID v1 with SHA-256 hash
    • DRISL codec (0x71) for data nodes
    • Raw codec (0x55) for blobs
  5. Data Model Validation - Validate data against abstract AT Protocol model:
    • No floats
    • Integer 64-bit signed range
    • Valid $bytes base64 encoding
    • Valid $link CID format
    • Reserved $ prefix field handling

Current state

The project uses System.Formats.Cbor for basic CAR reading, but doesn't have a proper AT Protocol data model layer that handles deterministic encoding, CID computation, or JSON↔CBOR conversion.

Why this matters

  • Required for repository commit signing and verification
  • Required for MST implementation
  • Required for firehose data processing/verification
  • Essential for PDS implementation
## Summary ATProto.NET has a CAR file reader, but lacks proper DAG-CBOR (DRISL-CBOR) encoding/decoding for the AT Protocol data model. The spec requires deterministic CBOR encoding for signing/verification and proper round-tripping between JSON and CBOR representations. ## Spec Reference From the [Data Model specification](https://atproto.com/specs/data-model): ### Required capabilities 1. **DRISL-CBOR Encoding** - Deterministic encoding following DRISL rules: - Map keys must be sorted by byte value - No floating point numbers allowed - CIDs encoded with CBOR tag 42 - Specific integer encoding rules (canonical lengths) 2. **DRISL-CBOR Decoding** - Parse and validate CBOR data against AT Protocol rules 3. **JSON ↔ CBOR Round-tripping** - Lossless conversion between representations: - `$link` objects ↔ CID tag 42 - `$bytes` objects ↔ CBOR byte strings - `$type` discriminator preservation 4. **CID Computation** - Generate CIDs from CBOR-encoded data: - CID v1 with SHA-256 hash - DRISL codec (0x71) for data nodes - Raw codec (0x55) for blobs 5. **Data Model Validation** - Validate data against abstract AT Protocol model: - No floats - Integer 64-bit signed range - Valid `$bytes` base64 encoding - Valid `$link` CID format - Reserved `$` prefix field handling ### Current state The project uses `System.Formats.Cbor` for basic CAR reading, but doesn't have a proper AT Protocol data model layer that handles deterministic encoding, CID computation, or JSON↔CBOR conversion. ### Why this matters - Required for repository commit signing and verification - Required for MST implementation - Required for firehose data processing/verification - Essential for PDS implementation
Author
Owner

Implemented in commit bc2c651. Added DagCborEncoder, DagCborDecoder, and CidComputation with full DRISL-CBOR spec compliance: deterministic encoding, CID tag 42, sorted keys, no floats, Base32Lower CID strings.

Implemented in commit bc2c651. Added DagCborEncoder, DagCborDecoder, and CidComputation with full DRISL-CBOR spec compliance: deterministic encoding, CID tag 42, sorted keys, no floats, Base32Lower CID strings.
Sign in to join this conversation.
No description provided.