Research Showcase / Level 1 Unit

VRAXION Byte Embedder

A 16-dimensional byte embedding unit that round-trips every input byte exactly, trains through a nonlinear C19 encoder, and deploys as a baked int8 lookup table with zero runtime compute.

100% Lossless roundtrip
288B int4 weights
4.1KB Baked LUT
24 C19 neurons
Mirrored tied-weight decode stays exact in both modes: neural float latent and baked LUT latent.
Byte 0x41 / “A” Δ max --
Signed bits
Neural float latent
Baked LUT latent
Current deploy path: float model for training and inspection, baked LUT for zero-compute inference.
02 / 06 · Architecture

Encoder nonlinear, decoder mirrored, latent exact

The encoder path is nonlinear C19; the decoder is mirrored, tied, and linear. This asymmetry is the winning shape: expressive enough to separate bytes, simple enough to decode them back exactly.

A
8 signed input bits

The byte enters as an 8-dimensional {-1,+1} signal.

B
24 C19 hidden units

Int4 weights plus learnable C19 parameters create the nonlinear encoder.

C
16D latent, tied mirror back

The same learned weight geometry is reused in reverse to reconstruct the original byte.

Key idea: keep the intelligence in the encoder, keep the decoder honest.
Input 8 signed bits {-1,+1}
Encoder 24 C19 neurons nonlinear
Latent 16D vector float + LUT deploy
Recovered byte 8 signed bits roundtrip exact
640B
4.1KB
256/256
03 / 06 · Breakthrough

L-BFGS unlocks C19, staged freeze keeps int4 exact

The optimizer story and the quantization story are separate, but they stack: L-BFGS finds the exact float solution, then a 96-step staged int4 freeze preserves it.

Baseline optimizer

Adam

67%

Gets trapped on C19’s oscillating surface and fails to reach the exact roundtrip frontier.

Winning optimizer

L-BFGS

100%

Uses curvature information to converge to the exact byte-embedder solution.

1

Train float baseline

Reach 100% roundtrip with float weights and the full nonlinear encoder intact.

2

Freeze next int4 stage

Lock the next staged chunk into int4. Public story stays on the 96-step schedule throughout.

3

Re-optimise remaining float slack

Fine-tune after each freeze step so the exact roundtrip never drops.

04 / 06 · Results

What the current artifact delivers

The current byte embedder is exact as a unit, deployable as a LUT, and still meaningfully useful downstream. The chart below mixes current artifact points with smaller historical deploy variants for context.

256/256Lossless roundtrip
40.72%Next-byte accuracy
+5.2ppAbove linear baseline
664Total parameters
640BStored model
96Freeze steps
Precision / size tradeoff Current artifact + historical context
05 / 06 · Live demo

Switch between neural float and baked LUT views

This is now a truthful dual-path viewer. One mode shows the neural encoder’s float latent. The other shows the baked LUT latent used for zero-compute deployment.

A 0x41
byte → bits → neural latent → decoded bits Verifying…
Deployment truth Δ max --
06 / 06 · Inspect

Inspect the learned unit from three angles

One slide, three lenses: raw neuron shapes, signed weight structure, and the resulting geometry in embedding space.

W1 — Encoder (8 × 24)

W2 — Output (24 × 16)

Weight distribution