Performance Benchmarks
Memory & energy efficiency
at near-parity accuracy
PNN replaces dense weight matrices with a patented connectivity configuration — cutting parameters 1.7–4.5× while holding accuracy within ~0.4 pp on vision and ~1.1 pp on tabular. Because the pattern is index-free, INT8 models are up to 18× smaller than dense FP32 and stay lossless on vision.
Results
Summary — 5 seeds, iso-architecture
| Task | Metric | Dense | PNN | PNN int8 | Savings |
|---|---|---|---|---|---|
| MNIST | acc % ↑ | 97.45 | 97.02 | 97.00 | 4.5× fewer · 18× smaller int8 |
| Fashion | acc % ↑ | 88.57 | 88.17 | 88.14 | 4.5× · 18× |
| Tabular* | acc % ↑ | 88.71 | 87.65 | 87.38 | 2.4× · 9.6× |
| NLP-GPT | val loss ↓ | 2.268 | 2.387 | — | 1.7× · 6.9× |
| microGPT | op-bound speed | 1.0× | 1.59× faster | — | scalar / no-BLAS |
↑ higher = better, ↓ lower = better. NLP metric is validation cross-entropy loss. * Tabular = recommended fc2-only config. INT8 lossless on vision.
Parameters
Fewer parameters per model
Accuracy
Near-parity, shown truthfully
Test accuracy %, dense vs PNN side by side, y-axis from 0 so the small gap is shown honestly. Mean of 5 seeds. NLP-GPT uses validation loss (2.268 vs 2.387) — a different metric, omitted here.
Memory
Model footprint — dense FP32 vs PNN INT8
Megabytes, log scale. Index-free connectivity means columns are computed, not stored.
Speed & energy
Where the speed actually shows up
Op-bound speedup
Scalar microGPT (no BLAS): fwd 1.70×, bwd 1.35×. The edge / MCU regime.
CPU inference (measured C)
INT8: dense 22 µs vs PNN 25 µs. Element-level sparsity gives no win vs tuned dense GEMM.
PNN chip (est)
~1.3 µs / inference, ~10–25× better energy than a matched NVIDIA part on a deployed fixed model.
Honest regime map
Where PNN wins — and where it doesn't
Wins
- • Memory: always — 4.5× fewer params, 18× smaller INT8.
- • Accuracy: near-parity on vision, graceful on tabular (≤1.1 pp).
- • Op-bound speed: 1.6× on scalar / no-BLAS edge.
- • Custom silicon: ~1.3 µs, ~12 nJ; ~10–25× better energy than NVIDIA (est).
Limitations
- • No CPU/GPU speed win vs tuned dense GEMM — the prime gather is SIMD-hostile.
- • INT8 not universally free — lossless on vision, harmful if input layers are over-sparsified.
- • NLP gain modest (1.7×, FFN-only); attention/embeddings stay dense.
- • Hardware numbers are engineering estimates; measured CPU reality is parity.