About

About the WNN Language Model research project.

Author

Luiz Alberto Crispiniano Garcia

Published

November 11, 2025

What is this project?

This is an open research project investigating Weightless Neural Networks (WNNs) as an alternative to weighted neural networks for language modeling. The core question: can RAM-based lookup neurons — with O(1) inference and no floating-point weights — learn to predict language?

Origin

The project started as an LLM prompt optimizer (hence the repository name), using meta-heuristic algorithms (GA, TS, SA) with a WNN surrogate model. During early architecture work, the WNN language model itself became the primary research focus.

Approach

Rather than training weights via gradient descent, we:

Binary encoding: Tokens are encoded as binary vectors. Each neuron observes a random subset of input bits (partial connectivity).
Memory-based learning: Neurons store input-output mappings in ternary memory (TRUE/FALSE/EMPTY). New patterns are written directly — no gradients needed.
Connectivity optimization: GA and TS search over which input bits each neuron observes. This is the generalization mechanism — neurons that see the right features generalize to unseen inputs.
Architecture search: Per-cluster neurons and bits are optimized, creating asymmetric architectures where frequent tokens get more resources.

Key Architectures

Tiered RAMLM (50K+ clusters, one per vocabulary token):

Tokens grouped into frequency tiers with different neuron/bit allocations
Asymmetric configs (e.g., 100 tokens @ 20 bits, 400 @ 12, rest @ 8) outperform uniform by 35%
Limited by sparse training data for rare tokens

BitwiseRAMLM (16 clusters, one per output bit):

Predicts P(bit_i=1 | context) for each of 16 bits
Token probabilities reconstructed via log-product over bits
Every neuron sees ALL training examples (vs ~20 per rare token in tiered)
Currently the best-performing architecture

Hardware

Mac Studio M4 Max (2025): 16 CPU cores, 40 Metal GPU cores, 64 GB unified memory.

Custom Rust+Metal accelerator provides ~800x speedup over pure Python, enabling population-based optimization with 50+ candidate architectures evaluated in parallel.

Tools

Dashboard: SvelteKit web app for experiment management, real-time monitoring via WebSocket
iOS app: SwiftUI companion for monitoring experiments on iPad/iPhone
Rust accelerator: PyO3-based Python extension with Metal GPU compute shaders

Status

Phase	Architecture search and optimization
Repository	github.com/lacg/wnn
License	CC BY-NC-SA 4.0 (prose), MIT (code)
Contact	lacg@me.com

How to cite

@software{wnn_lm,
  author = {Crispiniano Garcia, Luiz Alberto},
  title  = {Weightless Neural Networks for Language Modeling},
  year   = {2025},
  url    = {https://github.com/lacg/wnn},
  license = {CC BY-NC-SA 4.0}
}

Reuse

CC BY-NC-SA 4.0