About
What is this project?
This is an open research project investigating Weightless Neural Networks (WNNs) as an alternative to weighted neural networks for language modeling. The core question: can RAM-based lookup neurons — with O(1) inference and no floating-point weights — learn to predict language?
Origin
The project started as an LLM prompt optimizer (hence the repository name), using meta-heuristic algorithms (GA, TS, SA) with a WNN surrogate model. During early architecture work, the WNN language model itself became the primary research focus.
Approach
Rather than training weights via gradient descent, we:
- Binary encoding: Tokens are encoded as binary vectors. Each neuron observes a random subset of input bits (partial connectivity).
- Memory-based learning: Neurons store input-output mappings in ternary memory (TRUE/FALSE/EMPTY). New patterns are written directly — no gradients needed.
- Connectivity optimization: GA and TS search over which input bits each neuron observes. This is the generalization mechanism — neurons that see the right features generalize to unseen inputs.
- Architecture search: Per-cluster neurons and bits are optimized, creating asymmetric architectures where frequent tokens get more resources.
Key Architectures
Tiered RAMLM (50K+ clusters, one per vocabulary token):
- Tokens grouped into frequency tiers with different neuron/bit allocations
- Asymmetric configs (e.g., 100 tokens @ 20 bits, 400 @ 12, rest @ 8) outperform uniform by 35%
- Limited by sparse training data for rare tokens
BitwiseRAMLM (16 clusters, one per output bit):
- Predicts P(bit_i=1 | context) for each of 16 bits
- Token probabilities reconstructed via log-product over bits
- Every neuron sees ALL training examples (vs ~20 per rare token in tiered)
- Currently the best-performing architecture
Hardware
Mac Studio M4 Max (2025): 16 CPU cores, 40 Metal GPU cores, 64 GB unified memory.
Custom Rust+Metal accelerator provides ~800x speedup over pure Python, enabling population-based optimization with 50+ candidate architectures evaluated in parallel.
Tools
- Dashboard: SvelteKit web app for experiment management, real-time monitoring via WebSocket
- iOS app: SwiftUI companion for monitoring experiments on iPad/iPhone
- Rust accelerator: PyO3-based Python extension with Metal GPU compute shaders
Status
| Phase | Architecture search and optimization |
| Repository | github.com/lacg/wnn |
| License | CC BY-NC-SA 4.0 (prose), MIT (code) |
| Contact | lacg@me.com |
How to cite
@software{wnn_lm,
author = {Crispiniano Garcia, Luiz Alberto},
title = {Weightless Neural Networks for Language Modeling},
year = {2025},
url = {https://github.com/lacg/wnn},
license = {CC BY-NC-SA 4.0}
}