Home

Author

Luiz Alberto Crispiniano Garcia

Published

November 11, 2025

Weightless Neural Networks for Language Modeling

This research explores whether Weightless Neural Networks (WNNs) — specifically RAM-based neurons — can serve as a foundation for language modeling, traditionally dominated by weighted transformer architectures.

Current Focus

Building a complete RAM-based language model pipeline:

BitwiseRAMLM: 16-cluster per-bit output architecture with log-product token reconstruction
Connectivity optimization: Genetic Algorithm (Holland 1975; Goldberg 1989) + Tabu Search (Glover 1989, 1990) over neuron connectivity patterns
Rust+Metal acceleration: Full pipeline on Apple Silicon (16 CPU + 40 GPU cores)
Gating mechanisms: Content-based filtering inspired by DeepSeek’s Engram (DeepSeek-AI and Wenfeng 2026) architecture

Key Results

All models evaluated on WikiText-2 with GPT-2 tokenizer (50,257 vocab). CE = Cross-Entropy, PPL = Perplexity, Acc = top-1 accuracy.

Architecture	CE	PPL	Acc	Details
Random baseline	10.82	50,257	0.002%	Uniform prediction (methodology)
Tiered RAMLM (50K clusters)	~10.20	~27,000	~4.9%	5-tier, EMPTY=0.0 (results)
BitwiseRAMLM (16 clusters)	~9.11	~9,000	~6.4%	Per-bit prediction (results)

Target: GPT-2				(Radford et al. 2019)
GPT-2 Small (124M)	3.38	29.41	–	Zero-shot
GPT-2 Medium (355M)	3.12	22.76	–	Zero-shot
GPT-2 Large (774M)	2.99	19.93	–	Zero-shot
GPT-2 XL (1.5B)	2.91	18.34	–	Zero-shot

Weekly Updates

Follow the research progress in the Blog.

Repository

All code is at github.com/lacg/wnn.

References

DeepSeek-AI, and Liang Wenfeng. 2026. “Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models.” https://arxiv.org/abs/2601.07372.

Glover, F. 1989. “Tabu Search – Part i.” ORSA Journal on Computing 1 (3): 190–206. https://doi.org/10.1287/ijoc.1.1.1.

———. 1990. “Tabu Search – Part II.” ORSA Journal on Computing 2 (1): 4–32. https://doi.org/10.1287/ijoc.2.1.4.

Goldberg, David E. 1989. Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley. https://archive.org/details/geneticalgorithm0000gold.

Holland, John H. 1975. Adaptation in Natural and Artificial Systems. University of Michigan Press. https://archive.org/details/adaptationinnatu0000holl.

Radford, Alec, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. “Language Models Are Unsupervised Multitask Learners.” OpenAI. https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf.

Reuse

CC BY-NC-SA 4.0