Home

Author

Luiz Alberto Crispiniano Garcia

Published

November 11, 2025

Weightless Neural Networks for Language Modeling

This research explores whether Weightless Neural Networks (WNNs) — specifically RAM-based neurons — can serve as a foundation for language modeling, traditionally dominated by weighted transformer architectures.

Current Focus

Building a complete RAM-based language model pipeline:

Key Results

All models evaluated on WikiText-2 with GPT-2 tokenizer (50,257 vocab). CE = Cross-Entropy, PPL = Perplexity, Acc = top-1 accuracy.

Architecture CE PPL Acc Details
Random baseline 10.82 50,257 0.002% Uniform prediction (methodology)
Tiered RAMLM (50K clusters) ~10.20 ~27,000 ~4.9% 5-tier, EMPTY=0.0 (results)
BitwiseRAMLM (16 clusters) ~9.11 ~9,000 ~6.4% Per-bit prediction (results)
Target: GPT-2 (Radford et al. 2019)
GPT-2 Small (124M) 3.38 29.41 Zero-shot
GPT-2 Medium (355M) 3.12 22.76 Zero-shot
GPT-2 Large (774M) 2.99 19.93 Zero-shot
GPT-2 XL (1.5B) 2.91 18.34 Zero-shot

Weekly Updates

Follow the research progress in the Blog.

Repository

All code is at github.com/lacg/wnn.

References

DeepSeek-AI, and Liang Wenfeng. 2026. “Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models.” https://arxiv.org/abs/2601.07372.
Glover, F. 1989. “Tabu Search – Part i.” ORSA Journal on Computing 1 (3): 190–206. https://doi.org/10.1287/ijoc.1.1.1.
———. 1990. “Tabu Search – Part II.” ORSA Journal on Computing 2 (1): 4–32. https://doi.org/10.1287/ijoc.2.1.4.
Goldberg, David E. 1989. Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley. https://archive.org/details/geneticalgorithm0000gold.
Holland, John H. 1975. Adaptation in Natural and Artificial Systems. University of Michigan Press. https://archive.org/details/adaptationinnatu0000holl.
Radford, Alec, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. “Language Models Are Unsupervised Multitask Learners.” OpenAI. https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf.

Reuse