Rahul Chalamala – Research

LoLCATs: On Low-Rank Linearizing of Large Language Models

Michael Zhang, Simran Arora, Rahul Chalamala, Benjamin Spector, Alan Wu, Krithik Ramesh, Aaryan Singhal, Christopher Ré

ICLR, 2025

LoLCATs (Low-rank Linear Conversion via Attention Transfer) is a method to create efficient subquadratic LLMs from existing Transformers. LoLCATs replaces softmax with linear attentions and fine-tunes the model with low-rank adaptation, enabling linear-time and constant-memory generation in open-source LLMs.

arXiv together blog hazy blog 1 hazy blog 2 code

RedPajama: an Open Dataset for Training Large Language Models

Maurice Weber, Daniel Y. Fu, Quentin Anthony, Yonatan Oren, Shane Adams, Anton Alexandrov, Xiaozhong Lyu, Huu Nguyen, Xiaozhe Yao, Virginia Adams, Ben Athiwaratkun, Rahul Chalamala, Kezhen Chen, Max Ryabinin, Tri Dao, Percy Liang, Christopher Ré, Irina Rish, Ce Zhang

NeurIPS, 2024
- Spotlight Presentation

RedPajama-V2 is an open dataset for training large language models. The dataset includes over 100B text documents coming from 84 CommonCrawl snapshots and processed using the CCNet pipeline. Out of these, there are 30B documents in the corpus that additionally come with quality signals, and 20B documents that are deduplicated.

arXiv together blog code

LeanDojo: Theorem Proving with Retrieval-Augmented Language Models

Kaiyu Yang, Aidan Swope, Alex Gu, Rahul Chalamala, Peiyang Song, Shixing Yu, Saad Godil, Ryan Prenger, Anima Anandkumar

NeurIPS, 2023
- Oral Presentation

LeanDojo is an open-source Lean playground that provides tools, data, models, and benchmarks for LLM-based theorem proving. It features ReProver, a retrieval-augmented prover, and a comprehensive benchmark of 98,734 theorems to facilitate research and development.

arXiv project code

Spectrum Safety: Compatibility of NTS-3 Signals with GNSS Signals

Rahul Chalamala, Joanna Hinks

ION Joint Navigation Conference, 2022
- Oral Presentation

The interference of NTS-3 signals with GNSS signals was assessed by developing a Python-based framework to evaluate Spectral Separation Coefficients under ITU-R guidelines. Preliminary findings show minimal interference, and several strategies are proposed to mitigate any potential issues.

abstract