Orion Weller

I’m a fourth-year PhD student at the Center for Language and Speech Processing at Johns Hopkins University, advised by Benjamin Van Durme and Dawn Lawrie. I am broadly interested in natural language processing (NLP), information retrieval (IR), and machine learning (ML). My research is graciously supported by a NSF Graduate Research Fellowship.

My current research interests are situated between the NLP and IR fields, where I work to improve how models find, understand, and generate information. These days my research interests fall in three main categories, although I can get distracted by other LLM-based topics:

Retrieval models: pre-training modern encoder models, figuring out how to evaluate them comprehensively, and giving them new capabilites such as being instructable/prompted retrievers or using test-time compute.
Retrieval-Augmented Generation (RAG): working towards better RAG evaluations and improving RAG performance (often through better retrieval)
Language model pre-training data: understanding why LMs act they way they do, curating corpora for pre-training or using pre-training information to help LM generation

Previously I graduated with my Bachelor’s degree from Brigham Young University in computer science and statistics, where I was advised by Kevin Seppi and Quinn Snell.

I am currently interning at Google Deepmind with Jinhyuk Lee, Michael Boratko, and Iftekhar Naim.

In the past, I’ve spent time interning with many excellent mentors: at Samaya AI in 2024 with Jack Hessel, Ashwin Paranjape, and Yuhao Zhang, at Semantic Scholar/AI2 working with Luca Soldaini, Kyle Lo, and Arman Cohan in 2023, at Apple AI/ML with Matthias Sperber in 2020 and 2021, and at AllenNLP/AI2 with Matt Gardner and Matthew Peters in 2020.

If you’re interested in getting in contact with me, please email me at {last_name}{first_name}@gmail.com.