SciPy 2024

Madhav Kashyap

Madhav Kashyap is a Graduate Student at the University of Washington majoring in Computational Linguistics and Natural Language Processing. His recent work as a Graduate Research Assistant at the UW Scientific Software Engineering Center (eScience Institute) has been to develop open-source software used by oceanographers in measuring seafloor tectonic shifts to the centimeter level. His Thesis focuses on system optimizations for faster Information Retrieval in Large Language Model workflows. As a Backend Software Engineer at Akamai, he has industry experience coding robust Python and Go systems powering enterprise cybersecurity.

The speaker's profile picture

Sessions

07-09
13:30
240min
Generative AI Copilot for Scientific Software – a RAG-Based Approach using OLMo
Don Setiawan, Anshul Tambay, Cordero Core, Niki Burggraf, Anant Mittal, Vani Mandava, Ishika Khandelwal, Anuj Sinha, Madhav Kashyap

Generative AI systems built upon large language models (LLMs) have shown great promise as tools that enable people to access information through natural conversation. Scientists can benefit from the breakthroughs these systems enable to create advanced tools that will help accelerate their research outcomes. This tutorial will cover: (1) the basics of language models, (2) setting up the environment for using open source LLMs without the use of expensive compute resources needed for training or fine-tuning, (3) learning a technique like Retrieval-Augmented Generation (RAG) to optimize output of LLM, and (4) build a “production-ready” app to demonstrate how researchers could turn disparate knowledge bases into special purpose AI-powered tools. The right audience for our tutorial is scientists and research engineers who want to use LLMs for their work.

Tutorials
Ballroom D