SciPy 2024

LlamaBot: a Pythonic interface to Large Language Models
07-12, 14:35–15:05 (US/Pacific), Ballroom

In this talk, I will present LlamaBot, a Pythonic and modular set of components to build command line and backend tools that leverage large language models (LLMs). During this talk, I will showcase the core design philosophy, internal architecture and dependencies, and live demo command-line applications built using LlamaBot that use both open source and API-access-only LLMs. Finally, I will conclude with a roadmap for LlamaBot development, and an invitation to contribute and shape its development during the Sprints.


LlamaBot is a Pythonic suite of components to build backend and command-line applications. It is built on top of open source tools such as LiteLLM (https://github.com/BerriAI/litellm), Ollama (https://github.com/ollama/ollama), and Chromadb (https://github.com/chroma-core/chroma).

LlamaBot's core design philosophy follows PyTorch's patterns: objects as parameterizable (and reusable) functions. It is centered on SimpleBot, which we conceive of as a function programmable by natural language that accepts strings in and returns strings. Using LlamaBot, one can quickly build natural language-steerable bots that perform routine text-based tasks. During the talk, we will explain how the underlying architecture of LlamaBot helps enable the rapid development of natural language-steerable bots.

Within the LlamaBot repository, there is a suite of example command-line tools and bots that have been developed. Of the examples, we will showcase three:

  1. llamabot zotero chat, which allows us to chat with our Zotero library at the command line, and
  2. llamabot git commit, which automatically writes commit messages based on git commit diffs.
  3. llamabot git compose-release-notes, which automatically writes release notes based on the collection of commit messages between two git tags.

Thanks to Ollama and LiteLLM, bots developed using LlamaBot can access both LLMs hosted on one's local servers (e.g. Mistral) and API-gated LLMs (e.g. GPT-4). Additionally, thanks to ChromaDB, it is easy to bootstrap simple retrieval augmented generation (RAG) applications. During the talk, we will briefly explain the design decision chose these open source tools underneath the hood.

LlamaBot has been in active development since mid-2023. As an open source project, we welcome contributions of all kind, following the all-contributors specification. LlamaBot will be a sprint project during the SciPy sprints, and we welcome new contributors and contributions of all kinds.

As Principal Data Scientist at Moderna Eric leads the Data Science and Artificial Intelligence (Research) team to accelerate science to the speed of thought. Prior to Moderna, he was at the Novartis Institutes for Biomedical Research conducting biomedical data science research with a focus on using Bayesian statistical methods in the service of discovering medicines for patients. Prior to Novartis, he was an Insight Health Data Fellow in the summer of 2017 and defended his doctoral thesis in the Department of Biological Engineering at MIT in the spring of 2017.

Eric is also an open-source software developer and has led the development of pyjanitor, a clean API for cleaning data in Python, and nxviz, a visualization package for NetworkX. He is also on the core developer team of NetworkX and PyMC. In addition, he gives back to the community through code contributions, blogging, teaching, and writing.

His personal life motto is found in the Gospel of Luke 12:48.

This speaker also appears in: