07-14, 13:15–13:45 (America/Chicago), Grand Salon C
Allegro and FLARE are two very different packages for constructing machine learning potentials that are fast, accurate, and suitable for extreme-scale molecular dynamics simulations. Allegro uses PyTorch for efficient equivariant potentials with state-of-the-art accuracy, while FLARE is a sparse Gaussian process potential with an optimized C++ training backend leveraging Kokkos, OpenMP, and MPI for state-of-the-art performance, and a user-friendly Python frontend. We will compare and contrast the two methods, discuss lessons learned, and show spectacular scientific applications.
Molecular dynamics is a common method for studying molecules and materials at the atomistic level, in which the dynamics of atoms are simulated directly using Newton’s equations of motion. This requires a model for the forces between the atoms, often referred to as a potential. Traditionally, there have been two approaches to computing the interatomic forces. First, there are empirical potentials, which are based on simple, physically motivated functional forms with a few parameters that are fit to match experimental measurements of material properties. These models are fast, but they have limited accuracy and are hard to transfer between applications. The alternative is quantum mechanical methods, which are highly accurate. In return, they are computationally expensive and have limited scalability.
In recent years, machine learning potentials (MLPs) have emerged as a compromise in terms of accuracy and computational efficiency. The idea is to generate a small amount of training data with a quantum mechanical method. The MLP learns to reproduce its forces and energies and can be used for large and long-timescale molecular dynamics simulations with an accuracy approaching that of the quantum mechanical method.
Allegro and FLARE are two drastically different MLPs. FLARE approximates the energy of an atom as a sparse Gaussian process (SGP) as a function of the atom’s local environment. The environment is encoded in a rotationally invariant vector with high descriptive power. By using an invariant descriptor, FLARE correctly respects the symmetry of the problem. Allegro, on the other hand, exploits the symmetry of the problem by using an equivariant neural network, i.e., a neural network where tensor product layers force the features to systematically transform with the input. While more computationally demanding, the added symmetry information allows Allegro and other equivariant models to be significantly more accurate and data-efficient than traditional models.
For extreme-scale simulations, scalability and performance are of utmost importance. Through its model design of avoiding message passing, Allegro is the only scalable equivariant neural network potential, with excellent performance demonstrated up to 100 million atoms. FLARE, being a simpler model, takes this to the extreme and has achieved record scalability and performance, simulating 0.5 trillion atoms on 27,336 NVIDIA V100 GPUs.
On the implementation side, Allegro and FLARE are also very different. Allegro is implemented in Python with PyTorch, which allows for a high-level implementation with excellent GPU performance through the JIT compiler. FLARE has a low-level training backend written in C++ with OpenMP, MPI, and Kokkos. The C++ code is conveniently wrapped for Python use with pybind11.
In this talk, we will compare and contrast these two methods, discuss lessons learned, and show spectacular scientific applications.
Links:
Allegro repository: https://github.com/mir-group/allegro
Allegro paper: https://www.nature.com/articles/s41467-023-36329-y
FLARE repository: https://github.com/mir-group/flare
FLARE LAMMPS active learning tutorial: https://bit.ly/flarelmpotf
Preprint on FLARE scalability: https://arxiv.org/abs/2204.12573
I am a PhD student in Applied Physics in the group of Boris Kozinsky at Harvard SEAS. My focus is on machine learning interatomic potentials for molecular dynamics simulations, in particular on how to make them fast on modern hardware architecture and large supercompters.
GitHub: @anjohan