Jacob Schreiber SciPy 2023

Jacob Schreiber
.ical

Jacob Schreiber is a post-doctoral researcher at Stanford University, where he studies human genomics using modern machine-learning tools. In his "free time," he contributes to the Python data science ecosystem in the form of pomegranate, a package for probabilistic modeling, and apricot, a package for submodular optimization for summarizing large data. In the past, he was a core developer for scikit-learn.

Session

07-12

11:25

30min

tfmodisco-lite: an attribution-based motif discovery algorithm

Jacob Schreiber

An important problem in genomics is identifying the proteins that bind to DNA. Although many methods attempt to learn DNA motifs underlying protein binding as position-weight matrices (PWMs), these PWMs cannot faithfully represent real biology. For instance, a static PWM cannot describe a zinc-finger protein whose fingers can optionally include one-nucleotide spacing. TF-MoDISco is a framework for extracting motifs using attribution scores from a machine-learning model. The learned motifs and syntax overcome many of the limitations presented by PWM. I will describe the TF-MoDISco algorithm and showcase its efficient re-implementation, tfmodisco-lite.

Bioinformatics, Computational Biology & Neuroscience

Grand Salon C

Jacob Schreiber .ical

Session

Jacob Schreiber
.ical