SciPy 2023

Matt Harrison

Matt is a corporate trainer, author, and consultant on Python and Data Science. He has a CS degree from Stanford University. He is a best-selling author on Python and Data subjects. His books: Effective Pandas, Illustrated Guide to Learning Python 3, Intermediate Python, Learning the Pandas Library, and Effective PyCharm have all been best-selling books on Amazon. He just published Machine Learning Pocket Reference and Pandas Cookbook (Second Edition). He has taught courses at large companies (Netflix, NASA, Verizon, Adobe, HP, Exxon, and more), Universities (Stanford, University of Utah, BYU), as well as small companies. He has been using Python since 2000 and has taught thousands through live training both online and in person.

The speaker's profile picture

Sessions

07-11
08:00
240min
Idiomatic Pandas
Matt Harrison

Pandas can be tricky, and there is a lot of bad advice floating around. This tutorial will cut through some of the biggest issues I've seen with Pandas code after working with the library for a while and writing three books on it.

We will discuss:

  • Proper types
  • Chaining
  • Aggregation
  • Debugging
Tutorials
Classroom 106
07-13
13:15
55min
[BoF Room 103] PyArrow in pandas and Dask
Patrick Hoefler, James Bourbeau, Matt Harrison

DataFrame libraries in general, pandas and Dask specifically, are moving towards a better integration with PyArrow. This has many benefits, like improved performance and a reduced memory footprint. We want to connect with users to discuss how PyArrow can improve DataFrame libraries and what they expect out of PyArrow support. This can include things like improved performance, more consistent behavior or better interoperability with other libraries.

Birds of a Feather (BoF)
Classroom 103