Sanket Verma
Sanket is a data scientist based out of New Delhi, India. He likes to build data science tools and products and has worked with startups, government and organisations. He loves building community and bringing everyone together and is Chair of PyData Delhi and PyData Global. Currently, he's taking care of the community and OSS at Zarr as their Community Manager.
When he’s not working, he likes to play the violin and computer games and sometimes thinks of saving the world!
Sessions
A key feature of the Python data ecosystem is the reliance on simple but efficient primitives that follow well-defined interfaces to make tools work seamlessly together (Cf. http://data-apis.org/). NumPy provides an in-memory representation for tensors. Dask provides parallelisation of tensor access. Xarray provides metadata linking tensor dimensions. Zarr provides a missing feature, namely the scalable, persistent storage for annotated hierarchies of tensors. Defined through a community process, the Zarr specification enables the storage of large out-of-memory datasets locally and in the cloud. Implementations exist in C++, C, Java, Javascript, Julia, and Python, enabling.