anndata - Annotated data
anndata is a Python package for handling annotated data matrices in memory and on disk, positioned between pandas and xarray. anndata offers a broad range of computationally efficient features including, among others, sparse data support, lazy operations, and a PyTorch interface.
Discuss development on GitHub.
Ask questions on the scverse Discourse.
Install via
pip install anndataorconda install anndata -c conda-forge.Consider citing the anndata paper.
See Scanpy’s documentation for usage related to single cell data. anndata was initially built for Scanpy.
News
Muon paper published 2022-02-02
Muon has been published in Genome Biology [Bredikhin22].
Muon is a framework for multimodal data built on top of AnnData.
COVID-19 datasets distributed as h5ad 2020-04-01
In a joint initiative, the Wellcome Sanger Institute, the Human Cell Atlas, and the CZI distribute datasets related to COVID-19 via anndata’s h5ad files: covid19cellatlas.org.
Latest additions
Version 0.8
0.8.0 14th March, 2022
IO Specification
Warning
The on disk format of AnnData objects has been updated with this release.
Previous releases of anndata will not be able to read all files written by this version.
For discussion of possible future solutions to this issue, see issue 698
Internal handling of IO has been overhauled.
This should make it much easier to support new datatypes, use partial access, and use AnnData internally in other formats.
Each element should be tagged with an
encoding_typeandencoding_version. See updated docs on the file formatSupport for nullable integer and boolean data arrays. More data types to come!
Experimental support for low level access to the IO API via
read_elem()andwrite_elem()
Features
Added PyTorch dataloader
AnnLoaderand lazy concatenation objectAnnCollection. See the tutorials PR 416 S RybakovCompatibility with
h5adfiles written from Julia PR 569 I KatsMany logging messages that should have been warnings are now warnings PR 650 I Virshup
Significantly more efficient
anndata.read_umi_tools()PR 661 I VirshupFixed deepcopy of a copy of a view retaining sparse matrix view mixin type PR 670 M Klein
In many cases
Xcan now beNonePR 463 R Cannoodt PR 677 I Virshup. Remaining work is documented in issue 467.Removed hard
xlrddependency I Virshupobsandvardataframes are no longer copied by default onAnnDatainstantiation issue 371 I Virshup
Bug fixes
Fixed issue where
.copywas creating sparse matrices views when copying PR 670 michalk8Fixed issue where
.Xmatrix read in fromzarrwould always havefloat32values PR 701 I VirshupRaw.to_adata`now includesobspin the output PR 404 G Eraslan
Dependencies
xlrddropped as a hard dependencyNow requires
h5pyv3.0.0or newer