Release notes

Version 0.8

0.8.0 14th March, 2022

IO Specification

Warning

The on disk format of AnnData objects has been updated with this release. Previous releases of anndata will not be able to read all files written by this version.

For discussion of possible future solutions to this issue, see issue 698

Internal handling of IO has been overhauled. This should make it much easier to support new datatypes, use partial access, and use AnnData internally in other formats.

Each element should be tagged with an encoding_type and encoding_version. See updated docs on the file format
Support for nullable integer and boolean data arrays. More data types to come!
Experimental support for low level access to the IO API via read_elem() and write_elem()

Features

Added PyTorch dataloader AnnLoader and lazy concatenation object AnnCollection. See the tutorials PR 416 S Rybakov
Compatibility with h5ad files written from Julia PR 569 I Kats
Many logging messages that should have been warnings are now warnings PR 650 I Virshup
Significantly more efficient anndata.read_umi_tools() PR 661 I Virshup
Fixed deepcopy of a copy of a view retaining sparse matrix view mixin type PR 670 M Klein
In many cases X can now be None PR 463 R Cannoodt PR 677 I Virshup. Remaining work is documented in issue 467.
Removed hard xlrd dependency I Virshup
obs and var dataframes are no longer copied by default on AnnData instantiation issue 371 I Virshup

Bug fixes

Fixed issue where .copy was creating sparse matrices views when copying PR 670 michalk8
Fixed issue where .X matrix read in from zarr would always have float32 values PR 701 I Virshup
Raw.to_adata` now includes obsp in the output PR 404 G Eraslan

Dependencies

xlrd dropped as a hard dependency
Now requires h5py v3.0.0 or newer

Version 0.7

0.7.8 9 November, 2021

Bug fixes

Re-include test helpers PR 641 I Virshup

0.7.7 9 November, 2021

Bug fixes

Fixed propagation of import error when importing write_zarr but not all dependencies are installed PR 579 R Hillje
Fixed issue with .uns sub-dictionaries being referenced by copies PR 576 I Virshup
Fixed out-of-bounds integer indices not raising IndexError PR 630 M Klein
Fixed backed SparseDataset indexing with scipy 1.7.2 PR 638 I Virshup

Development processes

Use PEPs 621 (standardized project metadata), 631 (standardized dependencies), and 660 (standardized editable installs) PR 639 I Virshup

0.7.6 11 April, 2021

New features

Added anndata.AnnData.to_memory() for returning an in memory object from a backed one PR 470 PR 542 V Bergen I Virshup
anndata.AnnData.write_loom() now writes obs_names and var_names using the Index’s .name attribute, if set PR 538 I Virshup

Bug fixes

Fixed bug where np.str_ column names errored at write time PR 457 I Virshup
Fixed “value.index does not match parent’s axis 0/1 names” error triggered when a data frame is stored in obsm/varm after obs_names/var_names is updated PR 461 G Eraslan
Fixed adata.write_csvs when adata is a view PR 462 I Virshup
Fixed null values being converted to strings when strings are converted to categorical PR 529 I Virshup
Fixed handling of compression key word arguments PR 536 I Virshup
Fixed copying a backed AnnData from changing which file the original object points at PR 533 ilia-kats
Fixed a bug where calling AnnData.concatenate an AnnData with no variables would error PR 537 I Virshup

Deprecations

Passing positional arguments to anndata.read_loom() besides the path is now deprecated PR 538 I Virshup
anndata.read_loom() arguments obsm_names and varm_names are now deprecated in favour of obsm_mapping and varm_mapping PR 538 I Virshup

0.7.5 12 November, 2020

Functionality

Added ipython tab completion and a useful return from .keys to adata.uns PR 415 I Virshup

Bug fixes

Compatibility with h5py>=3 strings PR 444 I Virshup
Allow adata.raw = None, as is documented PR 447 I Virshup
Fix warnings from pandas 1.1 PR 425 I Virshup

0.7.4 10 July, 2020

Concatenation overhaul PR 378 I Virshup

New function anndata.concat() for concatenating AnnData objects along either observations or variables
New documentation section: Concatenation

Functionality

AnnData object created from dataframes with sparse values will have sparse .X PR 395 I Virshup

Bug fixes

Fixed error from AnnData.concatenate by bumping minimum versions of numpy and pandas issue 385
Fixed colors being incorrectly changed when AnnData object was subset PR 388

0.7.3 20 May, 2020

Bug fixes

Fixed bug where graphs used too much memory when copying PR 381 I Virshup

0.7.2 15 May, 2020

Concatenation overhaul I Virshup

Elements of uns can now be merged, see PR 350
Outer joins now work for layers and obsm, see PR 352
Fill value for outer joins can now be specified
Expect improvments in performance, see issue 303

Functionality

obsp and varp can now be transposed PR 370 A Wolf
obs_names_make_unique() is now better at making values unique, and will warn if ambiguities arise PR 345 M Weiden
obsp is now preferred for storing pairwise relationships between observations. In practice, this means there will be deprecation warnings and reformatting applied to objects which stored connectivities under uns["neighbors"]. Square matrices in uns will no longer be sliced (use .{obs,var}p instead). PR 337 I Virshup
ImplicitModificationWarning is now exported PR 315 P Angerer
Better support for ndarray subclasses stored in AnnData objects PR 335 michalk8

Bug fixes

Fixed inplace modification of Index objects by the make unique function PR 348 I Virshup
Passing ambiguous keys to obs_vector() and var_vector() now throws errors PR 340 I Virshup
Fix instantiating AnnData objects from DataFrame PR 316 P Angerer
Fixed indexing into AnnData objects with arrays like adata[adata[:, gene].X > 0] PR 332 I Virshup
Fixed type of version PR 315 P Angerer
Fixed deprecated import from pandas PR 319 P Angerer

0.7.0 22 January, 2020

Warning

Breaking changes introduced between 0.6.22.post1 and 0.7:

Elements of AnnDatas don’t have their dimensionality reduced when the main object is subset. This is to maintain consistency when subsetting. See discussion in issue 145.
Internal modules like anndata.core are private and their contents are not stable: See issue 174.
The old deprecated attributes .smp*. .add and .data have been removed.

View overhaul PR 164

Indexing into a view no longer keeps a reference to intermediate view, see issue 62.
Views are now lazy. Elements of view of AnnData are not indexed until they’re accessed.
Indexing with scalars no longer reduces dimensionality of contained arrays, see issue 145.
All elements of AnnData should now follow the same rules about how they’re subset, see issue 145.
Can now index by observations and variables at the same time.

IO overhaul PR 167

Reading and writing has been overhauled for simplification and speed.
Time and memory usage can be half of previous in typical use cases
Zarr backend now supports sparse arrays, and generally is closer to having the same features as HDF5.
Backed mode should see significant speed and memory improvements for access along compressed dimensions and IO. PR PR 241.
Categoricals can now be ordered (PR PR 230) and written to disk with a large number of categories (PR PR 217).

Mapping attributes overhaul (obsm, varm, layers, …)

New attributes obsp and varp have been added for two dimensional arrays where each axis corresponds to a single axis of the AnnData object. PR PR 207.
These are intended to store values like cell-by-cell graphs, which are currently stored in uns.
Sparse arrays are now allowed as values in all mapping attributes.
DataFrames are now allowed as values in obsm and varm.
All mapping attributes now share an implementation and will have the same behaviour. PR PR 164.

Miscellaneous improvements

Mapping attributes now have ipython tab completion (e.g. adata.obsm["\t can provide suggestions) PR PR 183.
AnnData attributes are now delete-able (e.g. del adata.raw) PR PR 242.
Many many bug fixes

Version 0.6

0.6.* 2019--

better support for aligned mappings (obsm, varm, layers) 0.6.22 PR 155 I Virshup
convenience accesors obs_vector(), var_vector() for 1d arrays. 0.6.21 PR 144 I Virshup
compatibility with Scipy >=1.3 by removing IndexMixin dependency. 0.6.20 PR 151 P Angerer
bug fix for second-indexing into views. 0.6.19 P Angerer
bug fix for reading excel files. 0.6.19 A Wolf
changed default compression to None in write_h5ad() to speed up read and write, disk space use is usually less critical. 0.6.16 A Wolf
maintain dtype upon copy. 0.6.13 A Wolf
layers inspired by .loom files allows their information lossless reading via read_loom(). 0.6.7–0.6.9 PR 46 & PR 48 S Rybakov
support for reading zarr files: read_zarr() 0.6.7 PR 38 T White
initialization from pandas DataFrames 0.6. A Wolf
iteration over chunks chunked_X() and chunk_X() 0.6.1 PR 20 S Rybakov

0.6.0 1 May, 2018

compatibility with Seurat converter
tremendous speedup for concatenate()
bug fix for deep copy of unstructured annotation after slicing
bug fix for reading HDF5 stored single-category annotations
'outer join' concatenation: adds zeros for concatenation of sparse data and nans for dense data
better memory efficiency in loom exports

Version 0.5

0.5.0 9 February, 2018

inform about duplicates in var_names and resolve them using var_names_make_unique()
automatically remove unused categories after slicing
read/write .loom files using loompy 2
fixed read/write for a few text file formats
read UMI tools files: read_umi_tools()

Version 0.4

0.4.0 23 December, 2017

read/write .loom files
scalability beyond dataset sizes that fit into memory: see this blog post
AnnData has a raw attribute, which simplifies storing the data matrix when you consider it raw: see the clustering tutorial

Release notes

Version 0.8

0.8.0 14th March, 2022

Version 0.7

0.7.8 9 November, 2021

0.7.7 9 November, 2021

0.7.6 11 April, 2021

0.7.5 12 November, 2020

0.7.4 10 July, 2020

0.7.3 20 May, 2020

0.7.2 15 May, 2020

0.7.0 22 January, 2020

Version 0.6

0.6.* 2019-*-*

0.6.0 1 May, 2018

Version 0.5

0.5.0 9 February, 2018

Version 0.4

0.4.0 23 December, 2017

0.6.* 2019--