Release notes#
Version 0.10#
0.10.7 2024-04-09#
Bugfix
Handle upstream
numcodecs
bug where read-only string arrays cannot be encoded @ivirshup #1421Use in-memory sparse matrix directly to fix compatibility with
scipy
1.13
@ilan-gold #1435
Documentation
Performance
Remove
vindex
for subsettingdask.array.Array
because of its slowness and memory consumption @ilan-gold #1432
0.10.6 2024-03-11#
Bugfix
Defer import of zarr in test helpers, as scanpy CI job relies on them #1343 @ilan-gold
Writing a dataframe with non-unique column names now throws an error, instead of silently overwriting #1335 @ivirshup
Bring optimization from #1233 to indexing on the whole
AnnData
object, not just the sparse dataset itself #1365 @ilan-goldFix mean slice length checking to use improved performance when indexing backed sparse matrices with boolean masks along their major axis #1366 @ilan-gold
Fixed overflow occurring when writing dask arrays with sparse chunks by always writing dask arrays with 64 bit indptr and indices, and adding an overflow check to
.append
method of sparse on disk structures #1348 @ivirshupModified
ValueError
message for invalid.X
during construction to show more helpful list instead of ambiguous__name__
#1395 @eroellPin
array-api-compat!=1.5
to avoid incorrect implementation ofasarray
#1411 @ivirshup
Documentation
Development
0.10.5 2024-01-25#
Bugfix
Fix outer concatenation along variables when only a subset of objects had an entry in layers #1291 @ivirshup
Fix comparison of >2d arrays in
uns
during concatenation #1300 @ivirshupFix bug (introduced in 0.10.4) where indexing an AnnData with
list[bool]
would return the wrong result #1332 @ivirshup
Documentation
Re-add search-as-you-type, this time via
readthedocs-sphinx-search
#1311 @flying-sheep
Performance
BaseCompressedSparseDataset
’sindptr
is cached #1266 @ilan-goldImproved performance when indexing backed sparse matrices with boolean masks along their major axis #1233 @ilan-gold
0.10.4 2024-01-04#
Bugfix
Only try to use
Categorical.map(na_action=…)
in actually supported Pandas ≥2.1 #1226 @flying-sheepAnnData.__sizeof__()
support for backed datasets #1230 @Neah-Koadata[:, []]
now returns anAnnData
object empty on the appropriate dimensions instead of erroring #1243 @ilan-goldadata.X[mask]
works in newernumpy
versions whenX
isbacked
#1255 @ilan-goldadata.X[...]
fixed forX
as aBaseCompressedSparseDataset
withzarr
backend #1265 @ilan-goldImprove read/write error reporting #1273 @flying-sheep
Documentation
Improve aligned mapping error messages #1252 @flying-sheep
0.10.3 2023-10-31#
Bugfix
Prevent pandas from causing infinite recursion when setting a slice of a categorical column #1211 @flying-sheep
Documentation
Stop showing “Support for Awkward Arrays is currently experimental” warnings when reading, concatenating, slicing, or transposing AnnData objects #1182 @flying-sheep
Other updates
Fail canary CI job when tests raise unexpected warnings. #1182 @flying-sheep
0.10.2 2023-10-11#
Bugfix
Added compatibility layer for packages relying on
anndata._core.sparse_dataset.SparseDataset
. Note that this API is deprecated and new code should useCSRDataset
,CSCDataset
, andsparse_dataset()
instead. #1185 @ivirshupHandle deprecation warning from
pd.Categorical.map
thrown duringanndata.concat
#1189 @flying-sheep @ivirshupFixed extra steps being included in IO tracebacks #1193 @flying-sheep
as_dense
argument ofwrite_h5ad
no longer writes an array without encoding metadata #1193 @flying-sheep
Performance
Improved performance of
concat_on_disk
with dense arrays in some cases #1169 @selmanozleyen
0.10.1 2023-10-08#
Bugfix
0.10.0 2023-10-06#
Features
GPU Support
Dense and sparse
CuPy
arrays are now supported #1066 @ivirshupOnce you have
CuPy
arrays in your anndata, use it with:rapids-singlecell
from v0.9+
anndata now has GPU enabled CI. Made possibly by a grant from CZI’s EOSS program and managed via Cirun #1066 #1084 @Zethson @ivirshup
Out of core
Concatenate on-disk anndata objects with
anndata.experimental.concat_on_disk()
#955 @selmanozleyenAnnData can now hold dask arrays with
scipy.sparse.spmatrix
chunks #1114 @ivirshupPublic API for interacting with on disk sparse arrays:
sparse_dataset()
,CSRDataset
, andCSCDataset
#765 @ilan-gold @ivirshupImproved performance for simple slices of OOC sparse arrays #1131 @ivirshup
Improved errors and warnings
Improved error messages when combining dataframes with duplicated column names #1029 @ivirshup
Improved warnings when modifying views of
AlingedMappings
#1016 @flying-sheep @ivirshupAnnDataReadError
s have been removed. The original error is now thrown with additional information in a note #1055 @ivirshup
Documentation
Added zarr examples to file format docs #1162 @ivirshup
Breaking changes
anndata.AnnData.transpose()
no longer copies unnecessarily. If you rely on the copying behavior, call.copy
on the resulting object. #1114 @ivirshup
Other updates
Bump minimum python version to 3.9 #1117 @flying-sheep
Deprecations
Deprecate
anndata.read
, which was just an alias foranndata.read_h5ad()
#1108 @ivirshup.dtype
argument toAnnData
constructor is now deprecated #1153 @ivirshup
Bug fixes
Fix shape inference on initialization when
X=None
is specified #1121 @flying-sheep
Version 0.9#
0.9.2 2023-07-25#
Bugfix
Views of
awkward.Array
s now work withawkward>=2.3
#1040 @ivirshupFix ufuncs of views like
adata.X[:10].cov(axis=0)
returning views #1043 @flying-sheepFix instantiating AnnData where
.X
is aDataFrame
with an integer valued index #1002 @flying-sheepFix
read_zarr()
when used onzarr.Group
#1057 @ivirshup
0.9.1 2023-04-11#
Bugfix
0.9.0 2023-04-11#
Features
Added experimental support for dask arrays #813 @syelman @rahulbshrestha
obsm
,varm
anduns
can now hold AwkwardArrays #647 @giovp, @grst, @ivirshupAdded experimental functions
anndata.experimental.read_dispatched()
andanndata.experimental.write_dispatched()
which allow customizing IO with a callback #873 @ilan-gold @ivirshupBetter error messages during IO #734 @flying-sheep, @ivirshup
Unordered categorical columns are no longer cast to object during
anndata.concat()
#763 @ivirshup
Documentation
New tutorials for experimental features
File format description now includes a more formal specification #882 @ivirshup
Interoperability: new page on interoperability with other packages #831 @ivirshup
Expanded docstring more documentation for
backed
argument ofanndata.read_h5ad()
#812 @jeskowagnerDocumented how to use alternative compression methods for the
h5ad
file format, seeAnnData.write_h5ad()
#857 @nigeil
Breaking changes
The
AnnData
dtype
argument no longer defaults tofloat32
#854 @ivirshupPreviously deprecated
force_dense
argumentAnnData.write_h5ad()
has been removed. #855 @ivirshupPreviously deprecated behaviour around storing adjacency matrices in
uns
has been removed #866 @ivirshup
Other updates
Deprecations
AnnData.concatenate()
is now deprecated in favour ofanndata.concat()
#845 @ivirshup
Bug fixes
Fixed order dependent outer concatenation bug #904 @ivirshup, reported by @szalata
Fixed bug in renaming categories #790 @ivirshup, reported by @perrin-isir
Fixed IO bug when keys in
uns
ended in_categories
#806 @ivirshup, reported by @HrovatinFixed
raw.to_adata
not populatingobs
aligned values whenraw
was assigned through the setter #939 @ivirshup
Version 0.8#
0.8.1 the future#
Bug fixes
Fix warning from
rename_categories
#790 I VirshupRemove backwards compat checks for categories in
uns
when we can tell the file is new enough #790 I VirshupCategorical arrays are now created with a python
bool
instead of anumpy.bool_
#856
Documentation
0.8.0 14th March, 2022#
IO Specification
Warning
The on disk format of AnnData objects has been updated with this release.
Previous releases of anndata
will not be able to read all files written by this version.
For discussion of possible future solutions to this issue, see #698
Internal handling of IO has been overhauled.
This should make it much easier to support new datatypes, use partial access, and use AnnData
internally in other formats.
Each element should be tagged with an
encoding_type
andencoding_version
. See updated docs on the file formatSupport for nullable integer and boolean data arrays. More data types to come!
Experimental support for low level access to the IO API via
read_elem()
andwrite_elem()
Features
Added PyTorch dataloader
AnnLoader
and lazy concatenation objectAnnCollection
. See the tutorials #416 S RybakovCompatibility with
h5ad
files written from Julia #569 I KatsMany logging messages that should have been warnings are now warnings #650 I Virshup
Significantly more efficient
anndata.read_umi_tools()
#661 I VirshupFixed deepcopy of a copy of a view retaining sparse matrix view mixin type #670 M Klein
In many cases
X
can now beNone
#463 R Cannoodt #677 I Virshup. Remaining work is documented in #467.Removed hard
xlrd
dependency I Virshupobs
andvar
dataframes are no longer copied by default onAnnData
instantiation #371 I Virshup
Bug fixes
Fixed issue where
.copy
was creating sparse matrices views when copying #670 michalk8Fixed issue where
.X
matrix read in fromzarr
would always havefloat32
values #701 I VirshupRaw.to_adata`
now includesobsp
in the output #404 G Eraslan
Dependencies
xlrd
dropped as a hard dependencyNow requires
h5py
v3.0.0
or newer
Version 0.7#
0.7.8 9 November, 2021#
Bug fixes
Re-include test helpers #641 I Virshup
0.7.7 9 November, 2021#
Bug fixes
Fixed propagation of import error when importing
write_zarr
but not all dependencies are installed #579 R HilljeFixed issue with
.uns
sub-dictionaries being referenced by copies #576 I VirshupFixed out-of-bounds integer indices not raising
IndexError
#630 M KleinFixed backed
SparseDataset
indexing with scipy 1.7.2 #638 I Virshup
Development processes
Use PEPs 621 (standardized project metadata), 631 (standardized dependencies), and 660 (standardized editable installs) #639 I Virshup
0.7.6 11 April, 2021#
New features
Added
anndata.AnnData.to_memory()
for returning an in memory object from a backed one #470 #542 V Bergen I Virshupanndata.AnnData.write_loom()
now writesobs_names
andvar_names
using theIndex
’s.name
attribute, if set #538 I Virshup
Bug fixes
Fixed bug where
np.str_
column names errored at write time #457 I VirshupFixed “value.index does not match parent’s axis 0/1 names” error triggered when a data frame is stored in obsm/varm after obs_names/var_names is updated #461 G Eraslan
Fixed
adata.write_csvs
whenadata
is a view #462 I VirshupFixed null values being converted to strings when strings are converted to categorical #529 I Virshup
Fixed handling of compression key word arguments #536 I Virshup
Fixed copying a backed
AnnData
from changing which file the original object points at #533 ilia-katsFixed a bug where calling
AnnData.concatenate
anAnnData
with no variables would error #537 I Virshup
Deprecations
Passing positional arguments to
anndata.read_loom()
besides the path is now deprecated #538 I Virshupanndata.read_loom()
argumentsobsm_names
andvarm_names
are now deprecated in favour ofobsm_mapping
andvarm_mapping
#538 I Virshup
0.7.5 12 November, 2020#
Functionality
Added ipython tab completion and a useful return from
.keys
toadata.uns
#415 I Virshup
Bug fixes
0.7.4 10 July, 2020#
Concatenation overhaul #378 I Virshup
New function
anndata.concat()
for concatenatingAnnData
objects along either observations or variablesNew documentation section: Concatenation
Functionality
AnnData object created from dataframes with sparse values will have sparse
.X
#395 I Virshup
Bug fixes
0.7.3 20 May, 2020#
Bug fixes
Fixed bug where graphs used too much memory when copying #381 I Virshup
0.7.2 15 May, 2020#
Concatenation overhaul I Virshup
Elements of
uns
can now be merged, see #350Outer joins now work for
layers
andobsm
, see #352Fill value for outer joins can now be specified
Expect improvements in performance, see #303
Functionality
obs_names_make_unique()
is now better at making values unique, and will warn if ambiguities arise #345 M Weidenobsp
is now preferred for storing pairwise relationships between observations. In practice, this means there will be deprecation warnings and reformatting applied to objects which stored connectivities underuns["neighbors"]
. Square matrices inuns
will no longer be sliced (use.{obs,var}p
instead). #337 I VirshupImplicitModificationWarning
is now exported #315 P AngererBetter support for
ndarray
subclasses stored inAnnData
objects #335 michalk8
Bug fixes
Fixed inplace modification of
Index
objects by the make unique function #348 I VirshupPassing ambiguous keys to
obs_vector()
andvar_vector()
now throws errors #340 I VirshupFix instantiating
AnnData
objects fromDataFrame
#316 P AngererFixed indexing into
AnnData
objects with arrays likeadata[adata[:, gene].X > 0]
#332 I VirshupFixed type of version #315 P Angerer
0.7.0 22 January, 2020#
Warning
Breaking changes introduced between 0.6.22.post1
and 0.7
:
Elements of
AnnData
s don’t have their dimensionality reduced when the main object is subset. This is to maintain consistency when subsetting. See discussion in #145.Internal modules like
anndata.core
are private and their contents are not stable: See #174.The old deprecated attributes
.smp*
..add
and.data
have been removed.
View overhaul #164
Indexing into a view no longer keeps a reference to intermediate view, see #62.
Views are now lazy. Elements of view of AnnData are not indexed until they’re accessed.
Indexing with scalars no longer reduces dimensionality of contained arrays, see #145.
All elements of AnnData should now follow the same rules about how they’re subset, see #145.
Can now index by observations and variables at the same time.
IO overhaul #167
Reading and writing has been overhauled for simplification and speed.
Time and memory usage can be half of previous in typical use cases
Zarr backend now supports sparse arrays, and generally is closer to having the same features as HDF5.
Backed mode should see significant speed and memory improvements for access along compressed dimensions and IO. PR #241.
Categorical
s can now be ordered (PR #230) and written to disk with a large number of categories (PR #217).
Mapping attributes overhaul (obsm, varm, layers, …)
New attributes
obsp
andvarp
have been added for two dimensional arrays where each axis corresponds to a single axis of the AnnData object. PR #207.These are intended to store values like cell-by-cell graphs, which are currently stored in
uns
.Sparse arrays are now allowed as values in all mapping attributes.
All mapping attributes now share an implementation and will have the same behaviour. PR #164.
Miscellaneous improvements
Version 0.6#
0.6.* 2019-*-*#
better support for aligned mappings (obsm, varm, layers)
0.6.22
#155 I Virshupconvenience accessors
obs_vector()
,var_vector()
for 1d arrays.0.6.21
#144 I Virshupcompatibility with Scipy >=1.3 by removing
IndexMixin
dependency.0.6.20
#151 P Angererbug fix for second-indexing into views.
0.6.19
P Angererbug fix for reading excel files.
0.6.19
A Wolfchanged default compression to
None
inwrite_h5ad()
to speed up read and write, disk space use is usually less critical.0.6.16
A Wolfmaintain dtype upon copy.
0.6.13
A Wolflayers
inspired by .loom files allows their information lossless reading viaread_loom()
.0.6.7
–0.6.9
#46 & #48 S Rybakovsupport for reading zarr files:
read_zarr()
0.6.7
#38 T Whiteinitialization from pandas DataFrames
0.6.
A Wolfiteration over chunks
chunked_X()
andchunk_X()
0.6.1
#20 S Rybakov
0.6.0 1 May, 2018#
compatibility with Seurat converter
tremendous speedup for
concatenate()
bug fix for deep copy of unstructured annotation after slicing
bug fix for reading HDF5 stored single-category annotations
'outer join'
concatenation: adds zeros for concatenation of sparse data and nans for dense databetter memory efficiency in loom exports
Version 0.5#
0.5.0 9 February, 2018#
inform about duplicates in
var_names
and resolve them usingvar_names_make_unique()
automatically remove unused categories after slicing
read/write .loom files using loompy 2
fixed read/write for a few text file formats
read UMI tools files:
read_umi_tools()
Version 0.4#
0.4.0 23 December, 2017#
read/write .loom files
scalability beyond dataset sizes that fit into memory: see this blog post
AnnData
has araw
attribute, which simplifies storing the data matrix when you consider it raw: see the clustering tutorial