anndata.concat#
- anndata.concat(adatas, *, axis=0, join='inner', merge=None, uns_merge=None, label=None, keys=None, index_unique=None, fill_value=None, pairwise=False)[source]#
Concatenates AnnData objects along an axis.
See the concatenation section in the docs for a more in-depth description.
- Parameters:
- adatas
Union
[Collection
[AnnData
],Mapping
[str
,AnnData
]] The objects to be concatenated. If a Mapping is passed, keys are used for the
keys
argument and values are concatenated.- axis
Literal
[0
,1
] (default:0
) Which axis to concatenate along.
- join
Literal
['inner'
,'outer'
] (default:'inner'
) How to align values when concatenating. If “outer”, the union of the other axis is taken. If “inner”, the intersection. See concatenation for more.
- merge
Union
[Literal
['same'
,'unique'
,'first'
,'only'
],Callable
,None
] (default:None
) How elements not aligned to the axis being concatenated along are selected. Currently implemented strategies include:
None
: No elements are kept."same"
: Elements that are the same in each of the objects."unique"
: Elements for which there is only one possible value."first"
: The first element seen at each from each position."only"
: Elements that show up in only one of the objects.
- uns_merge
Union
[Literal
['same'
,'unique'
,'first'
,'only'
],Callable
,None
] (default:None
) How the elements of
.uns
are selected. Uses the same set of strategies as themerge
argument, except applied recursively.- label
str
|None
(default:None
) Column in axis annotation (i.e.
.obs
or.var
) to place batch information in. If it’s None, no column is added.- keys
Collection
|None
(default:None
) Names for each object being added. These values are used for column values for
label
or appended to the index ifindex_unique
is notNone
. Defaults to incrementing integer labels.- index_unique
str
|None
(default:None
) Whether to make the index unique by using the keys. If provided, this is the delimiter between “{orig_idx}{index_unique}{key}”. When
None
, the original indices are kept.- fill_value
Any
|None
(default:None
) When
join="outer"
, this is the value that will be used to fill the introduced indices. By default, sparse arrays are padded with zeros, while dense arrays and DataFrames are padded with missing values.- pairwise
bool
(default:False
) Whether pairwise elements along the concatenated dimension should be included. This is False by default, since the resulting arrays are often not meaningful.
- adatas
- Return type:
Notes
Warning
If you use
join='outer'
this fills 0s for sparse data when variables are absent in a batch. Use this with care. Dense data is filled withNaN
.Examples
Preparing example objects
>>> import anndata as ad, pandas as pd, numpy as np >>> from scipy import sparse >>> a = ad.AnnData( ... X=sparse.csr_matrix(np.array([[0, 1], [2, 3]])), ... obs=pd.DataFrame({"group": ["a", "b"]}, index=["s1", "s2"]), ... var=pd.DataFrame(index=["var1", "var2"]), ... varm={ ... "ones": np.ones((2, 5)), ... "rand": np.random.randn(2, 3), ... "zeros": np.zeros((2, 5)), ... }, ... uns={"a": 1, "b": 2, "c": {"c.a": 3, "c.b": 4}}, ... ) >>> b = ad.AnnData( ... X=sparse.csr_matrix(np.array([[4, 5, 6], [7, 8, 9]])), ... obs=pd.DataFrame( ... {"group": ["b", "c"], "measure": [1.2, 4.3]}, index=["s3", "s4"] ... ), ... var=pd.DataFrame(index=["var1", "var2", "var3"]), ... varm={"ones": np.ones((3, 5)), "rand": np.random.randn(3, 5)}, ... uns={"a": 1, "b": 3, "c": {"c.b": 4}}, ... ) >>> c = ad.AnnData( ... X=sparse.csr_matrix(np.array([[10, 11], [12, 13]])), ... obs=pd.DataFrame({"group": ["a", "b"]}, index=["s1", "s2"]), ... var=pd.DataFrame(index=["var3", "var4"]), ... uns={"a": 1, "b": 4, "c": {"c.a": 3, "c.b": 4, "c.c": 5}}, ... )
Concatenating along different axes
>>> ad.concat([a, b]).to_df() var1 var2 s1 0 1 s2 2 3 s3 4 5 s4 7 8 >>> ad.concat([a, c], axis=1).to_df() var1 var2 var3 var4 s1 0 1 10 11 s2 2 3 12 13
Inner and outer joins
>>> inner = ad.concat([a, b]) # Joining on intersection of variables >>> inner AnnData object with n_obs × n_vars = 4 × 2 obs: 'group' >>> (inner.obs_names, inner.var_names) (Index(['s1', 's2', 's3', 's4'], dtype='object'), Index(['var1', 'var2'], dtype='object')) >>> outer = ad.concat([a, b], join="outer") # Joining on union of variables >>> outer AnnData object with n_obs × n_vars = 4 × 3 obs: 'group', 'measure' >>> outer.var_names Index(['var1', 'var2', 'var3'], dtype='object') >>> outer.to_df() # Sparse arrays are padded with zeroes by default var1 var2 var3 s1 0 1 0 s2 2 3 0 s3 4 5 6 s4 7 8 9
Keeping track of source objects
>>> ad.concat({"a": a, "b": b}, label="batch").obs group batch s1 a a s2 b a s3 b b s4 c b >>> ad.concat([a, b], label="batch", keys=["a", "b"]).obs # Equivalent to previous group batch s1 a a s2 b a s3 b b s4 c b >>> ad.concat({"a": a, "b": b}, index_unique="-").obs group s1-a a s2-a b s3-b b s4-b c
Combining values not aligned to axis of concatenation
>>> ad.concat([a, b], merge="same") AnnData object with n_obs × n_vars = 4 × 2 obs: 'group' varm: 'ones' >>> ad.concat([a, b], merge="unique") AnnData object with n_obs × n_vars = 4 × 2 obs: 'group' varm: 'ones', 'zeros' >>> ad.concat([a, b], merge="first") AnnData object with n_obs × n_vars = 4 × 2 obs: 'group' varm: 'ones', 'rand', 'zeros' >>> ad.concat([a, b], merge="only") AnnData object with n_obs × n_vars = 4 × 2 obs: 'group' varm: 'zeros'
The same merge strategies can be used for elements in
.uns
>>> dict(ad.concat([a, b, c], uns_merge="same").uns) {'a': 1, 'c': {'c.b': 4}} >>> dict(ad.concat([a, b, c], uns_merge="unique").uns) {'a': 1, 'c': {'c.a': 3, 'c.b': 4, 'c.c': 5}} >>> dict(ad.concat([a, b, c], uns_merge="only").uns) {'c': {'c.c': 5}} >>> dict(ad.concat([a, b, c], uns_merge="first").uns) {'a': 1, 'b': 2, 'c': {'c.a': 3, 'c.b': 4, 'c.c': 5}}