Xarray doesn't have an append method because its data structures are built on top of NumPy's non-resizable arrays, so we cannot append new elements without copying the entire array. Hence, we don't implement an append
method. Instead, you should use xarray.concat
.
One usual pattern is to accumulate Dataset/DataArray objects in a list, and concatenate once at the end:
datasets = []
for example in examples:
ds = create_an_xarray_dataset(example)
datasets.append(ds)
combined = xarray.concat(datasets, dim='example')
You don't want to concatenate inside the loop -- that would make your code run in quadratic time.
Alternatively, you could allocate a single Dataset/DataArray for the result, and
fill in the values with indexing, e.g.,
dims = ('example', 'x', 'y')
combined = xarray.Dataset(
data_vars={'my_variable': (dims, np.zeros((len(examples), 100, 200)))},
coords={'example': examples})
for example in examples:
combined.loc[dict(example=example)] = create_an_xarray_dataset(example)
(Note that you always need to use indexing with square brackets like []
or .loc[]
-- assigning with sel()
and isel()
doesn't work.)
These two approaches are equally efficient -- it's really a matter of taste which one looks better to you or works better for your application.
For what it's worth, pandas has the same limitation: the append
method does indeed copy entire dataframes each time it is used. This is a perpetual surprise and source of performance issues for new users. So I do think that we made the right design decision not including it in xarray.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…