I have a series of monthly gridded datasets in CSV form. I want to read them, add a few dimensions, and then write to netcdf. I've had great experience using xarray (xray) in the past so thought I'd use if for this task.
I can easily get them into a 2D DataArray with something like:
data = np.ones((360,720))
lats = np.arange(-89.75, 90, 0.5) * -1
lngs = np.arange(-179.75, 180, 0.5)
coords = {'lat': lats, 'lng':lngs}
da = xr.DataArray(data, coords=coords)
But when I try to add another dimension, which would convey information about time (all data is from the same year/month), things start to go sour.
I've tried two ways to crack this:
1) expand my input data to m x n x 1, something like:
data = np.ones((360,720))
lats = np.arange(-89.75, 90, 0.5) * -1
lngs = np.arange(-179.75, 180, 0.5)
coords = {'lat': lats, 'lng':lngs}
data = data[:,:,np.newaxis]
Then I follow the same steps as above, with coords updated to contain a third dimension.
lats = np.arange(-89.75, 90, 0.5) * -1
lngs = np.arange(-179.75, 180, 0.5)
coords = {'lat': lats, 'lng':lngs}
coords['time'] = pd.datetime(year, month, day))
da = xr.DataArray(data, coords=coords)
da.to_dataset(name='variable_name')
This is fine for creating a DataArray -- but when I try to convert to a dataset (so I can write to netCDF), I get an error about 'ValueError: Coordinate objects must be 1-dimensional'
2) The second approach I've tried is taking my dataarray, casting it to a dataframe, setting the index to ['lat','lng', 'time'] and then going back to a dataset with xr.Dataset.from_dataframe()
. I've tried this -- but it takes 20+ min before I kill the process.
Does anyone know how I can get a Dataset with a monthly 'time' dimension?
See Question&Answers more detail:
os