Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
592 views
in Technique[技术] by (71.8m points)

python - Convert raster time series of multiple GeoTIFF images to NetCDF

I have a raster time series stored in multiple GeoTIFF files (*.tif) that I'd like to convert to a single NetCDF file. The data is uint16.

I could probably use gdal_translate to convert each image to netcdf using:

 gdal_translate -of netcdf -co FORMAT=NC4 20150520_0164.tif foo.nc

and then some scripting with NCO to extract dates from filenames and then concatenate, but I was wondering whether I might do this more effectively in Python using xarray and it's new rasterio backend.

I can read a file easily:

import glob
import xarray as xr
f = glob.glob('*.tif')
da = xr.open_rasterio(f[0]) 
da

which returns

<xarray.DataArray (band: 1, y: 5490, x: 5490)>
[30140100 values with dtype=uint16]
Coordinates:
  * band     (band) int64 1
  * y        (y) float64 5e+05 5e+05 5e+05 5e+05 5e+05 4.999e+05 4.999e+05 ...
  * x        (x) float64 8e+05 8e+05 8e+05 8e+05 8.001e+05 8.001e+05 ...
Attributes:
    crs:      +init=epsg:32620

and I can write one of these to NetCDF:

ds.to_netcdf('foo.nc')

but ideally I would be able to use something like xr.open_mfdataset , write the time values (extracted from the filenames) and then write the entire aggregation to netCDF. And have dask handle the out-of-core memory issues. :-)

Can something like this be done with xarray and dask?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Xarray should be able to do the concat step for you. I have adapted your example a bit below. It will be up to you to parse the filenames into something useful.

import glob
import pandas as pd
import xarray as xr

def time_index_from_filenames(filenames):
    '''helper function to create a pandas DatetimeIndex
       Filename example: 20150520_0164.tif'''
    return pd.DatetimeIndex([pd.Timestamp(f[:8]) for f in filenames])

filenames = glob.glob('*.tif')
time = xr.Variable('time', time_index_from_filenames(filenames))
chunks = {'x': 5490, 'y': 5490, 'band': 1}
da = xr.concat([xr.open_rasterio(f, chunks=chunks) for f in filenames], dim=time)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...