Geotiff-to-xarray

Generate a xarray.Dataset from a list of geotiffs.

from pathlib import Path
import datetime

import xarray as xr
import rioxarray
print(xr.__version__)
print(rioxarray.__version__)
2025.3.1
0.18.2
geotiff_dir = Path("...")
def get_date(filename: Path) -> datetime.datetime:
    """Parse date from filename - adapt to your needs"""
    f = filename.stem
    date_str = f.split("_")[-1]
    return datetime.datetime.strptime(date_str, "%Y-%m-%d")

Get list of geotiffs

geotiffs = list(geotiff_dir.glob("*.tif"))
geotiffs.sort(key=get_date)

Perform the actual read

Important: The geotiffs themselves have to be tiled to enable chunked loading/processing (see this github issue), just setting chunks={...} in .open_rasterio is not sufficient.

da_s = [
    rioxarray.open_rasterio(
        raster,
        default_name="snow_cover",
        chunks={"band": 1, "x": 1024, "y": 1024},
    ).squeeze(dim="band", drop=True)
    for raster in geotiffs
]
time_var = xr.Variable("time", [get_date(img) for img in geotiffs])

Concatenate aling time dimension

ds = xr.concat(da_s, dim=time_var).to_dataset()
# write source directory into attributes to preserve the origin
ds.attrs["base_dir"] = geotiff_dir.as_posix()

Write to disk as zarr

ds.to_zarr("data.zarr")