Xarray-Beam

Xarray-Beam is a Python library for building Apache Beam pipelines with Xarray datasets.

The project aims to facilitate data transformations and analysis on large-scale multi-dimensional labeled arrays, such as:

Ad-hoc computation on Xarray data, by dividing a xarray.Dataset into many smaller pieces ("chunks").
Adjusting array chunks, using the Rechunker algorithm.
Ingesting large, multi-dimensional array datasets into an analysis-ready, cloud-optimized format, namely Zarr (see also Pangeo Forge).
Calculating statistics (e.g., "climatology") across distributed datasets with arbitrary groups.

For more about our approach and how to get started, read the documentation!

Warning: Xarray-Beam is a sharp tool 🔪

Xarray-Beam is relatively new, and focused on expert users:

We use it extensively at Google for processing large-scale weather datasets, but there is not yet a vibrant external community.
It provides low-level abstractions that facilitate writing very large scale data pipelines (e.g., 100+ TB), but by design it requires explicitly thinking about how every operation is parallelized.

Installation

Xarray-Beam requires recent versions of immutabledict, Xarray, Dask, Rechunker, Zarr, and Apache Beam. For best performance when writing Zarr files, use Xarray 0.19.0 or later.

Disclaimer

Xarray-Beam is an experiment that we are sharing with the outside world in the hope that it will be useful. It is not a supported Google product. We welcome feedback, bug reports and code contributions, but cannot guarantee they will be addressed.

See the "Contribution guidelines" for more.

Credits

Contributors:

Stephan Hoyer
Jason Hickey
Cenk Gazen
Alex Merose

Name		Name	Last commit message	Last commit date
Latest commit History 139 Commits
.github/workflows		.github/workflows
docs		docs
examples		examples
xarray_beam		xarray_beam
.gitignore		.gitignore
.readthedocs.yaml		.readthedocs.yaml
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
conftest.py		conftest.py
pyproject.toml		pyproject.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Xarray-Beam

Installation

Disclaimer

Credits

About

Releases 11

Contributors 8

Languages

License

google/xarray-beam

Folders and files

Latest commit

History

Repository files navigation

Xarray-Beam

Installation

Disclaimer

Credits

About

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases 11

Contributors 8

Languages