Dataset¶
Header file: <libs/zarr/dataset.hpp>
[source]
-
template<typename Store>
class Dataset¶ A class representing a dataset made from a Zarr group (i.e. collection of Zarr arrays) in a storage system.
This class provides functionality to create a dataset as a group of arrays obeying the Zarr storage specification version 2 (https://zarr.readthedocs.io/en/stable/spec/v2.html) that is also compatible with Xarray and NetCDF.
- Template Parameters:
Store – The type of the store object used by the dataset.
Public Functions
-
inline explicit Dataset(Store &store)¶
Constructs a Dataset with the specified store object.
This constructor initializes a Dataset with the provided store object by initialising a ZarrGroup and writing some additional metatdata for Xarray and NetCDF.
- Parameters:
store – The store object associated with the Dataset.
-
inline size_t get_dimension(const std::string &dimname) const¶
Returns the size of an existing dimension in the dataset.
- Parameters:
dimname – A string for the name of the dimension in the dataset.
- Returns:
The size of (i.e. number of elements along) the dimension.
-
inline void set_dimension(const std::pair<std::string, size_t> &dim)¶
Sets the size of an existing dimension in the dataset.
- Parameters:
dim – A pair containing the name of the dimension and its new size to be set.
-
inline void set_decomposition(CartesianDecomposition decomposition)¶
Sets the decomposition maps for correctly writing data out.
- Parameters:
decomposition – A CartesianDecomposition instance with the domain decomposition
-
inline void set_max_superdroplets(unsigned int max_superdroplets)¶
Sets the maximum number of superdroplets for data allocation, comes from the config file.
- Parameters:
max_superdroplets – The maximum number of superdroplets of the model
-
template<typename T>
inline XarrayZarrArray<Store, T> create_array(const std::string_view name, const std::string_view units, const double scale_factor, const std::vector<size_t> &chunkshape, const std::vector<std::string> &dimnames) const¶ Creates a new array in the dataset.
- Template Parameters:
T – The data type of the array.
- Parameters:
name – The name of the new array.
units – The units of the array data.
scale_factor – The scale factor of array data.
chunkshape – The shape of the chunks of the array.
dimnames – The names of each dimension of the array.
- Returns:
An instance of XarrayZarrArray representing the newly created array.
-
template<typename T>
inline XarrayZarrArray<Store, T> create_coordinate_array(const std::string_view name, const std::string_view units, const double scale_factor, const size_t chunksize, const size_t dimsize)¶ Creates a new 1-D array for a coordinate of the dataset.
- Template Parameters:
T – The data type of the coordinate array.
- Parameters:
name – The name of the new coordinate.
units – The units of the coordinate.
scale_factor – The scale factor of the coordinate data.
chunksize – The size of each 1-D chunk of the coordinate array.
dimsize – The initial size of the coordinate (number of elements along array).
- Returns:
An instance of XarrayZarrArray representing the newly created coordinate array.
-
template<typename T>
inline XarrayZarrArray<Store, T> create_ragged_array(const std::string_view name, const std::string_view units, const double scale_factor, const std::vector<size_t> &chunkshape, const std::vector<std::string> &dimnames, const std::string_view sampledimname) const¶ Creates a new ragged array in the dataset.
- Template Parameters:
T – The data type of the array.
- Parameters:
name – The name of the new array.
units – The units of the array data.
scale_factor – The scale factor of array data.
chunkshape – The shape of the chunks of the array.
dimnames – The names of each dimension of the array.
sampledimname – The names of the sample dimension of the array.
- Returns:
An instance of XarrayZarrArray representing the newly created ragged array.
-
template<typename T>
inline XarrayZarrArray<Store, T> create_raggedcount_array(const std::string_view name, const std::string_view units, const double scale_factor, const std::vector<size_t> &chunkshape, const std::vector<std::string> &dimnames, const std::string_view sampledimname) const¶ Creates a new raggedcount array in the dataset.
- Template Parameters:
T – The data type of the array.
- Parameters:
name – The name of the new array.
units – The units of the array data.
scale_factor – The scale factor of array data.
chunkshape – The shape of the chunks of the array.
dimnames – The names of each dimension of the array.
sampledimname – The names of the sample dimension of the array.
- Returns:
An instance of XarrayZarrArray representing the newly created raggedcount array.
-
template<typename T>
inline void write_arrayshape(XarrayZarrArray<Store, T> &xzarr) const¶ Calls array’s shape function to ensure the shape of the array matches the dimensions of the dataset.
- Template Parameters:
T – The data type of the array.
- Parameters:
xzarr – An instance of XarrayZarrArray representing the array.
Calls array’s shape function to ensure the shape of the array matches the dimensions of the dataset.
- Template Parameters:
T – The data type of the array.
- Parameters:
xzarr_ptr – A shared pointer to the instance of XarrayZarrArray representing the array.
-
template<typename T>
inline void write_ragged_arrayshape(XarrayZarrArray<Store, T> &xzarr) const¶ Calls array’s shape function to write the shape of the array for a ragged array.
- Template Parameters:
T – The data type of the array.
- Parameters:
xzarr – An instance of XarrayZarrArray representing the array.
-
template<typename T>
inline void write_to_array(XarrayZarrArray<Store, T> &xzarr, const typename Buffer<T>::viewh_buffer h_data) const¶ Writes data from Kokkos view in host memory to a Zarr array in the dataset and calls function to ensure the shape of the array matches the dimensions of the dataset.
Function writes data to an array in the dataset and updates the metadata for the shape of the array to ensure the size of each dimension of the array is consistent with the dimensions of the dataset.
- Template Parameters:
T – The data type of the array.
- Parameters:
xzarr – An instance of XarrayZarrArray representing the array.
h_data – The data to be written to the array.
Writes data from Kokkos view in host memory to a Zarr array in the dataset and calls function to ensure the shape of the array matches the dimensions of the dataset.
Function writes data to an array in the dataset and updates the metadata for the shape of the array to ensure the size of each dimension of the array is consistent with the dimensions of the dataset.
- Template Parameters:
T – The data type of the array.
- Parameters:
xzarr_ptr – A shared pointer to the instance of XarrayZarrArray representing the array.
h_data – The data to be written to the array.
Writes 1 data element to a Zarr array in the dataset and calls function to ensure the shape of the array matches the dimensions of the dataset.
Function writes 1 data element to an array in the dataset and updates the metadata for the shape of the array to ensure the size of each dimension of the array is consistent with the dimensions of the dataset.
- Template Parameters:
T – The data type of the array.
- Parameters:
xzarr_ptr – A shared pointer to the instance of XarrayZarrArray representing the array.
data – The data element to be written to the array.
-
template<typename T>
inline void write_to_ragged_array(XarrayZarrArray<Store, T> &xzarr, const typename Buffer<T>::viewh_buffer h_data) const¶ Writes data from Kokkos view in host memory to a Zarr array in the dataset and calls function to ensure the shape of the array matches the dimensions of the dataset.
Function writes data to an array in the dataset and updates the metadata for the shape of the array to ensure the size of each dimension of the array is consistent with the dimensions of the dataset.
- Template Parameters:
T – The data type of the array.
- Parameters:
xzarr – An instance of XarrayZarrArray representing the array.
h_data – The data to be written to the array.
-
inline explicit Dataset(Store &store)
Constructs a Dataset with the specified store object.
This constructor initializes a Dataset with the provided store object by initialising a ZarrGroup and writing some additional metatdata for Xarray and NetCDF.
- Parameters:
store – The store object associated with the Dataset.
-
inline size_t get_dimension(const std::string &dimname) const
Returns the size of an existing dimension in the dataset.
- Parameters:
dimname – A string for the name of the dimension in the dataset.
- Returns:
The size of (i.e. number of elements along) the dimension.
-
inline void set_dimension(const std::pair<std::string, size_t> &dim)
Sets the size of an existing dimension in the dataset.
- Parameters:
dim – A pair containing the name of the dimension and its new size to be set.
-
template<typename T>
inline XarrayZarrArray<Store, T> create_array(const std::string_view name, const std::string_view units, const double scale_factor, const std::vector<size_t> &chunkshape, const std::vector<std::string> &dimnames) const Creates a new array in the dataset.
- Template Parameters:
T – The data type of the array.
- Parameters:
name – The name of the new array.
units – The units of the array data.
scale_factor – The scale factor of array data.
chunkshape – The shape of the chunks of the array.
dimnames – The names of each dimension of the array.
- Returns:
An instance of XarrayZarrArray representing the newly created array.
-
template<typename T>
inline XarrayZarrArray<Store, T> create_coordinate_array(const std::string_view name, const std::string_view units, const double scale_factor, const size_t chunksize, const size_t dimsize) Creates a new 1-D array for a coordinate of the dataset.
- Template Parameters:
T – The data type of the coordinate array.
- Parameters:
name – The name of the new coordinate.
units – The units of the coordinate.
scale_factor – The scale factor of the coordinate data.
chunksize – The size of each 1-D chunk of the coordinate array.
dimsize – The initial size of the coordinate (number of elements along array).
- Returns:
An instance of XarrayZarrArray representing the newly created coordinate array.
-
template<typename T>
inline XarrayZarrArray<Store, T> create_ragged_array(const std::string_view name, const std::string_view units, const double scale_factor, const std::vector<size_t> &chunkshape, const std::vector<std::string> &dimnames, const std::string_view sampledimname) const Creates a new ragged array in the dataset.
- Template Parameters:
T – The data type of the array.
- Parameters:
name – The name of the new array.
units – The units of the array data.
scale_factor – The scale factor of array data.
chunkshape – The shape of the chunks of the array.
dimnames – The names of each dimension of the array.
sampledimname – The names of the sample dimension of the array.
- Returns:
An instance of XarrayZarrArray representing the newly created ragged array.
-
template<typename T>
inline XarrayZarrArray<Store, T> create_raggedcount_array(const std::string_view name, const std::string_view units, const double scale_factor, const std::vector<size_t> &chunkshape, const std::vector<std::string> &dimnames, const std::string_view sampledimname) const Creates a new raggedcount array in the dataset.
- Template Parameters:
T – The data type of the array.
- Parameters:
name – The name of the new array.
units – The units of the array data.
scale_factor – The scale factor of array data.
chunkshape – The shape of the chunks of the array.
dimnames – The names of each dimension of the array.
sampledimname – The names of the sample dimension of the array.
- Returns:
An instance of XarrayZarrArray representing the newly created raggedcount array.
-
template<typename T>
inline void write_arrayshape(XarrayZarrArray<Store, T> &xzarr) const Calls array’s shape function to ensure the shape of the array matches the dimensions of the dataset.
- Template Parameters:
T – The data type of the array.
- Parameters:
xzarr – An instance of XarrayZarrArray representing the array.
-
template<typename T>
inline void write_arrayshape(const std::shared_ptr<XarrayZarrArray<Store, T>> xzarr_ptr) const Calls array’s shape function to ensure the shape of the array matches the dimensions of the dataset.
- Template Parameters:
T – The data type of the array.
- Parameters:
xzarr_ptr – A shared pointer to the instance of XarrayZarrArray representing the array.
-
template<typename T>
inline void write_ragged_arrayshape(XarrayZarrArray<Store, T> &xzarr) const Calls array’s shape function to write the shape of the array for a ragged array.
- Template Parameters:
T – The data type of the array.
- Parameters:
xzarr – An instance of XarrayZarrArray representing the array.
-
template<typename T>
inline void write_to_array(XarrayZarrArray<Store, T> &xzarr, const typename Buffer<T>::viewh_buffer h_data) const Writes data from Kokkos view in host memory to a Zarr array in the dataset and calls function to ensure the shape of the array matches the dimensions of the dataset.
Function writes data to an array in the dataset and updates the metadata for the shape of the array to ensure the size of each dimension of the array is consistent with the dimensions of the dataset.
- Template Parameters:
T – The data type of the array.
- Parameters:
xzarr – An instance of XarrayZarrArray representing the array.
h_data – The data to be written to the array.
-
template<typename T>
inline void write_to_array(const std::shared_ptr<XarrayZarrArray<Store, T>> xzarr_ptr, const typename Buffer<T>::viewh_buffer h_data) const Writes data from Kokkos view in host memory to a Zarr array in the dataset and calls function to ensure the shape of the array matches the dimensions of the dataset.
Function writes data to an array in the dataset and updates the metadata for the shape of the array to ensure the size of each dimension of the array is consistent with the dimensions of the dataset.
- Template Parameters:
T – The data type of the array.
- Parameters:
xzarr_ptr – A shared pointer to the instance of XarrayZarrArray representing the array.
h_data – The data to be written to the array.
-
template<typename T>
inline void write_to_array(const std::shared_ptr<XarrayZarrArray<Store, T>> xzarr_ptr, const T data) const Writes 1 data element to a Zarr array in the dataset and calls function to ensure the shape of the array matches the dimensions of the dataset.
Function writes 1 data element to an array in the dataset and updates the metadata for the shape of the array to ensure the size of each dimension of the array is consistent with the dimensions of the dataset.
- Template Parameters:
T – The data type of the array.
- Parameters:
xzarr_ptr – A shared pointer to the instance of XarrayZarrArray representing the array.
data – The data element to be written to the array.
-
template<typename T>
inline void write_to_ragged_array(XarrayZarrArray<Store, T> &xzarr, const typename Buffer<T>::viewh_buffer h_data) const Writes data from Kokkos view in host memory to a Zarr array in the dataset and calls function to ensure the shape of the array matches the dimensions of the dataset.
Function writes data to an array in the dataset and updates the metadata for the shape of the array to ensure the size of each dimension of the array is consistent with the dimensions of the dataset.
- Template Parameters:
T – The data type of the array.
- Parameters:
xzarr – An instance of XarrayZarrArray representing the array.
h_data – The data to be written to the array.
Private Functions
-
inline void collect_distributed_dim_size(const std::pair<std::string, size_t> &dim)¶
Collects the distributed process-local size of dimensions.
- Parameters:
dim – A pair with the dimension name and local size
-
template<typename T>
inline Kokkos::View<T*, HostSpace> collect_global_data(Kokkos::View<T*, HostSpace> data, std::vector<std::string> dimnames) const¶ Collects the distributed process-local data for a write.
- Parameters:
data – A Kokkos view containing the data to be collected
dimnames – The names of the dimensions related to the array
-
template<typename T>
inline void correct_gridbox_data(std::string dimension, T *target, T *source) const¶ Correcly orders global data following the global gridbox order.
Given a source and a target array, correctly orders data from the source on the target following the global gridbox ordering. Should be called only on process 0.
- Parameters:
dimension – The dimension name which should be the gridboxes
source – The array to take the data from
target – The array to write the data to, according to the global gridbox order
-
inline void collect_global_array(float *target, float *local_source, int local_size, int *receive_counts, int *receive_displacements) const¶
Wrapper for MPI gatherv call for a float array.
-
inline void collect_global_array(unsigned int *target, unsigned int *local_source, int local_size, int *receive_counts, int *receive_displacements) const¶
Wrapper for MPI gatherv call for a unsigned int array.
-
inline void collect_global_array(size_t *target, size_t *local_source, int local_size, int *receive_counts, int *receive_displacements) const¶
Wrapper for MPI gatherv call for a unsigned int array.
-
inline void add_dimension(const std::pair<std::string, size_t> &dim)¶
Adds a dimension to the dataset.
- Parameters:
dim – A pair containing the name and size of the dimension to be added.
-
inline void add_dimension(const std::pair<std::string, size_t> &dim)
Adds a dimension to the dataset.
- Parameters:
dim – A pair containing the name and size of the dimension to be added.
Private Members
-
ZarrGroup<Store> group¶
< Reference to the zarr group object. map from name of each dimension in dataset to their size
Reference to the zarr group object.
-
std::unordered_map<std::string, size_t> datasetdims¶
map from name of each dimension in dataset to their size
-
CartesianDecomposition decomposition¶
-
std::shared_ptr<std::vector<unsigned int>> global_superdroplet_ordering¶
map from name of each dimension in dataset to their size
-
std::unordered_map<std::string, std::vector<size_t>> distributed_datasetdims¶
-
int my_rank¶
-
int comm_size¶