tokio.connectors._hdf5 module

Helper classes and functions used by the HDF5 connector

This contains some of the black magic required to make older H5LMT files compatible with the TOKIO HDF5 schemas and API.

class tokio.connectors._hdf5.MappedDataset(map_function=None, map_kwargs=None, transpose=False, force2d=False, *args, **kwargs)[source]

Bases: h5py._hl.dataset.Dataset

h5py.Dataset that applies a function to the results of __getitem__ before returning the data. Intended to dynamically generate certain datasets that are simple derivatives of others.


Apply the map function to the result of the parent class and return that transformed result instead. Transpose is very ugly, but required for h5lmt support.

__init__(map_function=None, map_kwargs=None, transpose=False, force2d=False, *args, **kwargs)[source]

Configure a MappedDatset

Attach a map function to a h5py.Dataset (or derivative) and store the arguments to be fed into that map function whenever this object gets sliced.

  • map_function (function) – function to be called on the value returned when parent class is sliced
  • map_kwargs (dict) – kwargs to be passed into map_function
  • transpose (bool) – when True, transpose the results of map_function before returning them. Required by some H5LMT datasets.
  • force2d (bool) – when True, convert a 1d array into a 2d array with a single column. Required by some H5LMT datasets.
tokio.connectors._hdf5._apply_timestep(return_value, parent_dataset, func=<function <lambda>>)[source]

Apply a transformation function to a return value

Transforms the data returned when slicing a h5py.Dataset object by applying a function to the dataset’s values. For example if return_value are ‘counts per timestep’ and you want to convert to ‘counts per second’, you would specify func=lambda x, y: x * y

  • return_value – the value returned when slicing h5py.Dataset
  • parent_dataset – the h5py.Dataset which generated return_value
  • func – a function which takes two arguments: the first is return_value, and the second is the timestep of parent_dataset

A modified version of return_value (usually a numpy.ndarray)

tokio.connectors._hdf5._one_column(return_value, col_idx, apply_timestep_func=None, parent_dataset=None)[source]

Extract a specific column from a dataset

  • return_value – the value returned by the parent DataSet object that we will modify
  • col_idx – the column index for the column we are demultiplexing
  • apply_timestep_func (function) – if provided, apply this function with return_value as the first argument and the timestep of parent_dataset as the second.
  • parent_dataset (Dataset) – if provided, indicates that return_value should be divided by the timestep of parent_dataset to convert values to rates before returning

A modified version of return_value (usually a numpy.ndarray)

tokio.connectors._hdf5.convert_counts_rates(hdf5_file, from_key, to_rates, *args, **kwargs)[source]

Convert a dataset between counts/sec and counts/timestep

Retrieve a dataset from an HDF5 file, convert it to a MappedDataset, and attach a multiply/divide function to it so that subsequent slices return a transformed set of data.

  • hdf5_file (h5py.File) – object from which dataset should be loaded
  • from_key (str) – dataset name key to load from hdf5_file
  • to_rates (bool) – convert from per-timestep to per-sec (True) or per-sec to per-timestep (False)

A MappedDataset configured to convert to/from rates when dereferenced

tokio.connectors._hdf5.demux_column(hdf5_file, from_key, column, apply_timestep_func=None, *args, **kwargs)[source]

Extract a single column from an HDF5 dataset

MappedDataset map function to present a single column from a dataset as an entire dataset. Required to bridge the h5lmt metadata table (which encodes all metadata ops in a single dataset) and the TOKIO HDF5 format (which encodes a single metadata op per dataset)

  • hdf5_file (h5py.File) – the HDF5 file containing the dataset of interest
  • from_key (str) – the dataset name from which a column should be extracted
  • column (str) – the column heading to be returned
  • transpose (bool) – transpose the dataset before returning it

A MappedDataset configured to extract a single column when dereferenced

tokio.connectors._hdf5.get_timestamps(hdf5_file, dataset_name)[source]

Return the timestamps dataset for a given dataset name

tokio.connectors._hdf5.get_timestamps_key(hdf5_file, dataset_name)[source]

Read into an HDF5 file and extract the name of the dataset containing the timestamps correspond to the given dataset_name

tokio.connectors._hdf5.map_dataset(hdf5_file, from_key, *args, **kwargs)[source]

Create a MappedDataset

Creates a MappedDataset from an h5py.File (or derivative). Functionally similar to h5py.File.__getitem__().


Divide a dataset name into is base and modifier

Parameters:dataset_name (str) – Key to reference a dataset that may or may not have a modifier suffix
Returns:First string is the base key, the second string is the modifier.
Return type:tuple of (str, str or None)