tokio.cli.archive_mmperfmon module

Retrieves mmperfmon counters and store them in TOKIO Timeseries format

Command-line tool that loads a tokio.connectors.mmperfmon.Mmperfmon object and encodes it as a TOKIO TimeSeries object. Syntax to create a new HDF5 is:

$ archive_mmperfmon --timestep=60 --init-start 2019-05-15T00:00:00 \
    --init-end 2019-05-16T00:00:00 mmperfmon.2019-05-15.tgz

where _mmperfmon.2019-05-15.tgz_ is one or more files that can be loaded by tokio.connectors.mmperfmon.Mmperfmon.from_file().

When updating an existing HDF5 file, the minimum required syntax is:

$ archive_mmperfmon --timestep=60 mmperfmon.2019-05-15.tgz

The init start/end times are only required when creating an empty HDF5 file.

class tokio.cli.archive_mmperfmon.Archiver(init_start, init_end, timestep, num_luns, num_servers, *args, **kwargs)[source]

Bases: dict

A dictionary containing TimeSeries objects

Contains the TimeSeries objects being populated from a remote data source. Implemented as a class so that a single object can store all of the TimeSeries objects that are generated by multiple method calls.

__init__(init_start, init_end, timestep, num_luns, num_servers, *args, **kwargs)[source]

Initializes the archiver and stores its settings

  • init_start (datetime.datetime) – Lower bound of time to be archived, inclusive
  • init_end (datetime.datetime) – Upper bound of time to be archived, exclusive
  • timestep (int) – Number of seconds between successive data points.
  • num_luns (int or None) – Number of LUNs expected to appear in mmperfmon outputs. If None, autodetect.
  • num_servers (int or None) – Number of NSD servers expected to appear in mmperfmon outputs. If None, autodetect.

Extracts and encode data from an Mmperfmon object

Uses the mmperfmon connector to populate one or more TimeSeries objects.

Parameters:mmpm (tokio.connectors.mmperfmon.Mmperfmon) – Instance of the mmperfmon connector class containing all of the data to be archived

Convert datasets to deltas where necessary and tack on metadata

Perform a few finishing actions to all datasets contained in self after they have been populated. Such actions are configured entirely in self.config and require no external input.

init_dataset(dataset_name, columns)[source]

Initialize an empty dataset within self

Creates and attaches a TimeSeries object to self

  • dataset_name (str) – name of dataset to be initialized
  • columns (list of str) – columns to initialize

Initialize all datasets that can be created from an Mmperfmon instance

This method examines an mmpm and identifies all TimeSeries datasets that can be derived from it, then calculates the dimensions of said datasets based on how many unique columns were found. This is required because the precise number of columns is difficult to generalize a priori on SAN file systems with arbitrarily connected LUNs and servers.

Also caches the mappings between LUN and NSD server names and their functions (data or metadata).

Parameters:mmpm (tokio.connectors.mmperfmon.Mmperfmon) – Object from which possible datasets should be identified and sized.

Infers the dataset name to which a LUN should belong

Returns the dataset name in which a given GPFS LUN name belongs. This is required for block-based file systems in which servers serve both data and metadata.

This function relies on tokio.config.CONFIG[‘mmperfmon_lun_map’].

Parameters:lun_name (str) – The name of a LUN
Returns:The name of a dataset in which lun_name should be filed.
Return type:str

Infers the type of server (data or metadata) from its name

Returns the type of server that server_name is. This relies on tokio.config.CONFIG[‘mmperfmon_md_servers’] which encodes a regex that matches metadata server names.

This method only makes sense for GPFS clusters that have distinct metadata servers.

Parameters:server_name (str) – Name of the server
Returns:“mdserver” or “dataserver”
Return type:str

Set metadata constants (version, units, etc) on datasets and groups

Parameters:dataset_names (list of str) – datasets whose metadata should be set
tokio.cli.archive_mmperfmon.archive_mmperfmon(init_start, init_end, timestep, num_luns, num_servers, output_file, input_files)[source]

Retrieves remote data and stores it in TOKIO time series format

Given a start and end time, retrieves all of the relevant contents of a remote data source and encodes them in the TOKIO time series HDF5 data format.

  • init_start (datetime.datetime) – The first timestamp to be included in the HDF5 file
  • init_end (datetime.datetime) – The timestamp following the last timestamp to be included in the HDF5 file.
  • timestep (int or None) – Number of seconds between successive entries in the HDF5 file to be created. If None, autodetect.
  • num_luns (int or None) – Number of LUNs expected to appear in mmperfmon outputs. If None, autodetect.
  • num_servers (int or None) – Number of NSD servers expected to appear in mmperfmon outputs. If None, autodetect.
  • output_file (str) – Path to the file to be created.
  • input_files (list of str) – List of paths to input files from which mmperfmon connectors should be instantiated.
tokio.cli.archive_mmperfmon.init_hdf5_file(datasets, init_start, init_end, hdf5_file)[source]

Creates HDF5 datasets within a file based on TimeSeries objects

Idempotently ensures that hdf5_file contains a dataset corresponding to each tokio.timeseries.TimeSeries object contained in the datasets object.

  • datasets (Archiver) – Dictionary keyed by dataset name and whose values are tokio.timeseries.TimeSeries objects. One HDF5 dataset will be created for each TimeSeries object.
  • init_start (datetime.datetime) – If a dataset does not already exist within the HDF5 file, create it using this as a lower bound for the timesteps, inclusive
  • init_end (datetime.datetime) – If a dataset does not already exist within the HDF5 file, create one using this as the upper bound for the timesteps, exclusive
  • hdf5_file (str) – Path to the HDF5 file in which datasets should be initialized

Entry point for the CLI interface