tokio.cli.archive_mmperfmon module¶
Retrieves mmperfmon counters and store them in TOKIO Timeseries format
Command-line tool that loads a tokio.connectors.mmperfmon.Mmperfmon
object and encodes it as a TOKIO TimeSeries object. Syntax to create a new
HDF5 is:
$ archive_mmperfmon --timestep=60 --init-start 2019-05-15T00:00:00 \
--init-end 2019-05-16T00:00:00 mmperfmon.2019-05-15.tgz
where _mmperfmon.2019-05-15.tgz_ is one or more files that can be loaded by
tokio.connectors.mmperfmon.Mmperfmon.from_file()
.
When updating an existing HDF5 file, the minimum required syntax is:
$ archive_mmperfmon --timestep=60 mmperfmon.2019-05-15.tgz
The init start/end times are only required when creating an empty HDF5 file.
-
class
tokio.cli.archive_mmperfmon.
Archiver
(init_start, init_end, timestep, num_luns, num_servers, *args, **kwargs)[source]¶ Bases:
dict
A dictionary containing TimeSeries objects
Contains the TimeSeries objects being populated from a remote data source. Implemented as a class so that a single object can store all of the TimeSeries objects that are generated by multiple method calls.
-
__init__
(init_start, init_end, timestep, num_luns, num_servers, *args, **kwargs)[source]¶ Initializes the archiver and stores its settings
Parameters: - init_start (datetime.datetime) – Lower bound of time to be archived, inclusive
- init_end (datetime.datetime) – Upper bound of time to be archived, exclusive
- timestep (int) – Number of seconds between successive data points.
- num_luns (int or None) – Number of LUNs expected to appear in mmperfmon outputs. If None, autodetect.
- num_servers (int or None) – Number of NSD servers expected to appear in mmperfmon outputs. If None, autodetect.
-
archive
(mmpm)[source]¶ Extracts and encode data from an Mmperfmon object
Uses the mmperfmon connector to populate one or more TimeSeries objects.
Parameters: mmpm (tokio.connectors.mmperfmon.Mmperfmon) – Instance of the mmperfmon connector class containing all of the data to be archived
-
finalize
()[source]¶ Convert datasets to deltas where necessary and tack on metadata
Perform a few finishing actions to all datasets contained in self after they have been populated. Such actions are configured entirely in self.config and require no external input.
-
init_dataset
(dataset_name, columns)[source]¶ Initialize an empty dataset within self
Creates and attaches a TimeSeries object to self
Parameters: - dataset_name (str) – name of dataset to be initialized
- columns (list of str) – columns to initialize
-
init_datasets
(mmpm)[source]¶ Initialize all datasets that can be created from an Mmperfmon instance
This method examines an mmpm and identifies all TimeSeries datasets that can be derived from it, then calculates the dimensions of said datasets based on how many unique columns were found. This is required because the precise number of columns is difficult to generalize a priori on SAN file systems with arbitrarily connected LUNs and servers.
Also caches the mappings between LUN and NSD server names and their functions (data or metadata).
Parameters: mmpm (tokio.connectors.mmperfmon.Mmperfmon) – Object from which possible datasets should be identified and sized.
-
lun_type
(lun_name)[source]¶ Infers the dataset name to which a LUN should belong
Returns the dataset name in which a given GPFS LUN name belongs. This is required for block-based file systems in which servers serve both data and metadata.
This function relies on tokio.config.CONFIG[‘mmperfmon_lun_map’].
Parameters: lun_name (str) – The name of a LUN Returns: The name of a dataset in which lun_name should be filed. Return type: str
-
server_type
(server_name)[source]¶ Infers the type of server (data or metadata) from its name
Returns the type of server that server_name is. This relies on tokio.config.CONFIG[‘mmperfmon_md_servers’] which encodes a regex that matches metadata server names.
This method only makes sense for GPFS clusters that have distinct metadata servers.
Parameters: server_name (str) – Name of the server Returns: “mdserver” or “dataserver” Return type: str
-
-
tokio.cli.archive_mmperfmon.
archive_mmperfmon
(init_start, init_end, timestep, num_luns, num_servers, output_file, input_files)[source]¶ Retrieves remote data and stores it in TOKIO time series format
Given a start and end time, retrieves all of the relevant contents of a remote data source and encodes them in the TOKIO time series HDF5 data format.
Parameters: - init_start (datetime.datetime) – The first timestamp to be included in the HDF5 file
- init_end (datetime.datetime) – The timestamp following the last timestamp to be included in the HDF5 file.
- timestep (int or None) – Number of seconds between successive entries in the HDF5 file to be created. If None, autodetect.
- num_luns (int or None) – Number of LUNs expected to appear in mmperfmon outputs. If None, autodetect.
- num_servers (int or None) – Number of NSD servers expected to appear in mmperfmon outputs. If None, autodetect.
- output_file (str) – Path to the file to be created.
- input_files (list of str) – List of paths to input files from which mmperfmon connectors should be instantiated.
-
tokio.cli.archive_mmperfmon.
init_hdf5_file
(datasets, init_start, init_end, hdf5_file)[source]¶ Creates HDF5 datasets within a file based on TimeSeries objects
Idempotently ensures that hdf5_file contains a dataset corresponding to each tokio.timeseries.TimeSeries object contained in the datasets object.
Parameters: - datasets (Archiver) – Dictionary keyed by dataset name and whose values are tokio.timeseries.TimeSeries objects. One HDF5 dataset will be created for each TimeSeries object.
- init_start (datetime.datetime) – If a dataset does not already exist within the HDF5 file, create it using this as a lower bound for the timesteps, inclusive
- init_end (datetime.datetime) – If a dataset does not already exist within the HDF5 file, create one using this as the upper bound for the timesteps, exclusive
- hdf5_file (str) – Path to the HDF5 file in which datasets should be initialized