tokio.connectors.mmperfmon module¶

Connectors for the GPFS mmperfmon query usage and mmperfmon query gpfsNumberOperations.

The typical output of mmperfmon query usage may look something like:

Legend:
xxxxxxxx.nersc.gov|CPU|cpu_user
xxxxxxxx.nersc.gov|CPU|cpu_sys
xxxxxxxx.nersc.gov|Memory|mem_total
xxxxxxxx.nersc.gov|Memory|mem_free
xxxxxxxx.nersc.gov|Network|lo|net_r
xxxxxxxx.nersc.gov|Network|lo|net_s

Row           Timestamp cpu_user cpu_sys   mem_total    mem_free     net_r     net_s
2019-01-11-10:00:00      0.2    0.56  31371.0 MB  18786.5 MB    1.7 kB    1.7 kB
2019-01-11-10:01:00     0.22    0.57  31371.0 MB  18785.6 MB    1.7 kB    1.7 kB
2019-01-11-10:02:00     0.14    0.55  31371.0 MB  18785.1 MB    1.7 kB    1.7 kB

Whereas the typical output of mmperfmon query gpfsnsdds is:

Legend:
xxxxxxxx.nersc.gov|GPFSNSDDisk|na07md01|gpfs_nsdds_bytes_read
xxxxxxxx.nersc.gov|GPFSNSDDisk|na07md02|gpfs_nsdds_bytes_read
xxxxxxxx.nersc.gov|GPFSNSDDisk|na07md03|gpfs_nsdds_bytes_read

Row           Timestamp gpfs_nsdds_bytes_read gpfs_nsdds_bytes_read gpfs_nsdds_bytes_read
2019-03-04-16:01:00             203539391                     0                     0
2019-03-04-16:02:00             175109739                     0                     0
2019-03-04-16:03:00              57053762                     0                     0

In general, each Legend: entry has the format:

col_number: hostname|subsystem[|device_id]|counter_name

where

col_number is an aribtrary number
hostname is the fully qualified NSD server hostname
subsystem is the type of component being measured (CPU, memory, network, disk)
device_id is optional and represents the instance of the subsystem being measured (e.g., CPU core ID, network interface, or disk identifier)
counter_name is the specific metric being measured

It is also worth noting that mmperfmon treats a timestamp labeled as, for example, 2019-03-04-16:01:00 as containing all data from the period between 2019-03-04-16:00:00 and 2019-03-04-16:01:00.

class tokio.connectors.mmperfmon.Mmperfmon(*args, **kwargs)[source]¶

Bases: tokio.connectors.common.SubprocessOutputDict

Representation for the mmperfmon query command. Generates a dict of form:

{
    timestamp0: {
            "something0.nersc.gov": {
                "key0": value0,
                "key1": value1,
                ...
            },
            "something1.nersc.gov": {
                ...
            },
            ...
    },
    timestamp1: {
        ...
    },
    ...
}

__repr__()[source]¶

Returns string representation of self

This does not convert back into a format that attempts to resemble the mmperfmon output because the process of loading mmperfmon output is lossy.

classmethod from_file(cache_file)[source]¶: Instantiate from a cache file

classmethod from_str(input_str)[source]¶: Instantiate from a string

load(cache_file=None)[source]¶

Load either a tarfile, directory, or single mmperfmon output file

Tries to load self.cache_file; if it is a directory or tarfile, it is handled by self.load_multiple; otherwise falls through to the load_str code path.

load_cache(cache_file=None)[source]¶

Loads from one of two formats of cache files

Because self.save_cache() outputs to a different format from self.load_str(), load_cache() must be able to ingest both formats.

load_multiple(input_file)[source]¶

Load one or more input files from a directory or tarball

Parameters:	input_file (str) – Path to either a directory or a tarfile containing text files, each of which contains the output of a single (multiple) – invocation. (mmperfmon) –

load_str(input_str)[source]¶

Parses the output of the subprocess output to initialize self

Parameters:	input_str (str) – Text output of the `mmperfmon query` command

to_dataframe(by_host=None, by_metric=None)[source]¶: Convert to a pandas.DataFrame

to_dataframe_by_host(host)[source]¶

Returns data from a specific host as a DataFrame

Parameters:	host (str) – Hostname from which a DataFrame should be constructed
Returns:	All measurements from the given host. Columns correspond to different metrics; indexed in time.
Return type:	pandas.DataFrame

to_dataframe_by_metric(metric)[source]¶

Returns data for a specific metric as a DataFrame

Parameters:	metric (str) – Metric from which a DataFrame should be constructed
Returns:	All measurements of the given metric for all hosts. Columns represent hosts; indexed in time.
Return type:	pandas.DataFrame

to_json(**kwargs)[source]¶

Returns a json-encoded string representation of self.

Returns:	JSON representation of self
Return type:	str

tokio.connectors.mmperfmon.get_col_pos(line, align=None)[source]¶

Return column offsets of a left-aligned text table

For example, given the string:

Row           Timestamp cpu_user cpu_sys   mem_total
123456789x123456789x123456789x123456789x123456789x123456789x

would return:

[(0, 4), (15, 24), (25, 33), (34, 41), (44, 53)]

for align=None.

Parameters:	line (str) – String from which offsets should be determined align (str or None) – Expected column alignment; one of ‘left’, ‘right’, or None (to return the exact start and stop of just the non-space text)
Returns:	List of tuples of integer offsets denoting the start index (inclusive) and stop index (exclusive) for each column.
Return type:	list

tokio.connectors.mmperfmon.value_unit_to_bytes(value_unit)[source]¶

Converts a value+unit string into bytes

Converts a string containing both a numerical value and a unit of that value into a normalized value. For example, “1 MB” will convert to 1048576.

Parameters:	value_unit (str) – Of the format “float str” where float is the value and str is the unit by which value is expressed.
Returns:	Number of bytes represented by value_unit
Return type:	int