tokio.connectors.mmperfmon module

Connectors for the GPFS mmperfmon query usage and mmperfmon query gpfsNumberOperations.

The typical output of mmperfmon query usage may look something like:

Legend:
 1: xxxxxxxx.nersc.gov|CPU|cpu_user
 2: xxxxxxxx.nersc.gov|CPU|cpu_sys
 3: xxxxxxxx.nersc.gov|Memory|mem_total
 4: xxxxxxxx.nersc.gov|Memory|mem_free
 5: xxxxxxxx.nersc.gov|Network|lo|net_r
 6: xxxxxxxx.nersc.gov|Network|lo|net_s

Row           Timestamp cpu_user cpu_sys   mem_total    mem_free     net_r     net_s
  1 2019-01-11-10:00:00      0.2    0.56  31371.0 MB  18786.5 MB    1.7 kB    1.7 kB
  2 2019-01-11-10:01:00     0.22    0.57  31371.0 MB  18785.6 MB    1.7 kB    1.7 kB
  3 2019-01-11-10:02:00     0.14    0.55  31371.0 MB  18785.1 MB    1.7 kB    1.7 kB

Whereas the typical output of mmperfmon query gpfsnsdds is:

Legend:
 1: xxxxxxxx.nersc.gov|GPFSNSDDisk|na07md01|gpfs_nsdds_bytes_read
 2: xxxxxxxx.nersc.gov|GPFSNSDDisk|na07md02|gpfs_nsdds_bytes_read
 3: xxxxxxxx.nersc.gov|GPFSNSDDisk|na07md03|gpfs_nsdds_bytes_read

Row           Timestamp gpfs_nsdds_bytes_read gpfs_nsdds_bytes_read gpfs_nsdds_bytes_read
  1 2019-03-04-16:01:00             203539391                     0                     0
  2 2019-03-04-16:02:00             175109739                     0                     0
  3 2019-03-04-16:03:00              57053762                     0                     0

In general, each Legend: entry has the format:

col_number: hostname|subsystem[|device_id]|counter_name

where

  • col_number is an aribtrary number
  • hostname is the fully qualified NSD server hostname
  • subsystem is the type of component being measured (CPU, memory, network, disk)
  • device_id is optional and represents the instance of the subsystem being measured (e.g., CPU core ID, network interface, or disk identifier)
  • counter_name is the specific metric being measured

It is also worth noting that mmperfmon treats a timestamp labeled as, for example, 2019-03-04-16:01:00 as containing all data from the period between 2019-03-04-16:00:00 and 2019-03-04-16:01:00.

class tokio.connectors.mmperfmon.Mmperfmon(*args, **kwargs)[source]

Bases: tokio.connectors.common.SubprocessOutputDict

Representation for the mmperfmon query command. Generates a dict of form:

{
    timestamp0: {
            "something0.nersc.gov": {
                "key0": value0,
                "key1": value1,
                ...
            },
            "something1.nersc.gov": {
                ...
            },
            ...
    },
    timestamp1: {
        ...
    },
    ...
}
__repr__()[source]

Returns string representation of self

This does not convert back into a format that attempts to resemble the mmperfmon output because the process of loading mmperfmon output is lossy.

classmethod from_file(cache_file)[source]

Instantiate from a cache file

classmethod from_str(input_str)[source]

Instantiate from a string

load(cache_file=None)[source]

Load either a tarfile, directory, or single mmperfmon output file

Tries to load self.cache_file; if it is a directory or tarfile, it is handled by self.load_multiple; otherwise falls through to the load_str code path.

load_cache(cache_file=None)[source]

Loads from one of two formats of cache files

Because self.save_cache() outputs to a different format from self.load_str(), load_cache() must be able to ingest both formats.

load_multiple(input_file)[source]

Load one or more input files from a directory or tarball

Parameters:
  • input_file (str) – Path to either a directory or a tarfile containing
  • text files, each of which contains the output of a single (multiple) –
  • invocation. (mmperfmon) –
load_str(input_str)[source]

Parses the output of the subprocess output to initialize self

Parameters:input_str (str) – Text output of the mmperfmon query command
to_dataframe(by_host=None, by_metric=None)[source]

Convert to a pandas.DataFrame

to_dataframe_by_host(host)[source]

Returns data from a specific host as a DataFrame

Parameters:host (str) – Hostname from which a DataFrame should be constructed
Returns:All measurements from the given host. Columns correspond to different metrics; indexed in time.
Return type:pandas.DataFrame
to_dataframe_by_metric(metric)[source]

Returns data for a specific metric as a DataFrame

Parameters:metric (str) – Metric from which a DataFrame should be constructed
Returns:All measurements of the given metric for all hosts. Columns represent hosts; indexed in time.
Return type:pandas.DataFrame
to_json(**kwargs)[source]

Returns a json-encoded string representation of self.

Returns:JSON representation of self
Return type:str
tokio.connectors.mmperfmon.get_col_pos(line, align=None)[source]

Return column offsets of a left-aligned text table

For example, given the string:

Row           Timestamp cpu_user cpu_sys   mem_total
123456789x123456789x123456789x123456789x123456789x123456789x

would return:

[(0, 4), (15, 24), (25, 33), (34, 41), (44, 53)]

for align=None.

Parameters:
  • line (str) – String from which offsets should be determined
  • align (str or None) – Expected column alignment; one of ‘left’, ‘right’, or None (to return the exact start and stop of just the non-space text)
Returns:

List of tuples of integer offsets denoting the start index (inclusive) and stop index (exclusive) for each column.

Return type:

list

tokio.connectors.mmperfmon.value_unit_to_bytes(value_unit)[source]

Converts a value+unit string into bytes

Converts a string containing both a numerical value and a unit of that value into a normalized value. For example, “1 MB” will convert to 1048576.

Parameters:value_unit (str) – Of the format “float str” where float is the value and str is the unit by which value is expressed.
Returns:Number of bytes represented by value_unit
Return type:int