tokio.connectors.hpss module¶
Connect to various outputs made available by HPSS
-
class
tokio.connectors.hpss.
FtpLog
(*args, **kwargs)[source]¶ Bases:
tokio.connectors.common.SubprocessOutputDict
Provides an interface for log files containing HPSS FTP transactions
This connector parses FTP logs generated by HPSS 7.3. Older versions are not supported.
HPSS FTP log files contain transfer records that look something like:
#0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Mon Dec 31 00:06:46 2018 dtn01-int.nersc.gov /home/o/operator/.check_ftp.25651 b POPN_Cmd r r ftp operator fd 0 Mon Dec 31 00:06:46 2018 0.010 dtn01-int.nersc.gov 33 /home/o/opera... b o PRTR_Cmd r ftp operator fd 0 Mon Dec 31 00:06:48 2018 0.430 sgn-pub-01.nersc.gov 0 /home/g/glock... b o RETR_Cmd r ftp wwwhpss Mon Feb 4 16:45:04 2019 457.800 sgn-pub-01.nersc.gov 7184842752 /home/g/glock... b o RETR_Cmd r ftp wwwhpss Fri Jul 12 15:32:43 2019 2.080 gert01-224.nersc.gov 2147483647 /home/n/nickb... b i PSTO_Cmd r ftp nickb fd 0 Mon Jul 29 15:44:22 2019 0.800 dtn02.nersc.gov 464566784 /home/n/nickb... b o PRTR_Cmd r ftp nickb fd 0
which this class deserializes and represents as a dictionary-like object of the form:
{ "ftp": [ { "bytes": 0, "bytes_sec": 0.0, "duration_sec": 0.43, "end_timestamp": 1546243608.0, "hpss_path": "/home/g/glock...", "hpss_uid": "wwwhpss", "opname": "HL", "remote_host": "sgn-pub-01.nersc.gov", "start_timestamp": 1546243607.57 }, ... ], "pftp": [ { "bytes": 33, "bytes_sec": 3300.0, "duration_sec": 0.01, "end_timestamp": 1546243606.0, "hpss_path": "/home/o/opera...", "hpss_uid": "operator", "opname": "HL", "remote_host": "dtn01-int.nersc.gov", "start_timestamp": 1546243605.99 }, ... ] }
where the top-level keys are either “ftp” or “pftp”, and their values are lists containing every FTP or parallel FTP transaction, respectively.
-
class
tokio.connectors.hpss.
HpssDailyReport
(*args, **kwargs)[source]¶ Bases:
tokio.connectors.common.SubprocessOutputDict
Representation for the daily report that HPSS can generate
-
class
tokio.connectors.hpss.
HsiLog
(*args, **kwargs)[source]¶ Bases:
tokio.connectors.common.SubprocessOutputDict
Provides an interface for log files containing HSI and HTAR transactions
This connector receives input from an HSI log file which takes the form:
Sat Aug 10 00:05:26 2019 dtn01.nersc.gov hsi 57074 31117 LH 0 0.02 543608 12356.7 4 /global/project/projectdir... /home/g/glock/... 57074 Sat Aug 10 00:05:28 2019 cori02-224.nersc.gov htar 58888 14301 create LH 0 58178668032 397.20 146472.0 /nersc/projects/blah.tar 5 58888 Sat Aug 10 00:05:29 2019 myuniversity.edu hsi 35136 1391 LH -1 0.03 0 0.0 0 xyz.bin /home/g/glock/xyz.bin 35136
but uses both tabs and spaces to denote different fields. This connector then presents this data in a dictionary-like form:
{ "hsi": [ { "access_latency_sec": 0.03, "account_id": 35136, "bytes": 0, "bytes_sec": 0.0, "client_pid": 1035, "cos_id": 0, "dest_path": "/home/g/glock/blah.bin", "hpss_uid": 35136, "opname": "LH", "remote_host": "someuniv.edu", "return_code": -1, "source_path": "blah.bin", "end_timestamp": 1565420701 }, ... "htar": [ { "account_id": 58888, "bytes": 58178668032, "bytes_sec": 146472.0, "client_pid": 14301, "cos_id": 5, "duration_sec": 397.2, "hpss_path": "/nersc/projects/blah.tar", "hpss_uid": 58888, "htar_op": "create", "opname": "LH", "remote_ftp_host": "", "remote_host": "cori02-224.nersc.gov", "return_code": 0, "end_timestamp": 1565420728 } ] }
where the top-level keys are either “hsi” or “htar”, and their values are lists containing every HSI or HTAR transaction, respectively.
The keys generally follow the raw nomenclature used in the HSI logs which can be found on Mike Gleicher’s website. Perhaps most relevant are the opnames, which can be one of
- FU - file unlink. Has no destination filename field or account id.
- FR - file rename. Has no account id.
- LH - transfer into HPSS (“Local to HPSS”)
- HL - transfer out of HPSS (“HPSS to Local”)
- HH - internal file copy (“HPSS-to-HPSS”)
For posterity,
access_latency_sec
is the time to open the file. This includes the latency to pull the tape and insert it into the drive.bytes
andbytes_sec
are the size and rate of data transferduration_sec
is the time to complete the transferreturn_code
is zero on success, nonzero otherwise
-
tokio.connectors.hpss.
_find_columns
(line, sep='=', gap=' ', strict=False)[source]¶ Determine the column start/end positions for a header line separator
Takes a line separator such as the one denoted below:
Host Users IO_GB =============== ===== ========= heart 53 148740.6
and returns a tuple of (start index, end index) values that can be used to slice table rows into column entries.
Parameters: - line (str) – Text comprised of separator characters and spaces that define the extents of columns
- sep (str) – The character used to draw the column lines
- gap (str) – The character separating
sep
characters - strict (bool) – If true, restrict column extents to only include sep characters and not the spaces that follow them.
Returns: Return type: list of tuples
-
tokio.connectors.hpss.
_get_ascii_resolution
(numeric_str)[source]¶ Determines the maximum resolution of an ascii-encoded numerical value
Necessary because HPSS logs contain numeric values at different and often-insufficient resolutions. For example, tiny but finite transfers can show up as taking 0.000 seconds, which results in infinitely fast transfers when calculated naively. This function gives us a means to guess at what the real speed might’ve been.
Does not work with scientific notation.
Parameters: numeric_str (str) – An ascii-encoded integer or float Returns: The smallest number that can be expressed using the resolution provided with numeric_str
Return type: float
-
tokio.connectors.hpss.
_hpss_timedelta_to_secs
(timedelta_str)[source]¶ Convert HPSS-encoded timedelta string into seconds
Parameters: timedelta_str (str) – String in form d-HH:MM:SS where d is the number of days, HH is hours, MM minutes, and SS seconds Returns: number of seconds represented by timedelta_str Return type: int
-
tokio.connectors.hpss.
_parse_section
(lines, start_line=0)[source]¶ Parse a single table of the HPSS daily report
Converts a table from the HPSS daily report into a dictionary. For example an example table may appear as:
Archive : IO Totals by HPSS Client Gateway (UI) Host Host Users IO_GB Ops =============== ===== ========= ======== heart 53 148740.6 27991 dtn11 5 29538.6 1694 Total 58 178279.2 29685 HPSS ACCOUNTING: 224962.6
which will return a dict of form:
{ "system": "archive", "title": "io totals by hpss client gateway (ui) host", "records": { "heart": { "io_gb": "148740.6", "ops": "27991", "users": "53", }, "dtn11": { "io_gb": "29538.6", "ops": "1694", "users": "5", }, "total": { "io_gb": "178279.2", "ops": "29685", "users": "58", } ] }
This function is robust to invalid data, and any lines that do not appear to be a valid table will be treated as the end of the table.
Parameters: - lines (list of str) – Text of the HPSS report
- start_line (int) –
Index of
lines
defined such thatlines[start_line]
is the table titlelines[start_line + 1]
is the table heading rowlines[start_line + 2]
is the line separating the table heading and the first row of datalines[start_line + 3:]
are the rows of the table
Returns: Tuple of (dict, int) where
- dict contains the parsed contents of the table
- int is the index of the last line of the table + 1
Return type:
-
tokio.connectors.hpss.
_rekey_table
(table, key)[source]¶ Converts a list of records into a dict of records
Converts a table of records as returned by _parse_section() of the form:
{ "records": [ { "host": "heart", "io_gb": "148740.6", "ops": "27991", "users": "53", }, ... ] }
Into a table of key-value pairs the form:
{ "records": { "heart": { "io_gb": "148740.6", "ops": "27991", "users": "53", }, ... } }
Does not handle degenerate keys when re-keying, so only some tables with a uniquely identifying key can be rekeyed.
Parameters: Returns: Table with records expressed as key-value pairs instead of a list
Return type: