tokio.cli.summarize_job module

Take a darshan log or job start/end time and pull scalar data from every available TOKIO connector/tool configured for the system to present a single system-wide view of performance for the time during which that job was running.

tokio.cli.summarize_job._identify_fs_from_path(path, mounts)[source]

Scan a list of mount points and try to identify the one that matches the given path


Determine the most-used API and file system based on the Darshan log


Determine the most-used file system based on the Darshan log


Entry point for the CLI interface

tokio.cli.summarize_job.merge_dicts(dict1, dict2, assertion=True, prefix=None)[source]

Take two dictionaries and merge their keys. Optionally raise an exception if a duplicate key is found, and optionally merge the new dict into the old after adding a prefix to every key.

tokio.cli.summarize_job.retrieve_concurrent_job_data(results, jobhost, concurrentjobs)[source]

Get information about all jobs that were running during a time period

tokio.cli.summarize_job.retrieve_darshan_data(results, darshan_log_file, silent_errors=False)[source]

Extract the performance data from the Darshan log

tokio.cli.summarize_job.retrieve_jobid(results, jobid, file_count)[source]

Get JobId from either Slurm or the CLI argument

tokio.cli.summarize_job.retrieve_lmt_data(results, file_system)[source]

Figure out the H5LMT file corresponding to this run

tokio.cli.summarize_job.retrieve_ost_data(results, ost, ost_fullness=None, ost_map=None)[source]

Get Lustre server status via lfsstatus tool

tokio.cli.summarize_job.retrieve_topology_data(results, jobinfo_cache_file, nodemap_cache_file)[source]

Get the diameter of the job (Cray XC)


Special serializer function that converts datetime into something that can be encoded in json

tokio.cli.summarize_job.summarize_byterate_df(dataframe, readwrite, timestep=None)[source]

Calculate some interesting statistics from a dataframe containing byte rate data.

tokio.cli.summarize_job.summarize_cpu_df(dataframe, servertype)[source]

Calculate some interesting statistics from a dataframe containing CPU load data.


Synthesize new Darshan summary metrics based on the contents of a connectors.darshan.Darshan object that is partially or fully populated


Extract key metrics from the POSIX module in a Darshan log

tokio.cli.summarize_job.summarize_mds_ops_df(dataframe, opname, timestep=None)[source]

Summarize various metadata op counts over a time range


Populate the fraction missing counter from a given DataFrame