tokio.connectors.nersc_jobsdb module

Extract job info from the NERSC jobs database. Accessing the MySQL database is not required (i.e., if you have everything stored in a cache, you never have to touch MySQL). However if you do have to connect to MySQL, you must set the following environment variables:

NERSC_JOBSDB_HOST
NERSC_JOBSDB_USER
NERSC_JOBSDB_PASSWORD
NERSC_JOBSDB_DB

If you do not know what to use as these credentials, you will have to rely on a cache database.

class tokio.connectors.nersc_jobsdb.NerscJobsDb(dbhost=None, dbuser=None, dbpassword=None, dbname=None, cache_file=None)[source]

Bases: tokio.connectors.cachingdb.CachingDb

Connect to and interact with the NERSC jobs database. Maintains a query cache where the results of queries are cached in memory. If a query is repeated, its values are simply regurgitated from here rather than touching any databases.

If this class is instantiated with a cache_file argument, all queries will go to that SQLite-based cache database if they are not found in the in-memory cache.

If this class is not instantiated with a cache_file argument, all queries that do not exist in the in-memory cache will go out to the MySQL database.

The in-memory query caching is possible because the job data in the NERSC jobs database is immutable and can be cached indefinitely once it appears there. At any time the memory cache can be committed to a cache database to be used or transported later.

drop_cache()[source]

Clear the query cache

get_concurrent_jobs(start_timestamp, end_timestamp, nersc_host)[source]

Grab all of the jobs that were running, in part or in full, during the time window bounded by start_timestamp and end_timestamp. Then calculate the fraction overlap for each job to calculate the number of core hours that were burned overall during the start/end time of interest.

get_job_startend(jobid, nersc_host)[source]

Return start and end time for a given job id

Retrieves the time a job started and completed.

Parameters:
  • jobid (str) – Job ID of interest
  • nersc_host (str) – NERSC host to which job ID of interest maps
Returns:

Two-item tuple of (start time, end time)

Return type:

tuple of datetime.datetime

query(query_str, query_variables=(), nocache=False)[source]

Pass a query through all layers of cache and return on the first hit.