qcg.pilotjob.resources module

class qcg.pilotjob.resources.CRType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

Consumable resource type.

GPU = 1
MEM = 2
class qcg.pilotjob.resources.CR(crtype, total_count=0, used=0)

Bases: object

Consumable resources.

crtype

type of cr

Type

CRType

total_count

number of cr

Type

int

used

currently used crs

Type

int

Initialize consumable resources.

Parameters
  • crtype (CRType) – type of cr

  • total_count (int) – number of crs

  • used (int) – currently used crs

property available

number of available resources.

Type

int

allocate(count)

Allocate consumable resources.

Parameters

count (int) –

Returns

number of allocated resources, or 0 if no resources has been allocated - due to

insufficient resources.

Return type

CRAllocation

to_dict()

Serialize consumable resources to dictionary.

Returns

serialized consumable resources

Return type

dict

static from_dict(data)

Create instance of CR class based on serialized data.

Parameters

data (dict) – node data generated by the ‘to_dict’ method

Returns

instance

Return type

CR

release(cralloc)

Release allocated consumable resources.

Parameters

cralloc (CRAllocation) –

Raises

InternalError – if allocation size is greater than used resources, no resources are released.

class qcg.pilotjob.resources.CRBind(crtype, ids, free_ids=None)

Bases: object

Consumable resource with bindable instances.

The object tracks allocation of specific instances.

crtype
Type

CRType

total_count
Type

int

ids
Type

list(str)

_free
Type

list(str)

Initialize bindable consumable resources.

Parameters
  • crtype (CRType) –

  • ids (list()) –

  • free_ids (list, optional) –

property available

number of available resources.

Type

int

property used

number of used resources.

Type

int

allocate(count)

Allocate bindable resources.

Parameters

count (int) – number of resources to allocate

Returns

object with a list of allocated bindable instances of the resources, or None if no

resources has been allocated - due to insufficient resources.

Return type

CRBindAllocation

release(cralloc)

Release allocated bindable consumable resources.

Parameters

cralloc (CRBindAllocation) – allocation to release

Raises

InternalError – if ‘count’ is greater than used resources, no resources are released.

to_dict()

Serialize bindable consumable resources to dictionary.

Returns

serialized data

Return type

dict

static from_dict(data)

Create instance of CRBind class based on serialized data.

Parameters

data (dict) – serialized data

Returns

instance

Return type

CRBind

class qcg.pilotjob.resources.Node(name=None, total_cores=0, used=0, core_ids=None, free_cores=None, crs=None)

Bases: object

Node resources. This class stores and allocates specific cores. Each core is identified by the number.

_name

node name

Type

str

_total_cores

total number of cores on a node

Type

int

_core_ids

core identifiers

Type

list(str)

_free_cores

free core identifiers

Type

list(str)

_crs

list of available consumable resources

Type

dict(crType,CR|CRBind)

resources

instance of resources the node belongs, set by constructor of Resources class

Type

Resources

Parameters
  • name (str) –

  • total_cores (int) –

  • used (int) –

  • core_ids (list(str)) - optional core identifiers (the list must have at least 'total_cores' elements) –

  • free_cores (list(str)) - optional free core identifiers (the list must have total_cores-used elements) –

  • crs (dict(CRType,CR|CRBind)) –

set_available_core_ids(core_ids)

Set a new list of available cpu/core identifiers. There are scenarios where information obtained from Slurm is not correct, and the real core/cpu identifiers are gathered by the agent launched from the manager.

Parameters

core_ids (list(int)) –

property name

node name

Type

str

property total

total number of cores

Type

int

property used

number of used cores

Type

int

property free

number of free cores

Type

int

property free_ids

list of free core identifiers

Type

list(str)

property ids

list of all available core identifiers

Type

list(str)

property crs

available consumable resources

Type

dict(crType,CR|CRBind)

property str_crs

string representation of available consumable resources

Type

str

property available
has_enough_crs(crs)

Check if node has enough CR.

Parameters

crs (dict(CRType,int)) –

Returns

true - if node contains requested cr’s, otherwise false.

allocate_crs(crs)

Allocate requested crs.

Parameters

crs (dict(CRType,int)) –

Returns

dict(CRType,CR|CRBind) - with allocated cr’s

Raises

NotSufficientResources - if no all resources could be reserved, in that case no resources will be allocated.

allocate_max(max_cores, crs=None)

Allocate maximum number of cores on a node and specific number of consumable resources.

Parameters
  • max_cores (int) –

  • crs (dict(CRType,int), optional) –

Returns

instance with allocated resources, or None if there no any available resources

Return type

NodeAllocation

allocate_exact(cores, crs=None)

Allocate specific number of cores on a node and specific number of consumable resources.

Parameters
  • cores (int) –

  • crs (dict(CRType,int)) –

Returns

instance with allocated resources, or None if there no any available resources

Return type

NodeAllocation

release(allocation)

Release allocation on a node.

Parameters

allocation (NodeAllocation) – allocated resources

to_dict()

Serialize node information to dictionary.

Returns

serialized node informations

Return type

dict

to_json()

Serialize node information to JSON format.

Returns

serialized node information

Return type

str

static from_dict(data)

Create instance of Node class based on serialized data.

Parameters

data – node data generated by the ‘to_dict’ method

Returns

instance of Node class

Return type

Node

class qcg.pilotjob.resources.ResourcesType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

Origin of resources.

LOCAL = 1
SLURM = 2
class qcg.pilotjob.resources.Resources(rtype, nodes=None, binding=False)

Bases: object

Available resources set. The set stores and tracks nodes with possible different number of available cores.

_type

origin of resources

Type

ResourcesType

_binding

does information about specific cores is available

Type

bool

_nodes

list of available nodes

Type

list(Node)

_total_cores

total number of available cores

Type

int

_used_cores

currently used cores

Type

int

_max_crs

maximum number of consumable resources on single node

Type

dict(CRType, int)

_total_crs

total number of consumable resources

Type

dict(CRType, int)

_system_allocation

resources allocated for system, excluded from those available for jobs

Type

NodeAllocation

Initialize resources.

Parameters
  • rtype (ResourcesType) – origin of resources

  • nodes (list(Node)) – list of available nodes

  • binding (bool) – does information about specific cores is available

property rtype

type of resources

Type

ResourceType

property binding

is cpu binding available

Type

bool

property nodes

list of all available nodes

Type

list(Node)

property total_nodes

total number of nodes

Type

int

property total_cores

total number of available cores

Type

int

property used_cores

number of used cores

Type

int

property free_cores

number of currently free cores

Type

int

property max_crs

maximum number of CRs on a single node

Type

dict(CRType,int)

property total_crs

total number of CRs on all nodes

Type

dict(CRType,int)

mark_not_available_cores(not_avail_cores)
mark_available_cores(avail_cores)
allocate_for_system()

Allocate single core for QCG-PilotJob, excluding this core from those available for jobs.

node_cores_allocated(cores)

Function called by the node when some cores has been allocated.

This function should track number of used cores in Resources statistics.

Parameters

cores (int) – number of allocated cores

node_cores_released(cores)

Function called by the node when some cores has been released. This function should track number of used cores in Resources statistics.

Parameters

cores (int) – number of released cores

check_min_job_requirements(job_reqs)

Check if given resource requirements can be met with those available.

Parameters

job_reqs (job_reqs) – job’s resource requirements described as dictionary

Returns

true if job’s resources requirements are less than available

Return type

bool

to_dict()

Serialize resources information to dictionary.

Returns

serialized resources information

Return type

dict

to_json()

“Serialize resources information to JSON.

Returns

serialized resources information

Return type

str

static from_dict(data)

Create instance of Resources class based on serialized data.

Parameters

data – resources data generated by the ‘to_dict’ method

Returns

instance of Resources class

Return type

Resources