qcg.pilotjob.launcher.launcher module

class qcg.pilotjob.launcher.launcher.Launcher(config, wdir, aux_dir, manager)

Bases: object

The launcher service used to launch applications on remote nodes.

All nodes should have shared file system.

work_dir

path to the working directory

Type:str
aux_dir

path to the auxilary directory

Type:str
zmq_ctx

ZMQ context

Type:zmq.Context
agents

= requested agent instances

Type:dict
nodes

registered agent instances

Type:dict
jobs_def_cb

application finish default callback

Type:def
jobs_cb

application finish callbacks

Type:dict
node_local_agent_cmd

list of command arguments to start agent on local node

Type:list
node_ssh_agent_cmd

list of command arguments to start agent on remote node via ssh

Type:str
in_socket
Type:zmq.Socket
local_address
Type:str
local_export_address
Type:str
iface_task
Type:asyncio.Future

Initialize instance.

Parameters:
  • config (dict) – configuration dictionary
  • wdir (str) – path to the working directory (the same on all nodes)
  • aux_dir (str) – path to the auxilary directory (the same on all nodes)
  • manager (Manager) – manager used to call scheduler loop on certain events
MIN_PORT_RANGE = 10000
MAX_PORT_RANGE = 40000
START_TIMEOUT_SECS = 600
SHUTDOWN_TIMEOUT_SECS = 30
MAXIMUM_CONCURRENT_CONNECTIONS = 1000
set_job_finish_callback(jobs_finish_cb, *jobs_finish_cb_args)

Set default function for notifing about finished jobs.

Parameters:
  • - optional default job finish callback (jobs_finish_cb) –
  • - job finish callback parameters (jobs_finish_cb_args) –
cancel(agent_id, app_id)

Cancel sumited application by the selected agent.

Parameters:
  • - agent that launched applicaton (agent_id) –
  • - application identifier (app_id) –
start(instances, local_port=None)

Initialize launcher with given agent instances. The instances data must contain list of places along with the data needed to initialize instance services.

Parameters:
  • instances ({} []) – agent_id - the agent identifier ssh - if the instance should be run via ssh (account?,host?,remote_python?) slurm - if the instance should be run via slurm () local - if the instance should be run on a local machine
  • - optional local port for the incoming connections, if not defined, (local_port) – the available random port will be chosen from the range
stop()

Stop all agents and release resources.

submit(agent_id, app_id, jname, args, stdin=None, stdout=None, stderr=None, env=None, wdir=None, cores=None, finish_cb=None, finish_cb_args=None)

Submit application to be launched by the selected agent.

Parameters:
  • - agent that should launch applicaton (agent_id) –
  • - application identifier (app_id) –
  • - job name (jname) –
  • - aplication arguments (args) –
  • - path to the standard input file (stdin) –
  • - path to the standard output file (stdout) –
  • - path to the standard error file (stderr) –
  • - environment variables (env) –
  • - working directory (wdir) –
  • - the list of cores application should be binded to (cores) –
  • - job finish callback (finish_cb) –
  • - job finish callback arguments (finish_cb_args) –
qcg.pilotjob.launcher.launcher.run_job(launcher, agent_id, appid, args, stdin=None, stdout=None, stderr=None, env=None)

Just for testing.

qcg.pilotjob.launcher.launcher.test()

Some simple test

qcg.pilotjob.launcher.launcher.finish_callback_default(text, message)

Just for testing.

qcg.pilotjob.launcher.launcher.finish_callback(text, jobid, message)

Just for testing.