Details
-
Type:
Story
-
Status: Won't Fix
-
Resolution: Done
-
Fix Version/s: None
-
Component/s: ctrl_pool
-
Labels:None
-
Team:Data Access and Database
Description
The ctrl_pool framework builds a batch submission system around our existing tasks. It provides a series of arguments which can be used to control the resources allocated to each task. For example:
$ constructBias.py --help
|
|
[...]
|
|
Batch submission options:
|
--queue QUEUE Queue name
|
--job JOB Job name
|
--nodes NODES Number of nodes
|
--procs PROCS Number of processors per node
|
--cores CORES Number of cores (Slurm/SMP only)
|
--time TIME Expected execution time per element (sec)
|
However, the wrapped task also has some similar sounding arguments. Following on from the above, we have:
*** Wrapped script:
|
usage: constructBias.py input [options]
|
|
[...]
|
optional arguments:
|
[...]
|
-j PROCESSES, --processes PROCESSES
|
Number of processes to use
|
-t TIMEOUT, --timeout TIMEOUT
|
Timeout for multiprocessing; maximum wall time (sec)
|
I can speculate about how these options interact (I guess that -j1 --procs 2 means run two tasks at a time and give each access to one CPU core, while -j2 --procs 1 means one task a time a time but let it use two cores), but I don't know, and I don't think I can unambiguously derive it from the documentation. Similar issues arise for timeouts, etc.
Please provide some instructions on how this is actually supposed to fit together so we can avoid confusion in the future (see e.g. DM-9415 for what happens when somebody encounters the current system for the first time).
Setting team to DAX and copying Fritz Mueller since this is a task framework issue. However, I don't think we need this addressed urgently — indeed, if this code is all going to be thrown away soon and replaced by something more usable, that would be a good outcome.