Details
-
Type:
Story
-
Status: Done
-
Resolution: Done
-
Fix Version/s: None
-
Component/s: obs_lsst, obs_subaru, pipe_tasks
-
Labels:
-
Story Points:1
-
Team:Architecture
-
Urgent?:No
Description
Hello,
An attempt to run a workflow generated on environment different to the edge node setup leads to this error https://ai-idds-01.cern.ch:25443/cache/DOMA_Harvester_1009468.out :
botocore.hooks DEBUG: Event after-call.s3.HeadObject: calling handler <bound method RetryQuotaChecker.release_retry_quota of <botocore.retries.standard.RetryQuotaChecker object at 0x7fce58f962b0>> |
transformSourceTable INFO: from /home/spadolski/wrk/stack/miniconda3-py38_4.9.2-0.4.3/Linux64/obs_subaru/21.0.0-27-g1176d449+70a9c181f9/policy/Source.yaml |
Traceback (most recent call last):
|
File "/opt/lsst/software/stack/stack/miniconda3-py38_4.9.2-0.4.3/Linux64/ctrl_mpexec/21.0.0-24-g0c1e3ff+d4d6b51e8d//bin/pipetask", line 29, in <module> |
sys.exit(main())
|
File "/opt/lsst/software/stack/stack/miniconda3-py38_4.9.2-0.4.3/Linux64/ctrl_mpexec/21.0.0-24-g0c1e3ff+d4d6b51e8d/python/lsst/ctrl/mpexec/cli/pipetask.py", line 43, in main |
return cli() |
File "/opt/lsst/software/stack/conda/miniconda3-py38_4.9.2/envs/lsst-scipipe-0.4.3/lib/python3.8/site-packages/click/core.py", line 829, in _call_ |
return self.main(*args, **kwargs) |
File "/opt/lsst/software/stack/conda/miniconda3-py38_4.9.2/envs/lsst-scipipe-0.4.3/lib/python3.8/site-packages/click/core.py", line 782, in main |
rv = self.invoke(ctx)
|
File "/opt/lsst/software/stack/conda/miniconda3-py38_4.9.2/envs/lsst-scipipe-0.4.3/lib/python3.8/site-packages/click/core.py", line 1259, in invoke |
return _process_result(sub_ctx.command.invoke(sub_ctx)) |
File "/opt/lsst/software/stack/conda/miniconda3-py38_4.9.2/envs/lsst-scipipe-0.4.3/lib/python3.8/site-packages/click/core.py", line 1066, in invoke |
return ctx.invoke(self.callback, **ctx.params) |
File "/opt/lsst/software/stack/conda/miniconda3-py38_4.9.2/envs/lsst-scipipe-0.4.3/lib/python3.8/site-packages/click/core.py", line 610, in invoke |
return callback(*args, **kwargs) |
File "/opt/lsst/software/stack/conda/miniconda3-py38_4.9.2/envs/lsst-scipipe-0.4.3/lib/python3.8/site-packages/click/decorators.py", line 21, in new_func |
return f(get_current_context(), *args, **kwargs) |
File "/opt/lsst/software/stack/stack/miniconda3-py38_4.9.2-0.4.3/Linux64/ctrl_mpexec/21.0.0-24-g0c1e3ff+d4d6b51e8d/python/lsst/ctrl/mpexec/cli/cmd/commands.py", line 103, in run |
script.run(qgraphObj=qgraph, **kwargs)
|
File "/opt/lsst/software/stack/stack/miniconda3-py38_4.9.2-0.4.3/Linux64/ctrl_mpexec/21.0.0-24-g0c1e3ff+d4d6b51e8d/python/lsst/ctrl/mpexec/cli/script/run.py", line 167, in run |
f.runPipeline(qgraphObj, taskFactory, args)
|
File "/opt/lsst/software/stack/stack/miniconda3-py38_4.9.2-0.4.3/Linux64/ctrl_mpexec/21.0.0-24-g0c1e3ff+d4d6b51e8d/python/lsst/ctrl/mpexec/cmdLineFwk.py", line 597, in runPipeline |
preExecInit.initialize(graph,
|
File "/opt/lsst/software/stack/stack/miniconda3-py38_4.9.2-0.4.3/Linux64/ctrl_mpexec/21.0.0-24-g0c1e3ff+d4d6b51e8d/python/lsst/ctrl/mpexec/preExecInit.py", line 89, in initialize |
self.saveInitOutputs(graph)
|
File "/opt/lsst/software/stack/stack/miniconda3-py38_4.9.2-0.4.3/Linux64/ctrl_mpexec/21.0.0-24-g0c1e3ff+d4d6b51e8d/python/lsst/ctrl/mpexec/preExecInit.py", line 182, in saveInitOutputs |
task = self.taskFactory.makeTask(taskDef.taskClass, taskDef.config, None, self.butler)
|
File "/opt/lsst/software/stack/stack/miniconda3-py38_4.9.2-0.4.3/Linux64/ctrl_mpexec/21.0.0-24-g0c1e3ff+d4d6b51e8d/python/lsst/ctrl/mpexec/taskFactory.py", line 93, in makeTask |
task = taskClass(config=config, initInputs=initInputs)
|
File "/opt/lsst/software/stack/stack/miniconda3-py38_4.9.2-0.4.3/Linux64/pipe_tasks/21.0.0-58-g436064c8+549a115573/python/lsst/pipe/tasks/postprocess.py", line 571, in _init_ |
self.funcs = CompositeFunctor.from_file(self.config.functorFile)
|
File "/opt/lsst/software/stack/stack/miniconda3-py38_4.9.2-0.4.3/Linux64/pipe_tasks/21.0.0-58-g436064c8+549a115573/python/lsst/pipe/tasks/functors.py", line 510, in from_file |
with open(filename) as f:
|
FileNotFoundError: [Errno 2] No such file or directory: '/home/spadolski/wrk/stack/miniconda3-py38_4.9.2-0.4.3/Linux64/obs_subaru/21.0.0-27-g1176d449+70a9c181f9/policy/Source.yaml' |
That is obvious that the path in wrong, this is the path on the submitting machine, not at the edge node container.
I assume that the problems arises here: https://github.com/lsst/obs_subaru/blob/master/config/transformSourceTable.py#L5 or, more specifically here: https://github.com/lsst/utils/blob/master/src/packaging.cc#L39
In the container:
(lsst-scipipe-0.4.3) [lsst@b65d486014ba stack]$ env | grep OBS_SUBARU_DIR
OBS_SUBARU_DIR=/opt/lsst/software/stack/stack/miniconda3-py38_4.9.2-0.4.3/Linux64/obs_subaru/21.0.0-27-g1176d449+70a9c181f9
Can this be fixed?
Yes, so this is seemingly a general problem with the way we handle pex configs.
We are essentially requiring that the graph builder and the executor node have the LSST software in exactly the same place. The fix would be to use environment variable strings directly in the file names and have the thing that reads the path do the environ expansion. I'm not sure how amenable the pipeline developers would be to that kind of change. Even if they were it would take a while to implement so you'd need to change the submission code now to use the same location anyhow.
cc/ Kian-Tat Lim, Yusra AlSayyad