Details
-
Type:
Story
-
Status: Done
-
Resolution: Done
-
Fix Version/s: None
-
Component/s: ts_auxiliary_telescope, ts_main_telescope
-
Labels:
-
Story Points:1
-
Epic Link:
-
Sprint:TSSW Sprint - Oct 26 - Nov 9
-
Team:Telescope and Site
-
Urgent?:No
Description
If one forgets to pass a configuration to the Watcher on the start command, it correctly rejects the command with the following error:
Traceback (most recent call last):
|
File "/opt/lsst/software/stack/conda/miniconda3-py37_4.8.2/envs/lsst-scipipe-cb4e2dc/lib/python3.7/site-packages/lsst/ts/salobj/csc_utils.py", line 134, in set_summary_state |
await cmd.start(timeout=timeout)
|
File "/opt/lsst/software/stack/conda/miniconda3-py37_4.8.2/envs/lsst-scipipe-cb4e2dc/lib/python3.7/site-packages/lsst/ts/salobj/topics/remote_command.py", line 446, in start |
return await cmd_info.next_ackcmd(timeout=timeout) |
File "/opt/lsst/software/stack/conda/miniconda3-py37_4.8.2/envs/lsst-scipipe-cb4e2dc/lib/python3.7/site-packages/lsst/ts/salobj/topics/remote_command.py", line 183, in next_ackcmd |
raise base.AckError(msg="Command failed", ackcmd=ackcmd) |
lsst.ts.salobj.base.AckError: msg='Command failed', ackcmd=(ackcmd private_seqNum=1729314749, ack=<SalRetCode.CMD_FAILED: -302>, error=1, result="Failed: 'latin-1' codec can't encode character ' |
u2013' in position 14808: ordinal not in range(256)") |
When trying to send the start command to the Watcher with a configuration after the above rejection, the Watcher is no longer commandable.
Traceback (most recent call last):
|
File "/opt/lsst/software/stack/conda/miniconda3-py37_4.8.2/envs/lsst-scipipe-cb4e2dc/lib/python3.7/site-packages/lsst/ts/salobj/csc_utils.py", line 134, in set_summary_state |
await cmd.start(timeout=timeout)
|
File "/opt/lsst/software/stack/conda/miniconda3-py37_4.8.2/envs/lsst-scipipe-cb4e2dc/lib/python3.7/site-packages/lsst/ts/salobj/topics/remote_command.py", line 446, in start |
return await cmd_info.next_ackcmd(timeout=timeout) |
File "/opt/lsst/software/stack/conda/miniconda3-py37_4.8.2/envs/lsst-scipipe-cb4e2dc/lib/python3.7/site-packages/lsst/ts/salobj/topics/remote_command.py", line 198, in next_ackcmd |
msg="Timed out waiting for command acknowledgement", ackcmd=last_ackcmd |
lsst.ts.salobj.base.AckTimeoutError: msg='Timed out waiting for command acknowledgement', ackcmd=(ackcmd private_seqNum=1834082701, ack=<SalRetCode.CMD_ACK: 300>, error=0, result='') |
A command rejection should not leave the Watcher unresponsive.
The problem was quite difficult to track down. It turns out that the SalLogHandler could raise an exception when emitting a message, and my code did not anticipate that. The error from the Watcher's schema validator triggered this, causing the callback loop to die in the `start` command topic. That also explains the strange message; it was the logging system complaining that it could not encode a log message properly.
My changes:
I also made two changes to ts_watcher:
Pull requests: