Fix Version/s: None
measure ( MeasureMergedCoaddSourcesTask) jobs can run for a long time without outputting any log messages. For example, see /scratch/brendal4/bps-gen3-dc2/submit/2.2i/runs/test-med-1/w_2021_32/
DM-31348/20210809T172956Z/jobs/measure/3828/18/y/13500_measure_3828_18_y.3588136.err where the 3rd log record came in more than 1hr after the 2nd log record. In some other cases it can take >2 hr or even longer. This is causing problems for using PanDA on IDF, because the lack of log activities is interpreted as the job has hung and PanDA pilot timed out.
Even though we might tune PanDA for longer timeout, it'd be good to have more log messages in running this task, so one can check the status of the run and so on.
Please add more log messages to this task. Either INFO- or VERBOSE-level logs are fine as the plan is to run these jobs with the VERBOSE-level logging.
That output looks good. Yes please make pull requests and I can review.
Thanks for agreeing to review this, Tim. Jenkins doesn't include ci_imsim or ci_hsc since the new log messages will not appear, with the default interval of 10 minutes. I could either change the default to a couple of seconds to test with the CI datasets and revert it to 10 minutes, or just consider my log outputs above as a sanity check.
Looks good. Thanks. One minor comment about when to add the 600 seconds.
I'll add a quick comment re deblending with scarlet. As far as I know it has never hung on ground based data, so that is something to keep in mind. If anyone has found otherwise please let me know. I mention this here because as I noted in github, the fix in this ticket will only work when a patch takes a long time due to multiple blends that as a collective take longer than 600 s. But there are some patches where nearly the entire patch is a blend, meaning it will still appear to hang. So I opened https://github.com/pmelchior/scarlet/issues/252 in scarlet to implement a similar fix on the scarlet side.
Jenkins run: https://ci.lsst.codes/blue/organizations/jenkins/stack-os-matrix/detail/stack-os-matrix/34846/pipeline
Tim Jenness - as with other log-related tickets, could I assign you as the reviewer for this ticket as well?