Details
-
Type:
Story
-
Status: To Do
-
Resolution: Unresolved
-
Fix Version/s: None
-
Component/s: analysis_drp, pipe_tasks
-
Labels:
-
Team:Data Release Production
-
Urgent?:No
Description
While debugging a ci_hsc failure on DM-35060, I found that one of the plotting tasks in analysis_drp was failing due to lack of valid data. This suggests that something should have failed much earlier (likely in the consolidation tasks), as one of our "implied required" columns (slot.psfShape) being all zeros/NaNs should fail early, instead of trickling down to unclear plotting failures. This is related to RFC-808, that defined sentinel values to fill empty columns with, but didn't define what to do when all such values are "sentineled out".
Eli Rykoff suggested at least making ci_hsc's test_validate_outputs.py read in the parquet source and object tables and check for all-NaN columns, which might help catch changes to the default DRP pipeline that trigger this problem, but wouldn't catch other kinds of failures in production.
Slack thread for the discussion that prompted this: https://lsstc.slack.com/archives/C2JPMCF5X/p1657329375041579
I've attached the ci_hsc build log from a failed Jenkins run that demonstrates this bug. If there is useful information there earlier than the first plot_e2PSF_scatter_visit ERROR log, I couldn't find it.