Details
-
Type:
Story
-
Status: Done
-
Resolution: Done
-
Fix Version/s: None
-
Component/s: ci_hsc, ci_hsc_gen3
-
Labels:None
-
Story Points:1
-
Epic Link:
-
Team:Data Release Production
-
Urgent?:No
Description
When a ci_hsc run fails, it can be hard to dig in to the point of failure. The readme in ci_hsc_gen3 is very short and doesn't say anything about what to do with failures. A section in the readme "How to debug failures" would be very useful. It should describe how to get the exact pipetask commands that ran before the failure (both the qgraph generation and actual run) and what to put in them to rerun from that point (e.g. --extend-run --skip-existing --clobber-outputs --no-versions), etc. The necessary approach to this has changed a few times as ci_hsc_gen3 has evolved (there's now a single pipeline.sh file that gets run, instead of commands embedded in scons), so this would be a good place to keep updated notes on how to work with it.
Eli Rykoff: you've passed on tips on debugging ci_hsc to me: can you please write something up in the ci_hsc_gen3 readme?