Details
-
Type:
Story
-
Status: Done
-
Resolution: Done
-
Fix Version/s: None
-
Component/s: None
-
Labels:None
-
Story Points:7
-
Epic Link:
-
Team:Data Facility
Description
attempt runs using the HSC-RC2 repo built in DM-18829
Attachments
Activity
tract-scale runs have had failures from the DM-19988 bug
For the report on DMLT Jun 4:
The MergeCoaddMeasurements error affecting 19/79 patches is DM-20705
Also filed DM-20758 for the issues from quantum pickles
The last run with the DM-20705 fix has finished successfully. The outputs are at /scratch/hchiang2/G3M19c_000001_035/ and /project/hchiang2/bps/submit/G3M19c/000001/drp/035/. The old butler repo from June /project/hchiang2/gen3repos/w_2019_20/repo/ was used with the new Oracle environment at lsst-oradb-test.ncsa.illinois.edu. The software stack was w_2019_28 plus the DM-20705 fix. It used the latest gen3-hsc-rc2 package (116a2fa) including the config files and pipeline.sh for generating a quantum graph pickle file, which is then used by ctrl_bps. Partitionable HTCondor slots were used.
Compared to the Gen2 run, the main differences include:
- Ignore two troublesome patches for tract=9615 (
DM-20695Selector) ({{-d "tract=9615 and patch!=28 and patch!=72" }}) for assembleCoadd. - No Gen3 skyCorrection or jointcal (
DM-19470)
p.s. Neither include ForcedPhotCcdTask (DM-19942)
The execution breakdown from pegasus-statistics is (time unit: second)
Transformation Count Succeeded Failed Min Max Mean Total
|
cit 5050 5050 0 7.401 325.836 96.669 488178.172 |
ct 5050 5050 0 3.989 83.851 41.16 207858.989 |
cwact 396 396 0 2.912 324.306 198.656 78667.866 |
dagman::post 22616 22616 0 0.0 16.0 1.664 37640.0 |
dcsst 396 396 0 2.468 420.753 221.337 87649.595 |
dcst 396 396 0 3.207 71.902 55.128 21830.835 |
fpct 396 396 0 3.426 3532.421 2149.721 851289.45 |
isr 5050 5050 0 5.637 49.916 28.363 143234.315 |
mdt 80 80 0 2.868 50.633 36.501 2920.078 |
mmcst 396 396 0 4.032 4077.043 2401.424 950963.914 |
mmt 80 80 0 2.456 31.86 22.111 1768.894 |
mwt 3266 3266 0 3.025 151.34 87.323 285197.588 |
pegasus::dirmanager 1 1 0 2.184 2.184 2.184 2.184 |
pegasus::transfer 2059 2059 0 3.237 13.994 8.891 18307.199 |
------------------------------------------------------------------------------
|
Type Succeeded Failed Incomplete Total Retries Total+Retries
|
Tasks 20556 0 0 20556 0 20556
|
Jobs 22616 0 0 22616 0 22616
|
Sub-Workflows 0 0 0 0 0 0
|
------------------------------------------------------------------------------
|
|
Workflow wall time : 8 hrs, 50 mins
|
Cumulative job wall time : 36 days, 7 hrs
|
Cumulative job wall time as seen from submit side : 36 days, 9 hrs
|
Cumulative job badput wall time : 0.0 secs
|
Cumulative job badput wall time as seen from submit side : 0.0 secs
|
Skip ForcedPhotCcdTask due to the
DM-19942bug