After double checking my runs with butler and finding that I couldn't find any src data for a few of the CosmosN visits (25814, 25186, 23600, and 25812), I found that the singleFrameDriver run for CosmosN encountered errors as shown below (that I haven't encountered previously):
File "/software/lsstsw/stack3_20171023/stack/miniconda3-4.3.21-10a4fa6/Linux64/daf_persistence/14.0-14-g87d16e8+7/python/lsst/daf/persistence/repository.py", line 189, in write
|
return butlerLocationStorage.write(butlerLocation, obj)
|
File "/software/lsstsw/stack3_20171023/stack/miniconda3-4.3.21-10a4fa6/Linux64/daf_persistence/14.0-14-g87d16e8+7/python/lsst/daf/persistence/posixStorage.py", line 264, in write
|
writeFormatter(butlerLocation, obj)
|
File "/software/lsstsw/stack3_20171023/stack/miniconda3-4.3.21-10a4fa6/Linux64/daf_persistence/14.0-14-g87d16e8+7/python/lsst/daf/persistence/posixStorage.py", line 718, in writeFitsCatalogStorage
|
obj.writeFits(logLoc.locString(), **kwds)
|
lsst.pex.exceptions.wrappers.FitsError:
|
File "src/fits.cc", line 821, in int lsst::afw::fits::Fits::addColumn(const string&, int) [with T = double; std::string = std::basic_string<char>]
|
cfitsio error (/datasets/hsc/repo/rerun/private/thrush/RE/01179/NB0921/output/ICSRC-0025814-008.fitsr0gdeldi): error writing to FITS file (106) : Adding column 'ext_convolved_ConvolvedFlux_3_3_3_apCorrSigma' with size 1
|
cfitsio error stack:
|
Error writing data buffer to file:
|
/datasets/hsc/repo/rerun/private/thrush/RE/01179/NB0921/output/ICSRC-0025814-008
|
.fitsr0gdeldi
|
{0}
|
lsst::afw::fits::FitsError: 'cfitsio error (/datasets/hsc/repo/rerun/private/thrush/RE/01179/NB0921/output/ICSRC-0025814-008.fitsr0gdeldi): error writing to FITS file (106) : Adding column 'ext_convolved_ConvolvedFlux_3_3_3_apCorrSigma' with size 1
|
cfitsio error stack:
|
Error writing data buffer to file:
|
/datasets/hsc/repo/rerun/private/thrush/RE/01179/NB0921/output/ICSRC-0025814-008
|
.fitsr0gdeldi
|
After rerunning singleFrameDriver, mosaic, and coaddDriver for the CosmosN dataset, it looks like the issue has been resolved and I'm able to find all of the visit data on butler for CosmosN. I'm still trying to track down why that issue happened previously, and I'm currently rerunning multibandDriver for Cosmos to make sure that the fixes to CosmosN are applied there as well.
I would also like to mention something that I forgot to include previously: as was seen in --DM-12929--, singleFrameDriver failed for 32 visits and ccds:
Cosmos: {'visit': 11698, 'ccd': 68}, {'visit': 278, 'ccd': 95}, {'visit': 280, 'ccd': 61},{'visit': 280, 'ccd': 69}, {'visit': 280, 'ccd': 103}, {'visit': 284, 'ccd': 61}, {'visit': 278, 'ccd': 95}, {'visit': 17934, 'ccd': 1}, {'visit': 28376, 'ccd': 69}, {'visit': 28382, 'ccd': 101}, {'visit': 28396, 'ccd': 102}, {'visit': 28398, 'ccd': 95}, {'visit': 28398, 'ccd': 101}, {'visit': 28400, 'ccd': 53}, {'visit': 28400, 'ccd': 61}, {'visit': 28400, 'ccd': 95}, {'visit': 28400, 'ccd': 101}, {'visit': 28400, 'ccd': 100}, {'visit': 17934, 'ccd': 1}, {'visit': 23596, 'ccd': 6}
Wide: {'visit': 9868, 'ccd': 76}, {'visit': 11582, 'ccd': 76}, {'visit': 7344, 'ccd': 67}, {'visit': 19468, 'ccd': 69}, {'visit': 6478, 'ccd': 99}, {'visit': 6528, 'ccd': 24}, {'visit': 6528, 'ccd': 67}, {'visit': 6528, 'ccd': 59}, {'visit': 9708, 'ccd': 99}, {'visit': 9736, 'ccd': 67}, {'visit': 17738, 'ccd': 69}, {'visit': 17750, 'ccd': 58}
Of the types of fatal errors:
- 5 failed with "RuntimeError: No matches to use for photocal"
- 18 failed with " InvalidParameterError:
File "src/PsfexPsf.cc", line 221, in virtual std::shared_ptr<lsst::afw::image::Image<double> > lsst::meas::extensions::psfex::PsfexPsf::_doComputeImage(const Point2D&, const lsst::afw::image::Color&, const Point2D&) const
Only spatial variation (ndim == 2) is supported; saw 0 {0}"
- 4 failed with "RuntimeError: Unable to match sources"
- 4 failed with "RuntimeError: No objects passed our cuts for consideration as psf stars"
- 1 failed with "FitsError:
File "src/fits.cc", line 841, in std::size_t lsst::afw::fits::Fits::addRows(std::size_t)
cfitsio error (/datasets/hsc/repo/rerun/private/thrush/RE/00991/HSC-Y/output/SRC-0006528-059.fitsy1hrihd5): error writing to FITS file (106) : Adding 724 rows to binary table"
Can you please also add to the tables the slurm job IDs, the number of nodes or node-hours, and information about mosaic jobs?
In /project/hsc_rc/w_2017_52/
DM-12982/logs/ please separate logs of successful runs that scientists will want to look at, and logs of failed attempts that others don't need to worry about.A soft link /datasets/hsc/repo/rerun/RC was set up so you don't have to copy data by hand unless you want to.