Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-31477

RC2 reprocessing with w_2021_34 (gen2)

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: None
    • Story Points:
      8
    • Epic Link:
    • Sprint:
      DRP S21b
    • Team:
      Data Release Production
    • Urgent?:
      No

      Description

      This will largely follow the procedure followed in DM-31184, but with the needed updated ordering of jobs identified there.  All three tracts must be run up to the global external calibration stages (fgcm/jointcal), but, depending on resources and time considerations, we may opt to only take tracts 9813 and 9697 beyond that.

        Attachments

          Issue Links

            Activity

            Hide
            lauren Lauren MacArthur added a comment -

            I am seeing many of the following raises in multiband:

              File "/software/lsstsw/stack_20210813/stack/miniconda3-py38_4.9.2-0.7.0/Linux64/meas_extensions_scarlet/22.0.1-2-g574b836+765d378d91/pytho
            n/lsst/meas/extensions/scarlet/scarletDeblendTask.py", line 755, in deblend
                blend, skipped, spectrumInit = deblend(mExposure, foot, self.config)
              File "/software/lsstsw/stack_20210813/stack/miniconda3-py38_4.9.2-0.7.0/Linux64/meas_extensions_scarlet/22.0.1-2-g574b836+765d378d91/pytho
            n/lsst/meas/extensions/scarlet/scarletDeblendTask.py", line 276, in deblend
                sources, skipped = init_all_sources(
              File "/software/lsstsw/stack_20210813/stack/miniconda3-py38_4.9.2-0.7.0/Linux64/scarlet/lsst-dev-g965bb5fbbf+f31336177f/lib/python/scarlet
            /initialization.py", line 388, in init_all_sources
                set_spectra_to_match(sources, observations)
              File "/software/lsstsw/stack_20210813/stack/miniconda3-py38_4.9.2-0.7.0/Linux64/scarlet/lsst-dev-g965bb5fbbf+f31336177f/lib/python/scarlet
            /initialization.py", line 553, in set_spectra_to_match
                covar = np.linalg.inv((m * w[None, :]) @ m.T)
              File "<__array_function__ internals>", line 5, in inv
              File "/software/lsstsw/stack_20210813/conda/miniconda3-py38_4.9.2/envs/lsst-scipipe/lib/python3.8/site-packages/numpy/linalg/linalg.py", l
            ine 545, in inv
                ainv = _umath_linalg.inv(a, signature=signature, extobj=extobj)
              File "/software/lsstsw/stack_20210813/conda/miniconda3-py38_4.9.2/envs/lsst-scipipe/lib/python3.8/site-packages/numpy/linalg/linalg.py", l
            ine 88, in _raise_linalgerror_singular
                raise LinAlgError("Singular matrix")
            numpy.linalg.LinAlgError: Singular matrix
            

            I'm afraid I can't provide any more information as this it the run where the pipe_drivers logs got swallowed (see DM-31530).

            While we have seen a number of LinAlgError raises of late, I think this is a new one. Paging Fred Moolekamp.

            Show
            lauren Lauren MacArthur added a comment - I am seeing many of the following raises in multiband : File " / software / lsstsw / stack_20210813 / stack / miniconda3 - py38_4. 9.2 - 0.7 . 0 / Linux64 / meas_extensions_scarlet / 22.0 . 1 - 2 - g574b836 + 765d378d91 / pytho n / lsst / meas / extensions / scarlet / scarletDeblendTask.py", line 755 , in deblend blend, skipped, spectrumInit = deblend(mExposure, foot, self .config) File " / software / lsstsw / stack_20210813 / stack / miniconda3 - py38_4. 9.2 - 0.7 . 0 / Linux64 / meas_extensions_scarlet / 22.0 . 1 - 2 - g574b836 + 765d378d91 / pytho n / lsst / meas / extensions / scarlet / scarletDeblendTask.py", line 276 , in deblend sources, skipped = init_all_sources( File " / software / lsstsw / stack_20210813 / stack / miniconda3 - py38_4. 9.2 - 0.7 . 0 / Linux64 / scarlet / lsst - dev - g965bb5fbbf + f31336177f / lib / python / scarlet / initialization.py", line 388 , in init_all_sources set_spectra_to_match(sources, observations) File " / software / lsstsw / stack_20210813 / stack / miniconda3 - py38_4. 9.2 - 0.7 . 0 / Linux64 / scarlet / lsst - dev - g965bb5fbbf + f31336177f / lib / python / scarlet / initialization.py", line 553 , in set_spectra_to_match covar = np.linalg.inv((m * w[ None , :]) @ m.T) File "<__array_function__ internals>" , line 5 , in inv File "/software/lsstsw/stack_20210813/conda/miniconda3-py38_4.9.2/envs/lsst-scipipe/lib/python3.8/site-packages/numpy/linalg/linalg.py" , l ine 545 , in inv ainv = _umath_linalg.inv(a, signature = signature, extobj = extobj) File "/software/lsstsw/stack_20210813/conda/miniconda3-py38_4.9.2/envs/lsst-scipipe/lib/python3.8/site-packages/numpy/linalg/linalg.py" , l ine 88 , in _raise_linalgerror_singular raise LinAlgError( "Singular matrix" ) numpy.linalg.LinAlgError: Singular matrix I'm afraid I can't provide any more information as this it the run where the pipe_drivers logs got swallowed (see DM-31530 ). While we have seen a number of LinAlgError raises of late, I think this is a new one. Paging Fred Moolekamp .
            Hide
            lauren Lauren MacArthur added a comment - - edited

            And a gazillion of:

            Failed to solve for PSF matching kernel in GAaP for (27952.000000, 11910.000000): Problematic scaling factors = 1.15 Errors: OutOfRangeError('Unable to insert a candidate at (27952.00, 11910.00)')
            

            Specifically, there are 764 instances in tract 9697, 0 in 9615(!) and 2688 1865 (and counting) in 9813.
            I believe this is a legitimate error message, but do you expect to see so many Arun Kannawadi?

            Show
            lauren Lauren MacArthur added a comment - - edited And a gazillion of: Failed to solve for PSF matching kernel in GAaP for ( 27952.000000 , 11910.000000 ): Problematic scaling factors = 1.15 Errors: OutOfRangeError( 'Unable to insert a candidate at (27952.00, 11910.00)' ) Specifically, there are 764 instances in tract 9697, 0 in 9615(!) and 2688 1865 (and counting) in 9813. I believe this is a legitimate error message, but do you expect to see so many Arun Kannawadi ?
            Hide
            fred3m Fred Moolekamp added a comment -

            Does anyone know what might have changed? Specifically, this error says that either the weights or the images have zeros in the matrix that were not present before. The first things that come to mind that are most likely to affect this would be a change in background subtraction and/or newly marked bad pixels. There is a fix in the current version of scarlet to fix this problem, but there is a lot of other code there that has not been tested in the stack and would require running the full RC2 dataset and the approval of Lauren MacArthur et al. to make sure that there are no regressions. Yusra AlSayyad I'm happy to spend the latter part of this week to trigger the jobs and then follow-up on them next week, assuming that everything runs smoothly, if you think that this is time sensitive.

            Show
            fred3m Fred Moolekamp added a comment - Does anyone know what might have changed? Specifically, this error says that either the weights or the images have zeros in the matrix that were not present before. The first things that come to mind that are most likely to affect this would be a change in background subtraction and/or newly marked bad pixels. There is a fix in the current version of scarlet to fix this problem, but there is a lot of other code there that has not been tested in the stack and would require running the full RC2 dataset and the approval of Lauren MacArthur et al. to make sure that there are no regressions. Yusra AlSayyad I'm happy to spend the latter part of this week to trigger the jobs and then follow-up on them next week, assuming that everything runs smoothly, if you think that this is time sensitive.
            Hide
            lauren Lauren MacArthur added a comment -

            Might it be related to the streak masking (DM-30270)?

            Show
            lauren Lauren MacArthur added a comment - Might it be related to the streak masking ( DM-30270 )?
            Hide
            lauren Lauren MacArthur added a comment - - edited

            All logs can be found in (noting that those produced by pipe_drivers tasks were largely swallowed due to DM-31530):

            /datasets/hsc/repo/rerun/RC/w_2021_34/DM-31477/logs
            

            Plots from validate_drp and pipe_analysis are all linked here. Those in the vsGen3 subdirectory are the pipe_analysis run on the associated Gen3 w_2021_34 run (DM-31524) and any comparison[Visit/Coadd]*.png plots are a direct comparison of Gen2 vs. Gen3. The coadd-level plots for Gen3 are on hold pending the resolution of an issue with the step3 submission for the Gen3 run (explicit tract specification was omitted). Those in the vsGen3/noExtCal subdirectory include, the compareVisit*.png plots without applying any external calibrations (i.e. fgcm & jointcal) and, thankfully, they as boring as can be (total flatlines), confirming that we still have Gen2/Gen3 parity through SFM (phew!).

            Show
            lauren Lauren MacArthur added a comment - - edited All logs can be found in (noting that those produced by pipe_drivers tasks were largely swallowed due to DM-31530 ): / datasets / hsc / repo / rerun / RC / w_2021_34 / DM - 31477 / logs Plots from validate_drp and pipe_analysis are all linked here . Those in the vsGen3 subdirectory are the pipe_analysis run on the associated Gen3 w_2021_34 run ( DM-31524 ) and any comparison [Visit/Coadd] *.png plots are a direct comparison of Gen2 vs. Gen3. The coadd-level plots for Gen3 are on hold pending the resolution of an issue with the step3 submission for the Gen3 run (explicit tract specification was omitted). Those in the vsGen3/noExtCal subdirectory include, the compareVisit*.png plots without applying any external calibrations (i.e. fgcm & jointcal ) and, thankfully, they as boring as can be (total flatlines), confirming that we still have Gen2/Gen3 parity through SFM (phew!).
            Hide
            lauren Lauren MacArthur added a comment - - edited

            Ok, this one is done. I noticed some minor bugs in the logic of the id list creation for the colorAnalysis.py script in pipe_analysis, so there is a PR for that (it only affected tract 9813 as it was related to the number of patches/number of filters, and the others got "lucky" with the old implementation). I reran 9813 on the gen3 repos with this branch and all looks good (plots are linked here).

            Show
            lauren Lauren MacArthur added a comment - - edited Ok, this one is done. I noticed some minor bugs in the logic of the id list creation for the colorAnalysis.py script in pipe_analysis , so there is a PR for that (it only affected tract 9813 as it was related to the number of patches/number of filters, and the others got "lucky" with the old implementation). I reran 9813 on the gen3 repos with this branch and all looks good (plots are linked here ).
            Hide
            lauren Lauren MacArthur added a comment -

            Let me know if this looks good to close out when you get a chance.

            Show
            lauren Lauren MacArthur added a comment - Let me know if this looks good to close out when you get a chance.

              People

              Assignee:
              lauren Lauren MacArthur
              Reporter:
              lauren Lauren MacArthur
              Reviewers:
              Yusra AlSayyad
              Watchers:
              Fred Moolekamp, Lauren MacArthur, Yusra AlSayyad
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.