Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-29907

Fix meas_extensions_scarlet sorting issue

    XMLWordPrintable

Details

    Description

      meas_extensions_scarlet seems to have catalogs that are not sorted properly, at least in gen3. This ticket is to figure out why, and to fix it. It is blocking DM-29888 because I can't get ci_hsc_gen3 to complete due to the parent sorting issue.

      Attachments

        Issue Links

          Activity

            fred3m Fred Moolekamp added a comment - - edited

            I see a few different ways to fix this:

            1. Rewrite the IdFactory.makeSource code to increment from the last source in the current catalog. I don't know enough about the current code to know the design choices that went into using the current method, so it's probably best if someone fixes this if this is the chosen solution.
            2. After MergeDetectionsTask has run, re-number the ids so that they are in numerical order beginning with the first dataid that the id factory would want to use (which might not necessarily be the first id, since the first source might have been culled). This would ensure that the catalog given to the deblender matches the next id expected by the factory, and that the parent ids still match between the mergeDet catalog and the deblendedFlux catalog
            3. Create a new catalog in the deblender (like it was doing before) and add all of the sources to it. This seems like the least appealing, since now there would be a mismatch between the ids in the two different catalogs.
            fred3m Fred Moolekamp added a comment - - edited I see a few different ways to fix this: Rewrite the IdFactory.makeSource code to increment from the last source in the current catalog. I don't know enough about the current code to know the design choices that went into using the current method, so it's probably best if someone fixes this if this is the chosen solution. After MergeDetectionsTask has run, re-number the ids so that they are in numerical order beginning with the first dataid that the id factory would want to use (which might not necessarily be the first id, since the first source might have been culled). This would ensure that the catalog given to the deblender matches the next id expected by the factory, and that the parent ids still match between the mergeDet catalog and the deblendedFlux catalog Create a new catalog in the deblender (like it was doing before) and add all of the sources to it. This seems like the least appealing, since now there would be a mismatch between the ids in the two different catalogs.
            jbosch Jim Bosch added a comment -

            I think I've found the problem, though it's definitely one of those, "how did that ever work" mysteries now. The Gen2 DeblendCoaddSourcesTask has these critical lines:

            https://github.com/lsst/pipe_tasks/blob/387f8f07a2b66205f9fa6bda9a89dcdbbef3f64c/python/lsst/pipe/tasks/multiBand.py#L535-L536

            ...which says, "when you add new records to this catalog, don't use this ID" (for each ID being copied from the mergeDet catalog).

            I don't see any such calls in either of the Gen3 deblending PipelineTasks in pipe_tasks, in meas_extensions_scarlet, or in meas_deblender.

            It looks like the right place to fix this is probably around

            https://github.com/lsst/pipe_tasks/blob/5d831a8ea6570b4acfbeb5ca35468ee0f99eca39/python/lsst/pipe/tasks/deblendCoaddSourcesPipeline.py#L159

            I think DM-17689 is absolved, except maybe being part of the "how did this ever work" mystery.

            jbosch Jim Bosch added a comment - I think I've found the problem, though it's definitely one of those, "how did that ever work" mysteries now. The Gen2 DeblendCoaddSourcesTask has these critical lines: https://github.com/lsst/pipe_tasks/blob/387f8f07a2b66205f9fa6bda9a89dcdbbef3f64c/python/lsst/pipe/tasks/multiBand.py#L535-L536 ...which says, "when you add new records to this catalog, don't use this ID" (for each ID being copied from the mergeDet catalog). I don't see any such calls in either of the Gen3 deblending PipelineTasks in pipe_tasks , in meas_extensions_scarlet , or in meas_deblender . It looks like the right place to fix this is probably around https://github.com/lsst/pipe_tasks/blob/5d831a8ea6570b4acfbeb5ca35468ee0f99eca39/python/lsst/pipe/tasks/deblendCoaddSourcesPipeline.py#L159 I think DM-17689 is absolved, except maybe being part of the "how did this ever work" mystery.

            I implemented your suggested fix and tested it using ci_hsc_gen3 locally and it works. Jenkins is currently building: https://ci.lsst.codes/blue/organizations/jenkins/stack-os-matrix/detail/stack-os-matrix/34085/pipeline.

            As for "how did that ever work," I'm wondering if gen3 was run all the way through the measurement algorithms that would be offended by an unsorted catalog before mid January? Once meas_extensions_scarlet became the default, and it re-wrote the catalog anyway, then it probably wasn't an issue. But my guess is that had it been tested using the old deblender, which also just appended the mergeDet catalog, then it this problem would have shown up sooner. If it had been run all the way through before January, then it's still a mystery to me too.

            fred3m Fred Moolekamp added a comment - I implemented your suggested fix and tested it using ci_hsc_gen3 locally and it works. Jenkins is currently building: https://ci.lsst.codes/blue/organizations/jenkins/stack-os-matrix/detail/stack-os-matrix/34085/pipeline . As for "how did that ever work," I'm wondering if gen3 was run all the way through the measurement algorithms that would be offended by an unsorted catalog before mid January? Once meas_extensions_scarlet became the default, and it re-wrote the catalog anyway, then it probably wasn't an issue. But my guess is that had it been tested using the old deblender, which also just appended the mergeDet catalog, then it this problem would have shown up sooner. If it had been run all the way through before January, then it's still a mystery to me too.
            jbosch Jim Bosch added a comment -

            Looks good!

            We have definitely been running ci_hsc_gen3 all the way through for a long time, but w14 was probably the first Gen3 RC2 run that got that far. Since this problem didn't cause a hard error all the time, I bet we just got lucky in ci_hsc_gen3 (until more recently something perturbed it, I guess).

            jbosch Jim Bosch added a comment - Looks good! We have definitely been running ci_hsc_gen3 all the way through for a long time, but w14 was probably the first Gen3 RC2 run that got that far. Since this problem didn't cause a hard error all the time, I bet we just got lucky in ci_hsc_gen3 (until more recently something perturbed it, I guess).

            Thanks for the quick review. Sorry I didn't merge last night, but I went to bed before Jenkins finished.

            fred3m Fred Moolekamp added a comment - Thanks for the quick review. Sorry I didn't merge last night, but I went to bed before Jenkins finished.

            People

              fred3m Fred Moolekamp
              fred3m Fred Moolekamp
              Jim Bosch
              Fred Moolekamp, Jim Bosch
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Jenkins

                  No builds found.