Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-15224

Investigate low association rate in HiTS2015 CI dataset

    Details

    • Story Points:
      4
    • Sprint:
      AP F18-3, AP F18-4, AP F18-5
    • Team:
      Alert Production

      Description

      While testing ap_verify_ci_hits2015, a set of six overlapping exposures across three epochs, for DM-15142, I got the following source matching statistics:

      • 1600 DiaObjects with 1 DiaSource
      • 82 DiaObjects with 2 DiaSources
      • 0 DiaObjects with 3 DiaSources

      The issue appears whether I run ap_pipe all at once or in order of increasing visit ID, so it does not appear to be a concurrency bug in the database. On the other hand, the dataset footprint is known to include "messy" chips, so it's possible that many of the DiaSources are transient artifacts rather than real sources.

      Pair-investigate the properties of the images and DiaObjects to figure out what is happening.

        Attachments

          Issue Links

            Activity

            Hide
            krzys Krzysztof Findeisen added a comment -

            We found that part of the problem is that CCDs 58 and 62 do not overlap with the other images (because I was comparing a East-is-left focal plane map to an East-is-right image of the HiTS 2015 dataset). Looking at only the two visits with CCDs 5 and 10 in a slightly different reduction, we see:

            • ~913 DiaObjects with 1 DiaSource
            • ~79 DiaObjects with 2 DiaSources

            Meredith Rawls thinks the ~10% repeat DiaSources despite the ~100% overlap is because most of the sources are image differencing artifacts; we will make plots of number of sources on the sky for each individual visit to test this hypothesis.

            Show
            krzys Krzysztof Findeisen added a comment - We found that part of the problem is that CCDs 58 and 62 do not overlap with the other images (because I was comparing a East-is-left focal plane map to an East-is-right image of the HiTS 2015 dataset). Looking at only the two visits with CCDs 5 and 10 in a slightly different reduction, we see: ~913 DiaObjects with 1 DiaSource ~79 DiaObjects with 2 DiaSources Meredith Rawls thinks the ~10% repeat DiaSources despite the ~100% overlap is because most of the sources are image differencing artifacts; we will make plots of number of sources on the sky for each individual visit to test this hypothesis.
            Hide
            krzys Krzysztof Findeisen added a comment -

            With the dataset amended as described in the comments to DM-15142, the statistics are:

            • 1066 DiaObjects with 1 DiaSources
            • 91 DiaObjects with 2 DiaSources
            • 25 DiaObjects with 3 DiaSources
            Show
            krzys Krzysztof Findeisen added a comment - With the dataset amended as described in the comments to DM-15142 , the statistics are: 1066 DiaObjects with 1 DiaSources 91 DiaObjects with 2 DiaSources 25 DiaObjects with 3 DiaSources
            Hide
            mrawls Meredith Rawls added a comment -

            It appears that updating the CI dataset to include more fully overlapping chips solved the majority of the problem. The plot below shows the sources detected on the sky with the three visits in this dataset. Small gray points are DIAObjects with a single DIASource, medium blue points are DIAObjects with two DIASources, and large orange points are DIAObjects with three DIASources.

            It is clear from this plot that there are lots of single-source objects being detected near chip edges and corners. This is most likely due to artifacts in the difference imaging template. I suspect if this plot is recreated when DM-14762 is complete, many of the erroneous gray sources will disappear.

            Show
            mrawls Meredith Rawls added a comment - It appears that updating the CI dataset to include more fully overlapping chips solved the majority of the problem. The plot below shows the sources detected on the sky with the three visits in this dataset. Small gray points are DIAObjects with a single DIASource, medium blue points are DIAObjects with two DIASources, and large orange points are DIAObjects with three DIASources. It is clear from this plot that there are lots of single-source objects being detected near chip edges and corners. This is most likely due to artifacts in the difference imaging template. I suspect if this plot is recreated when DM-14762 is complete, many of the erroneous gray sources will disappear.
            Hide
            mrawls Meredith Rawls added a comment -

            Ian Sullivan, would you be willing to review this? The PR is simply to add the notebook used to generate the plot into the ap_pipe-notebooks repo.

            Show
            mrawls Meredith Rawls added a comment - Ian Sullivan , would you be willing to review this? The PR is simply to add the notebook used to generate the plot into the ap_pipe-notebooks repo.
            Hide
            sullivan Ian Sullivan added a comment -

            Looks good. I would recommend you remove the commented-out cells at the beginning of the notebook (or add a comment explaining why you left them), and add a description of the final plot in a conclusion. That could probably just be the description you left as a comment here.

            Show
            sullivan Ian Sullivan added a comment - Looks good. I would recommend you remove the commented-out cells at the beginning of the notebook (or add a comment explaining why you left them), and add a description of the final plot in a conclusion. That could probably just be the description you left as a comment here.

              People

              • Assignee:
                mrawls Meredith Rawls
                Reporter:
                krzys Krzysztof Findeisen
                Reviewers:
                Ian Sullivan
                Watchers:
                Ian Sullivan, Krzysztof Findeisen, Meredith Rawls
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Summary Panel