Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-14762

Rerun complete HiTS 2015 data processing on the VC

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Story Points:
      4
    • Epic Link:
    • Sprint:
      AP F18-2, AP F18-3, AP F18-4, AP F18-5, AP F18-6
    • Team:
      Alert Production

      Description

      Following DM-14259, we can run the AP pipeline on the verification cluster and following DM-15080 we have new templates. Use them to process HiTS 2015, difference it with those templates, and record the results.

      Aim to come up with the fastest/least manual possible procedure for this, so that we can easily perform regular reprocessing using the same commands.

        Attachments

          Issue Links

            Activity

            swinbank John Swinbank created issue -
            swinbank John Swinbank made changes -
            Field Original Value New Value
            Epic Link DM-14464 [ 81358 ]
            swinbank John Swinbank made changes -
            Risk Score 0
            swinbank John Swinbank made changes -
            Link This issue is blocked by DM-14259 [ DM-14259 ]
            swinbank John Swinbank made changes -
            Link This issue is blocked by DM-15080 [ DM-15080 ]
            swinbank John Swinbank made changes -
            Sprint AP F18-1 [ 746 ] AP F18-2 [ 747 ]
            Story Points 6 4
            Description Following DM-14259, we can run the AP pipeline on the verification cluster.

            Use that capability to perform a complete reprocessing from scratch:

            * Process HiTS 2014, use it to build template coadds;
            * Process HiTS 2015, difference it with those templates, record the results.

            Aim to come up with the fastest/least manual possible procedure for this, so that we can easily perform regular reprocessing using the same commands.
            Following DM-14259, we can run the AP pipeline on the verification cluster and following DM-15080 we have new templates. Use them to process HiTS 2015, difference it with those templates, and record the results.

            Aim to come up with the fastest/least manual possible procedure for this, so that we can easily perform regular reprocessing using the same commands.
            swinbank John Swinbank made changes -
            Summary Rerun complete HiTS 2014 & 2015 data processing on the VC Rerun complete HiTS 2015 data processing on the VC
            swinbank John Swinbank made changes -
            Link This issue blocks DM-15081 [ DM-15081 ]
            swinbank John Swinbank made changes -
            Sprint AP F18-2 [ 747 ] AP F18-2, AP F18-3 [ 747, 748 ]
            mrawls Meredith Rawls made changes -
            Link This issue blocks DM-15081 [ DM-15081 ]
            swinbank John Swinbank made changes -
            Sprint AP F18-2, AP F18-3 [ 747, 748 ] AP F18-2, AP F18-3, AP F18-4 [ 747, 748, 749 ]
            swinbank John Swinbank made changes -
            Sprint AP F18-2, AP F18-3, AP F18-4 [ 747, 748, 749 ] AP F18-2, AP F18-3, AP F18-4, AP F18-5 [ 747, 748, 749, 750 ]
            mrawls Meredith Rawls made changes -
            Status To Do [ 10001 ] In Progress [ 3 ]
            Hide
            swinbank John Swinbank added a comment -

            At our meeting (Eric Bellm, Meredith Rawls, John Swinbank) of 2018-10-30 we agreed that it would be fun to try excluding pixels within N pixels of the edge of a chip, based on the DIASource x, y. We reckon that's probably easy and worth spending ~an hour on in the context of this ticket. If it turns out not to be easy, make a new ticket for it.

            Show
            swinbank John Swinbank added a comment - At our meeting ( Eric Bellm , Meredith Rawls , John Swinbank ) of 2018-10-30 we agreed that it would be fun to try excluding pixels within N pixels of the edge of a chip, based on the DIASource x, y. We reckon that's probably easy and worth spending ~an hour on in the context of this ticket. If it turns out not to be easy, make a new ticket for it.
            Hide
            mrawls Meredith Rawls added a comment -

            Gabor Kovacs, would you be willing to review this?

            The ap_pipe rerun is located on lsst-dev in /project/mrawls/hits2015/rerun/newtemplate1. To run ap_pipe, I used slurm scripts created by an updated version of prep_ap_pipe.sh (pushed to ap_pipe for you to review).

            The main thing to look at, however, is the new notebook in ap_pipe-notebooks. There are some histograms of the objects and sources detected along with some plots of objects/sources on the sky. Among other things, it shows that, as demonstrated in DM-15080, the new templates are at least not worse than before, and are arguably better.

            Show
            mrawls Meredith Rawls added a comment - Gabor Kovacs , would you be willing to review this? The ap_pipe rerun is located on lsst-dev in /project/mrawls/hits2015/rerun/newtemplate1 . To run ap_pipe, I used slurm scripts created by an updated version of prep_ap_pipe.sh (pushed to ap_pipe for you to review). The main thing to look at, however, is the new notebook in ap_pipe-notebooks. There are some histograms of the objects and sources detected along with some plots of objects/sources on the sky. Among other things, it shows that, as demonstrated in DM-15080 , the new templates are at least not worse than before, and are arguably better.
            mrawls Meredith Rawls made changes -
            Reviewers Gabor Kovacs [ gkovacs ]
            Status In Progress [ 3 ] In Review [ 10004 ]
            Hide
            mrawls Meredith Rawls added a comment - - edited

            It's worth noting there are PRs both here and here which Jira is being slow to pick up on due to its now-daily-5pm-crash habit.

            As I reported during standup today (11/1), it is non-trivial to exclude plotting sources near the edge of chips as John Swinbank requested above because the xy pixel information is not yet in the association database. Therefore, I've skipped that for now.

            Show
            mrawls Meredith Rawls added a comment - - edited It's worth noting there are PRs both here and here  which Jira is being slow to pick up on due to its now-daily-5pm-crash habit. As I reported during standup today (11/1), it is non-trivial to exclude plotting sources near the edge of chips as John Swinbank requested above because the xy pixel information is not yet in the association database. Therefore, I've skipped that for now.
            mrawls Meredith Rawls made changes -
            Link This issue relates to DM-16406 [ DM-16406 ]
            gkovacs Gabor Kovacs made changes -
            Assignee Meredith Rawls [ mrawls ] Gabor Kovacs [ gkovacs ]
            Hide
            gkovacs Gabor Kovacs added a comment -

            Ok, I'll do the review.

            Show
            gkovacs Gabor Kovacs added a comment - Ok, I'll do the review.
            swinbank John Swinbank made changes -
            Assignee Gabor Kovacs [ gkovacs ] Meredith Rawls [ mrawls ]
            swinbank John Swinbank made changes -
            Sprint AP F18-2, AP F18-3, AP F18-4, AP F18-5 [ 747, 748, 749, 750 ] AP F18-2, AP F18-3, AP F18-4, AP F18-5, AP F18-6 [ 747, 748, 749, 750, 751 ]
            Hide
            gkovacs Gabor Kovacs added a comment - - edited

            If we think that DM-14762-New-Template-Reprocessing.ipynb will be re-used in the future, I'd suggest adding a short summary about what the pipeline feature/configuration differences actually were between the "old" and "new" template coadds; just copy-pasting a few relevant lines from DM-15080. (As I understand, the image selection and psf matching options were different.)


            In the figure of the last Section (Define a "mini region" with a few thousand objects for future investigations), the black (low nDiaSources) patterns look quite different on the top right corner chips. Also the horizontal "black lines" are more pronounced in case of the new templates not just at the edges. Do we understand or care about these?

            Show
            gkovacs Gabor Kovacs added a comment - - edited If we think that DM-14762 -New-Template-Reprocessing.ipynb will be re-used in the future, I'd suggest adding a short summary about what the pipeline feature/configuration differences actually were between the "old" and "new" template coadds; just copy-pasting a few relevant lines from DM-15080 . (As I understand, the image selection and psf matching options were different.) In the figure of the last Section (Define a "mini region" with a few thousand objects for future investigations), the black (low nDiaSources) patterns look quite different on the top right corner chips. Also the horizontal "black lines" are more pronounced in case of the new templates not just at the edges. Do we understand or care about these?
            gkovacs Gabor Kovacs made changes -
            Status In Review [ 10004 ] Reviewed [ 10101 ]
            Hide
            mrawls Meredith Rawls added a comment -

            Thank you!

            To briefly address your questions: the main difference between the old and new templates is whatever has changed in the stack since the old ones were made (nearly a year ago now). For example, this includes crosstalk correction being implemented in obs_decam. The image selection is slightly different, but both used a best-seeing subset of all the HiTS 2014 visits, and both are psf-matched coadds.

            The horizontal black lines are edges and bad columns in the CCDs. These sources are almost certainly in masked pixel regions, but we do not yet have the ability to tell if a source or object falls in a masked region. These plots will look much cleaner when we visualize objects with sources that fall in unmasked/unflagged areas only. It's not entirely clear why the new templates result in more sources per object in these bad regions than the old ones did, but it ultimately shouldn't matter.

            Show
            mrawls Meredith Rawls added a comment - Thank you! To briefly address your questions: the main difference between the old and new templates is whatever has changed in the stack since the old ones were made (nearly a year ago now). For example, this includes crosstalk correction being implemented in obs_decam. The image selection is slightly different, but both used a best-seeing subset of all the HiTS 2014 visits, and both are psf-matched coadds. The horizontal black lines are edges and bad columns in the CCDs. These sources are almost certainly in masked pixel regions, but we do not yet have the ability to tell if a source or object falls in a masked region. These plots will look much cleaner when we visualize objects with sources that fall in unmasked/unflagged areas only. It's not entirely clear why the new templates result in more sources per object in these bad regions than the old ones did, but it ultimately shouldn't matter.
            mrawls Meredith Rawls made changes -
            Resolution Done [ 10000 ]
            Status Reviewed [ 10101 ] Done [ 10002 ]

              People

              Assignee:
              mrawls Meredith Rawls
              Reporter:
              swinbank John Swinbank
              Reviewers:
              Gabor Kovacs
              Watchers:
              Gabor Kovacs, John Swinbank, Krzysztof Findeisen, Meredith Rawls
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: