Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-13887

Let ap_verify process multiple images per instance

    Details

    • Story Points:
      6
    • Epic Link:
    • Sprint:
      AP S19-1, AP S19-2
    • Team:
      Alert Production

      Description

      Currently, ap_verify can process only a single image at a time. Now that ap_pipe is a command-line task, this limitation is driven only by internal factors:

      • Its error-handling strategy requires direct access to an instance of ApPipeTask, bypassing CmdLineTask's own parallel-processing support. Conversely, neither of the error-handling strategies provided by CmdLineTask may be appropriate for verification work.
      • Its metadata harvesting strategy requires either access to an instance of ApPipeTask, or knowledge of which data IDs have been processed.
      • Its metrics generation strategy assumes knowledge of which data IDs have been processed.

      Story points to be determined, as the above issues may be more efficiently handled as precursor tickets.

        Attachments

          Issue Links

            Activity

            Hide
            krzys Krzysztof Findeisen added a comment -

            Meredith Rawls, could you please review the changes to ap_verify and ap_pipe? Thanks!

            Jonathan Sick, while working on this ticket I found a mistake in some of the documentation for pipe_base. Can you please check that the new text (3 lines) is clear enough? Thanks!

            Show
            krzys Krzysztof Findeisen added a comment - Meredith Rawls , could you please review the changes to ap_verify and ap_pipe ? Thanks! Jonathan Sick , while working on this ticket I found a mistake in some of the documentation for pipe_base . Can you please check that the new text (3 lines) is clear enough? Thanks!
            Hide
            jsick Jonathan Sick added a comment -

            Thanks for fixing that up!

            Show
            jsick Jonathan Sick added a comment - Thanks for fixing that up!
            Hide
            mrawls Meredith Rawls added a comment -

            Thanks for implementing this! At a glance it looks great, but I'm going to look more closely and try running it to make sure it works as I'd expect early next week.

            Show
            mrawls Meredith Rawls added a comment - Thanks for implementing this! At a glance it looks great, but I'm going to look more closely and try running it to make sure it works as I'd expect early next week.
            Hide
            mrawls Meredith Rawls added a comment -

            I ran ap_verify with the 2019_01 weekly on a single visit+ccd, and I also used this ticket branch to run ap_verify on a pair of visits and a pair of ccds (4 total exposures). Both seemed to work. They are in /project/mrawls/ap_verify_201901 and /project/mrawls/ap_verify_DM13887 if you want to take a look, and I manually added log.txt to each of those directories. I have a few comments about the log message and the output files.

            • There are still a few instances of "cfitsio error stack" log messages related to defect ingestion which look scary
            • I'm glad it doesn't re-ingest and just does processCcd onward for each additional dataId, good
            • I was somewhat surprised to see a bunch of .json files appear in the directory from where I called the `ap_verify.py` command, which probably tells you how often I run ap_verify. I naively expected all output to go to the "output" location specified on the command line and not clutter up my hastily-chosen working directory. (I since manually moved them to each directory listed above, but I suspect one or more .json files from my initial ap_verify run was overwritten when I ran ap_verify a second time.)

            Overall this is a great improvement! I left some more comments on GitHub, but I have no problem with this being merged.

             

            Show
            mrawls Meredith Rawls added a comment - I ran ap_verify with the 2019_01 weekly on a single visit+ccd, and I also used this ticket branch to run ap_verify on a pair of visits and a pair of ccds (4 total exposures). Both seemed to work. They are in /project/mrawls/ap_verify_201901 and /project/mrawls/ap_verify_DM13887 if you want to take a look, and I manually added log.txt to each of those directories. I have a few comments about the log message and the output files. There are still a few instances of "cfitsio error stack" log messages related to defect ingestion which look scary I'm glad it doesn't re-ingest and just does processCcd onward for each additional dataId, good I was somewhat surprised to see a bunch of .json files appear in the directory from where I called the `ap_verify.py` command, which probably tells you how often I run ap_verify. I naively expected all output to go to the "output" location specified on the command line and not clutter up my hastily-chosen working directory. (I since manually moved them to each directory listed above, but I suspect one or more .json files from my initial ap_verify run was overwritten when I ran ap_verify a second time.) Overall this is a great improvement! I left some more comments on GitHub, but I have no problem with this being merged.  
            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            Thanks for the review!

            • The fits warnings are a known issue in the ap_verify_ci_hits2015 dataset. I removed a lot of HDUs from the raw images to keep the file size down; obs_decam can still read the files by using the HDU names, but it does complain about the unexpected format.
            • I've deferred changing the default .json location to DM-17248. While this would be a useful feature, it may be tricky to add without breaking previous code, so I'd rather not do it as part of this ticket.
            Show
            krzys Krzysztof Findeisen added a comment - - edited Thanks for the review! The fits warnings are a known issue in the ap_verify_ci_hits2015 dataset. I removed a lot of HDUs from the raw images to keep the file size down; obs_decam can still read the files by using the HDU names, but it does complain about the unexpected format. I've deferred changing the default .json location to DM-17248 . While this would be a useful feature, it may be tricky to add without breaking previous code, so I'd rather not do it as part of this ticket.

              People

              • Assignee:
                krzys Krzysztof Findeisen
                Reporter:
                krzys Krzysztof Findeisen
                Reviewers:
                Meredith Rawls
                Watchers:
                Jonathan Sick, Krzysztof Findeisen, Meredith Rawls
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Summary Panel