Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-29370

Allow butler gen3 to retrieve amplifiers from raw

    XMLWordPrintable

    Details

    • Story Points:
      8
    • Team:
      Data Release Production
    • Urgent?:
      No

      Description

      Eric Charles is doing EO testing and would like to be able to access the unadulterated raw pixel data for each amplifier. Currently he has to retrieve the raw with bounding boxes and has to revert flipping. This can be inefficient for a single amp.

      The request is to add a raw.amp derived component that can return the amp pixel data and header exactly as it is stored in the FITS file.

      Additional questions:

      • Should there be a raw.amps to return all the amps in a dict/list?
      • How does the amp number get passed to butler.get?
      • Should bounding boxes be allowed to request a subset of an amplifier?

      There might need to be some changes to daf_butler to support parameters for the derived component since currently all parameters are handled by the composite before the result is given to the derived component (so passing in bbox refers to a subset of the entire image before the derived component sees it).

        Attachments

          Issue Links

            Activity

            Hide
            jbosch Jim Bosch added a comment - - edited

            I think we definitely need to attach a Detector of some kind to the Exposure this method returns, or it won't be of much use, and it seems like constructing that out of in-untrimmed-assembled-coordinates Detector objects that come directly from the cameraGeom will be the vast majority of the work on this ticket.

            Circa DM-18610 I had big plans for making cameraGeom more amenable to that sort of thing, which got handed off to Christopher Waters because Gen3 was (still) calling, and then it promptly got buried beneath a pile of CPP and ISR issues for Chris. As it stands, the cameraGeom design is supposed to only ever represent the bounding boxes in a particular, standard state (post-assembly for detector-level boxes, assembled+untrimmed for amplifier boxes), because all other boxes for other states can be derived from that, and it's supposed to be true that if you do butler.get("camera", ...)[N] or butler.get("raw", detector=N, ...).getDetector() or butler.get("calexp", detector=N, ...).getDetector(), you get exactly the same thing each time, because the cameraGeom isn't supposed to know the assembly state and certainly isn't supposed to have bounding boxes that depend on it.

            All that said, I believe there is long-standing code in various places (including the obs_lsst raw formatters, IIRC) that basically ignores that and does treat the Detector and Amplifiers as things whose bounding boxes should be modified as the Exposure they are attached to is, and, well, it mostly works anyway - if you don't read the afw docstrings about what various bboxes are supposed to mean too carefully, and are willing to ignore the various attributes that don't make sense at all for some states, and don't actually expect those statements about butler.get consistency. It drives me absolutely bonkers, but the people who work with it regularly seem to be doing okay, and I just don't have time (and I don't think Chris has time) to really fix it.

            So, I am not optimistic about us solving that well anytime soon - at some level I think it's a symptom of us just declaring every image an Exposure instead of having different classes to represent different kinds of images and the states they can be in - and while whatever we do on this ticket needs to involve providing useful cameraGeom somehow, we need to be prepared to hold our noses about that making sense as part of a bigger picture if we want it to get done.

            Show
            jbosch Jim Bosch added a comment - - edited I think we definitely need to attach a Detector of some kind to the Exposure this method returns, or it won't be of much use, and it seems like constructing that out of in-untrimmed-assembled-coordinates Detector objects that come directly from the cameraGeom will be the vast majority of the work on this ticket. Circa DM-18610 I had big plans for making cameraGeom more amenable to that sort of thing, which got handed off to Christopher Waters because Gen3 was (still) calling, and then it promptly got buried beneath a pile of CPP and ISR issues for Chris. As it stands, the cameraGeom design is supposed to only ever represent the bounding boxes in a particular, standard state (post-assembly for detector-level boxes, assembled+untrimmed for amplifier boxes), because all other boxes for other states can be derived from that, and it's supposed to be true that if you do butler.get("camera", ...) [N] or butler.get("raw", detector=N, ...).getDetector() or butler.get("calexp", detector=N, ...).getDetector() , you get exactly the same thing each time, because the cameraGeom isn't supposed to know the assembly state and certainly isn't supposed to have bounding boxes that depend on it. All that said, I believe there is long-standing code in various places (including the obs_lsst raw formatters, IIRC) that basically ignores that and does treat the Detector and Amplifiers as things whose bounding boxes should be modified as the Exposure they are attached to is, and, well, it mostly works anyway - if you don't read the afw docstrings about what various bboxes are supposed to mean too carefully, and are willing to ignore the various attributes that don't make sense at all for some states, and don't actually expect those statements about butler.get consistency. It drives me absolutely bonkers, but the people who work with it regularly seem to be doing okay, and I just don't have time (and I don't think Chris has time) to really fix it. So, I am not optimistic about us solving that well anytime soon - at some level I think it's a symptom of us just declaring every image an Exposure instead of having different classes to represent different kinds of images and the states they can be in - and while whatever we do on this ticket needs to involve providing useful cameraGeom somehow, we need to be prepared to hold our noses about that making sense as part of a bigger picture if we want it to get done.
            Hide
            jbosch Jim Bosch added a comment -

            I think this ticket is no longer blocked by DM-29505 prototyping (which I've indicated by closing that ticket). That was a useful thought experiment and it may come back someday, but for now we've agreed that the best way to represent the camera geometry of the single-amplifier image returned by the functionality requested by this ticket is just to attach an afw.image.Detector object that contains a single amplifier in the appropriate state.

            For LSST data, that should be straightforward because the Amplifier objects start out in a state that matches the actual raw data (not the "raw" that butler returns), and are transformed in the raw formatter code. So we just need to intercept those Amplifier objects early and return them before that transformation. For other instruments that have preassembled/untrimmed raws, we can also just return the Amplifier objects we start with, because those don't ever change. In both cases we'll be asking the Detector to "rebuild" itself with just one Amplifier.

            I think that's enough information for either Tim Jenness or myself to do the work, though I'm not sure how much Tim knows about those CameraGeom APIs, so it might make more sense for me to do it.

            I think we might want a new StorageClass or two that also maps to afw.image.Exposure, for raw and raw.amp, to enable this, though trying to redefine the raw dataset type in a bunch of live repos sounds unpleasant in a character-building, "I guess this is something we need to be able to do" kind of way. It'd be easier to just add an amp read-component to the Exposure StorageClass and just expect that most formatters won't implement it.

            We will also need a new ticket to enable reading single amplifiers from at least bias images; that will be trickier (we need to essentially write the inverse of the LSST raw formatter assembly code), and Robert Lupton has offered to write the guts of it. I imagine that means biases and perhaps some other master calibs would also use the new StorageClass, if we have one.

            Show
            jbosch Jim Bosch added a comment - I think this ticket is no longer blocked by DM-29505 prototyping (which I've indicated by closing that ticket). That was a useful thought experiment and it may come back someday, but for now we've agreed that the best way to represent the camera geometry of the single-amplifier image returned by the functionality requested by this ticket is just to attach an afw.image.Detector object that contains a single amplifier in the appropriate state. For LSST data, that should be straightforward because the Amplifier objects start out in a state that matches the actual raw data (not the "raw" that butler returns), and are transformed in the raw formatter code. So we just need to intercept those Amplifier objects early and return them before that transformation. For other instruments that have preassembled/untrimmed raws, we can also just return the Amplifier objects we start with, because those don't ever change. In both cases we'll be asking the Detector to "rebuild" itself with just one Amplifier . I think that's enough information for either Tim Jenness or myself to do the work, though I'm not sure how much Tim knows about those CameraGeom APIs, so it might make more sense for me to do it. I think we might want a new StorageClass or two that also maps to afw.image.Exposure , for raw and raw.amp , to enable this, though trying to redefine the raw dataset type in a bunch of live repos sounds unpleasant in a character-building, "I guess this is something we need to be able to do" kind of way. It'd be easier to just add an amp read-component to the Exposure StorageClass and just expect that most formatters won't implement it. We will also need a new ticket to enable reading single amplifiers from at least bias images; that will be trickier (we need to essentially write the inverse of the LSST raw formatter assembly code), and Robert Lupton has offered to write the guts of it. I imagine that means biases and perhaps some other master calibs would also use the new StorageClass, if we have one.
            Hide
            jbosch Jim Bosch added a comment -

            This took much longer than expected because I kept finding new little things that needed fixing along the way, and testing this kind of geometric bookkeeping is always a pain, but I think this is finally ready for review.

            It also includes everything that I'd originally planned to defer to DM-29732, because it turned out that was essentially the same as making it work for non-LSST raws, and I wanted the logic to test the LSST raw case, too.

            Robert Lupton, I'm hoping you can review; this'll be your chance to move out of last place on the Science Pipelines reviewing leaderboard , but it's a shame you won't get credit there for a larger-than-average ticket.

            Changes are split across five packages (as usual, Jira doesn't see the afw one), and I've tried to define the commits to make reviewing commit-by-commit more digestible than looking at the full changes at once.  The PRs are (in the order I'd recommend looking at them):

            • utils: adds some flexibility to NaN comparisons in our workhorse array-comparison test assertion.
            • daf_butler: just some YAML declarations for the new read parameters, but that includes the detailed documentation (as YAML comments, sadly; DM-29985) for how to use the new parameters and what exactly one can expect to get back.
            • afw: most of the new geometric logic and a long-overdue redocumentation of amp attributes to reflect how they are actually used.
            • obs_base: support for per-amp reads in the base formatter class for raws (this implementation is inherited directly by all but obs_lsst) and the concrete formatter for non-raw Exposures.  Also support for Butler.get("raw.detector").
            • obs_lsst: support for per-amp reads in the raw formatter class, because the base class implementation won't work given its unique (among DM-maintained obs packages) on-disk layout.  Also fixes to make Butler.get("raw.detector") return the right thing.

             

            Show
            jbosch Jim Bosch added a comment - This took much longer than expected because I kept finding new little things that needed fixing along the way, and testing this kind of geometric bookkeeping is always a pain, but I think this is finally ready for review. It also includes everything that I'd originally planned to defer to DM-29732 , because it turned out that was essentially the same as making it work for non-LSST raws, and I wanted the logic to test the LSST raw case, too. Robert Lupton , I'm hoping you can review; this'll be your chance to move out of last place on the Science Pipelines reviewing leaderboard , but it's a shame you won't get credit there for a larger-than-average ticket. Changes are split across five packages (as usual, Jira doesn't see the afw one), and I've tried to define the commits to make reviewing commit-by-commit more digestible than looking at the full changes at once.  The PRs are (in the order I'd recommend looking at them): utils : adds some flexibility to NaN comparisons in our workhorse array-comparison test assertion. daf_butler : just some YAML declarations for the new read parameters, but that includes the detailed documentation (as YAML comments, sadly; DM-29985 ) for how to use the new parameters and what exactly one can expect to get back. afw : most of the new geometric logic and a long-overdue redocumentation of amp attributes to reflect how they are actually used. obs_base : support for per-amp reads in the base formatter class for raws (this implementation is inherited directly by all but obs_lsst) and the concrete formatter for non-raw Exposures.  Also support for Butler.get("raw.detector"). obs_lsst : support for per-amp reads in the raw formatter class, because the base class implementation won't work given its unique (among DM-maintained obs packages) on-disk layout.  Also fixes to make Butler.get("raw.detector") return the right thing.  
            Hide
            jbosch Jim Bosch added a comment -

            Christopher Waters, here's that review-transfer from Robert Lupton I mentioned.

            Show
            jbosch Jim Bosch added a comment - Christopher Waters , here's that review-transfer from Robert Lupton I mentioned.
            Hide
            czw Christopher Waters added a comment -

            A few comments on PR, nothing I think is major.

            Show
            czw Christopher Waters added a comment - A few comments on PR, nothing I think is major.

              People

              Assignee:
              jbosch Jim Bosch
              Reporter:
              tjenness Tim Jenness
              Reviewers:
              Christopher Waters
              Watchers:
              Christopher Waters, Eric Charles, James Chiang, Jim Bosch, Robert Lupton, Tim Jenness
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.