Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-15503

Improve raw data handling in gen2convert

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: None
    • Labels:

      Description

      DM-15189 improves support for raw data by adding new Formatters that provide the adaptation logic provided by std_ hooks in Gen2.  But those Formatters aren't used in gen2convert yet, because gen2convert doesn't see raw data as special.  As a result, loading raw with a Gen3 Butler pointed at a converted repo just gets you a DecoratedImage, not an Exposure.

      The simplest fix is probably to have gen2convert just run raw ingest instead of transferring raws via the usual mechanism.  The tricky part of that is to provide a way for gen2convert to identify the RawIngestTask specialization for a particular camera.  We need to solve that problem anyway, but I'd hoped to get some CameraGeom and package organization problems in better shape before tackling it.  An alternative may be to just force the user to provide the task name in the gen2convert configuration for now.

      Tentatively assigning this to me since I have a pretty good idea in my head of how to approach this, but Christopher Waters is also a possibility.

       

        Attachments

          Issue Links

            Activity

            jbosch Jim Bosch created issue -
            jbosch Jim Bosch made changes -
            Field Original Value New Value
            Link This issue relates to DM-15189 [ DM-15189 ]
            jbosch Jim Bosch made changes -
            Risk Score 0
            jbosch Jim Bosch made changes -
            Status To Do [ 10001 ] In Progress [ 3 ]
            jbosch Jim Bosch made changes -
            Epic Link DM-16675 [ 235235 ]
            jbosch Jim Bosch made changes -
            Story Points 4 2
            Description DM-15189 improves support for raw data by adding new Formatters that provide the adaptation logic provided by {{std_}} hooks in Gen2.  But those Formatters aren't used in {{gen2convert}} yet, because {{gen2convert}} doesn't see raw data as special.  As a result, loading {{raw}} with a Gen3 Butler pointed at a converted repo just gets you a {{DecoratedImage}}, not an {{Exposure}}.

            The simplest fix is probably to have {{gen2convert}} just run raw ingest instead of transferring raws via the usual mechanism.  The tricky part of that is to provide a way for {{gen2convert}} to identify the {{RawIngestTask}} specialization for a particular camera.  We need to solve that problem anyway, but I'd hoped to get some CameraGeom and package organization problems in better shape before tackling it.  An alternative may be to just force the user to provide the task name in the {{gen2convert}} configuration for now.

            Tentatively assigning this to me since I have a pretty good idea in my head of how to approach this, but [~czw] is also a possibility.

            [~mbutler], this ticket may interest you as it's the reason you couldn't use {{raw}} as the input for your dummy {{PipelineTask}} last week.
            DM-15189 improves support for raw data by adding new Formatters that provide the adaptation logic provided by {{std_}} hooks in Gen2.  But those Formatters aren't used in {{gen2convert}} yet, because {{gen2convert}} doesn't see raw data as special.  As a result, loading {{raw}} with a Gen3 Butler pointed at a converted repo just gets you a {{DecoratedImage}}, not an {{Exposure}}.

            The simplest fix is probably to have {{gen2convert}} just run raw ingest instead of transferring raws via the usual mechanism.  The tricky part of that is to provide a way for {{gen2convert}} to identify the {{RawIngestTask}} specialization for a particular camera.  We need to solve that problem anyway, but I'd hoped to get some CameraGeom and package organization problems in better shape before tackling it.  An alternative may be to just force the user to provide the task name in the {{gen2convert}} configuration for now.

            Tentatively assigning this to me since I have a pretty good idea in my head of how to approach this, but [~czw] is also a possibility.

             
            yusra Yusra AlSayyad made changes -
            Link This issue blocks DM-16467 [ DM-16467 ]
            jbosch Jim Bosch made changes -
            Link This issue relates to DM-16803 [ DM-16803 ]
            Hide
            jbosch Jim Bosch added a comment -

            This is basically done, except:

            • it's branched off DM-15034, so I'd like to merge that before putting this in review, unless it drags out;
            • there are no tests for the actual feature of the ticket (though other changes are covered by existing tests).

            I'd like to defer the tests to DM-16803, because I don't want to hack something new into ci_hsc when we'd get a lot more value (even short-term) from a bit of refactoring to make tests like these easy to write.  But I also don't want to block this ticket on that one, so I'm inclined to just live on the edge for a bit, given that one-off testing works and there'll be a lot more one-off testing of this functionality in the near term on DM-16467.

            Show
            jbosch Jim Bosch added a comment - This is basically done, except: it's branched off DM-15034 , so I'd like to merge that before putting this in review, unless it drags out; there are no tests for the actual feature of the ticket (though other changes are covered by existing tests). I'd like to defer the tests to DM-16803 , because I don't want to hack something new into ci_hsc when we'd get a lot more value (even short-term) from a bit of refactoring to make tests like these easy to write.  But I also don't want to block this ticket on that one, so I'm inclined to just live on the edge for a bit, given that one-off testing works and there'll be a lot more one-off testing of this functionality in the near term on DM-16467 .
            Hide
            jbosch Jim Bosch added a comment - - edited

            Christopher Waters, sorry for two reviews in rapid succession, but I think you're interested in this one: gen2convert now retrieves the correct formatter from the Instrument class, which has been refactored to provide the same to Gen3 ingest.  I've also updated ci_hsc to write a "camera" dataset that you should be able to take as an input in the ISR PipelineTask to work around Gen3 raws not having a camera attached directly.

            In case Jira doesn't spot all of the PRs, changes are in daf_butler, obs_base, obs_subaru, and ci_hsc.

            Show
            jbosch Jim Bosch added a comment - - edited Christopher Waters , sorry for two reviews in rapid succession, but I think you're interested in this one: gen2convert now retrieves the correct formatter from the Instrument class, which has been refactored to provide the same to Gen3 ingest.  I've also updated ci_hsc to write a "camera" dataset that you should be able to take as an input in the ISR PipelineTask to work around Gen3 raws not having a camera attached directly. In case Jira doesn't spot all of the PRs, changes are in daf_butler , obs_base , obs_subaru , and ci_hsc .
            jbosch Jim Bosch made changes -
            Reviewers Christopher Waters [ cwaters ]
            Status In Progress [ 3 ] In Review [ 10004 ]
            Hide
            czw Christopher Waters added a comment -

            I've added comments on the pull requests.  My only major concern is a carryover from working on DM-16467, where calibration data needs to be in a collection, but that collection needs to JOIN correctly with all other collections in the repository.  I had to hack this in my testing database, and although not technically part of this ticket, this ticket does move all calibration data into a `calib` repository.

            Show
            czw Christopher Waters added a comment - I've added comments on the pull requests.  My only major concern is a carryover from working on DM-16467 , where calibration data needs to be in a collection, but that collection needs to JOIN correctly with all other collections in the repository.  I had to hack this in my testing database, and although not technically part of this ticket, this ticket does move all calibration data into a `calib` repository.
            jbosch Jim Bosch made changes -
            Link This issue is triggering DM-16829 [ DM-16829 ]
            czw Christopher Waters made changes -
            Status In Review [ 10004 ] Reviewed [ 10101 ]
            Hide
            jbosch Jim Bosch added a comment -

            Merged to master.

             

            Show
            jbosch Jim Bosch added a comment - Merged to master.  
            jbosch Jim Bosch made changes -
            Resolution Done [ 10000 ]
            Status Reviewed [ 10101 ] Done [ 10002 ]

              People

              • Assignee:
                jbosch Jim Bosch
                Reporter:
                jbosch Jim Bosch
                Reviewers:
                Christopher Waters
                Watchers:
                Christopher Waters, Jim Bosch
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Summary Panel