Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-4168

productize "Data repository selection based on version"

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Won't Fix
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: butler
    • Labels:
      None
    • Story Points:
      6
    • Sprint:
      DB_W16_02
    • Team:
      Data Access and Database

      Description

      finish & productize work from DM-5608

        Attachments

          Issue Links

            Activity

            No builds found.
            npease Nate Pease [X] (Inactive) created issue -
            Hide
            npease Nate Pease [X] (Inactive) added a comment -

            Consult https://jira.lsstcorp.org/browse/RFC-95 (search for “version”) and read down from there about how they want to configure & specify repository roots. they talk about rerun a lot and that’s captured here. But not captured is:
            there may be multiple versions of a repository (like data release 1 and data release 2). Users need to be able to select easily between them.
            Also, want to be able to select different versions of different reference catalogs using the butler (right now they are selected thru EUPS).

            Show
            npease Nate Pease [X] (Inactive) added a comment - Consult https://jira.lsstcorp.org/browse/RFC-95 (search for “version”) and read down from there about how they want to configure & specify repository roots. they talk about rerun a lot and that’s captured here. But not captured is: there may be multiple versions of a repository (like data release 1 and data release 2). Users need to be able to select easily between them. Also, want to be able to select different versions of different reference catalogs using the butler (right now they are selected thru EUPS).
            frossie Frossie Economou made changes -
            Field Original Value New Value
            Component/s butler [ 12317 ]
            Team Data Access and Database [ 10204 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13230 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13230 ] This issue links to "Page (Confluence)" [ 13230 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13230 ] This issue links to "Page (Confluence)" [ 13230 ]
            swinbank John Swinbank made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13230 ] This issue links to "Page (Confluence)" [ 13230 ]
            swinbank John Swinbank made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13230 ] This issue links to "Page (Confluence)" [ 13230 ]
            jbosch Jim Bosch made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13230 ] This issue links to "Page (Confluence)" [ 13230 ]
            jbosch Jim Bosch made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13230 ] This issue links to "Page (Confluence)" [ 13230 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13230 ] This issue links to "Page (Confluence)" [ 13230 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13230 ] This issue links to "Page (Confluence)" [ 13230 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13230 ] This issue links to "Page (Confluence)" [ 13230 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13230 ] This issue links to "Page (Confluence)" [ 13230 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13230 ] This issue links to "Page (Confluence)" [ 13230 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13230 ] This issue links to "Page (Confluence)" [ 13230 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13230 ] This issue links to "Page (Confluence)" [ 13230 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13230 ] This issue links to "Page (Confluence)" [ 13230 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13230 ] This issue links to "Page (Confluence)" [ 13230 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13230 ] This issue links to "Page (Confluence)" [ 13230 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13230 ] This issue links to "Page (Confluence)" [ 13230 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13230 ] This issue links to "Page (Confluence)" [ 13230 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13230 ] This issue links to "Page (Confluence)" [ 13230 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13230 ] This issue links to "Page (Confluence)" [ 13230 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13230 ] This issue links to "Page (Confluence)" [ 13230 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13230 ] This issue links to "Page (Confluence)" [ 13230 ]
            npease Nate Pease [X] (Inactive) made changes -
            Description Add support in tasks so that a version of a repository (that may have many versions e.g. data release 1, data release 2, ...) may be selected via a command line argument and/or by setting an environment variable.

            This includes support for rerun, preprocessed, and astrometry_net_data, which all support the idea of different versions of data, which must be selectable.

            More details can be found in RFC-95
            Add support in tasks so that a version of a repository (that may have many versions e.g. data release 1, data release 2, ...) may be selected via a command line argument and/or by setting an environment variable.

            This includes support for rerun, preprocessed, and astrometry_net_data, which all support the idea of different versions of data, which must be selectable.
            UW also states: Handling of bitemporal calibration products including camera descriptions. N.b. calibration products can be lots of different things: objects, images, telemetry data, sky model, etc.

            More details can be found in RFC-95
            npease Nate Pease [X] (Inactive) made changes -
            Description Add support in tasks so that a version of a repository (that may have many versions e.g. data release 1, data release 2, ...) may be selected via a command line argument and/or by setting an environment variable.

            This includes support for rerun, preprocessed, and astrometry_net_data, which all support the idea of different versions of data, which must be selectable.
            UW also states: Handling of bitemporal calibration products including camera descriptions. N.b. calibration products can be lots of different things: objects, images, telemetry data, sky model, etc.

            More details can be found in RFC-95
            Add support in tasks so that a version of a repository (that may have many versions e.g. data release 1, data release 2, ...) may be selected via a command line argument and/or by setting an environment variable.

            This includes support for rerun, preprocessed, and astrometry_net_data, which all support the idea of different versions of data, which must be selectable.
            UW repo requirements include: calibration products including camera descriptions. N.b. calibration products can be lots of different things: objects, images, telemetry data, sky model, etc.

            More details can be found in RFC-95
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13241 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13241 ] This issue links to "Page (Confluence)" [ 13241 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13241 ] This issue links to "Page (Confluence)" [ 13241 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13241 ] This issue links to "Page (Confluence)" [ 13241 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13241 ] This issue links to "Page (Confluence)" [ 13241 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13241 ] This issue links to "Page (Confluence)" [ 13241 ]
            rhl Robert Lupton made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13230 ] This issue links to "Page (Confluence)" [ 13230 ]
            rhl Robert Lupton made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13230 ] This issue links to "Page (Confluence)" [ 13230 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13241 ] This issue links to "Page (Confluence)" [ 13241 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13241 ] This issue links to "Page (Confluence)" [ 13241 ]
            Hide
            rowen Russell Owen added a comment -

            There are other important aspects to this. For example color term correction data is keyed by two different items, both of which may change with time: the camera used to collect the data (possibly even the type of CCD in that camera) being corrected, and the reference catalog.

            Show
            rowen Russell Owen added a comment - There are other important aspects to this. For example color term correction data is keyed by two different items, both of which may change with time: the camera used to collect the data (possibly even the type of CCD in that camera) being corrected, and the reference catalog.
            Hide
            krughoff Simon Krughoff added a comment -

            I would also really like to be able to be able to do this by time as well. E.g. "Butler, please give me the color correction terms I should have used if I was reducing this data last week."

            The default would be "now" and I think we will want to tag certain times as special, i.e. Release x.

            Show
            krughoff Simon Krughoff added a comment - I would also really like to be able to be able to do this by time as well. E.g. "Butler, please give me the color correction terms I should have used if I was reducing this data last week." The default would be "now" and I think we will want to tag certain times as special, i.e. Release x.
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13241 ] This issue links to "Page (Confluence)" [ 13241 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13241 ] This issue links to "Page (Confluence)" [ 13241 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13241 ] This issue links to "Page (Confluence)" [ 13241 ]
            ktl Kian-Tat Lim made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13261 ]
            ktl Kian-Tat Lim made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13261 ] This issue links to "Page (Confluence)" [ 13261 ]
            xiuqin Xiuqin Wu [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13241 ] This issue links to "Page (Confluence)" [ 13241 ]
            xiuqin Xiuqin Wu [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13241 ] This issue links to "Page (Confluence)" [ 13241 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13241 ] This issue links to "Page (Confluence)" [ 13241 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13241 ] This issue links to "Page (Confluence)" [ 13241 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13241 ] This issue links to "Page (Confluence)" [ 13241 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13241 ] This issue links to "Page (Confluence)" [ 13241 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13241 ] This issue links to "Page (Confluence)" [ 13241 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13241 ] This issue links to "Page (Confluence)" [ 13241 ]
            jbecla Jacek Becla made changes -
            Epic Link DM-2404 [ 16718 ]
            jbecla Jacek Becla made changes -
            Story Points 12
            jbecla Jacek Becla made changes -
            Sprint DB_W16_02 [ 179 ]
            jbecla Jacek Becla made changes -
            Assignee Nate Pease [ npease ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13241 ] This issue links to "Page (Confluence)" [ 13241 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13241 ] This issue links to "Page (Confluence)" [ 13241 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13241 ] This issue links to "Page (Confluence)" [ 13241 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13241 ] This issue links to "Page (Confluence)" [ 13241 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13241 ] This issue links to "Page (Confluence)" [ 13241 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13241 ] This issue links to "Page (Confluence)" [ 13241 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13241 ] This issue links to "Page (Confluence)" [ 13241 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13241 ] This issue links to "Page (Confluence)" [ 13241 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13241 ] This issue links to "Page (Confluence)" [ 13241 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13241 ] This issue links to "Page (Confluence)" [ 13241 ]
            npease Nate Pease [X] (Inactive) made changes -
            Status To Do [ 10001 ] In Progress [ 3 ]
            jbecla Jacek Becla made changes -
            Sprint DB_W16_02 [ 179 ] DB_W16_02, DB_S16_03 [ 179, 199 ]
            npease Nate Pease [X] (Inactive) made changes -
            Status In Progress [ 3 ] In Review [ 10004 ]
            Reviewers Kian-Tat Lim [ ktl ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13241 ] This issue links to "Page (Confluence)" [ 13241 ]
            npease Nate Pease [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 13241 ] This issue links to "Page (Confluence)" [ 13241 ]
            Hide
            ktl Kian-Tat Lim added a comment -

            The direction the code is going is a good one; in particular, the repository of repository configs is a good primitive for this and similar use cases. But there are still substantial portions of this code (some unrelated to the ticket itself) that do not feel like they yet form a releasable feature set that we could announce in release notes. I've been thinking a bit about how to deal with complex, multi-part, interdependent developments like this with the goal of making sure that users are not disrupted while new, not-quite-ready features are being built, and I think it comes down to two alternatives:

            • Merging to a long-lived integration branch that is not master
            • Merging to master with a "version switch" triggered by (in this case) a Butler construction argument or perhaps an environment variable that only enables the new interfaces and implementation when explicitly requested.

            If interface changes are extensive and could be frequent due to uncertainty and evolution, then the first is probably preferable to minimize disruption to dependent package users until the interface is firmed up. If interface changes are expected to be minimal and infrequent because the interface is well-defined, then the second could be acceptable and would help with the eventual merge since dependent package users could help maintain compatibility with both versions while making unrelated changes.

            One of the things I worry about here is that the current lack of definition around Access and Storage (and the entire plugin serialization model) means that repository configurations are still unstable. The current code appears to expose this in both construction of Repositories (and hence Butlers) and in the persisted configuration files (which do not appear to have explicit code for dealing with evolution). On the other hand, we may be able to present an external interface that hides all of this complexity by providing a normal use case with pre-existing or internally-generated configurations (unlike the code-based example in LDM-463 for this ticket), in which case a version switch and "don't look behind the curtain" could be acceptable. (Note that modifications to existing configurations via code or manual overrides will become a normal use case in the future, so that interface does need to be fully defined and exposed.)

            So before anything is merged to master, I would like the following to take place:

            • Decide which of the above strategies is to be used and implement it.
            • Work through a complete example of how this primitive can be deployed in a particular use case such as the multi-version, date-range-based master calibration image repository and incorporate that into LDM-463.
            • Deal with any minor code comments that I expect to make in the PR later today.
            Show
            ktl Kian-Tat Lim added a comment - The direction the code is going is a good one; in particular, the repository of repository configs is a good primitive for this and similar use cases. But there are still substantial portions of this code (some unrelated to the ticket itself) that do not feel like they yet form a releasable feature set that we could announce in release notes. I've been thinking a bit about how to deal with complex, multi-part, interdependent developments like this with the goal of making sure that users are not disrupted while new, not-quite-ready features are being built, and I think it comes down to two alternatives: Merging to a long-lived integration branch that is not master Merging to master with a "version switch" triggered by (in this case) a Butler construction argument or perhaps an environment variable that only enables the new interfaces and implementation when explicitly requested. If interface changes are extensive and could be frequent due to uncertainty and evolution, then the first is probably preferable to minimize disruption to dependent package users until the interface is firmed up. If interface changes are expected to be minimal and infrequent because the interface is well-defined, then the second could be acceptable and would help with the eventual merge since dependent package users could help maintain compatibility with both versions while making unrelated changes. One of the things I worry about here is that the current lack of definition around Access and Storage (and the entire plugin serialization model) means that repository configurations are still unstable. The current code appears to expose this in both construction of Repositories (and hence Butlers) and in the persisted configuration files (which do not appear to have explicit code for dealing with evolution). On the other hand, we may be able to present an external interface that hides all of this complexity by providing a normal use case with pre-existing or internally-generated configurations (unlike the code-based example in LDM-463 for this ticket), in which case a version switch and "don't look behind the curtain" could be acceptable. (Note that modifications to existing configurations via code or manual overrides will become a normal use case in the future, so that interface does need to be fully defined and exposed.) So before anything is merged to master, I would like the following to take place: Decide which of the above strategies is to be used and implement it. Work through a complete example of how this primitive can be deployed in a particular use case such as the multi-version, date-range-based master calibration image repository and incorporate that into LDM-463. Deal with any minor code comments that I expect to make in the PR later today.
            ktl Kian-Tat Lim made changes -
            Status In Review [ 10004 ] Reviewed [ 10101 ]
            Hide
            npease Nate Pease [X] (Inactive) added a comment -

            I was hoping that we could get away with not having a development branch but at this point I'm inclined to agree that something is necessary.
            What about using the daf_butler package, or a new package? I think the benefits are that it would work in CI without having to set a var, and would allow other ticket branches to be built using cutting edge butler features. I expect we could merge changes back to daf_persistence and daf_butlerUtils if needed/desired?

            Show
            npease Nate Pease [X] (Inactive) added a comment - I was hoping that we could get away with not having a development branch but at this point I'm inclined to agree that something is necessary. What about using the daf_butler package, or a new package? I think the benefits are that it would work in CI without having to set a var, and would allow other ticket branches to be built using cutting edge butler features. I expect we could merge changes back to daf_persistence and daf_butlerUtils if needed/desired?
            Hide
            npease Nate Pease [X] (Inactive) added a comment -

            This story represents additional work needed beyond recently completed butler infrastructure work to support lookup for repositories. I.E gather specific requirements for mappers (and possibly registries) so that a repository can be looked up given science needs.

            Next step is to write an RFC.

            Show
            npease Nate Pease [X] (Inactive) added a comment - This story represents additional work needed beyond recently completed butler infrastructure work to support lookup for repositories. I.E gather specific requirements for mappers (and possibly registries) so that a repository can be looked up given science needs. Next step is to write an RFC.
            npease Nate Pease [X] (Inactive) made changes -
            Epic Link DM-2404 [ 16718 ] DM-5262 [ 22985 ]
            npease Nate Pease [X] (Inactive) made changes -
            Sprint DB_W16_02, DB_W16_03 [ 179, 199 ] DB_W16_02, DB_X16_03 [ 179, 204 ]
            Story Points 12 20
            npease Nate Pease [X] (Inactive) made changes -
            Status Reviewed [ 10101 ] To Do [ 10001 ]
            fritzm Fritz Mueller made changes -
            Sprint DB_W16_02, DB_X16_03 [ 179, 204 ] DB_W16_02, DB_S16_04 [ 179, 200 ]
            fritzm Fritz Mueller made changes -
            Sprint DB_W16_02, DB_S16_04 [ 179, 200 ] DB_W16_02, DB_X16_03 [ 179, 204 ]
            fritzm Fritz Mueller made changes -
            Rank Ranked higher
            fritzm Fritz Mueller made changes -
            Rank Ranked higher
            npease Nate Pease [X] (Inactive) made changes -
            Story Points 20 12
            fritzm Fritz Mueller made changes -
            Sprint DB_W16_02, DB_X16_03 [ 179, 204 ] DB_W16_02, DB_S16_05 [ 179, 201 ]
            fritzm Fritz Mueller made changes -
            Rank Ranked lower
            npease Nate Pease [X] (Inactive) made changes -
            Link This issue is blocked by DM-5608 [ DM-5608 ]
            npease Nate Pease [X] (Inactive) made changes -
            Description Add support in tasks so that a version of a repository (that may have many versions e.g. data release 1, data release 2, ...) may be selected via a command line argument and/or by setting an environment variable.

            This includes support for rerun, preprocessed, and astrometry_net_data, which all support the idea of different versions of data, which must be selectable.
            UW repo requirements include: calibration products including camera descriptions. N.b. calibration products can be lots of different things: objects, images, telemetry data, sky model, etc.

            More details can be found in RFC-95
            finish & productize work from DM-5608
            npease Nate Pease [X] (Inactive) made changes -
            Story Points 12 6
            npease Nate Pease [X] (Inactive) made changes -
            Summary Data repository selection based on version productize "Data repository selection based on version
            npease Nate Pease [X] (Inactive) made changes -
            Summary productize "Data repository selection based on version productize "Data repository selection based on version"
            npease Nate Pease [X] (Inactive) made changes -
            Epic Link DM-5262 [ 22985 ] DM-6032 [ 24341 ]
            fritzm Fritz Mueller made changes -
            Sprint DB_W16_02, DB_X16_05 [ 179, 201 ] DB_W16_02 [ 179 ]
            fritzm Fritz Mueller made changes -
            Rank Ranked higher
            fritzm Fritz Mueller made changes -
            Rank Ranked lower
            fritzm Fritz Mueller made changes -
            Rank Ranked lower
            fritzm Fritz Mueller made changes -
            Assignee Nate Pease [ npease ]
            fritzm Fritz Mueller made changes -
            Epic Link DM-6032 [ 24341 ] DM-12759 [ 36358 ]
            Hide
            tjenness Tim Jenness added a comment -

            Gen2 is dead.

            Show
            tjenness Tim Jenness added a comment - Gen2 is dead.
            tjenness Tim Jenness made changes -
            Resolution Done [ 10000 ]
            Status To Do [ 10001 ] Won't Fix [ 10405 ]

              People

              Assignee:
              Unassigned Unassigned
              Reporter:
              npease Nate Pease [X] (Inactive)
              Reviewers:
              Kian-Tat Lim
              Watchers:
              Gregory Dubois-Felsmann, Kian-Tat Lim, Nate Pease [X] (Inactive), Russell Owen, Simon Krughoff, Tim Jenness
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.