Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-31458

Prepare to merge Cassandra branch of dax_apdb

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: dax
    • Labels:
    • Story Points:
      10
    • Epic Link:
    • Sprint:
      DB_F21_06
    • Team:
      Data Access and Database
    • Urgent?:
      No

      Description

      The dax_apdb branch that I used for Cassandra development diverged significantly from the master branch. We need to start merging that branch to provide Cassandra support and hopefully allow AP pipeline to use actual Cassandra service (when it goes live at SLAC). These are very big changes and it will not be possible to make it backward compatible so we need to update client side as well. The changes include:

      • how do we configure each implementation, obviously Cassandra configuration is going to be different from sqlalchemy
      • because of how Cassandra partitioning works we need to change API for storing/retrieving all data, but first we need to figure out what is the best API for clients

      This ticket is going to cover all aspects, I'll start filling details of these changes in comments so that we can have a productive discussion. Please add anyone who you think should be involved to watchers.

        Attachments

          Issue Links

            Activity

            Hide
            salnikov Andy Salnikov added a comment -

            And just a heads up for possible future discussion about APDB interface - the countUnassociatedObjects() method may be super inefficient in Casssandra. I'll think about it later when I start working on that implementation, but be prepared for that discussion

            Show
            salnikov Andy Salnikov added a comment - And just a heads up for possible future discussion about APDB interface - the countUnassociatedObjects() method may be super inefficient in Casssandra. I'll think about it later when I start working on that implementation, but be prepared for that discussion
            Hide
            salnikov Andy Salnikov added a comment -

            Krzysztof Findeisen, would you have time to look at the changes? Sorry for large PR, but I hope it simplifies few things on client side and will make it easier for AP pipeline to handle Cassandra transition.

            There are four packages to look at (JIRA ignores ap_pipe for some reason):

            No need to look at l1dbproto, that one is for my self-review.

            Jenkins is happy, and I ran ap_verify with ap_verify_ci_cosmos_pdr2 dataset, that also worked OK.

            Any feedback on new API is welcome, if you have any suggestions I'd be happy to hear that too.

            Show
            salnikov Andy Salnikov added a comment - Krzysztof Findeisen , would you have time to look at the changes? Sorry for large PR, but I hope it simplifies few things on client side and will make it easier for AP pipeline to handle Cassandra transition. There are four packages to look at (JIRA ignores ap_pipe for some reason): dax_apdb: https://github.com/lsst/dax_apdb/pull/21 ap_association: https://github.com/lsst/ap_association/pull/130 ap_pipe: https://github.com/lsst/ap_pipe/pull/91 verify: https://github.com/lsst/verify/pull/89 No need to look at l1dbproto, that one is for my self-review. Jenkins is happy, and I ran ap_verify with ap_verify_ci_cosmos_pdr2 dataset, that also worked OK. Any feedback on new API is welcome, if you have any suggestions I'd be happy to hear that too.
            Hide
            krzys Krzysztof Findeisen added a comment -

            Sorry for the delay. A few of the dax_apdb changes make me a little nervous, but otherwise looks good.

            I have one question about what this means for ap_verify. ap_verify creates and configures an SQLite database on the fly, setting config parameters that in the new system are specific to SqlApdb. Should it also force a redirect to SQlApdb, or have a check that the pipeline fed to ap_verify uses a subclass of SQlApdb?

            Show
            krzys Krzysztof Findeisen added a comment - Sorry for the delay. A few of the dax_apdb changes make me a little nervous, but otherwise looks good. I have one question about what this means for ap_verify . ap_verify creates and configures an SQLite database on the fly, setting config parameters that in the new system are specific to SqlApdb . Should it also force a redirect to SQlApdb , or have a check that the pipeline fed to ap_verify uses a subclass of SQlApdb ?
            Hide
            salnikov Andy Salnikov added a comment -

            To answer the question about `ap_verify` - at this point I don't think it matters, there is no alternative implementation yet so there is no chance it will break. Once we have Cassandra merged we can start looking at how the whole thing needs to be updated to support two implementations. I imagine there is more than one place that will need an update and will likely need a dedicated ticket just for that.

            Show
            salnikov Andy Salnikov added a comment - To answer the question about `ap_verify` - at this point I don't think it matters, there is no alternative implementation yet so there is no chance it will break. Once we have Cassandra merged we can start looking at how the whole thing needs to be updated to support two implementations. I imagine there is more than one place that will need an update and will likely need a dedicated ticket just for that.
            Hide
            salnikov Andy Salnikov added a comment -

            Jenkins is happy with all my recent updates and ap_verify runs OK too. I have merged all packages, hope nothing is broken. Thanks for review and all suggestions, and sorry for such ginormous PR, next ones should be more reasonable size.

            Show
            salnikov Andy Salnikov added a comment - Jenkins is happy with all my recent updates and ap_verify runs OK too. I have merged all packages, hope nothing is broken. Thanks for review and all suggestions, and sorry for such ginormous PR, next ones should be more reasonable size.

              People

              Assignee:
              salnikov Andy Salnikov
              Reporter:
              salnikov Andy Salnikov
              Reviewers:
              Krzysztof Findeisen
              Watchers:
              Andy Salnikov, Chris Morrison [X] (Inactive), Fritz Mueller, Kian-Tat Lim, Krzysztof Findeisen
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.