Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-28626

Allow adding tables to published catalogs in the Qserv Replication/Ingest system



    • Improvement
    • Status: Done
    • Resolution: Done
    • None
    • Qserv
    • None



      This effort is not aimed at adding support for adding more rows to existing tables of the publishd catalogs.
      The current implementation has a rather simplistic model when it comes to ingesting catalogs and tables into Qserv. The model defines the following steps (transaction and chunk management are omitted here since these are irrelevant in the current context):

      1. create a catalog
      2. create tables
      3. ingest data into the tables
      4. publish the catalog

      Any modifications to the tables (adding more tables, deleting tables, adding more rows to the tables) would be still allowed while the catalog is remaining in the un-published state. Once it's published, no further modifications of the catalog would be formally supported by the Ingest system. This limitation may cause (and has already caused) various inconveniences in scenarios in which more tables may need to be added to an existing catalog after it's formally published. Although some workaround for this problem exists, the workaround is not convenient from the practical point of view as it requires making many changes to the persistent states of Qserv and the Replication system. Some of these changes would have to be done directly at the corresponding MySQL databases and tables. This protocol makes the workaround a rather fragile process that could be done by someone who has a deep knowledge of the design and implementation of the Qserv Replication and Ingest system.

      Hence, the goal of this effort is to extend the functionality of the Replication/Ingest system to allow adding new tables to the system by using the REST API. In the extended ingest model following protocol will be supported:

      1. temporary unpublish a catalog
      2. create new tables
      3. ingest data into the tables
      4. publish the catalog

      In the extended implementation, the previously ingested tables (before "unpublishing" the catalog) will be excluded from any operations by the R-I system's algorithms. To support this, a special is_published property will be added to the table descriptions in the Replication system's database schema. The new tables will be marked as published during the catalog publishing stage.

      A new REST service will be added to allow unpublishing catalogs:

      PUT /replication/config/database/:database

      The service should require a valid value of the admin-auth-key.

      Other changes

      The replication system database schema will be extended as follows:

      • The database descriptor will get two new columns representing timestamps when the database was created and when it was published (last time). If the database had to be temporarily unpublished, the second timestamp will be still retained to indicate when the previous time "publishing" operation was applied to the database.
      • Similar timestamps will be made to the table descriptors.
        • Note that "unpublishing" tables will still be not allowed.


        Issue Links


            gapon Igor Gaponenko added a comment - PR: https://github.com/lsst/qserv/pull/716

            Overall looks good. I left a few minor comments. Removing myself from the list of reviewers.


            npease Nate Pease [X] (Inactive) added a comment - Overall looks good. I left a few minor comments. Removing myself from the list of reviewers.  

            LGTM, thanks!

            fritzm Fritz Mueller added a comment - LGTM, thanks!


              gapon Igor Gaponenko
              gapon Igor Gaponenko
              Fritz Mueller
              Fabrice Jammes, Fritz Mueller, Igor Gaponenko, Nate Pease [X] (Inactive)
              0 Vote for this issue
              4 Start watching this issue




                  No builds found.