Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-5545

Alert production database next steps (May)

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Attachments

        Issue Links

          Activity

          Hide
          salnikov Andy Salnikov added a comment -

          Replication issues

          Alert production will need fully fault-tolerant solution for L1 database. This implies at least two replicas (with that volume of data it's unlikely we can afford more replicas) and transparent or fully-automated fail-over solution. To simplify fail-over operation it's preferred to have master-master type replication which allows transparent writing to any instance, fail-over in that case happens by just switching client to a different instance. For master-master replication we can have several options to chose from:

          • using native mysql option for master-master replication, it's mostly a question of configuring two instances (one-time task)
          • using third-party product like mariadb galera cluster

          Transparent client switching in case when main server goes away is not completely trivial. Standard mysql/mariadb C/C++ connector does not provide transparent switch-over mechanism (I believe only mariadb Connector/Java supports switching). It can be implemented on client side by extending API, that would require some error-catching mechanism to identify disconnects and doing smart attempt to reconnect when possible. Support at more high-level code may be necessary to do something in cases when failure happens in the middle of transaction. Alternatively switching can be implemented by some external mechanism, e.g. transparent proxy sitting in front of the mysql instances (e.g. mysql-proxy or MaxScale). Question still remains what happens when proxy dies, some client support is still needed in this case. Third-party products (server-side) may have their own solution a la proxy which takes care of the server switching.

          In addition to NCSA L1 instance(s) there will be read-only L1 instance at Base Site and it needs to be updated from NCSA instances with reasonably short delay. For that master-slave replication seems to be adequate. Three replicas together would make a sort of hybrid cluster, which may be easier managed with mariadb galera cluster.

          Show
          salnikov Andy Salnikov added a comment - Replication issues Alert production will need fully fault-tolerant solution for L1 database. This implies at least two replicas (with that volume of data it's unlikely we can afford more replicas) and transparent or fully-automated fail-over solution. To simplify fail-over operation it's preferred to have master-master type replication which allows transparent writing to any instance, fail-over in that case happens by just switching client to a different instance. For master-master replication we can have several options to chose from: using native mysql option for master-master replication, it's mostly a question of configuring two instances (one-time task) using third-party product like mariadb galera cluster Transparent client switching in case when main server goes away is not completely trivial. Standard mysql/mariadb C/C++ connector does not provide transparent switch-over mechanism (I believe only mariadb Connector/Java supports switching). It can be implemented on client side by extending API, that would require some error-catching mechanism to identify disconnects and doing smart attempt to reconnect when possible. Support at more high-level code may be necessary to do something in cases when failure happens in the middle of transaction. Alternatively switching can be implemented by some external mechanism, e.g. transparent proxy sitting in front of the mysql instances (e.g. mysql-proxy or MaxScale). Question still remains what happens when proxy dies, some client support is still needed in this case. Third-party products (server-side) may have their own solution a la proxy which takes care of the server switching. In addition to NCSA L1 instance(s) there will be read-only L1 instance at Base Site and it needs to be updated from NCSA instances with reasonably short delay. For that master-slave replication seems to be adequate. Three replicas together would make a sort of hybrid cluster, which may be easier managed with mariadb galera cluster.
          Hide
          salnikov Andy Salnikov added a comment -

          Spatial indexing and locality

          Following our yesterday discussion with Serge regarding all things L1, in particular spacial searches. For indexing we do have more than one option, in addition to HTM another viable option is Q3C. Q3C may be preferred to HTM because it is based on simpler non-recursive computations. In any case index calculation is supposed to be done on a client side, there is no reason to force this onto server (though it may still be possible). Both HTM and Q3C will be included into sphgeom and sciSQL (Serge is working on it).

          Spacial locality is useful to have to limit the number of disk seeks, though this needs to be measured. One minor complication is that InnoDB is using primary key for ordering, so if we want to control locality we may need complicated primary key. It will also mean significant amount of re-indexing when adding a bunch of new DIAObject records on every visit.

          Show
          salnikov Andy Salnikov added a comment - Spatial indexing and locality Following our yesterday discussion with Serge regarding all things L1, in particular spacial searches. For indexing we do have more than one option, in addition to HTM another viable option is Q3C. Q3C may be preferred to HTM because it is based on simpler non-recursive computations. In any case index calculation is supposed to be done on a client side, there is no reason to force this onto server (though it may still be possible). Both HTM and Q3C will be included into sphgeom and sciSQL (Serge is working on it). Spacial locality is useful to have to limit the number of disk seeks, though this needs to be measured. One minor complication is that InnoDB is using primary key for ordering, so if we want to control locality we may need complicated primary key. It will also mean significant amount of re-indexing when adding a bunch of new DIAObject records on every visit.
          Hide
          ktl Kian-Tat Lim added a comment -

          It is not clear to me that the L1 database needs to be fully fault-tolerant and especially that it needs the complexity of a master-master replication solution. The production-side L1 database needs to accept writes and likely spatial match queries only during nighttime operations, with some "dumps" for DayMOPS (~50% duty cycle). It is expected to run on a single node (no reduction of MTBF due to multiple nodes). Switching to a slave replica (now a new master) from the original master after a failure does not need to be instantaneous; dropping several visits is acceptable.

          Show
          ktl Kian-Tat Lim added a comment - It is not clear to me that the L1 database needs to be fully fault-tolerant and especially that it needs the complexity of a master-master replication solution. The production-side L1 database needs to accept writes and likely spatial match queries only during nighttime operations, with some "dumps" for DayMOPS (~50% duty cycle). It is expected to run on a single node (no reduction of MTBF due to multiple nodes). Switching to a slave replica (now a new master) from the original master after a failure does not need to be instantaneous; dropping several visits is acceptable.
          Hide
          salnikov Andy Salnikov added a comment -

          I don't think that mysql mysql master-master replication is much more complex compared to master-slave. It's a matter of configuration only, and fail-over is easier in my view with master-master.

          Show
          salnikov Andy Salnikov added a comment - I don't think that mysql mysql master-master replication is much more complex compared to master-slave. It's a matter of configuration only, and fail-over is easier in my view with master-master.
          Hide
          salnikov Andy Salnikov added a comment -

          Attaching an illustration for the proposed replication architecture.

          Show
          salnikov Andy Salnikov added a comment - Attaching an illustration for the proposed replication architecture.
          Hide
          salnikov Andy Salnikov added a comment -

          Fritz, K-T, could you have a quick look at the DMTN-018 to see if there are any gaping inconsistencies with reality there?

          Show
          salnikov Andy Salnikov added a comment - Fritz, K-T, could you have a quick look at the DMTN-018 to see if there are any gaping inconsistencies with reality there?
          Hide
          fritzm Fritz Mueller added a comment -

          Went ahead and pushed a bunch of minor grammar fixes. Looks good – merge at will – thanks, Andy!

          Show
          fritzm Fritz Mueller added a comment - Went ahead and pushed a bunch of minor grammar fixes. Looks good – merge at will – thanks, Andy!
          Hide
          salnikov Andy Salnikov added a comment -

          Thanks Fritz, hope it did not take longer to correct my English than to write the note itself I pushed everything to master and it is already published at https://dmtn-018.lsst.io/

          Show
          salnikov Andy Salnikov added a comment - Thanks Fritz, hope it did not take longer to correct my English than to write the note itself I pushed everything to master and it is already published at https://dmtn-018.lsst.io/

            People

            • Assignee:
              salnikov Andy Salnikov
              Reporter:
              fritzm Fritz Mueller
              Reviewers:
              Fritz Mueller, Kian-Tat Lim
              Watchers:
              Andy Salnikov, Fritz Mueller, Kian-Tat Lim
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Summary Panel