Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-14711

F18 PPDB (L1/AP) Oracle Experiment

    XMLWordPrintable

    Details

    • Type: Epic
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: None
    • Labels:
    • Epic Name:
      F18 PPDB (L1/AP) Oracle Experiment
    • Story Points:
      42
    • WBS:
      02C.06.01.01
    • Team:
      Data Access and Database
    • Cycle:
      Fall 2018

      Description

      Continue to evaluate performance of Oracle as potential infrastructure for the L1/AP database. Port existing prototype to Oracle and run performance tests using hardware provided by NCSA. Document performance relative to previous tests with MySQL and Postgres.

        Attachments

          Activity

          Hide
          salnikov Andy Salnikov added a comment -

          I think this epic is more or less complete. The summary of what has been tested so far with some conclusions:

          • In general Oracle performance seems to be on par with what we saw in Postgres at the level of 1-month of data
          • Scaling beyond 1 month needs different approach, not clear yet what
          • The performance seems to be limited by reading DiaSource and DiaForcedSource tables, possible approaches to scaling those are either parallelism with multiple partitions or in memory cache for those two tables.
          • Oracle shows some peculiarities like quick performance drop when selecting from DiaObjectLast table which recovers with time, we have not been able to fix this by using query hints
          • DiaObjectLast select performance is reasonable only if it is Index-Organized table with small number of columns and PK which corresponds to region select, fortunately we can keep number of columns very small in that table
          • Performance is constrained by server obviously, no significant difference between running 255 processes on single node (with few processes per core) or on many node with MPI. There is modest improvement in MPI case due to keeping database connection open.
          Show
          salnikov Andy Salnikov added a comment - I think this epic is more or less complete. The summary of what has been tested so far with some conclusions: In general Oracle performance seems to be on par with what we saw in Postgres at the level of 1-month of data Scaling beyond 1 month needs different approach, not clear yet what The performance seems to be limited by reading DiaSource and DiaForcedSource tables, possible approaches to scaling those are either parallelism with multiple partitions or in memory cache for those two tables. Oracle shows some peculiarities like quick performance drop when selecting from DiaObjectLast table which recovers with time, we have not been able to fix this by using query hints DiaObjectLast select performance is reasonable only if it is Index-Organized table with small number of columns and PK which corresponds to region select, fortunately we can keep number of columns very small in that table Performance is constrained by server obviously, no significant difference between running 255 processes on single node (with few processes per core) or on many node with MPI. There is modest improvement in MPI case due to keeping database connection open.

            People

            Assignee:
            Unassigned Unassigned
            Reporter:
            vaikunth Vaikunth Thukral
            Reviewers:
            Andy Salnikov
            Watchers:
            Andy Salnikov, Colin Slater, Fritz Mueller, Vaikunth Thukral
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved:

                Summary Panel

                  CI Builds

                  No builds found.