Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-10360

Documentation for generating statistics for L2/catalog data

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: Qserv
    • Labels:
      None

      Description

      Statistics were generated at IN2P3 for correcting JOIN queries. The general process needs to be documented so we can optimize it in the future (see DM-9757 for reference).

        Attachments

          Issue Links

            Activity

            Hide
            vaikunth Vaikunth Thukral added a comment -

            Planning on writing this up in confluence.

            Show
            vaikunth Vaikunth Thukral added a comment - Planning on writing this up in confluence.
            Hide
            vaikunth Vaikunth Thukral added a comment -
            Show
            vaikunth Vaikunth Thukral added a comment - The document to review is in confluence: https://confluence.lsstcorp.org/pages/viewpage.action?pageId=58950786
            Hide
            gapon Igor Gaponenko added a comment -

            I have reviewed the ticket (its scope, a proposed solution and the documentation), and this all looks right to me. The only minor improvement which I would recommend for further practical uses of the proposed recipe is related to step #2 ("...Generate list of all chunks that need to have statistics generated with appropriate options..."). I think using the INFORMATION_SCHEMA would be a better option for compiling a list of eligible tables. An advantage of this approach is that it won't require a direct access to the underlying file system. Here is an idea of a query against the INFORMATION_SCHEMA which would produce a similar result (set of the ANALYZE TABLE ... statements).:

            SELECT CONCAT("ANALYSE TABLE ",TABLE_SCHEMA,".",TABLE_NAME)
              FROM information_schema.tables
              WHERE TABLE_SCHEMA="LSST" AND TABLE_NAME LIKE "Source\_%";
            +------------------------------------------------------+
            | CONCAT("ANALYSE TABLE ",TABLE_SCHEMA,".",TABLE_NAME) |
            +------------------------------------------------------+
            | ANALYSE TABLE LSST.Source_1                          |
            | ANALYSE TABLE LSST.Source_2                          |
            +------------------------------------------------------+
            

            Show
            gapon Igor Gaponenko added a comment - I have reviewed the ticket (its scope, a proposed solution and the documentation), and this all looks right to me. The only minor improvement which I would recommend for further practical uses of the proposed recipe is related to step #2 ("...Generate list of all chunks that need to have statistics generated with appropriate options..."). I think using the INFORMATION_SCHEMA would be a better option for compiling a list of eligible tables. An advantage of this approach is that it won't require a direct access to the underlying file system. Here is an idea of a query against the INFORMATION_SCHEMA which would produce a similar result (set of the ANALYZE TABLE ... statements).: SELECT CONCAT( "ANALYSE TABLE " ,TABLE_SCHEMA, "." ,TABLE_NAME) FROM information_schema.tables WHERE TABLE_SCHEMA= "LSST" AND TABLE_NAME LIKE "Source\_%" ; + ------------------------------------------------------+ | CONCAT( "ANALYSE TABLE " ,TABLE_SCHEMA, "." ,TABLE_NAME) | + ------------------------------------------------------+ | ANALYSE TABLE LSST.Source_1 | | ANALYSE TABLE LSST.Source_2 | + ------------------------------------------------------+
            Hide
            vaikunth Vaikunth Thukral added a comment -

            Thanks for the review Igor Gaponenko, I've added your suggestion to the documentation and moved the ticket to Done.

            Show
            vaikunth Vaikunth Thukral added a comment - Thanks for the review Igor Gaponenko , I've added your suggestion to the documentation and moved the ticket to Done.

              People

              Assignee:
              vaikunth Vaikunth Thukral
              Reporter:
              vaikunth Vaikunth Thukral
              Reviewers:
              Igor Gaponenko
              Watchers:
              Igor Gaponenko, Vaikunth Thukral
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: