Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-26754

Migrating a configuration system of the Qserv partitioning tools to JSON

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: Qserv
    • Labels:
      None

      Description

      The current implementation of the partitioning tools (Git package https://github.com/lsst/partition) is based on a custom DSL. This language is loosely based on JSON, though it's not JSON. Neither it's YAML. The non-standard configuration system poses a number of problems with using the tools in the LSST (and beyond) production environment, such as:

      • it's impossible to use standard tools and an libraries for making, editing and validating configurations
      • a manual process of making the configurations is error-prone, and has already caused problems with preparing various pre-production catalogs for ingesting into Qserv.

      Another problem is that the configuration system allows overriding nearly all configuration file's parameters in the command line. Though such flexibility may be handy for some limited debugging or quick testing scenario, it's highly unsafe in production. In any production scenarios, all essential parameters affecting data transformations must be captured in a source code system for bookkeeping, monitoring and data provenance purposes. These tasks are better addressed if the configuration system is orthogonal.

      Hence, a goal of this effort is to migrate the configuration system to JSON. This would open a possibility to rely on numerous standard tools for making, editing and validating configurations.

      Here is an example of the JSON-based configuration file:

      {"mr":{
        "num-workers":12, "block-size":128, "pool-size":32768
       },
       "part":{
        "num-stripes":340, "num-sub-stripes":3,
        "chunk":"chunkId", "sub-chunk":"subChunkId",
        "pos_ra":"ra", "pos_dec":"dec",
        "overlap":0.01667
       },
       "in":{
        "null":"\\N", "delimiter":"\t", "escape":"\\",
        "field":[
         "designation",
         "ra",
         "dec",
         "cntr",
         "source_id"
        ]
       },
       "out": {
        "null":"\\N", "delimiter":",", "escape":"\\", "no-quote":false
       }
      }
      

        Attachments

          Issue Links

            Activity

            Hide
            npease Nate Pease added a comment -

            Overall it seems fine. I've made a few suggestions, you can decide what to do with those.

            Show
            npease Nate Pease added a comment - Overall it seems fine. I've made a few suggestions, you can decide what to do with those.

              People

              Assignee:
              gapon Igor Gaponenko
              Reporter:
              gapon Igor Gaponenko
              Reviewers:
              Nate Pease
              Watchers:
              Fritz Mueller, Hsin-Fang Chiang, Igor Gaponenko, Nate Pease
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: