Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-24290

Enable running ci_hsc_gen3 contents on AWS

    Details

    • Story Points:
      4
    • Team:
      External
    • Urgent?:
      No

      Description

      Title might not summarize the end goal in the best way so: this issue collects bugs and improvements I plan on making in order to be able to execute the same list of commands that are execute when ci_hsc_gen3 is scons'd. This boils down to to making the necessary changes such that the following list of commands execute correctly, given the stated conditions, irregardless of whether the commands can be executed via a scons flag or not:

      If provided a correct butler.yaml, AWS credentials, a S3 Bucket as repo root and an connectable RDS instance

      makeButlerRepo.py --override s3://bucketname/ -c butler.yaml
      registerInstrument.py s3://bucketname/
      makeGen3Skymap.py s3://bucketname/ -C ~/ci_hsc_gen3/configs/skymap.py skymaps
      ingestExternalData.py s3://bucketname/ ci_hsc_gen3/resources/external.yaml
      pipetask run -d "patch = 69" -b s3://bucketname/ 
          -i calib/hsc,raw/hsc,masks/hsc,ref_cats,skymaps,shared/ci_hsc 
          -o shared/ci_hsc_output 
          --register-dataset-types 
          -p ci_hsc_gen3/pipelines/CiHsc.yaml
      

      This currently is not possible due to the following bugs or issues:

      1.  Bugs

      a. PostgreSQL database namespace not being handled correctly

      In daf/butler/registry/databases/postgresql.py L68 the namespace is set to a non-existent DSN "schema" key.

      As-is this breaks PostgreSQL database code when a namespace is not supplied explicitly. This was not noticed since all tests provide an explicit namespace

      Psycopg2 manual never explicitly lists parameters but has a "see Pconninfo" pointer to libq library. The manual page on PConninfo states that Pconninfo lists all PQconnectdb values that were set during connection time, omitting the ones that weren't. A list of recognized PQconnectdbParams can be found here. The list does not contain a key valued "schema".

      Leaving the namespace as None does not break functionality but is not one of the PostgreSQL recommended secure schema usage patters and it makes an awkward _str_ output.
      A solution is to query and use the database default schema. If not properly configured server-side this will default to 'public' which is not a recommended secure pattern so maybe we should consider replacing "public" with "None" and letting psycopg2 fail with a non-existing schema or raising an error when we encounter "public" if we want to strongly enforce the usage of secure schema patterns.

      b. pipe_tasks assumes Butler exists strictly on-local-disk

      Remove and/or replace all os.path.exists checks in pipe_tasks. For example the makeGen3SkyMap.py:

      # Verify any supplied paths actually exist on disk
      if not os.path.exists(args.butler):
          print("Butler path specified does not exists")
          sys.exit(1)
      

      This stopped being the case when S3Datastore was implemented.

      2. Improvements

      a. ingestExternalData.py script in ci_hsc_gen3 hard-codes "symlink" transfer type

      Which is fine in context of **ci_hsc_gen3  scons but makes no sense in non-POSIX compliant cases. Add a flag to ingestExternalData.py that defaults to symlink but allows other transfer types to be specified.

      b. makeButlerRepo.py creates a "." key when a fresh S3 Bucket is targeted

      Does not break functionality but shouldn't happen. The "." alias for absolute path shouldn't appear as a Key in the data repository root. Investigate and fix.

      3. Extra goals

      • get a scons flag that wraps all the above instead of an external script or individual commands
      • Add a way to turn checksums off. Talking to Michelle Gower brought to my attention that this flag does not seem to be added to butler.yaml when DAF_BUTLER_CONFIG_PATH is set, but does if -c is explicitly provided. I suspect --override is missing/not applied in one case but is in the other. In any case if I want to get scons -aws (point directly above) then I'll have to mess with the SConfigure anyhow so I might as well investigate.
      • See if timings for individual steps can be added as a flag to scons. Potentially useful but not reported by current scons is the timing breakdown of individual commands.

        Attachments

          Activity

          Hide
          dinob Dino Bektesevic added a comment -

          1.  Bugs
              a. PostgreSQL database namespace not being handled correctly Fixed
              b. pipe_tasks assumes Butler exists strictly on-local-disk Fixed

          2. Improvements
              a. ingestExternalData.py script in ci_hsc_gen3 hard-codes "symlink" transfer type Fixed, changed to auto mode.
              b. makeButlerRepo.py creates a "." key when a fresh S3 Bucket is targeted  Fixed

          3. Extra goals
              a. get a scons flag that wraps all the above instead of an external script or individual commands Fixed
              b. Add a way to turn checksums off from scons Won't Fix
              c. add profiling to scons Fixed

          I think I have covered everything I wanted from this ticket so I'm closing it as done. Thanks for the reviews and input!

          Show
          dinob Dino Bektesevic added a comment - 1.  Bugs     a. PostgreSQL database namespace not being handled correctly Fixed     b. pipe_tasks assumes Butler exists strictly on-local-disk Fixed 2. Improvements     a. ingestExternalData.py script in ci_hsc_gen3 hard-codes "symlink" transfer type Fixed , changed to auto mode.     b. makeButlerRepo.py creates a "." key when a fresh S3 Bucket is targeted  Fixed 3. Extra goals     a. get a scons flag that wraps all the above instead of an external script or individual commands Fixed     b. Add a way to turn checksums off from scons Won't Fix     c. add profiling to scons Fixed I think I have covered everything I wanted from this ticket so I'm closing it as done. Thanks for the reviews and input!
          Hide
          tjenness Tim Jenness added a comment -

          Looks okay to me. Couple of comments on GitHub.

          Show
          tjenness Tim Jenness added a comment - Looks okay to me. Couple of comments on GitHub.
          Hide
          tjenness Tim Jenness added a comment -

          For ingest the new "auto" mode should now work for both posix and s3 so that is a reasonable default to use. I have changed the 2to3 conversion script to use it.

          Show
          tjenness Tim Jenness added a comment - For ingest the new "auto" mode should now work for both posix and s3 so that is a reasonable default to use. I have changed the 2to3 conversion script to use it.
          Hide
          dinob Dino Bektesevic added a comment - - edited

          DAF_BUTLER_CONFIG_PATH sounds like that same issue. I feel like a solution to that problem may just be to get a scons flag that optionally supplies override to makeButlerRepo.py. Then the same end result can be achieved for the two approaches without the need for the extra steps. I don't know if people have strong feeling as to how many flags are too many however.

           

          The 2b surprised me too, that wasn't the case on the old 2019_w38 build we have, so something must've changed in between. I don't believe fix will be difficult, it will just require me stepping through a clean repo generation to see where the mishap happens.

          Show
          dinob Dino Bektesevic added a comment - - edited DAF_BUTLER_CONFIG_PATH sounds like that same issue. I feel like a solution to that problem may just be to get a scons flag that optionally supplies override to makeButlerRepo.py. Then the same end result can be achieved for the two approaches without the need for the extra steps. I don't know if people have strong feeling as to how many flags are too many however.   The 2b surprised me too, that wasn't the case on the old 2019_w38 build we have, so something must've changed in between. I don't believe fix will be difficult, it will just require me stepping through a clean repo generation to see where the mishap happens.
          Hide
          tjenness Tim Jenness added a comment -

          Is the DAF_BUTLER_CONFIG_PATH problem the one described in DM-23260 ?

          Maybe each datastore should have a default transfer mode?

          I'm intrigued by the 2b "." key problem. Looking forward to seeing what that is in more detail.

          Show
          tjenness Tim Jenness added a comment - Is the DAF_BUTLER_CONFIG_PATH problem the one described in DM-23260 ? Maybe each datastore should have a default transfer mode? I'm intrigued by the 2b "." key problem. Looking forward to seeing what that is in more detail.

            People

            • Assignee:
              dinob Dino Bektesevic
              Reporter:
              dinob Dino Bektesevic
              Reviewers:
              Jim Bosch, Tim Jenness
              Watchers:
              Dino Bektesevic, Hsin-Fang Chiang, Jim Bosch, Kian-Tat Lim, Michelle Gower, Nate Pease, Tim Jenness
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:
                Start date:

                Summary Panel