Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-11118

Build stubbed out verify_ap

    XMLWordPrintable

Details

    • 6
    • Alert Production F17 - 7
    • Alert Production

    Description

      We have a start at a design for the verify_ap. It will require changing the API for the prototype ap pipeline, but should contain place holders for anything we know we need but do not have yet.

      This will give us a better idea what the processing flow looks like and also give us a place to add metrics calculations and a place to stick things like association.

      We wrote down the following pseudocode as a starting point:

      main():
          find input data
          "--dataset"
          load_dataset_config --> yaml file with stuff like "HiTS" : "hits_data"
          (for starters, self-contained datasets that we know will work)
          input_repo_dir = eups.getPackageDir(datasetname)"mapper" : "obs_package"
          ?? eups setup obs_package ?? (supposedly this can be done within python?)
          "--output"
          copy stub input repo to output
          "--dataIdString" (note: often just "--Id" as a command line string in other places)
          ap_pipe.ingest_raws(data_location, repo_location)
          ap_pipe.ingest_calibs(data_location, repo_location)
          ap_pipe.process(repo_location, rerun, parallelization_level, dataIdString)
          --> how to get templates? in same repo with input data?
          ap_pipe.diff(repo_location, rerun, parallelization_level, dataIdString)
          ap_pipe.assoc(repo_location, rerun)
          ap_pipe.afterburner?? --> and/or, the afterburner could live in the same package as main, and be called as verify_ap_afterburner
      

      Attachments

        Issue Links

          Activity

            The code is finally ready for review. It should be equivalent to the pseudocode above, except that it's organized a little differently (e.g., the first six lines correspond to Dataset.__init__).

            reiss, can you review for general programming quality and not-too-horrible-deviations-from-Pythonicity? mrawls, can you review for consistency with the original design, and whether the stubs in pipedriver will work with your plan for ap_pipe?

            krzys Krzysztof Findeisen added a comment - The code is finally ready for review. It should be equivalent to the pseudocode above, except that it's organized a little differently (e.g., the first six lines correspond to Dataset.__init__ ). reiss , can you review for general programming quality and not-too-horrible-deviations-from-Pythonicity? mrawls , can you review for consistency with the original design, and whether the stubs in pipedriver will work with your plan for ap_pipe ?
            reiss David Reiss added a comment -

            Please see my comments on the PR. Other than those comments, it all looks really good to me. A couple of design questions but I am not sure if those were explicitly chosen or arbitrary – please see comments. I could not currently run the tests as my stack is having difficulty rebuilding – but please ensure that the tests are run and passed by CI.

            reiss David Reiss added a comment - Please see my comments on the PR. Other than those comments, it all looks really good to me. A couple of design questions but I am not sure if those were explicitly chosen or arbitrary – please see comments. I could not currently run the tests as my stack is having difficulty rebuilding – but please ensure that the tests are run and passed by CI.

            I've tried to reply to your design questions. Let me know if you want to argue about anything.

            As stated in the PR message, I cannot run test_dataset until we have a working dataset; that may turn up more bugs like the missing returns. krughoff said that need not block work on this ticket (presumably because this package is not yet part of the Stack or CI?).

            krzys Krzysztof Findeisen added a comment - I've tried to reply to your design questions. Let me know if you want to argue about anything. As stated in the PR message, I cannot run test_dataset until we have a working dataset; that may turn up more bugs like the missing returns. krughoff said that need not block work on this ticket (presumably because this package is not yet part of the Stack or CI?).

            As I said over on GitHub, I'm very happy with how this looks and you've done an impressive job building this from scratch. I will defer to reiss to mark it as "Reviewed" once he is happy with all of the dataset interface business.

            mrawls Meredith Rawls added a comment - As I said over on GitHub, I'm very happy with how this looks and you've done an impressive job building this from scratch. I will defer to reiss to mark it as "Reviewed" once he is happy with all of the dataset interface business.

            The repository created by this ticket is now called ap_verify. With this change, all test cases pass, and the main program runs until it hits a NotImplementedError.

            krzys Krzysztof Findeisen added a comment - The repository created by this ticket is now called ap_verify . With this change, all test cases pass, and the main program runs until it hits a NotImplementedError .
            reiss David Reiss added a comment -

            Sounds great to me. Consider it done!

            reiss David Reiss added a comment - Sounds great to me. Consider it done!

            People

              krzys Krzysztof Findeisen
              krughoff Simon Krughoff (Inactive)
              David Reiss, Meredith Rawls
              David Reiss, Krzysztof Findeisen, Meredith Rawls, Simon Krughoff (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Jenkins

                  No builds found.