Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-38952

Add ci_middleware package

    XMLWordPrintable

    Details

      Description

      We have a lot of missing test coverage in the middleware packages that have let a lot of bugs slip through onto main recently, especially in execution butler, QuantumGraph generation, and corners of the butler query system exercised only by those. It's very hard to write complete unit tests for these components in the packages that implement, because the most complicated bits of logic involve interactions between our realistically-complicated data model and a realistically complicated pipeline. We've written many test utility suites with made-up data across several packages but none of them are really adequate (or easy to extend without breaking lots of existing tests).

      But we also have a powerful tool we've barely used - the mock-execution system created on DM-31253. We also have plenty of realistic test data in various repos that would collapse down to something pretty small if we exported just the database content and replaced files with tiny stubs.

      My plan here is to create a new ci_middleware package with that kind of test data (probably in the form of an export YAML and library code to make a repo from it) and test scripts. It'd depend on on drp_pipe to get some real pipelines (AP pipelines are at present a subset of DRP from the middleware-complexity standpoint) and some real obs package (probably obs_subaru, since that's what will support the more complex pipelines and calibrations right now). It'll also have required dependencies on daf_butler, pipe_base, and obs_base, and an optional dependency on ctrl_bps and and particular BPS extension packages we want to test there, with those optional tests skipped if the appropriate packages are set up.

      My first goal here is being able to test QG generation, but I definitely want to be able to test SingleQuantumExecutor against QBB and execution butler here as well, by running the real DRP pipelines in mock mode. Integrating testing involving different ResourcePath schemes was not something in mind, but I don't know much about setting up test environments for that, and I'm not opposed to the idea.

      If the exported test data is on the large side, this might involve splitting of a testdata_middleware for that.

        Attachments

          Issue Links

            Activity

            Hide
            jbosch Jim Bosch added a comment -

            One downside of this approach is that we won't be able see the tests in this package in the coverage reports on PRs.  But I definitely plan to make sure I can run coverage locally while working on a ticket to see what it looks like in the files I'm working on, and I'll take unreported test coverage over no test coverage any day.

            Show
            jbosch Jim Bosch added a comment - One downside of this approach is that we won't be able see the tests in this package in the coverage reports on PRs.  But I definitely plan to make sure I can run coverage locally while working on a ticket to see what it looks like in the files I'm working on, and I'll take unreported test coverage over no test coverage any day.
            Hide
            tjenness Tim Jenness added a comment -

            pipetask does at least have a coverage option now. If you are using ci_builder for this I'm not sure if ci_builder knows about the coverage option yet.

            Show
            tjenness Tim Jenness added a comment - pipetask does at least have a coverage option now. If you are using ci_builder for this I'm not sure if ci_builder knows about the coverage option yet.
            Hide
            jbosch Jim Bosch added a comment -

            I think this is ready for review. It doesn't do everything I'd like the new test package to eventually do (anything involving CALIBRATION collections is the most notable absence, and I think we could use this to test some BPS functionality, too), but it's a good start that already found a couple more bugs in storage class conversion.  It takes about 25 minutes to run in Jenkins, and it covers most of the middleware things we rely on ci_hsc for while also testing a lot more.

            It's also worth noting that this tests direct full-Butler execution and QBB, but it does not test execution butler at all. I could add that pretty easily, I think, but it just didn't seem worthwhile with removal around the corner.

            Recommended review order is to start with the ci_middleware README, and then review the ci_middleware PR as a monolith (not commit-by-commit; too much added in early commits disappears by later ones, but there are commits in between I don't want to squash).

            After that, the changes in pipe_base, ctrl_mpexec, and daf_butler will make more sense, and are probably best reviewed commit-by-commit.

            Show
            jbosch Jim Bosch added a comment - I think this is ready for review. It doesn't do everything I'd like the new test package to eventually do (anything involving CALIBRATION collections is the most notable absence, and I think we could use this to test some BPS functionality, too), but it's a good start that already found a couple more bugs in storage class conversion.  It takes about 25 minutes to run in Jenkins, and it covers most of the middleware things we rely on ci_hsc for while also testing a lot more. It's also worth noting that this tests direct full-Butler execution and QBB, but it does not test execution butler at all. I could add that pretty easily, I think, but it just didn't seem worthwhile with removal around the corner. Recommended review order is to start with the ci_middleware README , and then review the ci_middleware PR as a monolith (not commit-by-commit; too much added in early commits disappears by later ones, but there are commits in between I don't want to squash). After that, the changes in pipe_base , ctrl_mpexec , and daf_butler will make more sense, and are probably best reviewed commit-by-commit.
            Hide
            tjenness Tim Jenness added a comment -

            I haven't looked at every line of ci_middleware but it does look impressive. It took 11 minutes to run on my laptop (same on my desktop because it can't use many cores).

            Show
            tjenness Tim Jenness added a comment - I haven't looked at every line of ci_middleware but it does look impressive. It took 11 minutes to run on my laptop (same on my desktop because it can't use many cores).
            Hide
            jbosch Jim Bosch added a comment - - edited

            I'd forgotten one branch, analysis_drp, which is just fixes to deprecation warnings that were bugging me while testing ci_middleware.

            Show
            jbosch Jim Bosch added a comment - - edited I'd forgotten one branch, analysis_drp , which is just fixes to deprecation warnings that were bugging me while testing ci_middleware.
            Hide
            jbosch Jim Bosch added a comment -

            Running pipelines_check (how had I not done that before?) revealed another problem: the small amount of refactoring I did in ctrl_mpexec prevented the instrument defined in the pipeline from being automatically included in the data query in QG generation. Could you take a look at the last commit in each of pipe_base and ctrl_mpexec for the fix?

            I'm running a full ci_<everything> Jenkins again, so anytime in the next 7 hours would be fine.

            Show
            jbosch Jim Bosch added a comment - Running pipelines_check (how had I not done that before?) revealed another problem: the small amount of refactoring I did in ctrl_mpexec prevented the instrument defined in the pipeline from being automatically included in the data query in QG generation. Could you take a look at the last commit in each of pipe_base and ctrl_mpexec for the fix? I'm running a full ci_<everything> Jenkins again, so anytime in the next 7 hours would be fine.
            Hide
            tjenness Tim Jenness added a comment -

            New code looks okay but I wonder about adding Instrument.from_pipeline to remove some duplicate code and there is precedent for adding this type of constructor to Instrument.

            Show
            tjenness Tim Jenness added a comment - New code looks okay but I wonder about adding Instrument.from_pipeline to remove some duplicate code and there is precedent for adding this type of constructor to Instrument.
            Hide
            kannawad Arun Kannawadi added a comment -

            Ah, I created DM-39602 which fixes those deprecation warnings in `analysis_drp` and in `faro`.

            Show
            kannawad Arun Kannawadi added a comment - Ah, I created DM-39602 which fixes those deprecation warnings in `analysis_drp` and in `faro`.

              People

              Assignee:
              jbosch Jim Bosch
              Reporter:
              jbosch Jim Bosch
              Reviewers:
              Tim Jenness
              Watchers:
              Arun Kannawadi, Jim Bosch, Tim Jenness
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.