Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-21885

Convert concrete MetricTasks to PipelineTasks

    XMLWordPrintable

    Details

      Description

      The lsst.verify.MetricTask system was designed to be as forward-compatible with PipelineTask as possible, so it should be straightforward to convert it, its subclasses, and any related tasks that are still needed in Gen3.

      This is an umbrella ticket for all of that work; I don't know exactly what that would entail, but would like to have any relevant tickets linked here somehow.  I've taken a wild guess at SPs, but would love to have that updated by someone who actually has a decent basis for estimation.

        Attachments

          Issue Links

            Activity

            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            Jim Bosch, I'm still unclear on the requirements for this work. Do we need MetricTasks to be able to work with both Gen 2 and Gen 3 pipelines during a transition period, or is this a replacement of Gen-2-only tasks with Gen-3-only? The distinction affects what the best implementation is.

            Show
            krzys Krzysztof Findeisen added a comment - - edited Jim Bosch , I'm still unclear on the requirements for this work. Do we need MetricTasks to be able to work with both Gen 2 and Gen 3 pipelines during a transition period, or is this a replacement of Gen-2-only tasks with Gen-3-only? The distinction affects what the best implementation is.
            Hide
            jbosch Jim Bosch added a comment -

            If think that's up to Eric Bellm and John Swinbank, but if there is no transition period we'd need to be careful with scheduling the work to ensure we don't lose the ability to do complete end-to-end tests in at least one middleware system for too long.

            Show
            jbosch Jim Bosch added a comment - If think that's up to Eric Bellm and John Swinbank , but if there is no transition period we'd need to be careful with scheduling the work to ensure we don't lose the ability to do complete end-to-end tests in at least one middleware system for too long.
            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            I've ticketed all the work related to MetricTask itself, but I'm not clear on the status of MetricsControllerTask. This Gen 2-only task does the following:

            1. Identify the datasets needed by each MetricTask [handled by QuantumGraph]
            2. Attach SQuaSH-mandated medatata to the measurements; currently, the data ID and instrument name
            3. Catching exceptions and None returns from invalid/inapplicable MetricTasks
            4. Persistence [DM-21875]

            At the time that MetricsControllerTask was designed, I assumed that #2 would be handled by Gen 3 provenance; can Jim Bosch or Angelo Fausti comment on that?

            I'm not sure if #3 needs a custom executor for pipelines containing MetricTasks, or if it will be handled using a run-everything-you-can strategy in the default executor. The key thing is that independent MetricTasks should not stop each other from running.

            Show
            krzys Krzysztof Findeisen added a comment - - edited I've ticketed all the work related to MetricTask itself, but I'm not clear on the status of MetricsControllerTask . This Gen 2-only task does the following: Identify the datasets needed by each MetricTask [handled by QuantumGraph ] Attach SQuaSH-mandated medatata to the measurements; currently, the data ID and instrument name Catching exceptions and None returns from invalid/inapplicable MetricTasks Persistence [ DM-21875 ] At the time that MetricsControllerTask was designed, I assumed that #2 would be handled by Gen 3 provenance; can Jim Bosch or Angelo Fausti comment on that? I'm not sure if #3 needs a custom executor for pipelines containing MetricTasks , or if it will be handled using a run-everything-you-can strategy in the default executor. The key thing is that independent MetricTasks should not stop each other from running.
            Hide
            afausti Angelo Fausti added a comment -

            Krzysztof Findeisen I don't know much about Gen 3 provenance yet, but making sure that the metadata we want to associate with metrics is present in the registry sounds like the first step. I also learned that there's a provenance WG forming up https://ldm-722.lsst.io/v/DM-20309/index.html

            Show
            afausti Angelo Fausti added a comment - Krzysztof Findeisen I don't know much about Gen 3 provenance yet, but making sure that the metadata we want to associate with metrics is present in the registry sounds like the first step. I also learned that there's a provenance WG forming up https://ldm-722.lsst.io/v/DM-20309/index.html
            Hide
            jbosch Jim Bosch added a comment - - edited

            Just doing the Butler.put in Gen3 on the Measurement objects will associate them with data IDs in the Gen3 registry database (and the data ID includes the instrument name in Gen3 for datasets keyed by visit and/or detector).  If you want those to be inserted into the Measurement objects themselves somehow before they're written, that would not be covered, and we'd need to discuss how best to handle that.  Inserting metadata beyond the data ID and instrument name into the registry database is something we have plans for (DM-21773) but have not yet implemented.

            I think we'll want to take a look at the expected-failure modes for MetricTasks you refer to in #3 as use cases for PipelineTasks in general, and define some rules that would allow them to work with generic activators.  We've done a tiny bit of work in that area so far, but have long known that we need more sophistication in classifying and handling failures.

             

            Show
            jbosch Jim Bosch added a comment - - edited Just doing the Butler.put in Gen3 on the Measurement objects will associate them with data IDs in the Gen3 registry database (and the data ID includes the instrument name in Gen3 for datasets keyed by visit and/or detector).  If you want those to be inserted into the Measurement objects themselves somehow before they're written, that would not be covered, and we'd need to discuss how best to handle that.  Inserting metadata beyond the data ID and instrument name into the registry database is something we have plans for ( DM-21773 ) but have not yet implemented. I think we'll want to take a look at the expected-failure modes for MetricTasks you refer to in #3 as use cases for PipelineTasks in general, and define some rules that would allow them to work with generic activators.  We've done a tiny bit of work in that area so far, but have long known that we need more sophistication in classifying and handling failures.  
            Hide
            krzys Krzysztof Findeisen added a comment -

            Inserting metadata beyond the data ID and instrument name

            This is also a feature of MetricsControllerTask (DM-16736), but AFAIK it has never been used, so I'm fine with dropping it until DM-21773.

            Show
            krzys Krzysztof Findeisen added a comment - Inserting metadata beyond the data ID and instrument name This is also a feature of MetricsControllerTask ( DM-16736 ), but AFAIK it has never been used, so I'm fine with dropping it until DM-21773 .
            Hide
            swinbank John Swinbank added a comment -

            Hey Krzysztof Findeisen — I'm a bit confused by what happened on this ticket.

            Am I correct in understanding that the idea is there's no work actually being performed here, but rather that it automatically becomes done when all of its blockers are completed? That's why it would be worth 0 SPs.

            However... it's marked as done, and DM-21937, which is blocking it, is still in progress.

            Show
            swinbank John Swinbank added a comment - Hey Krzysztof Findeisen — I'm a bit confused by what happened on this ticket. Am I correct in understanding that the idea is there's no work actually being performed here, but rather that it automatically becomes done when all of its blockers are completed? That's why it would be worth 0 SPs. However... it's marked as done, and DM-21937 , which is blocking it, is still in progress.
            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            Yes, most of the Gen 3 work is organized in terms of such tickets.

            As for DM-21937, I think it's premature to return to that discussion until Middleware does a bit more development, but we did have a workaround that could be used for the initial Gen 3 conversion, so I propose unlinking it from this ticket.

            Show
            krzys Krzysztof Findeisen added a comment - - edited Yes, most of the Gen 3 work is organized in terms of such tickets. As for DM-21937 , I think it's premature to return to that discussion until Middleware does a bit more development, but we did have a workaround that could be used for the initial Gen 3 conversion, so I propose unlinking it from this ticket.

              People

              Assignee:
              krzys Krzysztof Findeisen
              Reporter:
              jbosch Jim Bosch
              Watchers:
              Angelo Fausti, Jim Bosch, John Swinbank, Krzysztof Findeisen
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.