Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-17398

Support execution of incomplete graphs

    XMLWordPrintable

    Details

    • Story Points:
      0.5
    • Sprint:
      BG3_S19_01
    • Team:
      Data Access and Database

      Description

      Michelle discovered an interesting issue while trying to run pipetask on a single Qunatum (extracted from a full graph). Graph traverse() method tries to determine prerequisites for a quantum execution by looking at quanta inputs and determining which other quanta produced those inputs. It works OK on a full graph but when single quantum is removed from a graph its prerequisites are not in the same graph anymore and traverse() currently does not like this (it crashes). 

       

        Attachments

          Activity

          Hide
          salnikov Andy Salnikov added a comment - - edited

          I'd like to solve this without any extra command line options if possible (there already have too many of those).

          At the core of the problem is that Quantum has inputs which do not come from butler (dataset_id is None) and the Quantum that produces those inputs is not in a graph. I think that right long-term approach for the "split graph" is to have different kind of executor which works on single Quantum and does not need all the graph-related complications. As we don't have that executor now we need some workaround for existing executor to not break in this case.

          One possible workaround is to check the number of Quanta in a graph, if there is only one Quantum then it can be assumed that this is a (possibly) split graph and ignore any missing inputs (and not build prerequisites for them). Single-quantum graph can of course be built normally without splitting, but that case will not have non-existing inputs so it should be fine.

          Another option is of course to add one more command line option and ignore non-existing inputs iff this option is present.

          Show
          salnikov Andy Salnikov added a comment - - edited I'd like to solve this without any extra command line options if possible (there already have too many of those). At the core of the problem is that Quantum has inputs which do not come from butler ( dataset_id is None) and the Quantum that produces those inputs is not in a graph. I think that right long-term approach for the "split graph" is to have different kind of executor which works on single Quantum and does not need all the graph-related complications. As we don't have that executor now we need some workaround for existing executor to not break in this case. One possible workaround is to check the number of Quanta in a graph, if there is only one Quantum then it can be assumed that this is a (possibly) split graph and ignore any missing inputs (and not build prerequisites for them). Single-quantum graph can of course be built normally without splitting, but that case will not have non-existing inputs so it should be fine. Another option is of course to add one more command line option and ignore non-existing inputs iff this option is present.
          Hide
          salnikov Andy Salnikov added a comment -

          Jim Bosch, can you look at my quick and dirty patch. It fixes Michelle's immediate problem but as I said we'll need something better for the future. I tried to come up with a unit test but it requires registry so cannot be done until we have that.

          Show
          salnikov Andy Salnikov added a comment - Jim Bosch , can you look at my quick and dirty patch. It fixes Michelle's immediate problem but as I said we'll need something better for the future. I tried to come up with a unit test but it requires registry so cannot be done until we have that.
          Hide
          jbosch Jim Bosch added a comment -

          Looks fine for an interim solution, and given the temporary nature of this fix I'm not too bothered by the lack of a test.

          Long term, now that all of the work involved in running a single Quantum has been split out into SingleQuantumExecutor, I think the right solution involves either having a separate command-line interface for that or Michelle Gower et al calling that in Python from some custom command-line tool to be run only by Pegasus.

          Show
          jbosch Jim Bosch added a comment - Looks fine for an interim solution, and given the temporary nature of this fix I'm not too bothered by the lack of a test. Long term, now that all of the work involved in running a single Quantum has been split out into SingleQuantumExecutor, I think the right solution involves either having a separate command-line interface for that or Michelle Gower et al calling that in Python from some custom command-line tool to be run only by Pegasus.
          Hide
          salnikov Andy Salnikov added a comment -

          Thanks for review! Merged and done.

          Show
          salnikov Andy Salnikov added a comment - Thanks for review! Merged and done.
          Hide
          mgower Michelle Gower added a comment -

          Thanks for the fix.   It worked.    For the long term solution, we should have a bigger discussion.    We will need to run some python code before and after running the quantum, so we'd prefer not to have to use a subprocess call to run the quantum.    Also, long term, it is worrisome that running a single quantum cannot be accomplished by "just" executing the task's runQuantum method.

          Show
          mgower Michelle Gower added a comment - Thanks for the fix.   It worked.    For the long term solution, we should have a bigger discussion.    We will need to run some python code before and after running the quantum, so we'd prefer not to have to use a subprocess call to run the quantum.    Also, long term, it is worrisome that running a single quantum cannot be accomplished by "just" executing the task's runQuantum method.

            People

            Assignee:
            salnikov Andy Salnikov
            Reporter:
            salnikov Andy Salnikov
            Reviewers:
            Jim Bosch
            Watchers:
            Andy Salnikov, Jim Bosch, Michelle Gower, Vaikunth Thukral
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved:

                Jenkins

                No builds found.