Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-17148

pipeline task activator squashes import errors

    Details

    • Story Points:
      1
    • Sprint:
      BG3_S19_01, BG3_S19_02
    • Team:
      Data Access and Database

      Description

      If the activator is being used to run or list, it will not find, or hide results for modules for which there was an import error, and report to the user that that module does not exists. This should at least produce a warning that there was a problem importing a module by python so the user can track down why there was a problem importing.

        Attachments

          Issue Links

            Activity

            Hide
            salnikov Andy Salnikov added a comment -

            One complication for implementing this - when importing something I need to try several location for a module, if module does not exist at some location then importlib.import_module() raises ImportError which I currently catch and ignore (continue searching in other locations). If module is there but import fails then importlib.import_module() raises an error for that failure. The error can be any exception including ImportError (e.g. in case module imports something which does not exist). To make diagnostics useful I need to distinguish between cases when ImportError is raised because imported module does not exist and when ImportError is raised by the imported code.

            Show
            salnikov Andy Salnikov added a comment - One complication for implementing this - when importing something I need to try several location for a module, if module does not exist at some location then importlib.import_module() raises ImportError which I currently catch and ignore (continue searching in other locations). If module is there but import fails then importlib.import_module() raises an error for that failure. The error can be any exception including ImportError (e.g. in case module imports something which does not exist). To make diagnostics useful I need to distinguish between cases when ImportError is raised because imported module does not exist and when ImportError is raised by the imported code.
            Hide
            salnikov Andy Salnikov added a comment -

            I think I managed to improve things using different methods from importlib.

            Now if pipetask fails to import some module during list if will preint a warning message (this is after adding import DoesNotExist to a module):

            $ pipetask list
            ctrl.mpexec.taskLoader WARN: import of module lsst.ctrl.mpexec.examples.test1task failed: No module named 'DoesNotExist'
             
            Task class name                                                           Kind     
            ------------------------------------------------------------------------- ---------
            lsst.ctrl.mpexec.examples.calexpToCoaddTask.CalexpToCoaddTask             PipelineTask
            lsst.ctrl.mpexec.examples.patchSkyMapTask.PatchSkyMapTask                 PipelineTask
            lsst.ctrl.mpexec.examples.rawToCalexpTask.RawToCalexpTask                 PipelineTask
            ...
            

            If problem happens when building a pipeline then more detailed exception is raised:

            $ pipetask build -t test1task.Test1Task
            Failed to build pipeline: Import of module test1task failed
            Traceback (most recent call last):
              File "/project/salnikov/gen3-middleware/ctrl_mpexec/python/lsst/ctrl/mpexec/taskLoader.py", line 215, in loadTaskClass
                spec.loader.exec_module(module)
              File "<frozen importlib._bootstrap_external>", line 678, in exec_module
              File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
              File "/project/salnikov/gen3-middleware/ctrl_mpexec/python/lsst/ctrl/mpexec/examples/test1task.py", line 11, in <module>
                import DoesNotExist
            ModuleNotFoundError: No module named 'DoesNotExist'
             
            The above exception was the direct cause of the following exception:
             
            Traceback (most recent call last):
              File "/project/salnikov/gen3-middleware/ctrl_mpexec/bin/pipetask", line 26, in <module>
                sys.exit(CmdLineFwk().parseAndRun())
              File "/project/salnikov/gen3-middleware/ctrl_mpexec/python/lsst/ctrl/mpexec/cmdLineFwk.py", line 115, in parseAndRun
                pipeline = self.makePipeline(taskFactory, args)
              File "/project/salnikov/gen3-middleware/ctrl_mpexec/python/lsst/ctrl/mpexec/cmdLineFwk.py", line 259, in makePipeline
                pipeBuilder.addTask(action.value, action.label)
              File "/project/salnikov/gen3-middleware/pipe_base/python/lsst/pipe/base/pipelineBuilder.py", line 115, in addTask
                taskClass, taskName = self._taskFactory.loadTaskClass(taskName)
              File "/project/salnikov/gen3-middleware/ctrl_mpexec/python/lsst/ctrl/mpexec/taskFactory.py", line 79, in loadTaskClass
                taskClass, fullTaskName, taskKind = self.taskLoader.loadTaskClass(taskName)
              File "/project/salnikov/gen3-middleware/ctrl_mpexec/python/lsst/ctrl/mpexec/taskLoader.py", line 218, in loadTaskClass
                raise ImportError(f"Import of module {module_name} failed") from exc
            ImportError: Import of module test1task failed
            

            Also improved diagnostics for other conditions - module is not found, module is loaded but task class is not there, class is found but it's not a Task class (these all result in exceptions).

            Show
            salnikov Andy Salnikov added a comment - I think I managed to improve things using different methods from importlib . Now if pipetask fails to import some module during list if will preint a warning message (this is after adding import DoesNotExist to a module): $ pipetask list ctrl.mpexec.taskLoader WARN: import of module lsst.ctrl.mpexec.examples.test1task failed: No module named 'DoesNotExist'   Task class name Kind ------------------------------------------------------------------------- --------- lsst.ctrl.mpexec.examples.calexpToCoaddTask.CalexpToCoaddTask PipelineTask lsst.ctrl.mpexec.examples.patchSkyMapTask.PatchSkyMapTask PipelineTask lsst.ctrl.mpexec.examples.rawToCalexpTask.RawToCalexpTask PipelineTask ... If problem happens when building a pipeline then more detailed exception is raised: $ pipetask build -t test1task.Test1Task Failed to build pipeline: Import of module test1task failed Traceback (most recent call last): File "/project/salnikov/gen3-middleware/ctrl_mpexec/python/lsst/ctrl/mpexec/taskLoader.py", line 215, in loadTaskClass spec.loader.exec_module(module) File "<frozen importlib._bootstrap_external>", line 678, in exec_module File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed File "/project/salnikov/gen3-middleware/ctrl_mpexec/python/lsst/ctrl/mpexec/examples/test1task.py", line 11, in <module> import DoesNotExist ModuleNotFoundError: No module named 'DoesNotExist'   The above exception was the direct cause of the following exception:   Traceback (most recent call last): File "/project/salnikov/gen3-middleware/ctrl_mpexec/bin/pipetask", line 26, in <module> sys.exit(CmdLineFwk().parseAndRun()) File "/project/salnikov/gen3-middleware/ctrl_mpexec/python/lsst/ctrl/mpexec/cmdLineFwk.py", line 115, in parseAndRun pipeline = self.makePipeline(taskFactory, args) File "/project/salnikov/gen3-middleware/ctrl_mpexec/python/lsst/ctrl/mpexec/cmdLineFwk.py", line 259, in makePipeline pipeBuilder.addTask(action.value, action.label) File "/project/salnikov/gen3-middleware/pipe_base/python/lsst/pipe/base/pipelineBuilder.py", line 115, in addTask taskClass, taskName = self._taskFactory.loadTaskClass(taskName) File "/project/salnikov/gen3-middleware/ctrl_mpexec/python/lsst/ctrl/mpexec/taskFactory.py", line 79, in loadTaskClass taskClass, fullTaskName, taskKind = self.taskLoader.loadTaskClass(taskName) File "/project/salnikov/gen3-middleware/ctrl_mpexec/python/lsst/ctrl/mpexec/taskLoader.py", line 218, in loadTaskClass raise ImportError(f"Import of module {module_name} failed") from exc ImportError: Import of module test1task failed Also improved diagnostics for other conditions - module is not found, module is loaded but task class is not there, class is found but it's not a Task class (these all result in exceptions).
            Hide
            salnikov Andy Salnikov added a comment -

            Nate, can you review my updates, I think it's a straightforward change and I tested with some manually-broken modules. Plus there are few small unrelated trivial renames in separate commits.

            Show
            salnikov Andy Salnikov added a comment - Nate, can you review my updates, I think it's a straightforward change and I tested with some manually-broken modules. Plus there are few small unrelated trivial renames in separate commits.
            Hide
            salnikov Andy Salnikov added a comment -

            Nate Lust, JIRA is slow picking up PR, here it is: https://github.com/lsst/ctrl_mpexec/pull/10

            Show
            salnikov Andy Salnikov added a comment - Nate Lust , JIRA is slow picking up PR, here it is: https://github.com/lsst/ctrl_mpexec/pull/10
            Hide
            nlust Nate Lust added a comment -

            My only comment is that in a few places in strings you refer to things like "a task subclass", and those should probably be changed to PipelineTask, as a task is a related but different thing, and would not want to confuse a user.

            Show
            nlust Nate Lust added a comment - My only comment is that in a few places in strings you refer to things like "a task subclass", and those should probably be changed to PipelineTask, as a task is a related but different thing, and would not want to confuse a user.
            Hide
            salnikov Andy Salnikov added a comment -

            Thanks for review! I know it's confusing but TaskLoader can handle not only PipelineTask but any subclass of Task too (including CmdLineTask). This code implements one of the early designs (coming from initial NCSA experiment I believe) which could run CmdLineTasks similarly to today's PipelineTask. And I think we also discussed possibility that SuperTask could be dynamically composed of sub-Tasks based on configuration so loading Tasks classes could be done with the same interface. I think all of that does not matter anymore and as CmdLineTask disappears we'll need to simplify TaskLoader to get rid of all those options, but for now I leave it as it is. Merged and done.

            Show
            salnikov Andy Salnikov added a comment - Thanks for review! I know it's confusing but TaskLoader  can handle not only PipelineTask  but any subclass of Task too (including CmdLineTask ). This code implements one of the early designs (coming from initial NCSA experiment I believe) which could run CmdLineTasks similarly to today's PipelineTask. And I think we also discussed possibility that SuperTask could be dynamically composed of sub-Tasks based on configuration so loading Tasks classes could be done with the same interface. I think all of that does not matter anymore and as CmdLineTask disappears we'll need to simplify TaskLoader to get rid of all those options, but for now I leave it as it is. Merged and done.

              People

              • Assignee:
                salnikov Andy Salnikov
                Reporter:
                nlust Nate Lust
                Reviewers:
                Nate Lust
                Watchers:
                Andy Salnikov, Nate Lust, Vaikunth Thukral
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Summary Panel