Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-19831

QuantumGraph pickle files seem to carry environment info or try to import packages

    Details

    • Team:
      Data Access and Database

      Description

      The shared stack on lsst-dev environment has more packages than the standard Conda env (e.g. dask). Somehow Gen3 QuantumGraph pickle file seem to carry information to import all packages. Taking a QuantumGraph pickle file made using the shared stack on lsst-dev to use in another environment without those extra packages (e.g. on a "clean" computer with only the stack installed and nothing more) resulted in errors like below: 

       

        File "/opt/lsst/software/stack/stack/miniconda3-4.5.12-1172c30/Linux64/pex_config/17.0.1-1-g703d48b+6/python/lsst/pex/config/config.py", line 990, in loadFromStream
          exec(stream, {}, local)
        File "<string>", line 37, in <module>
      ModuleNotFoundError: No module named 'PIL'
      

        Attachments

          Activity

          Hide
          hchiang2 Hsin-Fang Chiang added a comment -

          From the Gen2 era, a persisted config file would carry all imports. That may be relevant here. 

          Show
          hchiang2 Hsin-Fang Chiang added a comment - From the Gen2 era, a persisted config file would carry all imports. That may be relevant here. 
          Hide
          swinbank John Swinbank added a comment -

          Since this is a middleware issue, setting team to DAX and hoping Fritz Mueller will ensure it gets handled.

          Show
          swinbank John Swinbank added a comment - Since this is a middleware issue, setting team to DAX and hoping Fritz Mueller will ensure it gets handled.
          Hide
          salnikov Andy Salnikov added a comment - - edited

          This is definitely pex_config issue, is pex_config DAX responsibility now? I can look at it but it will take time to learn all intricacies (and I likely will break many things along the way). If there are pex_config experts out there maybe we should ask them first?

           

          Show
          salnikov Andy Salnikov added a comment - - edited This is definitely pex_config issue, is pex_config DAX responsibility now? I can look at it but it will take time to learn all intricacies (and I likely will break many things along the way). If there are pex_config experts out there maybe we should ask them first?  
          Hide
          jbosch Jim Bosch added a comment -

          Nate Lust may be able to comment on this; this may be fallout from the fix for another bug that he worked on.

          Part of me thinks we should just try to avoid all of this by moving away from pickle for QuantumGraph (and Pipeline) sooner rather than later (I think we need to do that eventually anyway).

           

          Show
          jbosch Jim Bosch added a comment - Nate Lust may be able to comment on this; this may be fallout from the fix for another bug that he worked on. Part of me thinks we should just try to avoid all of this by moving away from pickle for QuantumGraph (and Pipeline) sooner rather than later (I think we need to do that eventually anyway).  
          Hide
          salnikov Andy Salnikov added a comment -

          I don't think QuantumGraph pickle is a problem, but rather persistence of pex_config as Python code (pickling is based on that). And I'm OK with looking for better replacement for pickle, I thought about that when I implemented pickling but decided that pickle was better than JSON or YAML, mainly because of pex_config.

          Show
          salnikov Andy Salnikov added a comment - I don't think QuantumGraph pickle is a problem, but rather persistence of pex_config as Python code (pickling is based on that). And I'm OK with looking for better replacement for pickle, I thought about that when I implemented pickling but decided that pickle was better than JSON or YAML, mainly because of pex_config.
          Hide
          tjenness Tim Jenness added a comment -

          Pex_config pickle files are the output of config.saveToStream stored as a big string so it's definitely possible to reproduce that in YAML and there is a helper routine for converting the string back to the Config.

          I'm not really sure how pex_config decides which python imports it needs to include in the output stream so I'm not sure why PIL was added in this case. I did a quick check of serializing a config and it only included this at the top:

          import __main__
          assert type(config)==__main__.Complex, 'config is of type %s.%s instead of __main__.Complex' % (type(config).__module__, type(config).__name__)
          

          Maybe Kian-Tat Lim has some insight into what triggers the imports being included and whether we can do something like go through the saved stream one line at a time and evaluating it until we hit the assert line (and so at least be able to import all the items we can import without breaking and defer breakage until something tries to use that import).

          Show
          tjenness Tim Jenness added a comment - Pex_config pickle files are the output of config.saveToStream stored as a big string so it's definitely possible to reproduce that in YAML and there is a helper routine for converting the string back to the Config. I'm not really sure how pex_config decides which python imports it needs to include in the output stream so I'm not sure why PIL was added in this case. I did a quick check of serializing a config and it only included this at the top: import __main__ assert type(config)==__main__.Complex, 'config is of type %s.%s instead of __main__.Complex' % (type(config).__module__, type(config).__name__) Maybe Kian-Tat Lim has some insight into what triggers the imports being included and whether we can do something like go through the saved stream one line at a time and evaluating it until we hit the assert line (and so at least be able to import all the items we can import without breaking and defer breakage until something tries to use that import).

            People

            • Assignee:
              Unassigned
              Reporter:
              hchiang2 Hsin-Fang Chiang
              Watchers:
              Andy Salnikov, Fritz Mueller, Hsin-Fang Chiang, Jim Bosch, John Swinbank, Tim Jenness
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:

                Summary Panel