Uploaded image for project: 'Request For Comments'
  1. Request For Comments
  2. RFC-823

How do we maintain the stack version of scarlet?

    XMLWordPrintable

    Details

    • Type: RFC
    • Status: Implemented
    • Resolution: Done
    • Component/s: DM
    • Labels:
      None

      Description

      DM-30756 involved stripping autograd from a sub-module in scarlet, which effectively required a whole new code base with only a few shared primitives. At this time the main scarlet code is seen as being more geared to joint reprocessing of images from ground, space, and spectral data, which requires GPU/TPUs for better performance, so Peter Melchior and his group are going to be moving scarlet "main" to pytorch/jax so that it can be run on GPUs.

      The question now is where do we keep the CPU "lite" version of scarlet that we will use for the LSST Science Pipelines, and how do we structure it? An important constraint is that there should be a version that is available outside of the stack so that people outside of DM can still continue to run a CPU version of scarlet without having to install the entire stack.

      I am proposing to update the stack scarlet with the current main from Peter's repo, including the modifications for the lite module, and maintain this as a separate fork going forward, no longer keeping it in sync with the soon to be GPU version that will be maintained outside of LSST. It can be installed using setup.py as the current "main" branch and we will still use the lsst-dev branch inside the stack. There is a lot of scarlet code that we don't use/need at all in the stack, so we could remove all of that to make it more lightweight and easier for other developers to maintain/understand. We probably want to rename the package to scarlet_lite so that there is less confusion and will allow people to have both the heavy and lite versions of scarlet installed at the same time. Does this seem reasonable, or does anyone have a better idea about how to make this split?

        Attachments

          Issue Links

            Activity

            Hide
            fred3m Fred Moolekamp added a comment -

            Thank you for bringing up testing Jim Bosch. That's actually another big argument in favor of the split that I had thought about but forgot to mention. I'd love to have more extensive testing in scarlet to make sure that there are no regressions or bugs introduced, and while we had a decent compromise in the main scarlet package it turned out to be insufficient for testing scarlet on a suitable subset of data to catch regressions that would affect us and there was push back to implement more extensive testing in the main package. I agree completely that scarlet should be setup as a typical stack package, I just want to make sure that it can be installed outside of the stack (eg. with a setup.py/setup.cfg as opposed to scons). But given the points made by Kian-Tat Lim the other day in favor of creating a new package, I do see all of the tests being run in the same way as a typical stack package, and the addition of a ci_scarlet package is a great idea.

            There will be a small but non-zero amount of work to format the new package since scarlet followed black formatting and used snake case, but I think that was going to be necessary anyway and I can use pycharm to greatly simplify renaming variables and functions.

            Show
            fred3m Fred Moolekamp added a comment - Thank you for bringing up testing Jim Bosch . That's actually another big argument in favor of the split that I had thought about but forgot to mention. I'd love to have more extensive testing in scarlet to make sure that there are no regressions or bugs introduced, and while we had a decent compromise in the main scarlet package it turned out to be insufficient for testing scarlet on a suitable subset of data to catch regressions that would affect us and there was push back to implement more extensive testing in the main package. I agree completely that scarlet should be setup as a typical stack package, I just want to make sure that it can be installed outside of the stack (eg. with a setup.py/setup.cfg as opposed to scons). But given the points made by Kian-Tat Lim the other day in favor of creating a new package, I do see all of the tests being run in the same way as a typical stack package, and the addition of a ci_scarlet package is a great idea. There will be a small but non-zero amount of work to format the new package since scarlet followed black formatting and used snake case, but I think that was going to be necessary anyway and I can use pycharm to greatly simplify renaming variables and functions.
            Hide
            tjenness Tim Jenness added a comment -

            A new package is absolutely allowed to use snake case, and black is consistent with the style guide as well (we use it in middleware packages).

            Show
            tjenness Tim Jenness added a comment - A new package is absolutely allowed to use snake case, and black is consistent with the style guide as well (we use it in middleware packages).
            Hide
            Parejkoj John Parejko added a comment -

            Would the rc_subset package be enough data to serve as a useful larger scarlet test? It has 6 detectors with 8 visits in each of grizy from HSC.

            Show
            Parejkoj John Parejko added a comment - Would the rc_subset package be enough data to serve as a useful larger scarlet test? It has 6 detectors with 8 visits in each of grizy from HSC.
            Hide
            fred3m Fred Moolekamp added a comment -

            Maybe, but I don't think that picking the right dataset for scarlet testing will be trivial. And ideally, it should be run on something like a subset of DC2 data, that way we know the truth of the objects that are being deblended and it will be easier to identify failures.

            Show
            fred3m Fred Moolekamp added a comment - Maybe, but I don't think that picking the right dataset for scarlet testing will be trivial. And ideally, it should be run on something like a subset of DC2 data, that way we know the truth of the objects that are being deblended and it will be easier to identify failures.
            Hide
            fred3m Fred Moolekamp added a comment -

            To summarize, the plan is to create a new scarlet_lite package that will only contain the bits of the scarlet package that are needed in Rubin. There will also be a ci_scarlet package created to better track deblender performance, but its creation is not necessary to close this RFC.

            Show
            fred3m Fred Moolekamp added a comment - To summarize, the plan is to create a new scarlet_lite package that will only contain the bits of the scarlet package that are needed in Rubin. There will also be a ci_scarlet package created to better track deblender performance, but its creation is not necessary to close this RFC.

              People

              Assignee:
              fred3m Fred Moolekamp
              Reporter:
              fred3m Fred Moolekamp
              Watchers:
              Eli Rykoff, Fred Moolekamp, Jim Bosch, John Parejko, Kian-Tat Lim, Tim Jenness
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:
                Planned End:

                  Jenkins

                  No builds found.