Uploaded image for project: 'Request For Comments'
  1. Request For Comments
  2. RFC-285

Revise front-end for warping

    XMLWordPrintable

    Details

    • Type: RFC
    • Status: Implemented
    • Resolution: Done
    • Component/s: DM
    • Labels:
      None

      Description

      The warping implementation (currently MakeCoaddTempExpTask) relies on a SelectImagesTask to sort out the inputs. Unfortunately, this selection is performed inside the MakeCoaddTempExp.run method, which means MakeCoaddTempExpTask gets parallelised over patches rather than over individual warps (unless the user has some additional mechanics for setting the command-lines, essentially doing the selection themselves), which is inefficient. The SelectImagesTask hierarchy grew up as a MySQL database query before other things got tacked on to support butler-based selection, and could use simplification.

      I propose to move the selection of images into the TaskRunner.getTargetList method, which will allow for parallelisation over individual warps. I will also refactor the SelectImagesTask hierarchy and add a sub-class to perform selection of images using the spatial database introduced in DM-3472.

      There is a SelectSdssImagesTask in obs_sdss that I propose to drop because it relies on a particular database that I don't have access to and cannot recreate, and that code probably hasn't been used in a while. If someone objects to dropping it, I will attempt to modify it to conform with the SelectImagesTask changes, but testing and fixing it will have to be the responsibility of whoever wants to keep it.

      While I'm modifying the warping front-end, I also propose to move the config parameters that are operational in nature (e.g., doWrite, doOverwrite) out of the Config. Their presence in the Config causes trouble because there are legitimate reasons for changing their values between command-line incantations, but doing so requires adding --clobber-config which should only be required for changes of an algorithmic nature.

        Attachments

          Issue Links

            Activity

            Hide
            jbosch Jim Bosch added a comment -

            I agree that doing the selection before we actually start the warping makes sense, but I don't like the idea of putting it in the TaskRunner instead of the Task itself. I expect SuperTask will help with this in the future, but in the meantime, would be it feasible to have the TaskRunner call a method in the Task to do the selection, at least?

            A cleaner but much less user-friendly interim solution might be to have a separate command-line task for generating the selection, and writing it out as a list to a new butlerized dataset; this would better handle the case where selection itself is an expensive operation that may want to be parallelized.

            I'm all for moving the overwrite options out of the config. Do you have any thoughts on where we could put them instead, or do you think it's just better to not have them for now?

            Show
            jbosch Jim Bosch added a comment - I agree that doing the selection before we actually start the warping makes sense, but I don't like the idea of putting it in the TaskRunner instead of the Task itself. I expect SuperTask will help with this in the future, but in the meantime, would be it feasible to have the TaskRunner call a method in the Task to do the selection, at least? A cleaner but much less user-friendly interim solution might be to have a separate command-line task for generating the selection, and writing it out as a list to a new butlerized dataset; this would better handle the case where selection itself is an expensive operation that may want to be parallelized. I'm all for moving the overwrite options out of the config. Do you have any thoughts on where we could put them instead, or do you think it's just better to not have them for now?
            Hide
            price Paul Price added a comment -

            We can have the TaskRunner call a method in the Task to do the selection; I don't think that's too different from what I was planning.

            For moving the overwrite options out of the Config, I was thinking to put them in a separate Config that wouldn't be persisted. Alternatively, they could be separate command-line options, but I like the idea of grouping them together in a Config. I think we should do something similar for the "debug" options (which would allow us to declare and document all the possible options and set them on the command-line), but let's not discuss that here.

            Show
            price Paul Price added a comment - We can have the TaskRunner call a method in the Task to do the selection; I don't think that's too different from what I was planning. For moving the overwrite options out of the Config , I was thinking to put them in a separate Config that wouldn't be persisted. Alternatively, they could be separate command-line options, but I like the idea of grouping them together in a Config . I think we should do something similar for the "debug" options (which would allow us to declare and document all the possible options and set them on the command-line), but let's not discuss that here.
            Hide
            ktl Kian-Tat Lim added a comment -

            I have been aiming towards a solution like the one Jim Bosch suggested, with two tasks (the first of which I've been calling a DataIdGenerator) communicating via a list of dataIds, although not necessarily via a persisted Butler dataset.

            Paul Price has an account in the lsst-db server that holds the krughoff_SDSS_quality_db database (which was copied from a public SDSS release, I believe) The credentials need to be saved in either both /.lsst/db-auth.paf and /.lsst/db-auth.py or else ~/.mysql.cnf (I think that works).

            Options for configuring I/O, including what datasets are written, should ideally be in a Butler or SuperTask configuration, separate from the algorithmic Task configuration.

            I'm not sure if Gregory Dubois-Felsmann and possibly Rob Kooper [X] and Andy Salnikov will be able to weigh in by the planned end of this RFC. It may need to be extended a day or two.

            Show
            ktl Kian-Tat Lim added a comment - I have been aiming towards a solution like the one Jim Bosch suggested, with two tasks (the first of which I've been calling a DataIdGenerator) communicating via a list of dataIds, although not necessarily via a persisted Butler dataset. Paul Price has an account in the lsst-db server that holds the krughoff_SDSS_quality_db database (which was copied from a public SDSS release, I believe) The credentials need to be saved in either both /.lsst/db-auth.paf and /.lsst/db-auth.py or else ~/.mysql.cnf (I think that works). Options for configuring I/O, including what datasets are written, should ideally be in a Butler or SuperTask configuration, separate from the algorithmic Task configuration. I'm not sure if Gregory Dubois-Felsmann and possibly Rob Kooper [X] and Andy Salnikov will be able to weigh in by the planned end of this RFC. It may need to be extended a day or two.
            Hide
            yusra Yusra AlSayyad added a comment -

            I still use SelectSdssImagesTask and will take responsibility for testing it after your SelectImageTask refactor.

            Show
            yusra Yusra AlSayyad added a comment - I still use SelectSdssImagesTask and will take responsibility for testing it after your SelectImageTask refactor.
            Hide
            price Paul Price added a comment -

            Certainly the long-term solution is to put the selection into the workflow system using SuperTask, but at the moment we don't have SuperTask or a workflow system. It's important to fix the parallelisation of warping in the code we're using now, because it will allow higher efficiency in the coming HSC production run. Also, the code in question has grown in fits and starts and has needed refactoring for a long time.

            What I do should not impact development of the workflow system using SuperTask, but will help inform its development.

            Show
            price Paul Price added a comment - Certainly the long-term solution is to put the selection into the workflow system using SuperTask , but at the moment we don't have SuperTask or a workflow system. It's important to fix the parallelisation of warping in the code we're using now, because it will allow higher efficiency in the coming HSC production run. Also, the code in question has grown in fits and starts and has needed refactoring for a long time. What I do should not impact development of the workflow system using SuperTask , but will help inform its development.
            Hide
            salnikov Andy Salnikov added a comment -

            I'm not sure what exactly to say here. Indeed our expectations currently are that SuperTask will help to solve this sort of issues (likely with some help from task authors). How exactly it will be done I cannot say, it's a complicated issue and we need to think more about all possible options. Cases like this are certainly interesting for developing functional scheduling system and I'll look at it more to understand all details.

            Regarding configuration - SuperTask does not do anything special yet, all configuration issues are handled by Task level. K-T is right that we will most likely reuse (task) configuration for defining things that are "non-algorithmic" parameters, this is again not at all settled yet.

            Show
            salnikov Andy Salnikov added a comment - I'm not sure what exactly to say here. Indeed our expectations currently are that SuperTask will help to solve this sort of issues (likely with some help from task authors). How exactly it will be done I cannot say, it's a complicated issue and we need to think more about all possible options. Cases like this are certainly interesting for developing functional scheduling system and I'll look at it more to understand all details. Regarding configuration - SuperTask does not do anything special yet, all configuration issues are handled by Task level. K-T is right that we will most likely reuse (task) configuration for defining things that are "non-algorithmic" parameters, this is again not at all settled yet.
            Hide
            price Paul Price added a comment -

            Are there further arguments against this proposal, or need to hold off longer to give people a chance to respond before I adopt?

            Show
            price Paul Price added a comment - Are there further arguments against this proposal, or need to hold off longer to give people a chance to respond before I adopt?
            Hide
            jbosch Jim Bosch added a comment -

            Paul Price, are you opposed to having a separate selection CmdLineTask we run first, that writes out a dataset (perhaps an ExposureCatalog) that lists what would be coadded? I think that's sounding like it's the most future-proof, and I imagine it could easily go as part of coaddDrivers.py and hence not make that driver any harder to use, and I even think having that list-of-exposures dataset could be good for provenance (since we wouldn't have to regenerate that list in assembleCoadd's TaskRunner and just hope it's consistent with the list from makeCoaddTempExp's TaskRunner).

            Show
            jbosch Jim Bosch added a comment - Paul Price , are you opposed to having a separate selection CmdLineTask we run first, that writes out a dataset (perhaps an ExposureCatalog ) that lists what would be coadded? I think that's sounding like it's the most future-proof, and I imagine it could easily go as part of coaddDrivers.py and hence not make that driver any harder to use, and I even think having that list-of-exposures dataset could be good for provenance (since we wouldn't have to regenerate that list in assembleCoadd's TaskRunner and just hope it's consistent with the list from makeCoaddTempExp's TaskRunner).
            Hide
            price Paul Price added a comment -

            That's essentially what the spatial database is. I don't see a difference between your proposal and mine except for the implementation details.

            Show
            price Paul Price added a comment - That's essentially what the spatial database is. I don't see a difference between your proposal and mine except for the implementation details.
            Hide
            jbosch Jim Bosch added a comment - - edited

            The difference between the spatial database and my list is an extra level of filtering that takes it from the list of possible exposures included to the list of actual exposures included (from either more precise spatial checks or any kind of quality check). I'd like to put that filtering in Task code if at all possible. But I take your point that even if we do that, the coaddition task TaskRunners will need to load the outputs using the butler and iterate over them to generate data IDs.

            Show
            jbosch Jim Bosch added a comment - - edited The difference between the spatial database and my list is an extra level of filtering that takes it from the list of possible exposures included to the list of actual exposures included (from either more precise spatial checks or any kind of quality check). I'd like to put that filtering in Task code if at all possible. But I take your point that even if we do that, the coaddition task TaskRunners will need to load the outputs using the butler and iterate over them to generate data IDs.
            Hide
            price Paul Price added a comment -

            I think I see the merits of having a "list of actual exposures": it factors out the common operation of selecting exposures from the warping and coadding tasks. OK, so then I see it going like this:

            1. singleFrameDriver/processCcd adds entries to the spatial database.
            2. selectImagesToCoadd selects entries from the spatial database and writes a selection database.
            3. makeMatchedWarp iterates over exposures,patches in the selection database.
            4. assembleCoadd iterates over patches,filters in the selection database.
            5. multiband...

            Do you have a particular preference for the format of the selection database? You mentioned an ExposureCatalog, but I'm leaning towards SQLite for consistency with the spatial database and more flexibility. I can see the spatial and selection databases morphing into a proper database that tracks quality and processing metrics for each stage (e.g., section 2.1 of The Pan-STARRS Data Processing System) down the road (we definitely need something like that).

            Show
            price Paul Price added a comment - I think I see the merits of having a "list of actual exposures": it factors out the common operation of selecting exposures from the warping and coadding tasks. OK, so then I see it going like this: 1. singleFrameDriver/processCcd adds entries to the spatial database. 2. selectImagesToCoadd selects entries from the spatial database and writes a selection database. 3. makeMatchedWarp iterates over exposures,patches in the selection database. 4. assembleCoadd iterates over patches,filters in the selection database. 5. multiband... Do you have a particular preference for the format of the selection database? You mentioned an ExposureCatalog , but I'm leaning towards SQLite for consistency with the spatial database and more flexibility. I can see the spatial and selection databases morphing into a proper database that tracks quality and processing metrics for each stage (e.g., section 2.1 of The Pan-STARRS Data Processing System ) down the road (we definitely need something like that).
            Hide
            jbosch Jim Bosch added a comment -

            I hadn't really thought about extending it to a real database, but it's an interesting idea. I'd like to get Kian-Tat Lim and DAX opinions on that.

            Conceptually, I think this is just a per tract+patch+filter list of visit+ccd data IDs, right? I think the simplest way to represent that is a regular butlerized file (maybe just BaseCatalog rather then ExposureCatalog, if that's all it is). Mapping generic data IDs to a schema is probably a bit thorny right now, but I think we'll have to provide generic capabilities for that someday and hopefully we can do it well now.

            So I guess the big question is whether you think it'd be useful to essentially denormalize some of the exposure metadata into those files so it's useful for something other than the coaddition tasks. If that's the case, ExposureCatalog or a real database starts to make more sense. I think the LSST plan is not (thus far) to use a database for a lot of the provenance stuff (or, if we do, to make it write-only by the pipelines), but other people can comment much more usefully on that than I can.

            Show
            jbosch Jim Bosch added a comment - I hadn't really thought about extending it to a real database, but it's an interesting idea. I'd like to get Kian-Tat Lim and DAX opinions on that. Conceptually, I think this is just a per tract+patch+filter list of visit+ccd data IDs, right? I think the simplest way to represent that is a regular butlerized file (maybe just BaseCatalog rather then ExposureCatalog, if that's all it is). Mapping generic data IDs to a schema is probably a bit thorny right now, but I think we'll have to provide generic capabilities for that someday and hopefully we can do it well now. So I guess the big question is whether you think it'd be useful to essentially denormalize some of the exposure metadata into those files so it's useful for something other than the coaddition tasks. If that's the case, ExposureCatalog or a real database starts to make more sense. I think the LSST plan is not (thus far) to use a database for a lot of the provenance stuff (or, if we do, to make it write-only by the pipelines), but other people can comment much more usefully on that than I can.
            Hide
            tjenness Tim Jenness added a comment -

            This RFC closed last week. What's the consensus?

            Show
            tjenness Tim Jenness added a comment - This RFC closed last week. What's the consensus?
            Hide
            price Paul Price added a comment -

            I'm gonna do it.

            Show
            price Paul Price added a comment - I'm gonna do it.
            Hide
            tjenness Tim Jenness added a comment -

            Is DM-5766 the triggered work? (It's currently a Blocks relationship not "Is Triggering" – it can be both).

            Show
            tjenness Tim Jenness added a comment - Is DM-5766 the triggered work? (It's currently a Blocks relationship not "Is Triggering" – it can be both).
            Hide
            price Paul Price added a comment -

            Yes.

            Show
            price Paul Price added a comment - Yes.
            Hide
            tjenness Tim Jenness added a comment -

            Paul Price what's going on with this RFC? The triggered work closed as invalid. Has the RFC been withdrawn?

            Show
            tjenness Tim Jenness added a comment - Paul Price what's going on with this RFC? The triggered work closed as invalid. Has the RFC been withdrawn?
            Hide
            tjenness Tim Jenness added a comment -

            Paul Price please comment on the status of this RFC. It has no triggered work associated with it.

            Show
            tjenness Tim Jenness added a comment - Paul Price please comment on the status of this RFC. It has no triggered work associated with it.
            Hide
            price Paul Price added a comment -

            I think it belongs to Yusra AlSayyad now.

            Show
            price Paul Price added a comment - I think it belongs to Yusra AlSayyad now.
            Hide
            yusra Yusra AlSayyad added a comment - - edited

            Without reading the comment history too carefully, I can probably do this on DM-15236 which is the ticket to "refactor makeCoaddTempExp in prep for Gen 3 middleware."

            Show
            yusra Yusra AlSayyad added a comment - - edited Without reading the comment history too carefully, I can probably do this on DM-15236 which is the ticket to "refactor makeCoaddTempExp in prep for Gen 3 middleware."
            Hide
            tjenness Tim Jenness added a comment -

            Ok, I've attached that ticket.

            Show
            tjenness Tim Jenness added a comment - Ok, I've attached that ticket.
            Hide
            yusra Yusra AlSayyad added a comment -

            I just marked DM-15236 as invalid because the the work was actually done on DM-17028https://github.com/lsst/pipe_tasks/commit/62e839c2e8686beb4b9883a710c8e7fb8af9ab5b

            Jim Bosch's original comment, "I expect SuperTask will help with this in the future," has been realized.  There's still some work to be done this cycle to retire the Gen 2 version.  I'll attach the appropriate ticket when we get there. 

             

            Show
            yusra Yusra AlSayyad added a comment - I just marked DM-15236 as invalid because the the work was actually done on DM-17028 :  https://github.com/lsst/pipe_tasks/commit/62e839c2e8686beb4b9883a710c8e7fb8af9ab5b Jim Bosch 's original comment, "I expect SuperTask will help with this in the future," has been realized.  There's still some work to be done this cycle to retire the Gen 2 version.  I'll attach the appropriate ticket when we get there.   
            Hide
            tjenness Tim Jenness added a comment -

            Yusra AlSayyad RFCs that have no work to be done keep popping up at the DMCCB for discussion. If you need to do more work on this RFC implementation would it be possible to create a ticket for it (it doesn't have to cover all the work, just some work to allow the RFC to keep on hiding).

            Show
            tjenness Tim Jenness added a comment - Yusra AlSayyad RFCs that have no work to be done keep popping up at the DMCCB for discussion. If you need to do more work on this RFC implementation would it be possible to create a ticket for it (it doesn't have to cover all the work, just some work to allow the RFC to keep on hiding).
            Hide
            yusra Yusra AlSayyad added a comment -

            I'm comfortable calling this implemented now. I expect deprecation of the Gen 2 version this Fall. 

            Show
            yusra Yusra AlSayyad added a comment - I'm comfortable calling this implemented now. I expect deprecation of the Gen 2 version this Fall. 

              People

              Assignee:
              price Paul Price
              Reporter:
              price Paul Price
              Watchers:
              Andy Salnikov, Hsin-Fang Chiang, Jim Bosch, John Swinbank, Kian-Tat Lim, Paul Price, Russell Owen, Tim Jenness, Yusra AlSayyad
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:
                Planned End:

                  Jenkins

                  No builds found.