Uploaded image for project: 'Request For Comments'
  1. Request For Comments
  2. RFC-275

Add daf_ingest to lsst_distrib

    XMLWordPrintable

    Details

    • Type: RFC
    • Status: Withdrawn
    • Resolution: Done
    • Component/s: DM
    • Labels:
      None
    • Location:
      This ticket.

      Description

      As part of RFC-57 (DM-1674) the ingestion code was removed from datarel and placed in daf_ingest. daf_ingest was never added back into lsst_distrib. This RFC proposes that we put it back. I am assuming this is not controversial as it's nominally a rebranding of something that was already in lsst_distrib.

      The one wrinkle is that daf_ingest does not build with Python 3 (see DM-8268).

        Attachments

          Issue Links

            Activity

            Hide
            ktl Kian-Tat Lim added a comment -

            +1.

            Show
            ktl Kian-Tat Lim added a comment - +1.
            Hide
            mfisherlevine Merlin Fisher-Levine added a comment -

            +2

            Show
            mfisherlevine Merlin Fisher-Levine added a comment - +2
            Hide
            swinbank John Swinbank added a comment -
            • How much maintenance effort is required to add daf_ingest to lsst_distrib? That is, beyond making in Python 3 compatible, does it need further work?
            • Do we have a pressing requirement for this package? If we add it, who will use it?
            • Maybe a silly question, but... what does it actually do? I mean, I can speculate based on the name, but there seems to be a total absence of useful documentation (which is something that should certainly be fixed before it goes into _distrib).
            Show
            swinbank John Swinbank added a comment - How much maintenance effort is required to add daf_ingest to lsst_distrib? That is, beyond making in Python 3 compatible, does it need further work? Do we have a pressing requirement for this package? If we add it, who will use it? Maybe a silly question, but... what does it actually do? I mean, I can speculate based on the name, but there seems to be a total absence of useful documentation (which is something that should certainly be fixed before it goes into _distrib).
            Hide
            mfisherlevine Merlin Fisher-Levine added a comment -

            I assumed from the name that this was necessary for ingest scripts to run. If that's correct, I would argue that this is essential and should be shipped with lsst_distrib.

            If it's not then I don't know what it's for and I withdraw any opinion on the subject (other than that the package has a misleading name!).

            Show
            mfisherlevine Merlin Fisher-Levine added a comment - I assumed from the name that this was necessary for ingest scripts to run. If that's correct, I would argue that this is essential and should be shipped with lsst_distrib. If it's not then I don't know what it's for and I withdraw any opinion on the subject (other than that the package has a misleading name!).
            Hide
            tjenness Tim Jenness added a comment -

            John Swinbank it passes tests and there isn't much code in there so it may not be a large maintenance effort. As to the user, DM-5766 implies that daf_ingest is needed.

            As for "what does it do", I admit I am guilty of trying to clean up a loose end from RFC-57. The lack of documentation is a problem. The tests are for ingesting a catalog and for indexing exposures. Fritz Mueller I guess one outcome of this discussion is that you need to find someone to add some basic documentation explaining the package.

            If DM-5766 is a long way off or doesn't really need daf_ingest then I'm happy to let it drop for now. If DM-5766 does not need daf_ingest then please clarify that ticket.

            As for the name, it was chosen based on the datarel code used to ingest data releases.

            Show
            tjenness Tim Jenness added a comment - John Swinbank it passes tests and there isn't much code in there so it may not be a large maintenance effort. As to the user, DM-5766 implies that daf_ingest is needed. As for "what does it do", I admit I am guilty of trying to clean up a loose end from RFC-57 . The lack of documentation is a problem. The tests are for ingesting a catalog and for indexing exposures. Fritz Mueller I guess one outcome of this discussion is that you need to find someone to add some basic documentation explaining the package. If DM-5766 is a long way off or doesn't really need daf_ingest then I'm happy to let it drop for now. If DM-5766 does not need daf_ingest then please clarify that ticket. As for the name, it was chosen based on the datarel code used to ingest data releases.
            Hide
            tjenness Tim Jenness added a comment -

            See also this confluence page: https://confluence.lsstcorp.org/display/SQRE/Bulge+Survey+Processing where daf_ingest was used and a comment was made that it is not yet part of lsst_distrib.

            Show
            tjenness Tim Jenness added a comment - See also this confluence page: https://confluence.lsstcorp.org/display/SQRE/Bulge+Survey+Processing where daf_ingest was used and a comment was made that it is not yet part of lsst_distrib .
            Hide
            tjenness Tim Jenness added a comment -

            Also, the rename of the ingestion parts of datarel to daf_ingest was mentioned in the stack history page: https://confluence.lsstcorp.org/display/DM/DM+Stack+Package+History (making it a bit odd that the rename lost that code from lsst_distrib).

            daf_ingest is also mentioned in our "stack tour" at https://confluence.lsstcorp.org/display/LSWUG/Tour+of+the+Software+Stack

            Show
            tjenness Tim Jenness added a comment - Also, the rename of the ingestion parts of datarel to daf_ingest was mentioned in the stack history page: https://confluence.lsstcorp.org/display/DM/DM+Stack+Package+History (making it a bit odd that the rename lost that code from lsst_distrib ). daf_ingest is also mentioned in our "stack tour" at https://confluence.lsstcorp.org/display/LSWUG/Tour+of+the+Software+Stack
            Hide
            tjenness Tim Jenness added a comment -
            Show
            tjenness Tim Jenness added a comment - See also this community post: https://community.lsst.org/t/database-ingest-of-exposure-metadata/481
            Hide
            mfisherlevine Merlin Fisher-Levine added a comment - - edited

            -1 (=0 net from me)

            Show
            mfisherlevine Merlin Fisher-Levine added a comment - - edited -1 (=0 net from me)
            Hide
            swinbank John Swinbank added a comment -

            DM-5766 is a good argument; I'd forgotten that daf_ingest was a prerequisite for that. I think we should encourage the relevant product owner (hi, Jim Bosch) to comment on its relative priority, but I do think that the spatial indexing is likely something we should carefully preserve.

            I'm less clear about the catalogue ingestion code. I got the impression (possibly mistakenly) when I talked to Fritz Mueller about database ingestion a couple of months ago that this code wasn't really on his radar, and my guess was that this code had rotted. The major driver from SciPi will likely be ingesting results from upcoming ... data goodness(? QA? science validation? I'm lost) processing: I think it's a DAX call whether or not this is the appropriate code to use for that.

            Show
            swinbank John Swinbank added a comment - DM-5766 is a good argument; I'd forgotten that daf_ingest was a prerequisite for that. I think we should encourage the relevant product owner (hi, Jim Bosch ) to comment on its relative priority, but I do think that the spatial indexing is likely something we should carefully preserve. I'm less clear about the catalogue ingestion code. I got the impression (possibly mistakenly) when I talked to Fritz Mueller about database ingestion a couple of months ago that this code wasn't really on his radar, and my guess was that this code had rotted. The major driver from SciPi will likely be ingesting results from upcoming ... data goodness(? QA? science validation? I'm lost) processing: I think it's a DAX call whether or not this is the appropriate code to use for that.
            Hide
            afausti Angelo Fausti added a comment -

            I think https://community.lsst.org/t/unified-exposure-metadata/1214 is also relevant, as far as I know daf_ingest is not ingesting the exposure metadata and a unified metadata as implemented in DM-5503 was a requirement for that.

            Show
            afausti Angelo Fausti added a comment - I think https://community.lsst.org/t/unified-exposure-metadata/1214 is also relevant, as far as I know daf_ingest is not ingesting the exposure metadata and a unified metadata as implemented in DM-5503 was a requirement for that.
            Hide
            price Paul Price added a comment -

            I hope to get to DM-5766 soon. It would be good to get it done while Yusra AlSayyad is also playing with the warping.

            Show
            price Paul Price added a comment - I hope to get to DM-5766 soon. It would be good to get it done while Yusra AlSayyad is also playing with the warping.
            Hide
            jbosch Jim Bosch added a comment - - edited

            DM-5766 adds very useful functionality I'd love to have in the stack soon. I imagine implementing that will require adding daf_ingest as a dependency of pipe_tasks, which would bring it in as a dependency of lsst_apps, not just datarel. It looks like that will bring in sphgeom as well, which I don't anticipate adding any problems.

            The actual ingest-processing-to-SQL-databases code has probably bitrotted quite a bit and I don't think it has ever been fully integrated with the rest of the stack (at least not for modern pipeline outputs; it's possible it was used to ingest the old Stripe 82 processing into the PDAC database). I don't think we need that code right now (at least in Science Pipelines), but it's a very important piece of the puzzle in integrating Science Pipelines with DAX/SUIT in future incarnations of the PDAC.

            I don't see a clear gain or harm from adding daf_ingest to datarel earlier than DM-5766 (it's just unnecessary work vs. letting part of RFC-57 linger).

            Show
            jbosch Jim Bosch added a comment - - edited DM-5766 adds very useful functionality I'd love to have in the stack soon. I imagine implementing that will require adding daf_ingest as a dependency of pipe_tasks , which would bring it in as a dependency of lsst_apps , not just datarel . It looks like that will bring in sphgeom as well, which I don't anticipate adding any problems. The actual ingest-processing-to-SQL-databases code has probably bitrotted quite a bit and I don't think it has ever been fully integrated with the rest of the stack (at least not for modern pipeline outputs; it's possible it was used to ingest the old Stripe 82 processing into the PDAC database). I don't think we need that code right now (at least in Science Pipelines), but it's a very important piece of the puzzle in integrating Science Pipelines with DAX/SUIT in future incarnations of the PDAC. I don't see a clear gain or harm from adding daf_ingest to datarel earlier than DM-5766 (it's just unnecessary work vs. letting part of RFC-57 linger).
            Hide
            swinbank John Swinbank added a comment -

            Making pipe_tasks depend on daf_ingest because the former is actively using functionality provided by the latter certainly sits better with me than adding the dependency to _distrib on the basis that it might be useful someday.

            I'd prefer not to be adding dependencies which pull in the likely-bitrotted table-ingest code, but I don't think it's worth either trying to veto the dependency or to split daf_ingest into multiple packages on that basis.

            Show
            swinbank John Swinbank added a comment - Making pipe_tasks depend on daf_ingest because the former is actively using functionality provided by the latter certainly sits better with me than adding the dependency to _distrib on the basis that it might be useful someday. I'd prefer not to be adding dependencies which pull in the likely-bitrotted table-ingest code, but I don't think it's worth either trying to veto the dependency or to split daf_ingest into multiple packages on that basis.
            Hide
            tjenness Tim Jenness added a comment -

            I will withdraw this RFC on the basis that DM-5766 will pull daf_ingest into lsst_apps when the time is right and there is no need in the short term to add it to lsst_distrib just because the code from datarel used to be in lsst_distrib.

            I do not believe that DM-5766 will require an RFC to bring in daf_ingest based on the discussion in this RFC. If you feel that it does require one we can reopen this RFC at the time.

            Show
            tjenness Tim Jenness added a comment - I will withdraw this RFC on the basis that DM-5766 will pull daf_ingest into lsst_apps when the time is right and there is no need in the short term to add it to lsst_distrib just because the code from datarel used to be in lsst_distrib . I do not believe that DM-5766 will require an RFC to bring in daf_ingest based on the discussion in this RFC. If you feel that it does require one we can reopen this RFC at the time.

              People

              Assignee:
              tjenness Tim Jenness
              Reporter:
              tjenness Tim Jenness
              Watchers:
              Angelo Fausti, Fritz Mueller, Hsin-Fang Chiang, Jim Bosch, John Swinbank, Kian-Tat Lim, Paul Price, Tim Jenness
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:
                Planned End:

                  Jenkins

                  No builds found.