Uploaded image for project: 'Request For Comments'
  1. Request For Comments
  2. RFC-810

Clarifications concerning synthetic source injection changes

    XMLWordPrintable

    Details

    • Type: RFC
    • Status: Withdrawn
    • Resolution: Done
    • Component/s: DM
    • Labels:
      None

      Description

      Seeking a couple clarifications about synthetic source injection:

      In RFC-764 we adopted a plan to move code associated with synthetic source injection into a new package. We weren't super clear about if tasks or just low-level code should be moved though.  I propose to move only low-level code to the new package, e.g., GalSim stuff, matching stuff, etc. Associated driver tasks would for now remain in pipe_tasks, ap_pipe, ....

      Some relevant discussion from #dm-source-injection: https://lsstc.slack.com/archives/C01LBE9CPBR/p1634157997004200

       

      The other piece we didn't fully address in the original RFC was renaming "fakes" -> "synthetic sources". We did decide that the new package name would be "source_injection" at least in part to avoid using "fake", but there are many references in the science pipelines that currently refer instead to fakes. E.g.,

      pipe/tasks/insertFakes.py
      pipe/tasks/matchFakes.py
      ap/pipe/createApFakes.py

      just to pick a few.

      For completeness, I propose to rename all of these with "Fakes" -> "SyntheticSources" (or similar).

      I also propose to rename the "FAKE" mask plane to "SYNTHETIC".

      Again, some relevant discussion from #dm-source-injection: https://lsstc.slack.com/archives/C01LBE9CPBR/p1634158723010200

       

        Attachments

          Issue Links

            Activity

            Hide
            ebellm Eric Bellm added a comment -

            I am opposed to renaming "fakes" to "synthetic sources." The
            latter adds no new semantic information and is substantially longer
            to type and say. "Fake source injection" is commonly used in the
            literature and will not cause any confusion.

            Show
            ebellm Eric Bellm added a comment - I am opposed to renaming "fakes" to "synthetic sources." The latter adds no new semantic information and is substantially longer to type and say. "Fake source injection" is commonly used in the literature and will not cause any confusion.
            Hide
            lskelvin Lee Kelvin added a comment - - edited

            I'm in favour of switching away from use of the name 'fake', and any derivatives. Beyond any negative connotations that the term fake may imply, it also does not encompass the full capability of the package. For example, groups I have been working with have been experimenting with injecting real galaxies (e.g., suitably degraded HST imaging) into HSC data.

            For some time now, both DM and DESC have been using the phrase 'synthetic source injection', which seems to have caught on. I'd be in favour of using 'synthetic' for the dataset type (e.g., synthetic_calexp / synthetic_deepCoadd), and SYNTHETIC for the mask plane name.

            As an alternative, I think 'injected' also works well (injected_calexp / injected_deepCoadd, INJECTED mask plane), although I don't have a strong preference between injected and synthetic.

            Show
            lskelvin Lee Kelvin added a comment - - edited I'm in favour of switching away from use of the name 'fake', and any derivatives. Beyond any negative connotations that the term fake may imply, it also does not encompass the full capability of the package. For example, groups I have been working with have been experimenting with injecting real galaxies (e.g., suitably degraded HST imaging) into HSC data. For some time now, both DM and DESC have been using the phrase 'synthetic source injection', which seems to have caught on. I'd be in favour of using 'synthetic' for the dataset type (e.g., synthetic_calexp / synthetic_deepCoadd ), and SYNTHETIC for the mask plane name. As an alternative, I think 'injected' also works well ( injected_calexp / injected_deepCoadd , INJECTED mask plane), although I don't have a strong preference between injected and synthetic.
            Hide
            ebellm Eric Bellm added a comment -

            Out of curiosity, I did an ADS fulltext search on the following (quoted) terms:

            "synthetic source": 392 results
            "synthetic source injection": 15 results
            "fake source": 545 results
            "fake source injection": 45 results

            Whether or not a change to "synthetic sources" is accepted, I like Lee's suggestion of "injected" as the dataset/mask name--I think the casual science user will find it more intuitive.

            Show
            ebellm Eric Bellm added a comment - Out of curiosity, I did an ADS fulltext search on the following (quoted) terms: "synthetic source": 392 results "synthetic source injection": 15 results "fake source": 545 results "fake source injection": 45 results Whether or not a change to "synthetic sources" is accepted, I like Lee's suggestion of "injected" as the dataset/mask name--I think the casual science user will find it more intuitive.
            Hide
            mrawls Meredith Rawls added a comment -

            This is a tricky one. In principle, I like "synthetic" a bit better than "fake," and "injected" a bit less. "Fake" has significant presence in our codebase already, and I'm not sure a switch is worth it. I can go either way, so my real vote here is for consistency.

            We should choose one word and use it for everything, including the mask plane. Fake, synthetic, injected, potato, whatever. (To respond to Lee's point, I don't much care if you're injecting "real" galaxies, the result is still fake/synthetic images/data.)

            I suggest tying the implementation of this RFC to DM-30210. That ticket is about fixing how the contract-checker expands names of data products such as coadds (e.g., deepCoadd vs. goodSeeingCoadd), but similar contract-checking issues may apply when data products are prefixed with, e.g., "fakes_". I am wary to embark on any kind of data product naming change throughout the stack before this is addressed.

            Show
            mrawls Meredith Rawls added a comment - This is a tricky one. In principle, I like "synthetic" a bit better than "fake," and "injected" a bit less. "Fake" has significant presence in our codebase already, and I'm not sure a switch is worth it. I can go either way, so my real vote here is for consistency. We should choose one word and use it for everything, including the mask plane. Fake, synthetic, injected, potato, whatever. (To respond to Lee's point, I don't much care if you're injecting "real" galaxies, the result is still fake/synthetic images/data.) I suggest tying the implementation of this RFC to DM-30210 . That ticket is about fixing how the contract-checker expands names of data products such as coadds (e.g., deepCoadd vs. goodSeeingCoadd), but similar contract-checking issues may apply when data products are prefixed with, e.g., "fakes_". I am wary to embark on any kind of data product naming change throughout the stack before this is addressed.
            Hide
            jmeyers314 Joshua Meyers added a comment -

            Thanks for the discussion Eric, Lee, and Meredith. I think my vote is mostly also in the direction of consistency and not towards a particular name (my impression was we were headed towards "synthetic" already, but maybe that's not actually clear). Waiting on the outcome of DM-30210 sounds reasonable to me.

            With that in mind, and also because this should probably be decided in a data management context (and I'm actually only here in a DESC context at the moment) I propose to drop the name change proposal without any particular prejudice.

            Since the "only move low-level code to the source_injection package" was ambiguously okay from RFC-764 and hasn't seemed to elicit any particular response here I think it's okay to move forward with that part of the plan.

            So I'll mark this RFC as withdrawn in the next day or two unless I hear any more discussion.

             

            Show
            jmeyers314 Joshua Meyers added a comment - Thanks for the discussion Eric, Lee, and Meredith. I think my vote is mostly also in the direction of consistency and not towards a particular name (my impression was we were headed towards "synthetic" already, but maybe that's not actually clear). Waiting on the outcome of DM-30210 sounds reasonable to me. With that in mind, and also because this should probably be decided in a data management context (and I'm actually only here in a DESC context at the moment) I propose to drop the name change proposal without any particular prejudice. Since the "only move low-level code to the source_injection package" was ambiguously okay from RFC-764 and hasn't seemed to elicit any particular response here I think it's okay to move forward with that part of the plan. So I'll mark this RFC as withdrawn in the next day or two unless I hear any more discussion.  

              People

              Assignee:
              jmeyers314 Joshua Meyers
              Reporter:
              jmeyers314 Joshua Meyers
              Watchers:
              Eric Bellm, Joshua Meyers, Lee Kelvin, Meredith Rawls
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:
                Planned End:

                  Jenkins

                  No builds found.