Uploaded image for project: 'Request For Comments'
  1. Request For Comments
  2. RFC-906

APDB column renaming

    XMLWordPrintable

    Details

    • Type: RFC
    • Status: Implemented
    • Resolution: Done
    • Component/s: DM
    • Labels:
      None

      Description

      RFC-807 proposed extensive renaming and reformatting of the column names of the major DRP tables. These changes were propagated into DP0.2. Due to the short deadline, comparable changes were not made to the column names in the DIA tables or in the APDB. Even within DP0.2, this has led to inconsistencies (e.g., ForcedSource.psfFlux, DIASource.psFlux, and DIASource.psfFlux_flag).

      This RFC is to make the DIA and Solar System column naming scheme consistent with the RFC-807 conventions as well as to make a few other changes that we have identified as necessary for AP.

      The sdm_schemas pull request for DM-37196 provides the full list of proposed changes:
      https://github.com/lsst/sdm_schemas/pull/107/commits

      Outline of the proposed changes:

      Consistency with RFC-807 and DP0.2:

      • psFlux → psfFlux
      • separate band with an underscore (e.g., u_psfFluxMean not uPsfFluxMean)
      • filtername → band

      Additions and corrections:

      • add apdb.yaml to the public-facing schema browser
      • correct units to be nJy rather than nanomaggy
      • add time_processed and time_withdrawn columns to DIASource and DIAForcedSource in APDB to support catchup processing and withdrawal of bad records (see DM-SST notes, https://confluence.lsstcorp.org/display/DM/2023-01-30+DM-SST+Agenda+and+Meeting+Notes)
      • add Visit, CcdVisit, and Instrument tables to APDB. Note that we are retaining the denormalized band column in DIASource and DIAForcedSource for user convenience.

      New renaming:

      • use ra, dec (not "decl") for coordinates as proposed in RFC-863
      • "dip" → "dipole"
      • DIAObject.radecTai → DIAObject.radecEpoch
      • totFlux → scienceFlux throughout
      • DiaSource.diffFlux → DiaSource.snapDiffFlux
      • DiaSource.midPointTai → DiaSource.midPointTime

      While for simplicity we have used apdb.yaml as the illustration for the proposed changes, when this RFC closes we will propagate these changes to imsim.yaml and (where necessary) hsc.yaml as well.

        Attachments

          Issue Links

            Activity

            Hide
            ebellm Eric Bellm added a comment -

            A discussion today highlighted a lack of clarity in this RFC: we are proposing to make the column name consistency adjustments in all DIA tables, including those for future DRP runs (as encapsulated in imsim.yaml). Otherwise the column name formats will remain inconsistent between DRP tables in future DRs, as they are in DP0.2.

            Show
            ebellm Eric Bellm added a comment - A discussion today highlighted a lack of clarity in this RFC: we are proposing to make the column name consistency adjustments in all DIA tables, including those for future DRP runs (as encapsulated in imsim.yaml ). Otherwise the column name formats will remain inconsistent between DRP tables in future DRs, as they are in DP0.2.
            Hide
            Parejkoj John Parejko added a comment -

            Responding to a comment from Meredith Rawls on DM-37196 (the implementation ticket), since it's a topic for this RFC-906.

            I am concerned about the change from midPointTai to midPointMjd. The former is a time scale, and the other is a time format. I recognize Gregory's assertion (both in an RFC comment and deep in a therein referenced Slack thread) that the times we report are in fact MJDs, but isn't it MJD in TAI?

            The original RFC proposed midPointTime which I think is much less of an eyebrow-raising change, and also gives us the flexibility to gracefully specify exactly what flavor (scale + format) of time we are reporting in documentation rather than hardwiring it in column/schema names. I worry that calling it midPointMjd might cause a user to infer this is MJD in UTC (not TAI) and do their science wrong, since UTC is the astropy default.

            Note this conversation on slack in February/March. The "MJD"ness is the important thing for users to be able to interpret this value (it's a double, not an SQL time). All of our reported times are TAI; MJD is not necessarily in TAI. We seem to have de-facto settled on MJD as our preferred time reporting format. As I said on that slack thread, I could be convinced to make that `midPointTime`, but we have expMidptMJD and obsStartMJD in the Visit and CcdVisit tables as precedent.

            Show
            Parejkoj John Parejko added a comment - Responding to a comment from Meredith Rawls on DM-37196 (the implementation ticket), since it's a topic for this RFC-906 . I am concerned about the change from midPointTai to midPointMjd. The former is a time scale, and the other is a time format. I recognize Gregory's assertion (both in an RFC comment and deep in a therein referenced Slack thread) that the times we report are in fact MJDs, but isn't it MJD in TAI? The original RFC proposed midPointTime which I think is much less of an eyebrow-raising change, and also gives us the flexibility to gracefully specify exactly what flavor (scale + format) of time we are reporting in documentation rather than hardwiring it in column/schema names. I worry that calling it midPointMjd might cause a user to infer this is MJD in UTC (not TAI) and do their science wrong, since UTC is the astropy default. Note this conversation on slack in February/March . The "MJD"ness is the important thing for users to be able to interpret this value (it's a double, not an SQL time). All of our reported times are TAI; MJD is not necessarily in TAI. We seem to have de-facto settled on MJD as our preferred time reporting format. As I said on that slack thread, I could be convinced to make that `midPointTime`, but we have expMidptMJD and obsStartMJD in the Visit and CcdVisit tables as precedent.
            Hide
            mjuric Mario Juric added a comment - - edited

            FWIW, Meredith Rawls brings up a good point...

            E.g., in solar system community, observation times are commonly sent in UTC (e.g. see https://www.minorplanetcenter.net/iau/info/OpticalObs.html) and orbits are given in TT (e.g., see https://www.minorplanetcenter.net/iau/info/MPOrbitFormat.html). And our orbits table will also have the epoch given as MJD but in TT (as is convention).

            Having TAI explicitly in the name of columns with time quantities would help with this class of mistakes – thinking TAI is UTC, or TT, etc. They're quite insidious to track down when they happen (we've had this bug with SDSS for about ~5 years until we noticed).

            Show
            mjuric Mario Juric added a comment - - edited FWIW, Meredith Rawls brings up a good point... E.g., in solar system community, observation times are commonly sent in UTC (e.g. see https://www.minorplanetcenter.net/iau/info/OpticalObs.html)  and orbits are given in TT (e.g., see https://www.minorplanetcenter.net/iau/info/MPOrbitFormat.html ). And our orbits table will also have the epoch given as MJD but in TT (as is convention). Having TAI explicitly in the name of columns with time quantities would help with this class of mistakes – thinking TAI is UTC, or TT, etc. They're quite insidious to track down when they happen (we've had this bug with SDSS for about ~5 years until we noticed).
            Hide
            Parejkoj John Parejko added a comment - - edited

            I've updated most of the double-based time fields to be explicitly MjdTai in their names, spelled out expressed as Modified Julian Date, International Atomic Time in the relevant docstrings, fixed the DiaSource.midPoint capitalization and changed Visit/CcdVisit expMidpt/expMidptMjdTai to match DiaSource as just midpoint:

            raDecTai->raDecMjdTai
            midPointMjd->midpointMjdTai
            obsStartMJD->objStartMjdTai
            expMidptMJD->midpointMjdTai
            expMidpt->midpoint -- an SQL timestamp
            

            See this sdm_schemas commit for the exact changes.

            This leaves unchanged fields like SSObject.discoverySubmissionDate and SSObject.firstObservationDate which are currently unused. It also doesn't change MPCORB.epoch which comes straight from the MPC database schema and I think we have to keep it matching that (even though it's TT, not TAI).

            Do we need to have duplicates of the old names of these columns in imsim.yaml/hsc.yaml, or are they basically unused in analysis at present? I don't see any uses of them other than the code in postprocess.py that writes them.

            Show
            Parejkoj John Parejko added a comment - - edited I've updated most of the double -based time fields to be explicitly MjdTai in their names, spelled out expressed as Modified Julian Date, International Atomic Time in the relevant docstrings, fixed the DiaSource.midPoint capitalization and changed Visit/CcdVisit expMidpt/expMidptMjdTai to match DiaSource as just midpoint : raDecTai->raDecMjdTai midPointMjd->midpointMjdTai obsStartMJD->objStartMjdTai expMidptMJD->midpointMjdTai expMidpt->midpoint -- an SQL timestamp See this sdm_schemas commit for the exact changes. This leaves unchanged fields like SSObject.discoverySubmissionDate and SSObject.firstObservationDate which are currently unused. It also doesn't change MPCORB.epoch which comes straight from the MPC database schema and I think we have to keep it matching that (even though it's TT, not TAI). Do we need to have duplicates of the old names of these columns in imsim.yaml/hsc.yaml , or are they basically unused in analysis at present? I don't see any uses of them other than the code in postprocess.py that writes them.
            Hide
            ctslater Colin Slater added a comment -

            I agree with John's names that specify TAI. I had previously hoped that we could just say that all MJDs are in TAI, but if the MPC is using TT then we really need to guard against mixing those up.

            Show
            ctslater Colin Slater added a comment - I agree with John's names that specify TAI. I had previously hoped that we could just say that all MJDs are in TAI, but if the MPC is using TT then we really need to guard against mixing those up.

              People

              Assignee:
              ebellm Eric Bellm
              Reporter:
              ebellm Eric Bellm
              Watchers:
              Andy Salnikov, Brianna Smart, Colin Slater, Eric Bellm, Fritz Mueller, Gregory Dubois-Felsmann, Ian Sullivan, John Parejko, Kian-Tat Lim, Leanne Guy, Mario Juric, Yusra AlSayyad
              Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:
                Planned End:

                  Jenkins

                  No builds found.