Uploaded image for project: 'Request For Comments'
  1. Request For Comments
  2. RFC-648

Handling time in Gen3 middleware

    XMLWordPrintable

    Details

    • Type: RFC
    • Status: Implemented
    • Resolution: Done
    • Component/s: DM
    • Labels:
      None

      Description

      The Gen3 middleware will enable the use of time in the query/selector used for quantum graph generation.  This requires times to be present in the registry database and means to be provided for users to specify times.

      There are four obvious candidates for the representation of times in the registry database:

      • Database internal DATETIME/TIMESTAMP fields, typically required to be UTC.  Allows use of database functions for handling times.
      • astropy.time internal format composed of two double-precision floating-point numbers representing an MJD in either UTC or TAI timescales.
      • A single double-precision floating-point number representing an MJD, in UTC or TAI.
      • lsst.daf.base.DateTime internal format composed of a single 64-bit integer counting nanoseconds since the Unix epoch in either UTC or TAI timescales.

      Similarly, there are three candidates for human-specified time literals:

      • ISO8601 string (YYYY-MM-DDTHH:mm:ss.ssssssZ).
      • An MJD numeric literal, possibly with a prefix or suffix to indicate the type.
      • A nanosecond numeric literal, again possibly with a prefix/suffix.

      There are implementation desires for the time literal to be identifiable as such to the parser and for all time literals to be translated into a single internal representation within the quantum graph generator.

      I think it is also desirable for the quantum graph selector expression to be close to ADQL/SQL rather than inventing a new language.  Our requirements say that user-facing times do not need to be TAI, but pipeline-internal times should be TAI.

      Note that the database representation can differ between the registry database and any metadata database published to the DAC and science users, although we may be able to reduce duplication by having those two be implemented using a single underlying set of database tables.

      I propose that the human-specified literals be either ISO8601 with explicit timezone indcator or MJD (single-column) numeric, both in the UTC timescale.  If a type indicator is necessary for the latter, prefixing the number with the literal string "MJD" could be acceptable, though it deviates from SQL/ADQL.  I also expect these literal forms to be used in FITS headers where needed.  Literal nanoseconds is too user-hostile to be supported.

      As long as the registry database is non-public, I propose that the database representation be either DB-native DATETIME in UTC or lsst.daf.base.DateTime integer nanoseconds, assuming that the DateTime class can be used and new time-handling/translation code is not needed.  The schema creator should choose between native and nanoseconds depending on the expected usage of the column.

      Since DateTime provides conversions to and from UTC and MJD, it can be used as the internal representation within the parser and quantum graph generator.

        Attachments

          Issue Links

            Activity

            Hide
            tjenness Tim Jenness added a comment -

            I really don't want a daf_base dependency so could we use microsecond integers rather than nanosecond integers and use astropy.time? There is no scenario I can imagine that would be dealing with multiple datasets per nanosecond (and I'm sure that milliseconds would be fine).

            Show
            tjenness Tim Jenness added a comment - I really don't want a daf_base dependency so could we use microsecond integers rather than nanosecond integers and use astropy.time? There is no scenario I can imagine that would be dealing with multiple datasets per nanosecond (and I'm sure that milliseconds would be fine).
            Hide
            tjenness Tim Jenness added a comment -

            Kian-Tat Lim I'm happy for this RFC to be adopted with an integer time if we use astropy and not daf_base. I'll leave it up to you to determine whether we really need nanoseconds in the registry rather than microseconds.

            Show
            tjenness Tim Jenness added a comment - Kian-Tat Lim  I'm happy for this RFC to be adopted with an integer time if we use astropy and not daf_base. I'll leave it up to you to determine whether we really need nanoseconds in the registry rather than microseconds.
            Hide
            ktl Kian-Tat Lim added a comment - - edited

            The following code is small and self-explanatory enough that it can be placed in daf_butler without needing a new package.

            import astropy.time
             
            epoch = astropy.time.Time("1970-01-01T00:00:00", format="isot", scale="tai")
             
            def from_isot_utc(isotutc : str) -> int:
                moment = astropy.time.Time(isotutc, format="isot", scale="utc")
                return int(round((moment - epoch).to_value("sec") * 1e9))
             
            def to_isot_utc(nsecs : int) -> str:
                moment = astropy.time.TimeDelta(nsecs * 1e-9, format="sec") + epoch
                moment.precision = 6
                return moment.utc.isot + "Z"
             
            def from_mjd_tai(mjd : float) -> int:
                moment = astropy.time.Time(mjd, format="mjd", scale="tai")
                return int(round((moment - epoch).to_value("sec") * 1e9))
             
            def to_mjd_tai(nsecs : int) -> float:
                moment = astropy.time.TimeDelta(nsecs * 1e-9, format="sec") + epoch
                moment.precision = 6
                return moment.mjd
            

            The results from this code are consistent with daf_base DateTime in the domain of interest, and they are consistent with DateTime's MJD conversion accuracy of 1 microsecond or better.

            I'm not sure why I was made the assignee for this RFC, but I will go ahead and adopt it, with the already-blocked ticket as the implementation.

            Show
            ktl Kian-Tat Lim added a comment - - edited The following code is small and self-explanatory enough that it can be placed in daf_butler without needing a new package. import astropy.time   epoch = astropy.time.Time( "1970-01-01T00:00:00" , format = "isot" , scale = "tai" )   def from_isot_utc(isotutc : str ) - > int :     moment = astropy.time.Time(isotutc, format = "isot" , scale = "utc" )     return int ( round ((moment - epoch).to_value( "sec" ) * 1e9 ))   def to_isot_utc(nsecs : int ) - > str :     moment = astropy.time.TimeDelta(nsecs * 1e - 9 , format = "sec" ) + epoch   moment.precision = 6     return moment.utc.isot + "Z"   def from_mjd_tai(mjd : float ) - > int :     moment = astropy.time.Time(mjd, format = "mjd" , scale = "tai" )     return int ( round ((moment - epoch).to_value( "sec" ) * 1e9 ))   def to_mjd_tai(nsecs : int ) - > float :     moment = astropy.time.TimeDelta(nsecs * 1e - 9 , format = "sec" ) + epoch     moment.precision = 6     return moment.mjd The results from this code are consistent with daf_base DateTime in the domain of interest, and they are consistent with DateTime 's MJD conversion accuracy of 1 microsecond or better. I'm not sure why I was made the assignee for this RFC, but I will go ahead and adopt it, with the already-blocked ticket as the implementation.
            Hide
            salnikov Andy Salnikov added a comment -

            Do we have a clear agreement for what units should we use for integer representation in database? I think Tim was saying that microseconds will work OK but K-T's examples use nanoseconds. Could we summarize the decision unambiguously here?

             

            Show
            salnikov Andy Salnikov added a comment - Do we have a clear agreement for what units should we use for integer representation in database? I think Tim was saying that microseconds will work OK but K-T's examples use nanoseconds. Could we summarize the decision unambiguously here?  
            Hide
            ktl Kian-Tat Lim added a comment -

            Please use nanoseconds in the database, but it should be expected that values from the user, and many values in, e.g., headers, are only accurate to the microsecond.

            Show
            ktl Kian-Tat Lim added a comment - Please use nanoseconds in the database, but it should be expected that values from the user, and many values in, e.g., headers, are only accurate to the microsecond.

              People

              Assignee:
              ktl Kian-Tat Lim
              Reporter:
              ktl Kian-Tat Lim
              Watchers:
              Andy Salnikov, Christopher Stephens [X] (Inactive), Christopher Waters, Colin Slater, Eric Bellm, Fritz Mueller, Jim Bosch, John Parejko, Kian-Tat Lim, Russell Owen, Tim Jenness
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:
                Planned End:

                  Jenkins

                  No builds found.