Handling time in Gen3 middleware

XMLWordPrintable

Details

• Type: RFC
• Status: Implemented
• Resolution: Done
• Component/s:
• Labels:
None

Description

The Gen3 middleware will enable the use of time in the query/selector used for quantum graph generation.  This requires times to be present in the registry database and means to be provided for users to specify times.

There are four obvious candidates for the representation of times in the registry database:

• Database internal DATETIME/TIMESTAMP fields, typically required to be UTC.  Allows use of database functions for handling times.
• astropy.time internal format composed of two double-precision floating-point numbers representing an MJD in either UTC or TAI timescales.
• A single double-precision floating-point number representing an MJD, in UTC or TAI.
• lsst.daf.base.DateTime internal format composed of a single 64-bit integer counting nanoseconds since the Unix epoch in either UTC or TAI timescales.

Similarly, there are three candidates for human-specified time literals:

• ISO8601 string (YYYY-MM-DDTHH:mm:ss.ssssssZ).
• An MJD numeric literal, possibly with a prefix or suffix to indicate the type.
• A nanosecond numeric literal, again possibly with a prefix/suffix.

There are implementation desires for the time literal to be identifiable as such to the parser and for all time literals to be translated into a single internal representation within the quantum graph generator.

I think it is also desirable for the quantum graph selector expression to be close to ADQL/SQL rather than inventing a new language.  Our requirements say that user-facing times do not need to be TAI, but pipeline-internal times should be TAI.

Note that the database representation can differ between the registry database and any metadata database published to the DAC and science users, although we may be able to reduce duplication by having those two be implemented using a single underlying set of database tables.

I propose that the human-specified literals be either ISO8601 with explicit timezone indcator or MJD (single-column) numeric, both in the UTC timescale.  If a type indicator is necessary for the latter, prefixing the number with the literal string "MJD" could be acceptable, though it deviates from SQL/ADQL.  I also expect these literal forms to be used in FITS headers where needed.  Literal nanoseconds is too user-hostile to be supported.

As long as the registry database is non-public, I propose that the database representation be either DB-native DATETIME in UTC or lsst.daf.base.DateTime integer nanoseconds, assuming that the DateTime class can be used and new time-handling/translation code is not needed.  The schema creator should choose between native and nanoseconds depending on the expected usage of the column.

Since DateTime provides conversions to and from UTC and MJD, it can be used as the internal representation within the parser and quantum graph generator.

Activity

Hide
Tim Jenness added a comment -

I really don't want a daf_base dependency so could we use microsecond integers rather than nanosecond integers and use astropy.time? There is no scenario I can imagine that would be dealing with multiple datasets per nanosecond (and I'm sure that milliseconds would be fine).

Show
Tim Jenness added a comment - I really don't want a daf_base dependency so could we use microsecond integers rather than nanosecond integers and use astropy.time? There is no scenario I can imagine that would be dealing with multiple datasets per nanosecond (and I'm sure that milliseconds would be fine).
Hide
Tim Jenness added a comment -

Kian-Tat Lim I'm happy for this RFC to be adopted with an integer time if we use astropy and not daf_base. I'll leave it up to you to determine whether we really need nanoseconds in the registry rather than microseconds.

Show
Tim Jenness added a comment - Kian-Tat Lim  I'm happy for this RFC to be adopted with an integer time if we use astropy and not daf_base. I'll leave it up to you to determine whether we really need nanoseconds in the registry rather than microseconds.
Hide
Kian-Tat Lim added a comment - - edited

The following code is small and self-explanatory enough that it can be placed in daf_butler without needing a new package.

 import astropy.time   epoch = astropy.time.Time("1970-01-01T00:00:00", format="isot", scale="tai")   def from_isot_utc(isotutc : str) -> int:     moment = astropy.time.Time(isotutc, format="isot", scale="utc")     return int(round((moment - epoch).to_value("sec") * 1e9))   def to_isot_utc(nsecs : int) -> str:     moment = astropy.time.TimeDelta(nsecs * 1e-9, format="sec") + epoch    moment.precision = 6     return moment.utc.isot + "Z"   def from_mjd_tai(mjd : float) -> int:     moment = astropy.time.Time(mjd, format="mjd", scale="tai")     return int(round((moment - epoch).to_value("sec") * 1e9))   def to_mjd_tai(nsecs : int) -> float:     moment = astropy.time.TimeDelta(nsecs * 1e-9, format="sec") + epoch     moment.precision = 6     return moment.mjd 

The results from this code are consistent with daf_base DateTime in the domain of interest, and they are consistent with DateTime's MJD conversion accuracy of 1 microsecond or better.

I'm not sure why I was made the assignee for this RFC, but I will go ahead and adopt it, with the already-blocked ticket as the implementation.

Show
Kian-Tat Lim added a comment - - edited The following code is small and self-explanatory enough that it can be placed in daf_butler without needing a new package. import astropy.time   epoch = astropy.time.Time( "1970-01-01T00:00:00" , format = "isot" , scale = "tai" )   def from_isot_utc(isotutc : str ) - > int :     moment = astropy.time.Time(isotutc, format = "isot" , scale = "utc" )     return int ( round ((moment - epoch).to_value( "sec" ) * 1e9 ))   def to_isot_utc(nsecs : int ) - > str :     moment = astropy.time.TimeDelta(nsecs * 1e - 9 , format = "sec" ) + epoch   moment.precision = 6     return moment.utc.isot + "Z"   def from_mjd_tai(mjd : float ) - > int :     moment = astropy.time.Time(mjd, format = "mjd" , scale = "tai" )     return int ( round ((moment - epoch).to_value( "sec" ) * 1e9 ))   def to_mjd_tai(nsecs : int ) - > float :     moment = astropy.time.TimeDelta(nsecs * 1e - 9 , format = "sec" ) + epoch     moment.precision = 6     return moment.mjd The results from this code are consistent with daf_base DateTime in the domain of interest, and they are consistent with DateTime 's MJD conversion accuracy of 1 microsecond or better. I'm not sure why I was made the assignee for this RFC, but I will go ahead and adopt it, with the already-blocked ticket as the implementation.
Hide
Andy Salnikov added a comment -

Do we have a clear agreement for what units should we use for integer representation in database? I think Tim was saying that microseconds will work OK but K-T's examples use nanoseconds. Could we summarize the decision unambiguously here?

Show
Andy Salnikov added a comment - Do we have a clear agreement for what units should we use for integer representation in database? I think Tim was saying that microseconds will work OK but K-T's examples use nanoseconds. Could we summarize the decision unambiguously here?
Hide
Kian-Tat Lim added a comment -

Please use nanoseconds in the database, but it should be expected that values from the user, and many values in, e.g., headers, are only accurate to the microsecond.

Show
Kian-Tat Lim added a comment - Please use nanoseconds in the database, but it should be expected that values from the user, and many values in, e.g., headers, are only accurate to the microsecond.

People

• Assignee:
Kian-Tat Lim
Reporter:
Kian-Tat Lim
Watchers:
Andy Salnikov, Christopher Stephens, Christopher Waters, Colin Slater, Eric Bellm, Fritz Mueller, Jim Bosch, John Parejko, Kian-Tat Lim, Russell Owen, Tim Jenness