This is a followup to
RFC-906 (updating the APDB schema), to clarify the large number of changes that result from it (as implemented on DM-37196). These changes will result a non-backwards compatible break once merged: datasets prior to the merge will likely not be able to be processed with code from after the merge, and many notebooks and other tools will not work without updates to their data models. As part of this RFC, I am asking to waive the deprecation period for API changes, as there doesn't seem to be a point of maintaining old APIs when the output data model has also changed.
The changes caused by
RFC-906 resulted in 14 PRs, 10 in lsst_distrib packages, and 4 in testdata and CI packages. All of the modified packages and their respective PRs are listed in a comment on DM-37196, where the implementation work for RFC-906 is occurring.
The most significant breaking change is that declination coordinate fields are now `dec` everywhere, instead of often (but not always!) being `decl` in our output products. Once
DM-37196 merges, there will be no more uses of `decl` anywhere in the science pipelines code. This includes the ExposureSummaryStats data product, isolated star catalogs, DiaSource/DiaObject/forced catalogs, visit summary tables, solar system catalogs generated from outside sources, and more. This change implements at least the spirit of RFC-863, though there are some details yet to be decided on that RFC.
The names of many difference imaging-related flux fields have also changed (e.g. PSFlux->psfFlux, TOTFlux->scienceFlux). See the sdm_schemas PR for a detailed breakdown of all such changes. In order to keep our code and data model self-consistent, I also changed the names of many diaCalculation plugins and functions in meas_base, so that for example, the code calculating "psfFluxMean" is named "WeightedMeanDiaPsfFlux", not "WeightedMeanDiaPsFlux". I am not attempting to deprecate the old methods or make this API change backwards compatible, given that the data model changes mean that code outside of lsst_distrib will have to be updated anyway.
I have replaced "filterName" with "band" to make the schemas more consistent with our butler field names, which takes care of
Any code outside of lsst_distrib (e.g. analysis_ap, notebooks, RubinTV, summit packages) that accessed any of the changed fields will have to be updated to work with data processed after the merge. It is up to the developers of those external tools whether they put in hooks to attempt to allow operating on data processed both before and after this change, or to make a clean break with the past.
What to do about analysis_tools is my biggest open question: there is an ongoing sprint on that package and it will be used for the upcoming bootcamp, but it does have 9 instances of the decl->dec change, and may have more as the sprint continues. I propose to merge
DM-37196 (once it is reviewed) on Tuesday, May 30th, just after the bootcamp is completed, so that these changes do not impact the bootcamp itself, and so that they are included in the end of May weekly. The bootcamp is working with earlier data, so it is in their interest to have analysis_tools maintain compatibility with that old data.
A thought: this could be an opportunity for us to start a new /repo/main butler repository on USDF, given that many of the datasets in the current repo will not work with analysis code going forward.
I will announce the merge and write up a post for Community about how "data processed beginning with weekly X is incompatible with previous processing".