Details
-
Type:
RFC
-
Status: Adopted
-
Resolution: Unresolved
-
Component/s: DM
-
Labels:None
Description
Several times in the past few years, we have introduced problems that new versions of the stack cannot read data processed with older versions of the stack (e.g., DM-10302, DM-23584). This is partly because of a lack of testing of reading old data with new versions of the stack; I believe this is also because a "guarantee" of reading of old data is not explicitly in the dev guide.
This RFC proposes to remedy these problems by (a) making it explicit that we will support reading some subset of old data; and (b) add a new repo (or repos) that will be tested regularly in Jenkins. These tests should be lightweight enough to be tested with every Jenkins run. This repo will contain a "blessed" set of older data. At a minimum, these data should include samples of data that are available in /datasets on lsst-dev. These data have already been "blessed" via RFC process to be worthy of being generally publicized to stack developers. This RFC would make the implicit promise that these data are readable with future versions of the stack explicit and tested.
Specifically, as a straw-man proposal I would propose:
- Two repos, backcomp and testdata_backcomp (yes, these are horrible names, this is a straw-man.)
- The testdata_backcomp would be setup optional, LFS-backed, with samples of files from HSC RC2 processing; ImSim DC2 processing; DECam processing; etc (if it is in /datasets it should be included unless there is a reason not to).
- These sample files will include something like a single example of raw, flat, dark, bias, fringe, calexp, calexp_background, src, skyCorr, deepCoadd_calexp, deepCoadd_meas for each obs set / generation of processing that we support.
- The test repo will have very basic tests: load a butler, get various datasets, and perhaps do the minimum of manipulation (realize a background; check the coadd input table; load PSFs and WCSs)
- We can start by adding HSC RC2, ImSim DC2, and DECam, and expand from there.
- It will be the responsibility of the developer adding new blessed data to /datasets to add a suitable sample to these repos to ensure that these data are readable in the future. This should hopefully be easy, and also worthwhile because spending a few minutes up front to add data here will save a lot of pain further on.
- The decision to remove old datasets from this testing suite, explicitly breaking backwards compatibility promises, should have an RFC and an associated community post.
Attachments
Issue Links
- is triggering
-
DM-29623 Document detailed policies for data backwards compatibility
- To Do
-
DM-29624 Stand up initial Gen3 data backwards compatibility test packages
- To Do
- relates to
-
DM-36240 Setup repository demonstrating we can read historical files
- In Progress
-
DM-36425 Discuss purpose of tables produced in postprocess.py
- Done
- mentioned in
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
In the current Gen3 development is the concept that changes will be accompanied by a utility to migrate existing repositories when the need arises. Once Gen3 is accepted (and Gen2 has been fully deprecated) updates to Gen3 will fall under the Change Control Board (CCB). I think rather than guarantee that DM will not "break" existing repositories it should be left to the CCB to decide whether changes should be allowed that would "break" existing repos (i.e. that in some unforeseen case that developers would have to demonstrate that providing migration support is so onerous that a change be allowed that would require users to wholesale abandon existing repos).