Details
-
Type:
Story
-
Status: Done
-
Resolution: Done
-
Fix Version/s: None
-
Component/s: daf_butler
-
Story Points:1
-
Team:Architecture
-
Urgent?:No
Description
It would be very helpful for the OODS if butler recorded ingest time as well as observation time. Storing ingest time and allowing queries such as "give me all datasets in this collection that were stored more than N days ago" would significantly streamline OODS data expiry. The returned refs could immediately be passed to pruneDatasets. Currently OODS works by going behind gen2 butler's back and deleting the files directly from datastore without involving butler.
Attachments
Issue Links
Activity
Field | Original Value | New Value |
---|---|---|
Labels | gen3-middleware | gen3-middleware gen3-registry-incompatibility |
Assignee | Tim Jenness [ tjenness ] |
Team | Data Access and Database [ 10204 ] | Architecture [ 10304 ] |
Status | To Do [ 10001 ] | In Progress [ 3 ] |
Reviewers | Jim Bosch [ jbosch ] | |
Status | In Progress [ 3 ] | In Review [ 10004 ] |
Status | In Review [ 10004 ] | Reviewed [ 10101 ] |
Story Points | 4 | 1 |
Resolution | Done [ 10000 ] | |
Status | Reviewed [ 10101 ] | Done [ 10002 ] |
Jim Bosch this seems like a low hanging fruit schema change that I should try to do.
I think we decided that this should be registry to allow querying but that it could also go to datastore but that would only be useful if there was some behind the scenes datastore job that could delete artifacts without telling registry.
What I'm not sure about is which table to date should be attached to. Should it be a new table? Which table should get the new column? Is it the "dataset" table (https://github.com/lsst/daf_butler/blob/master/python/lsst/daf/butler/registry/datasets/byDimensions/tables.py#L200) ? Should we let the database automatically fill in the timestamp?