Convert old refcats to nJy for RFC-549

XMLWordPrintable

Details

• Type: RFC
• Status: Implemented
• Resolution: Done
• Component/s:
• Labels:
None

Description

As fallout from RFC-549, we have to transition our existing reference catalogs from having fluxes in Jansky (most only implicitly: most don't have units on their flux fields at all!) to nanoJansky. We do not have versioning for reference catalogs (the afw.table version has a different purpose and is not propagated to the butler), so we have no way to identify "old" refcats other than attempting to guess from the various fields they do or don't contain.

I propose to do this by writing a script that will convert refcat files to have nJy units and having LoadReferenceObjectsTask.loadSkyCircle raise an exception pointing to that script if it encounters a refcat that has anything other than "nJy" for flux units. This will greatly simplify this transition and will allow us to also fix the fluxSigma->fluxErr values in those refcats, while maintaining the old refcats in-place for older versions of the stack.

For example, on lsst-dev, we would copy each of the existing directories in /datasets/refcats/htm, appending "-nJy"; run the script on those new directories; and update the obs package config defaults to point to the new refcats. Those old refcats could then be kept in place until at least v18.

All newly created refcats will have fluxes in units of nJy.

This would be made much easier if we could drop support for astrometry.net reference catalogs at the same time (RFC-562): a.net refcats are persisted in a way that makes them much harder to modify in-place. I believe the only important a.net refcat that we are still using is in testdata_jointcal, and I have a plan for dealing with it soon.

Activity

Hide
Paul Price added a comment -

But a config file is a tiny fraction of the size of an entire catalog!

I am willing to believe there's an argument to dump the old catalogs, but that's not it.

Show
Paul Price added a comment - But a config file is a tiny fraction of the size of an entire catalog! I am willing to believe there's an argument to dump the old catalogs, but that's not it.
Hide
Simon Krughoff added a comment - - edited

My argument against an adapter layer was the dual work of creating that layer and then removing it. I also know, from experience, that removing code seems to be harder than adding it. It seemed at the time I was discussing with John Parejko that translating the catalogs we know about and providing tools for users to do so on their own has the least overall overhead. We need to do that work eventually anyway.

I understand the hesitance, but it really does seem cleaner to me to make v17 be the last version guaranteed to support the old units.

I do think this basically adheres to our code deprecation which as I understand is:
1) announce deprecation
2) make a release without deprecation
3) work toward deprecation
4) fully deprecate in the following release.

Show
Simon Krughoff added a comment - - edited My argument against an adapter layer was the dual work of creating that layer and then removing it. I also know, from experience, that removing code seems to be harder than adding it. It seemed at the time I was discussing with John Parejko that translating the catalogs we know about and providing tools for users to do so on their own has the least overall overhead. We need to do that work eventually anyway. I understand the hesitance, but it really does seem cleaner to me to make v17 be the last version guaranteed to support the old units. I do think this basically adheres to our code deprecation which as I understand is: 1) announce deprecation 2) make a release without deprecation 3) work toward deprecation 4) fully deprecate in the following release.
Hide
Colin Slater added a comment -

I don't think flag day counts as deprecation process; the key is having an overlap period where both methods work.

Even if we thought we could get away with a disruptive change this time, I think it would be much better to use this as a chance to institute a file versioning process. We're going to want to make more changes like this in the future, and it will be a problem if we always feel like a big-bang change is the only way to do it.

I would be much happier with something like:

• We change the refcat creation code to write nJy in the units fields, and add an entry in the refcat config (the file that stores things like the HTM level and indexing method) that specifies this is refcat format 1.
• The refcat loader and tasks that call it are changed use nJy, and if the loader reads a refcat without the version string it assumes the catalog is in Jy (refcat format 0), outputs appropriate warnings to the user, and does the conversion to output nJy to the calling task.
• After this is merged, we can convert our on-disk refcats. This doesn't need to be done urgently, since old and new formats are supported.
• After ~6 months or so, we remove support for refcat format 0. A second RFC is not required, but we will need to notify users.
Show
Colin Slater added a comment - I don't think flag day counts as deprecation process; the key is having an overlap period where both methods work. Even if we thought we could get away with a disruptive change this time, I think it would be much better to use this as a chance to institute a file versioning process. We're going to want to make more changes like this in the future, and it will be a problem if we always feel like a big-bang change is the only way to do it. I would be much happier with something like: We change the refcat creation code to write nJy in the units fields, and add an entry in the refcat config (the file that stores things like the HTM level and indexing method) that specifies this is refcat format 1. The refcat loader and tasks that call it are changed use nJy, and if the loader reads a refcat without the version string it assumes the catalog is in Jy (refcat format 0), outputs appropriate warnings to the user, and does the conversion to output nJy to the calling task. After this is merged, we can convert our on-disk refcats. This doesn't need to be done urgently, since old and new formats are supported. After ~6 months or so, we remove support for refcat format 0. A second RFC is not required, but we will need to notify users.
Hide
John Swinbank added a comment -

I don't think flag day counts as deprecation process; the key is having an overlap period where both methods work.

I broadly agree with that. My understanding is the request for a “flag day” transition is effectively a request to circumvent the deprecation process. I think that's permissible, as long as the DM-CCB agrees.

If we were to decide to implement an adapter, I don't see an argument for not following the procedure in the Developer Guide. Specifically, that would mean reading (and inferring units for) version “0“ catalogs while issuing a deprecation warnings for (at least) one release, then dropping support for those catalogs on the next major release.

We're within a gnat's crotchet of the 17.0 release and those deprecation warnings are not currently being issued. I don't think it's plausible to suggest that we'll hold the release for them. That means that we'd either have to include deprecation notices in 17.1 (or 18.0) and then not drop support for version 0 catalogs until after the 18.0 (or 19.0) release.

Given that timeline, I personally prefer making a clean break and just dropping the legacy code. However, I'm sympathetic to Colin's point; if he or others believe that will cause major inconvenience to external users, then I think that's a serious concern.

Show
John Swinbank added a comment - I don't think flag day counts as deprecation process; the key is having an overlap period where both methods work. I broadly agree with that. My understanding is the request for a “flag day” transition is effectively a request to circumvent the deprecation process. I think that's permissible, as long as the DM-CCB agrees. If we were to decide to implement an adapter, I don't see an argument for not following the procedure in the Developer Guide . Specifically, that would mean reading (and inferring units for) version “0“ catalogs while issuing a deprecation warnings for (at least) one release, then dropping support for those catalogs on the next major release. We're within a gnat's crotchet of the 17.0 release and those deprecation warnings are not currently being issued. I don't think it's plausible to suggest that we'll hold the release for them. That means that we'd either have to include deprecation notices in 17.1 (or 18.0) and then not drop support for version 0 catalogs until after the 18.0 (or 19.0) release. Given that timeline, I personally prefer making a clean break and just dropping the legacy code. However, I'm sympathetic to Colin's point; if he or others believe that will cause major inconvenience to external users, then I think that's a serious concern.
Hide
John Parejko added a comment -

To try to move this along, I've implemented some "check and convert" code in loadReferenceObjectsTask, and refactored the conversion script I wrote to use it. It still needs some tests of the Indexed ref loader, but the commandline code behaves as expected. This code does not use a version identifier, but instead checks the existing flux fields for whether they have nJy units. If someone could help me implement the config versioning that Colin Slater describes above, I think we could have the behavior that Colin Slater describes.

Take a look at the branch:

Show
John Parejko added a comment - To try to move this along, I've implemented some "check and convert" code in loadReferenceObjectsTask , and refactored the conversion script I wrote to use it. It still needs some tests of the Indexed ref loader, but the commandline code behaves as expected. This code does not use a version identifier, but instead checks the existing flux fields for whether they have nJy units. If someone could help me implement the config versioning that Colin Slater describes above, I think we could have the behavior that Colin Slater describes. Take a look at the branch: https://github.com/lsst/meas_algorithms/compare/tickets/DM-17029#diff-282f082f7e9aa1fd4b2eddad9048d760

People

Assignee:
John Parejko
Reporter:
John Parejko
Watchers:
Colin Slater, Gabriele Comoretto [X] (Inactive), Jim Bosch, John Parejko, John Swinbank, Kian-Tat Lim, Leanne Guy, Margaret Gelman, Meredith Rawls, Paul Price, Robert Lupton, Simon Krughoff, Tim Jenness, Wil O'Mullane, Yusra AlSayyad
0 Vote for this issue
Watchers:
15 Start watching this issue

Dates

Created:
Updated:
Resolved:
Planned End:

CI Builds

No builds found.