# afw.Image.ExposureF('file.fits.fz[i]') returns the image in 'file.fits.fz[1]'

XMLWordPrintable

#### Details

• Type: Bug
• Status: Done
• Resolution: Done
• Fix Version/s: None
• Component/s:
• Labels:
None
• Story Points:
3
• Sprint:
Science Pipelines DM-S15-2, Science Pipelines DM-S15-3, Science Pipelines DM-S15-4
• Team:
Data Release Production

#### Description

It seems that afwImage.ExposureF ignores the extension number when this is passed on as part of the filename and uses the image in extension number 1. This is not the case with afwImage.MaskedImageF which correctly uses the input extension number passed in the same way.

The problem has been checked on OSX Yosemite 10.10.3 with
the is illustrated in the following code https://gist.github.com/anonymous/d10c4a79d94c1393a493

which also requires the following image in the working directory:
http://www.astro.washington.edu/users/krughoff/data/c4d_130830_040651_ooi_g_d1.fits.fz

#### Activity

Hide
John Swinbank added a comment -

I have a fix for this, but it's a bit rough because I don't fully understand the guts of FITS yet. I'll clean it up shortly.

Show
John Swinbank added a comment - I have a fix for this, but it's a bit rough because I don't fully understand the guts of FITS yet. I'll clean it up shortly.
Hide
John Swinbank added a comment -

I certainly agree that it's a bug that Exposure and MaskedImage should do the same thing here, and I have a fix which will cause the behaviour of MaskedImageF and ExposureF to be the same in this case. It's pretty straightforward.

However, when trying to write the test case for this fix, I ended up wondering what the correct behaviour here actually is. Before committing anything, I'd appreciate input from the gurus.

Both MaskedImage and Exposure extend Image to include a mask and a variance image. These are stored in successive HDUs in the FITS file. This implies that you can't sensibly load arbitrary multi-HDU FITS images into these types, as you'll end up with image N+2 being the variance for image N. That is:

 >>> import lsst.afw.image as afwImage >>> im1 = afwImage.MaskedImageF("c4d_130830_040651_ooi_g_d1.fits.fz[1]") >>> im3 = afwImage.MaskedImageF("c4d_130830_040651_ooi_g_d1.fits.fz[3]") >>> assert(im1.get(0,0)[2] = im3.get(0,0)[0])

This is, of course, an issue with trying to leverage a complex data model into FITS. (I think that's something Tim Jenness has opinions on...)

Anyway, my question is: what's the desired behaviour? Would it actually be more helpful to refuse to load files like this as MaskedImage types, since they clearly aren't really?

(And my apologies if this is stating the obvious to everybody else – I'm looking at this code for the first time.)

Show
Hide
Jim Bosch added a comment -

I think this may be working as intended - or at least it may be harder to fix in a safe way than one might expect. The reason is that Exposure isn't just saved as the three HDUs that comprise a MaskedImage; there are many other binary table HDUs that can follow afterwards to store the additional objects attached to any exposure (how many depends on the nature of the Psf, the Wcs, and whether it has CoaddInputs tables). And those additional binary tables sometimes store the indices of other HDUs they depend on. It might be that those are preserved when we pass an HDU index to cfitsio as part of the filename, but I'm not sure they are. Anyhow, I think of Exposure as an LSST-specific archive that happens to be written in FITS so that non-LSST code can read it; I don't really think of it as something we should expect non-LSST code to write.

If we do need to read non-LSST images into Exposures, I think it's safest to write a custom loader function that reads the external file in as a MaskedImage along with a PropertySet for the header, then uses the PropertySet (and whatever else it needs) to create the Wcs and possibly Psf, then uses those to create an Exposure.

Show
Hide
John Swinbank added a comment -

I think this may be working as intended - or at least it may be harder to fix in a safe way than one might expect.

It depends what we mean by fix, of course. Based on the above, I'm not really sure what that would be.

However: the issue in this case is simply that Exposure reads metadata from the primary HDU, whereas MaskedImage doesn't. Line 180 of MaskedImage.cc causes an Exposure to roll back to the primary HDU, and then line 184 fast-forwards it to the first HDU which has data in it, which isn't necessarily where we came from. My simple fix would be:

 --- a/src/image/MaskedImage.cc +++ b/src/image/MaskedImage.cc @@ -177,8 +177,10 @@ image::MaskedImage::MaskedImage(  if (metadata) {  // Read primary metadata - only if user asks for it.  // If the primary HDU is not empty, this may be the same as imageMetadata. + int prevHdu = fitsfile.getHdu();  fitsfile.setHdu(1);  fitsfile.readMetadata(*metadata); + fitsfile.setHdu(prevHdu);  }

I think based on the comments above that there's no danger to this, and it would at least make Exposure and MaskedImage consistent, which I think is a win (although I still think that just throwing an error and refusing to proceed might be better). Jim Bosch, are there problems I'm not anticipating?

Show
Hide
Jim Bosch added a comment -

The only concrete problem I can think of is an extreme corner case: if we did have an Exposure with more than just the MaskedImage HDUs saved in the middle of a long string of HDUs, I'd be worried that some of those extra binary table HDUs might not be read in properly. Or that we might get very weird behavior if we tried to read in an Exposure in a case where we happened to have the correct three MaskedImage HDUs followed by some binary table HDUs that aren't the tables Exposure expects.

While both of those cases are rare, I think the philosophy of what Exposure represents means that the right fix is to throw an exception here - and perhaps try to detect other cases where we're trying to load an Exposure that wasn't written by our code (but that might break some existing code, including unit tests, and while I think we ultimately want to break and fix those, now may not be the best time for that).

Show
Jim Bosch added a comment - The only concrete problem I can think of is an extreme corner case: if we did have an Exposure with more than just the MaskedImage HDUs saved in the middle of a long string of HDUs, I'd be worried that some of those extra binary table HDUs might not be read in properly. Or that we might get very weird behavior if we tried to read in an Exposure in a case where we happened to have the correct three MaskedImage HDUs followed by some binary table HDUs that aren't the tables Exposure expects. While both of those cases are rare, I think the philosophy of what Exposure represents means that the right fix is to throw an exception here - and perhaps try to detect other cases where we're trying to load an Exposure that wasn't written by our code (but that might break some existing code, including unit tests, and while I think we ultimately want to break and fix those, now may not be the best time for that).
Hide
John Swinbank added a comment - - edited

... try to detect other cases where we're trying to load an Exposure that wasn't written by our code ...

The best approach would seem to be to make this check absolutely explicit, rather than an ad-hoc heuristic. That is, when we write an Exposure to FITS, write a header that explicitly states that it's an LSST Exposure, and then refuse to read anything which doesn't have that header.

I'd imagine that that would break lots of stuff, but I worry that doing anything else would be a minefield for the unwary when the LSST codebase starts being more widely used outside the project.

My proposal is to "Won't Fix" this one, and raise a low-priority issue for safe loading of exposures and other complex data stuffed into FITS files. Does that work for everybody?

(...and I thought this was a quick bug I'd get sorted on the train to work this morning!)

Show
John Swinbank added a comment - - edited ... try to detect other cases where we're trying to load an Exposure that wasn't written by our code ... The best approach would seem to be to make this check absolutely explicit, rather than an ad-hoc heuristic. That is, when we write an Exposure to FITS, write a header that explicitly states that it's an LSST Exposure, and then refuse to read anything which doesn't have that header. I'd imagine that that would break lots of stuff, but I worry that doing anything else would be a minefield for the unwary when the LSST codebase starts being more widely used outside the project. My proposal is to "Won't Fix" this one, and raise a low-priority issue for safe loading of exposures and other complex data stuffed into FITS files. Does that work for everybody? (...and I thought this was a quick bug I'd get sorted on the train to work this morning!)
Hide
Jim Bosch added a comment -

That sounds fine to me. I mostly agree with your proposed approach for marking Exposure's we've written, but whenever I've made similar changes to afw::table in the past it's been considered absolutely essential that we maintain backwards compatibility for Exposures written by older versions of our code, so I think we'll have to find some way of preserving that (which will, of course, reduce the amount of work in the breakage-cleaning step), even if we start tagging all new Exposures with a special header field. There are some header fields we already write if we save anything beyond the usual MaskedImage exposures, so being willing to load Exposures that either have those OR are exactly the three HDUs that go into a MaskedImage (and no more) would probably suffice.

Show
Jim Bosch added a comment - That sounds fine to me. I mostly agree with your proposed approach for marking Exposure's we've written, but whenever I've made similar changes to afw::table in the past it's been considered absolutely essential that we maintain backwards compatibility for Exposures written by older versions of our code, so I think we'll have to find some way of preserving that (which will, of course, reduce the amount of work in the breakage-cleaning step), even if we start tagging all new Exposures with a special header field. There are some header fields we already write if we save anything beyond the usual MaskedImage exposures, so being willing to load Exposures that either have those OR are exactly the three HDUs that go into a MaskedImage (and no more) would probably suffice.
Hide
Simon Krughoff added a comment - - edited

You are both right, of course, that the fix is not straight forward. I should have realized the issue from the start, but the fact that I didn't means we have to do something, in my opinion. The real issue is that if you use the string constructor and point Exposure to a non-MEF FITS file, it does something reasonable. That makes me think "Oh if I can point it at a single image, surely I can also point it at a single extension of an MEF."

If I had my way, the behavior would be:
1. Single image – Preserve current behavior
2. MEF, extension not specified – treat as persisted Exposure (or MaskedImage), possibly warning if there are more than 3 image extensions
3. MEF, extension specified – treat exactly as #1 and ignore the rest of the extensions.

This would make the behavior more intuitive and make MaskedImage and Exposure behave similarly.

Show
Simon Krughoff added a comment - - edited You are both right, of course, that the fix is not straight forward. I should have realized the issue from the start, but the fact that I didn't means we have to do something, in my opinion. The real issue is that if you use the string constructor and point Exposure to a non-MEF FITS file, it does something reasonable. That makes me think "Oh if I can point it at a single image, surely I can also point it at a single extension of an MEF." If I had my way, the behavior would be: 1. Single image – Preserve current behavior 2. MEF, extension not specified – treat as persisted Exposure (or MaskedImage), possibly warning if there are more than 3 image extensions 3. MEF, extension specified – treat exactly as #1 and ignore the rest of the extensions. This would make the behavior more intuitive and make MaskedImage and Exposure behave similarly.
Hide
John Swinbank added a comment -

What I didn't realise at the time I wrote the above is that we already are saving headers which indicate there's something special going on. It seems to me we can get a long way towards making things safer by just providing warnings when there's something unexpected going on. I dropped in a couple of calls to logging and a quick hack to do Simon's #3 above, and now get output like the following:

 In [1]: import lsst.afw.image as afwImage   In [2]: im = afwImage.MaskedImageF('c4d_130830_040651_ooi_g_d1.fits.fz')  WARNING: Expected extension type not found: IMAGE  WARNING: Mask unreadable; using default  WARNING: Expected extension type not found: VARIANCE   In [3]: im = afwImage.MaskedImageF('c4d_130830_040651_ooi_g_d1.fits.fz[10]')  WARNING: Expected extension type not found: IMAGE  WARNING: Unexpected HDU; using default mask & variance   In [4]: im = afwImage.MaskedImageF('genuine_exposure.fits')   In [5]:

I dunno if that's more intuitive, but it would mean that Rahul Biswas would at least have had some indication that things weren't doing what he expected. Thoughts?

Show
Hide
Jim Bosch added a comment -

This looks good to me (both Simon Krughoff's proposal and John Swinbank's partial implementation). I suspect we'll want a way for users to be able to indicate that the warnings should be suppressed, however - I imagine that means an extra boolean argument on some constructors that probably already have too many arguments, unless someone else has a better idea.

Show
Jim Bosch added a comment - This looks good to me (both Simon Krughoff 's proposal and John Swinbank 's partial implementation). I suspect we'll want a way for users to be able to indicate that the warnings should be suppressed, however - I imagine that means an extra boolean argument on some constructors that probably already have too many arguments, unless someone else has a better idea.
Hide
John Swinbank added a comment -

If we use lsst::log (rather than pex::logging), we could send these warnings to a special logger and then users could turn them off in their logging configuration rather than by using arguments. I'd prefer that, I think – partly because these constructors already have too many arguments, as you suggest, but also because my impression from the discussion above is that it almost always will be a mistake to (ab)use the data in this way, and it seems better to make shouting about it not particularly optional. (You plural probably have a better idea of what typical uses might be than I do though, so feel free to disagree.)

I do note that lsst::log depends on log4cxx, so it would be an interesting test case of bringing that into lsst_apps.

Show
John Swinbank added a comment - If we use lsst::log (rather than pex::logging ), we could send these warnings to a special logger and then users could turn them off in their logging configuration rather than by using arguments. I'd prefer that, I think – partly because these constructors already have too many arguments, as you suggest, but also because my impression from the discussion above is that it almost always will be a mistake to (ab)use the data in this way, and it seems better to make shouting about it not particularly optional. (You plural probably have a better idea of what typical uses might be than I do though, so feel free to disagree.) I do note that lsst::log depends on log4cxx , so it would be an interesting test case of bringing that into lsst_apps.
Hide
Jim Bosch added a comment -

I think we could do that with pex::logging too, but it probably is time to start using lsst::log in new code, including in afw.

Show
Jim Bosch added a comment - I think we could do that with pex::logging too, but it probably is time to start using lsst::log in new code, including in afw.
Hide
Simon Krughoff added a comment -

Thanks for the implementation, John Swinbank. It looks good to me too. Just a heads up that the version of log4cxx in the stack does not build on OSX, so we'll have to fix that to use lsst::log.

Show
Simon Krughoff added a comment - Thanks for the implementation, John Swinbank . It looks good to me too. Just a heads up that the version of log4cxx in the stack does not build on OSX, so we'll have to fix that to use lsst::log.
Hide
John Swinbank added a comment -

...heads up that the version of log4cxx in the stack does not build on OSX...

Actually, I fixed that in DM-2581, for some value of fixed – it builds, but the tests fail. That's why I say this'll be "interesting", as in "interesting times"!

The version in the 10_1_rc3 release definitely won't build on OSX, though, and it's likely that will be the next release; any changes here should definitely not land until that's gone out.

Show
John Swinbank added a comment - ...heads up that the version of log4cxx in the stack does not build on OSX... Actually, I fixed that in DM-2581 , for some value of fixed – it builds, but the tests fail. That's why I say this'll be "interesting", as in "interesting times"! The version in the 10_1_rc3 release definitely won't build on OSX, though, and it's likely that will be the next release; any changes here should definitely not land until that's gone out.
Hide
John Swinbank added a comment -

This is now ready for review on afw's tickets/DM-2599. Simon, would you mind taking a look?

I found there were quite a few corner cases in the simple scheme you outlined above. I wrote a moderately extensive test to cover them, but do keep an eye out for anything I missed.

Per K-T's advice, I didn't use lsst::log, but rather pex::logging (although I did get as far as getting the former running on OS X without any obvious errors). I did create a separate logger so that the warnings can be filtered if they become irritating (I much prefer this to adding more ctor arguments).

Show
John Swinbank added a comment - This is now ready for review on afw 's tickets/DM-2599 . Simon, would you mind taking a look? I found there were quite a few corner cases in the simple scheme you outlined above. I wrote a moderately extensive test to cover them, but do keep an eye out for anything I missed. Per K-T's advice, I didn't use lsst::log, but rather pex::logging (although I did get as far as getting the former running on OS X without any obvious errors). I did create a separate logger so that the warnings can be filtered if they become irritating (I much prefer this to adding more ctor arguments).
Hide
Simon Krughoff added a comment -

The only thing I have to say is that there should be a test to make sure the MaskedImage constructor and Exposure constructor behave identically. Other than that, good to merge.

I couldn't run the tests since it took me so long to get to it. Specifically, the change to remove mention of eups in the tests has broken tests. As of w.2025.22 this has been fixed.

Show
Simon Krughoff added a comment - The only thing I have to say is that there should be a test to make sure the MaskedImage constructor and Exposure constructor behave identically. Other than that, good to merge. I couldn't run the tests since it took me so long to get to it. Specifically, the change to remove mention of eups in the tests has broken tests. As of w.2025.22 this has been fixed.
Hide
John Swinbank added a comment -

Thanks Simon. In fact, MaskedImage and Exposure aren't quite identical, as the latter always sets needAllHdus to False. I rearranged the tests slightly to account for that & will merge once I have a clean buildbot run.

Show
John Swinbank added a comment - Thanks Simon. In fact, MaskedImage and Exposure aren't quite identical, as the latter always sets needAllHdus to False . I rearranged the tests slightly to account for that & will merge once I have a clean buildbot run.
Hide
Simon Krughoff added a comment -

Got it. Sounds good to me.

Show
Simon Krughoff added a comment - Got it. Sounds good to me.

#### People

Assignee:
John Swinbank
Reporter:
Rahul Biswas
Reviewers:
Simon Krughoff
Watchers:
Jim Bosch, John Swinbank, Rahul Biswas, Simon Krughoff, Tim Jenness