Details
-
Type:
Story
-
Status: Done
-
Resolution: Done
-
Fix Version/s: None
-
Component/s: obs_decam, pipe_tasks
-
Labels:None
-
Story Points:14
-
Epic Link:
-
Team:Data Facility
Description
This ticket is for implementing changes in obs_decam in order to run processCcd with raw DECam data.
Changes are mostly in DecamMapper and a new class DecamIsrTask is added. A test to retrieve defects with Butler is also added. (testdata_decam is at lsst-dev /lsst8/testdata_decam/ )
The pixels with bit 1 (bad, hot/dead pixel/column) from the Community Pipeline Bad Pixel Masks are used as bad pixels. The CP BPM fits files are directly used as defect files. Due to their large size, they probably should not go into obs_decam repository so they are treated similarly as other calibration products.
With the changes of this ticket, the following ingest defects files into calibRegistry:
ingestCalibs.py . --calibType defect path-to-bpm/*fits
|
and the following should run past ISR:
processCcd.py . --config isr.doFringe=False isr.assembleCcd.setGain=False calibrate.doPhotoCal=False calibrate.doAstrometry=False calibrate.measurePsf.starSelector.name="secondMoment" |
Running fringe correction with DECam raw data will be in future tickets (DM-4223 and possibly more). Also this ticket does not cover implementing or porting new ISR functionalities that haven't yet been included in ip_isr (such as crosstalk).
Attachments
Issue Links
Activity
Russell Owen, I found you were the last person editing about defects. Would you mind reviewing this ticket?
The changes are at branch tickets/DM-4191 of pipe_tasks and obs_decam.
Some notes:
- The last commit currently only includes one of the persisted amplifier information fits files. I do so because github may not want to show other changes otherwise. The rest of the files will be added back before merge.
- DECam defects files are ingested in the same way as other calibration products, using ingestCalibs. Their validity ranges are set there as well. The information is stored in calibRegistry.sqlite3 just like flat/bias/dark/fringe, not separately in defectRegistry.sqlite3. DecamMapper is edited to treat defects like other calibration products as well.
- Community Pipeline MasterCal bias and flat products has their edges trimmed 4 extra pixels on each edge in the data section.
- I reuse setEdgeBits from meas.algorithms.detection.SourceDetectionTask in doing ISR. It looks a bit wierd to use something from SourceDetectionTask but that method is exactly what I intend to do there. Or should I write another method here? Or just don't mask the edge pixels here as it will be done in later steps of processing anyway?
Review of DM-4191
It is exciting to see time-varying defects for a camera. Now if we can just get time-varying amplifier data...but that is outside the scope of this ticket.
pipe_tasks
The changes to ingestCalibs look sensible and well documented. However, I have a few minor requests:
The help for --calibType is as follows: ""If not None, use the input calibration type instead of finding it in fits header". None only occurs if the user omits the argument, and I find the help confusing. Please consider something like the following, instead: "Type of calibration image; if omitted, determine from the OBSTYPE header of the calibration image".
The following is slightly inefficient, since it creates a sequence from the pair of lists before iterating over it.
for thisDate, nextDate in zip(dates[:-1], dates[1:]):
|
This is only a concern if the lists of dates are long. Still, using itertools.izip would eliminate the worry. If you make that change, please also change the "else" clause, as well.
Please restore the blank line after the initial line of the doc string for resolveOverlaps. Python doc strings should have a one-line summary, followed by a blank line, followed by additional information.
obs_decam
I am a bit confused by some of these changes.
decamMapper.py
why do you only add "ccdnum" to self.mappings["raw"] if not ingesting to create a registry? I assume that process is harmed by knowing about ccdnum, but I am curious as to why, and it would be nice to explain this as a comment in the code.
The standard CamaraMapper version of _setTimes talks about changing a calib image, which is a "calexp", but your version talks about "the calib object in an raw Exposure". Several suggestions:
- Clarify image types are changed. Raw or calexp or...? If yours is different than the standard CameraMapper it would be nice to have an explanation in the doc string as to why that is so.
- Emphasize what is different about your override vs. the standard method.
- Change "in an raw" to "in a raw"
- Add a blank line after the first sentence
isr.py
I don't quite understand how trimming works. You trim the input exposure, not the bias and flat images, which implies you trim the data when applying bias and flat, rather than the other way around, despite the help strings for the config params saying the opposite. So...
If bias and flat are smaller than raw data, then I suggest you compute the amount of trim required based on the dimensions of the provided data. Why have config fields for this? The results are more likely to be correct. As it stands, if you get the values wrong you'll get an exception, and if you can compute a result, surely that's safer and simpler? Especially since if you ever do get master flats and biases that are full sized the code will "just work" with no reconfiguration.
If bias and flat are the same size or later than raw data then I really don't understand the code, and it could use some additional documentation.
A few more code notes:
I suggest you replace this:
from lsst.ip.isr import IsrTask
|
from lsst.ip.isr import isr as lsstIsr
|
with this:
from lsst.ip.isr import IsrTask, biasCorrection, flatCorrection
|
and ditch all use of "lsstIsr" (which I found confusing, as I thought you might be importing lsstSim's version of bias and flat correction).
afwImage.PARENT is redundant in the following, and should probably be removed, because we (I, to be honest) went to great lengths to make the default PARENT and that default is used nearly everywhere:
getBBox(afwImage.PARENT)
|
If you do decide you need to keep the config fields, please consider removing the term "MasterCal" from the names. These are presumably the bias and flats we have, and it doesn't matter where they come from. I also suggest clarifying that these are used to trim the data down to the size of the bias and flat images. Finally, I suggest changing "on each edge in the..." to "on each edge of the..." in the help strings.
Please explain why you had to override the run method. The standard run method also updates the variance plane. Please put this explanation in the doc string for the method.
I think your use of SourceDetectionTask.setEdgeBits is reasonable. It is a pity that this function is not available some other way, since it is a common need, but refactoring that out into a free function is presumably out of scope for this ticket. You could file a ticket about that, if you are willing. Note that coaddition also needs to do this, though it has some extra needs.
Thank you very much for your review!
If I understand correctly, because "ccdnum" is not used directly in the template of raw dataset, it is not automatically included in keyDict of raw dataset mapping. So when ccdnum is given in the argument id of processCcd, it would have an error from pipe_base argumentParser like this:
processCcd.py: error: "Unrecognized ID key 'ccdnum'; valid keys are: ['date', 'filter', 'hdu', 'visit']" |
My intent is to allow using the data ID key "ccdnum", and I got the fix of adding it to keyDict from
https://dev.lsstcorp.org/trac/ticket/2859
However if keyDict is updated when ingesting data into registry, Butler would do a lookup in the registry (trying to find ccdnum I guess) and such a lookup would fail. (For example, RuntimeError: No registry for lookup or RuntimeError: No unique lookup for ['ccdnum'] : 0 matches)
So I was thinking just to add the key when a registry exists, and I haven't been able to come up with a better way. But just this morning I realized this may be also related to the fact that DECam data are multi-extension fits, and the zeroth extension does not have a CCDNUM. I might tweak DecamIngestTask or DecamParseTask a bit to make it better. I will look into it more.
The bias and flat products are smaller than the raw data (sorry about my confusing documentations). Calculating the amount of trim based on the input sounds a good idea. The bias and flat products seem always trimmed, or at least those from the past two years. I was thinking getting it from config might be more efficient, but true it would be nice if the code will just work when flat and bias are no longer trimmed.
The standard ISR updates the variance plane after CCD assembly, but AssembleCcdTask needs it to setGain. So in the current branch, updateVariance is done twice. Is that bad?
I will also work on other things you mentioned, especially improving all docstrings and comments.
Thank you for the explanation about ccdnum. I suspect your solution is fine, but a bit more documentation as a comment would be appreciated. There may be a better way to do this; you could ask on community.lsst.org.
I think setting variance twice is not very efficient. Is there a more direct way to set gain? If it is just an issue of remapping an existing keyword then I suggest doing that remapping in the Mapper for raw images; I think std_raw is the method to override.
One ideally wants to set the variance plane after bias subtraction (though the standard IsrTask has it after dark subtraction).
CCD assembly does an "effective gain" calculation, but this isn't very useful (an approximation of an approximation, and you're never going to use that information anyway because it's embedded in the variance plane). I suggest setting AssembleCcdConfig.setGain=False; then you don't need the variance plane in CCD assembly, and can set the variance a single time after bias subtraction.
Filed issue DM-4232 about setting the variance plane.
With a more careful look, the condition of needing a registry to add "ccdnum" to self.mappings["raw"].keyDict should not be necessary. (Otherwise obs_cfht and obs_subaru probably would have had problems too!) The problem was because the keyword CCDNUM is missing in the zeroth extension of the MEF and getDestination in ingest tries to retrieve it using Butler, Butler tries to find a ccdnum by looking up the registry. But is doesn't make sense for Butler to want to do a lookup in the registry when ingesting. It was more of an omission when I edited DecamIngest in DM-3462. I added a commit to fix that.
CameraMapper._setTimes is used and only used in CameraMapper._standardizeExposure for setting properties in Exposure's Calib object (the object returned by Exposure.getCalib()). On second thought, I removed the overriding of _setTimes, but added std_raw in DecamMapper. All keywords of the zeroth extension but not existent in its specific HDU are now copied to the exposure's metadata.
Other changes include:
- Improve comments, help message, and docstrings to be more clear. Also fix the format.
- Use itertools.izip
- DecamIsrConfig is removed and the number of edge trim is computed based on the dimensions of the input. It will just work even when Community Pipeline changes his mind to trim their caliration products a bit differently.
- Ditch the use of lsstIsr.
- Remove afwImage.PARENT as it is the default.
- The run method overriding is also removed from DecamIsrTask. Use config isr.assembleCcd.setGain=False for now. (Thanks Paul for explaining the problem and creating the new ticket!)
Russell Owen, I just wanted to check if you have further comments?
This looks great. Thanks for the improvements. I have a few minor things to ask about, but nothing that would stop you from merging now if you like.
pipe_tasks
You say "For defects, the products are valid from the date they are registered until the next date. For other calibration products, a midpoint is used to fix the conflicts.". On the whole this is pretty clear, but I suggest trying to use a different phrase than "fix the conflicts". Do we actually expect conflicts? I thought data products had a start date, but not an end date, so conflicts will not occur. Perhaps you could say "For most calibration products, the calibration data with whose start date is nearest the date of the observation is used. However, defects files are valid from their start date until they are superseded by subsequent defect data".
The --calibType argument help is much clearer, but I still suggest explicitly saying that the OBSTYPE header entry is used (as per my original review).
obs_decam
Would it make sense to shorten the key "ccdnum" down to simply "ccd"? (Certainly not if you already use the key "ccd" for something else).
Why in std_raw do you copy over all metadata? I'm referring to the following snippet:
for kw in md0.paramNames():
|
if kw not in md.paramNames():
|
md.add(kw, md0.get(kw))
|
isr.py: I am a bit surprised that you require that the amount bias and flats are trimmed is the same in x and y. After all, you had separate parameters for those earlier. But if you are positive this is a safe assumption then I have no objection. If you do want to allow them to be different the code would be easy to expand to return a tuple of x, y trim amounts.
One more thing that Simon Krughoff pointed out: in the paf file you use "[%(hdu)d]" for raw images, but "[%(ccdnum)d]" for all other image types. It seems as if the keys "hdu" and "ccdnum" are being used to mean the same thing. I assume this is intentional, but it seems odd and I just wanted to be sure
Thanks a lot for your speedy reply! Sorry I didn't mean to rush you.
I see what you mean the wording about "fixing the conflicts" is not clear, I will edit that.
As far as I know at this moment pipe_tasks/ingestCalibs is only used by obs_decam. I was thinking if other obs_* packages want to use it in the future, they will likely have to override CalibsPasreTask.getCalibType and they might want to use a different header keyword or they can also use filenames. That's why I didn't want to say "OBSTYPE" header explicitly.
Would it make sense to shorten the key "ccdnum" down to simply "ccd"? (Certainly not if you already use the key "ccd" for something else).
The short answer is "ccd" is used for something else.
A bit longer answer is that there was some confusing history about the use of "ccdnum" and "ccd" in obs_decam.
With my limited knowledge I think that use of "ccd" can be eliminated altogether and then replace ccdnum by ccd (I think I know how and sometimes I really hope to do so). However I'm not completely sure about all the consequenses. Also I don't want to add more confusions.
Why in std_raw do you copy over all metadata?
There are more header keywords existing only in the zeroth header but not in others. So far I only encountered problems due to EXPTIME and MJD-OBS. I was thinking it doesn't hurt to copy all non-repeating keywords. (a bit waste of space...)
isr.py: I am a bit surprised that you require that the amount bias and flats are trimmed is the same in x and y. After all, you had separate parameters for those earlier. But if you are positive this is a safe assumption then I have no objection. If you do want to allow them to be different the code would be easy to expand to return a tuple of x, y trim amounts.
True, I guess I preferred to have a strict assumption (and force people to look at the code again when Community Pipeline changes their settings )
in the paf file you use "[%(hdu)d]" for raw images, but "[%(ccdnum)d]" for all other image types.
The keys "hdu" and "ccdnum" have different meanings. "[%(ccdnum)d]" in the instcal/dqmask/wtmap template is a known bug. "[%(ccdnum)d]" in the bias/flat template is correct (so far as I understand) and was my convenient choice. If we want to be proactive to future Community Pipeline changes, we may want to change all of them to "hdu" (changes in ingest task needed).
use a different phrase than "fix the conflicts". Do we actually expect conflicts?
I guess the confusion might have also come from the method name "resolveOverlaps". Previously it was indeed for fixing overlaps in validity ranges. The validity range is set through the file header or filename information or/and specified through user's input (validity period in days), so the ranges can indeed overlap. For example if there are daily calibration products but the user input a validity period of 10 days.
But the wording was not accurate for defects products. I renamed the method name and edited the comments. Hopefully it's more clear now.
A bit more on hdu and ccdnum...
"hdu" refers to the HDU number in the FITS files. They are not necessarily ordered by ccdnum, which corresponds to a specific and fixed CCD of DECam.
In Community Pipeline bias and flat data I've checked so far, they are ordered nicely and always the same (hdu == ccdnum). Raw data have the order shuffled.
Thanks a lot for your careful review and thoughtful comments! Changes are merged to master.
pipe_tasks
python/lsst/pipe/tasks/ingestCalibs.py | 47 ++++++++++++++++++++++++---------- |
1 file changed, 34 insertions(+), 13 deletions(-) |
obs_decam
README.md | 2 +- |
config/ingestCalibs.py | 2 +- |
config/processCcd.py | 11 ++ |
decam/camGeom/N1.fits | 2 +- |
.... (in total 62 fits files of the updated CCD files) |
decam/segmentfile.txt | 248 +++++++++++++++++----------------- |
policy/DecamMapper.paf | 36 ++++- |
python/lsst/obs/decam/decamMapper.py | 83 ++++++++++-- |
python/lsst/obs/decam/ingest.py | 22 ++- |
python/lsst/obs/decam/ingestCalibs.py | 23 ++++ |
python/lsst/obs/decam/isr.py | 80 ++++++++++- |
tests/getRaw.py | 7 + |
72 files changed, 440 insertions(+), 214 deletions(-) |
Passed Jenkins: https://ci.lsst.codes/job/stack-os-matrix/5205/