Fix Version/s: None
Team:Data Release Production
I've been trying to implement parallel overscan on the AuxTel images. I found out that for some amps it was not subtracting the overscan correctly. The attached Scan_C02.png shows that when you subtract the median of the parallel overscan region from the median of the data region, you get a very flat curve. But Scan_Correct_Co2.png shows that when the subtraction is of the overscanResult.overscanValue that is returned by fitOverscan, it only subtracts part of the curve. The reason is this assumption here (lines 224-229 in overscan.py)
- The serial overscan correction has removed the majority
- of the signal in the parallel overscan region, so the
- mean should be close to zero. The noise in both should
- be similar, so we can use the noise from the serial
- overscan region to set the threshold for bleed
This turns out to not be a very good assumption when there is a lot of structure in the parallel overscan region. To fix this, I changed line 229 from:
thresholdLevel = self.config.numSigmaClip * serialResults.overscanSigmaResidual
thresholdLevel = np.median(maskIm.image.array) + self.config.numSigmaClip * np.std(maskIm.image.array)
This fixed the problem, as shown in Scan_C02_Fixed.png and Scan_Correct_Fixed.png. After this, the parallel overscan subtraction looks quite good.
Christopher Waters, I can do a PR with this fix if you want, or maybe there is a better solution.
DM-37357 Update masking in parallel overscan
Aaron Roodman noted a similar effect on BOT data (
DM-37357), proving that my assertions in the comment you copied cannot be right. The serial overscan correction removes some offset (applied row-by-row, but I don't think there's much real difference between those rows), leaving whatever remnant parallel signal there is. As you're showing here, it's not a simple mean-zero thing, so we do need to fold in the median offset as you do above.
I'd like to spend a bit of time to get this right in different cases, after looking at some real bleeds (see attached figure; taken from LATISS exposure 2021012000074). Your method will work perfectly fine on an obvious bright bleed (bottom profile, corresponding to the fifth overscan down, C04 due to alphabetical sorting). I'm concerned about the crosstalk ghosts (remaining profiles, corresponding to C02, C01, and C00). These appear to stay unmasked; the calculated thresholds (using the above equation) are ~74 for C02, ~30k for C01 (caused by the bright defect column that caused me to shift the profile box), and ~34 for C00. We may be able to use the fact that these are obviously crosstalk ghosts to attempt to fix this with a two-pass algorithm: identify bright bleeds with a high threshold, and then look at the target locations for those bleeds in the other amplifiers and check if the per-column values there are consistent with those outside that region.
This leads to the last issue that isn't yet handled, which is the case of fully masked columns. I think there should be a way to use the same "per-column checks" to identify good and bad regions that can be used for interpolation.
In the case of fully masked columns, we could just set those columns to the value of the median of the whole parallel overscan region. I'm worried that if there are multiple fully masked columns, then the linear interpolation might fail.
I've marked this ticket as Invalid as it duplicates
DM-37357, as that is the branch that I've pushed fixes onto.
The plots above are from exposure 2022121200752.