Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-10155

Implement in-place SourceCatalog operators in PhotoCalib

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: afw

      Description

      The two `countsTo*` methods that operate in-place on SourceCatalogs in PhotoCalib are marked "not implemented" due RFC-322. Once that RFC is completed and the necessary key name changes are in place, we can finally implement those methods.

      We might want to consolidate them into a single countsToFluxAndMagnitude, since if we're iterating over the catalog anyway, might as well do all the math at once.

        Attachments

          Issue Links

            Activity

            Hide
            Parejkoj John Parejko added a comment -

            This is close, but I just tested it on an stack-produced afw table, and the "do all the columns" version fails because of base_Blendedness_abs_child_instFlux, which doesn't have a corresponding *_instFluxErr field. I wonder if I should add a check for the word "Blendedness" in the field name, or just ignore fields that don't have error, or what?

            Show
            Parejkoj John Parejko added a comment - This is close, but I just tested it on an stack-produced afw table, and the "do all the columns" version fails because of base_Blendedness_abs_child_instFlux , which doesn't have a corresponding *_instFluxErr field. I wonder if I should add a check for the word "Blendedness" in the field name, or just ignore fields that don't have error, or what?
            Hide
            lauren Lauren MacArthur added a comment - - edited

            I also ran into this when writing a “calibrate catalog using PhotoCalib” function for pipe_analysis. I ended up doing the kind of hack-y thing for any *instFlux field that did not have an associated *instFluxErr described here: https://github.com/lsst-dm/pipe_analysis/blob/master/python/lsst/pipe/analysis/utils.py#L941-L957

            Show
            lauren Lauren MacArthur added a comment - - edited I also ran into this when writing a “calibrate catalog using PhotoCalib” function for pipe_analysis . I ended up doing the kind of hack-y thing for any *instFlux field that did not have an associated *instFluxErr described here: https://github.com/lsst-dm/pipe_analysis/blob/master/python/lsst/pipe/analysis/utils.py#L941-L957
            Hide
            jbosch Jim Bosch added a comment -

            I think we want to be able to just handle instFlux fields without errors by calibrating them without magnitude/calibrated-flux errors.

            Show
            jbosch Jim Bosch added a comment - I think we want to be able to just handle instFlux fields without errors by calibrating them without magnitude/calibrated-flux errors.
            Hide
            Parejkoj John Parejko added a comment -

            As part of this work, I'm also greatly accelerating the other PhotoCalib SourceCatalog operations by vectorizing them to make AST-backed PhotoCalibs much faster. Looks like roughly a factor of >15 for TransformBoundedField, and little change for constant PhotoCalibs.

            Show
            Parejkoj John Parejko added a comment - As part of this work, I'm also greatly accelerating the other PhotoCalib SourceCatalog operations by vectorizing them to make AST-backed PhotoCalibs much faster. Looks like roughly a factor of >15 for TransformBoundedField, and little change for constant PhotoCalibs.
            Hide
            Parejkoj John Parejko added a comment -

            Jim Bosch: can you please review this (about 250 lines)? I took your advice above re: fields without errors.

            It isn't as fast as I'd hoped: it takes about 30ms to calibrate every field on my test catalog (photoCalib.calibrateCatalog(src)) and 22ms to do just one field (photoCalib.calibrateCatalog(src)), which tells me the "find the fields" part isn't taking much time. The `evaluateCatalog` step should take about 1ms, given the performance of e.g. photoCalib.instFluxToMagnitude(src, 'slot_PsfFlux'). I've attached my profiling script, if you have any ideas.

            Show
            Parejkoj John Parejko added a comment - Jim Bosch : can you please review this (about 250 lines)? I took your advice above re: fields without errors. It isn't as fast as I'd hoped: it takes about 30ms to calibrate every field on my test catalog ( photoCalib.calibrateCatalog(src) ) and 22ms to do just one field ( photoCalib.calibrateCatalog(src) ), which tells me the "find the fields" part isn't taking much time. The `evaluateCatalog` step should take about 1ms, given the performance of e.g. photoCalib.instFluxToMagnitude(src, 'slot_PsfFlux') . I've attached my profiling script, if you have any ideas.
            Show
            Parejkoj John Parejko added a comment - Jenkins run: https://ci.lsst.codes/blue/organizations/jenkins/stack-os-matrix/detail/stack-os-matrix/29708/pipeline
            Hide
            jbosch Jim Bosch added a comment -

            Looks good!

            My best guess at where the time is going is that it's in the pybind11 layer, but that really is just a guess.

            Show
            jbosch Jim Bosch added a comment - Looks good! My best guess at where the time is going is that it's in the pybind11 layer, but that really is just a guess.
            Hide
            Parejkoj John Parejko added a comment -

            It looks like most of that time is going into BaseTable::copyRecord(record, schemaMapper), as we guessed on the phone. No real way to avoid that, I don't think.

            Thank you for the review suggestions: I think the result is much cleaner.

            Merged and done.

            Show
            Parejkoj John Parejko added a comment - It looks like most of that time is going into BaseTable::copyRecord(record, schemaMapper) , as we guessed on the phone. No real way to avoid that, I don't think. Thank you for the review suggestions: I think the result is much cleaner. Merged and done.

              People

              • Assignee:
                Parejkoj John Parejko
                Reporter:
                Parejkoj John Parejko
                Reviewers:
                Jim Bosch
                Watchers:
                Jim Bosch, John Parejko, John Swinbank, Lauren MacArthur, Yusra AlSayyad
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Summary Panel