Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-31013

Fix metrics that are reporting NaN

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: faro
    • Labels:
      None

      Description

      For several weeks now, many faro metrics are being reported as NaNs. Simon did some initial investigation and found that the JSON documents that are dispatched contain NaNs, so the problem is before dispatch. Is this dies to the ccript that does the reconstitution? Or are the metric measurements in the Butler repo empty or NaN?

      The scripts sould be manually rerun on a more recent weekly to determine and fix the cause.

        Attachments

          Issue Links

            Activity

            Show
            lguy Leanne Guy added a comment - - edited Dashboard: https://chronograf-demo.lsst.codes/sources/2/dashboards/70?refresh=Paused&tempVars%5BDataset%5D=validation_data_hsc&tempVars%5BFilter%5D=HSC-R&tempVars%5BTract%5D=9441&lower=now%28%29%20-%2030d Only AM1 is reporting non NaN numbers Dataset: validation_data_hsc
            Hide
            jcarlin Jeffrey Carlin added a comment - - edited

            The metrics that are no longer showing results are: PA1, TE1, "Galaxy photometry repeatability," "Stellar photometry repeatability," and "Stellar locus width." There are three different explanations:

            1a. PA1 stopped showing up after DM-25789 was merged. That new implementation requires 3 or more measurements of each object to calculate an RMS. Since validation_data_hsc has at most 2 matched-visit measurements for any stars, PA1 is not measured.

            1b. The modelPhotRep metrics have recently been added to `faro` (DM-30823), but use the same `photRepeat` function as PA1, which requires 3 or more measurements to calculate repeatability.

            2. TE1 stopped showing up after DM-26988 was merged. The new implementation of TE1/TE2 calculates these metrics on coadds. Since validation_data_hsc does not include coadds, the ellipticity residual metrics are not calculated.

            3. The other panel (Stellar locus width) is empty in the dashboard because that metric was previously calculated by pipe_analysis. The stellar locus width is implemented in `faro`, but only works on multi-band coadds, which aren't present in validation_data_hsc.

            SUMMARY: Repeatability metrics don't appear because there are not enough visits in validation_data_hsc. Ellipticity residual and stellar locus metrics don't appear because they require coadds. (Also the stellar locus width uses gri bands, while validation_data_hsc has only iry.)

            Show
            jcarlin Jeffrey Carlin added a comment - - edited The metrics that are no longer showing results are: PA1, TE1, "Galaxy photometry repeatability," "Stellar photometry repeatability," and "Stellar locus width." There are three different explanations: 1a. PA1 stopped showing up after DM-25789 was merged. That new implementation requires 3 or more measurements of each object to calculate an RMS. Since validation_data_hsc has at most 2 matched-visit measurements for any stars, PA1 is not measured. 1b. The modelPhotRep metrics have recently been added to `faro` ( DM-30823 ), but use the same `photRepeat` function as PA1, which requires 3 or more measurements to calculate repeatability. 2. TE1 stopped showing up after DM-26988 was merged. The new implementation of TE1/TE2 calculates these metrics on coadds. Since validation_data_hsc does not include coadds, the ellipticity residual metrics are not calculated. 3. The other panel (Stellar locus width) is empty in the dashboard because that metric was previously calculated by pipe_analysis. The stellar locus width is implemented in `faro`, but only works on multi-band coadds, which aren't present in validation_data_hsc. SUMMARY: Repeatability metrics don't appear because there are not enough visits in validation_data_hsc. Ellipticity residual and stellar locus metrics don't appear because they require coadds. (Also the stellar locus width uses gri bands, while validation_data_hsc has only iry.)
            Hide
            lguy Leanne Guy added a comment -

            Good work Jeff. I'm glad to see this is not a bug and just a limitation of the dataset.

            Show
            lguy Leanne Guy added a comment - Good work Jeff. I'm glad to see this is not a bug and just a limitation of the dataset.
            Hide
            jcarlin Jeffrey Carlin added a comment -

            Slight amendment to the statements above – `validation_data_hsc` is typically run through coaddition, but the script that we're using to convert from gen2to3 is only running singleFrame processing. But even if coadds were available, they cover small regions, so the TE1 and stellar locus metrics may not be possible. And there are still too few overlapping visits for matched-visit metrics.

            I'd like to explore a little more to confirm that this is true, and maybe update DMTN-091 with more details. Unless Simon Krughoff already knows the number of visits, how much overlap, etc.?

            Show
            jcarlin Jeffrey Carlin added a comment - Slight amendment to the statements above – `validation_data_hsc` is typically run through coaddition, but the script that we're using to convert from gen2to3 is only running singleFrame processing. But even if coadds were available, they cover small regions, so the TE1 and stellar locus metrics may not be possible. And there are still too few overlapping visits for matched-visit metrics. I'd like to explore a little more to confirm that this is true, and maybe update DMTN-091 with more details. Unless Simon Krughoff already knows the number of visits, how much overlap, etc.?
            Hide
            krughoff Simon Krughoff added a comment -

            I want to comment here that we are currently working on having lsst.verify drop non-finite values when the JSON job document is created.  This should allow us to keep the values in the repository, but not have issues uploading to to SQuaSH.

             

            Show
            krughoff Simon Krughoff added a comment - I want to comment here that we are currently working on having lsst.verify drop non-finite values when the JSON job document is created.  This should allow us to keep the values in the repository, but not have issues uploading to to SQuaSH.  
            Hide
            afausti Angelo Fausti added a comment -

            This is now fixed via DM-31131. We decided to preserve the NaN values and represent them as null in the JSON job document. In the SQuaSH database they are stored as SQLAlchemy NULL, and when writing to InfluxDB the recommended approach is to drop the NaN values.

            Show
            afausti Angelo Fausti added a comment - This is now fixed via DM-31131 . We decided to preserve the NaN values and represent them as null in the JSON job document. In the SQuaSH database they are stored as SQLAlchemy NULL, and when writing to InfluxDB the recommended approach is to drop the NaN values.
            Hide
            jcarlin Jeffrey Carlin added a comment -

            Closed as Done – the changes in DM-31131 fix the issue. Thanks, Angelo!

            Show
            jcarlin Jeffrey Carlin added a comment - Closed as Done – the changes in DM-31131 fix the issue. Thanks, Angelo!

              People

              Assignee:
              jcarlin Jeffrey Carlin
              Reporter:
              lguy Leanne Guy
              Watchers:
              Angelo Fausti, Jeffrey Carlin, Leanne Guy, Simon Krughoff
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Dates

                Due:
                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.