Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-33892

Create matched difference faro metrics for DC2

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: faro, pipe_tasks
    • Labels:
      None
    • Story Points:
      15
    • Team:
      Data Release Production
    • Urgent?:
      No

      Description

      Implement metrics and the tasks to generate them using aggregated statistics from matches to truth (DM-33221).

        Attachments

          Issue Links

            Activity

            Hide
            dtaranu Dan Taranu added a comment -

            Thanks, Keith. Just to clarify, the magnitude bin indices match the upper (bright) limit of the selection, i.e. mag15 means 15<=mag<16. They are sequential integer indices just because I chose 1 mag steps. Having said that, they could be redefined in the future if there's a compelling reason to do so (e.g. if popular opinion votes for 14.5<mag<15.5 as the least-surprising definition of '15th magnitude sources'), and sequential indices are generic enough that they'd work for other kinds of selections.

            Show
            dtaranu Dan Taranu added a comment - Thanks, Keith. Just to clarify, the magnitude bin indices match the upper (bright) limit of the selection, i.e. mag15 means 15<=mag<16. They are sequential integer indices just because I chose 1 mag steps. Having said that, they could be redefined in the future if there's a compelling reason to do so (e.g. if popular opinion votes for 14.5<mag<15.5 as the least-surprising definition of '15th magnitude sources'), and sequential indices are generic enough that they'd work for other kinds of selections.
            Hide
            dtaranu Dan Taranu added a comment - - edited

            To test, I added the pipeline_full.yaml to $FARO_DIR/pipelines/measurement/measurement_tract_matched.yaml, lacking a better name. I then added it to step3 in imsim's DRP.yaml in drp_pipe. While Jenkins passed on MacOS, it timed out on CentOS with lots of sqlite errors, presumably because it was trying to put a lot of metric outputs in a short time frame (each metric task takes <0.5s). I'm going to drop the drp_pipe PR and defer it to DM-34880.

            I do still want to add the pipeline (I'm planning to generate the metrics myself and submit them for DC2 reruns until DM-34880 is resolved), and also I realized I forgot to add a unit test, so I added one as well. Keith Bechtol, could you have a quick look at those?

            Show
            dtaranu Dan Taranu added a comment - - edited To test, I added the pipeline_full.yaml to $FARO_DIR/pipelines/measurement/measurement_tract_matched.yaml , lacking a better name. I then added it to step3 in imsim's DRP.yaml in drp_pipe . While Jenkins passed on MacOS, it timed out on CentOS with lots of sqlite errors, presumably because it was trying to put a lot of metric outputs in a short time frame (each metric task takes <0.5s). I'm going to drop the drp_pipe PR and defer it to DM-34880 . I do still want to add the pipeline (I'm planning to generate the metrics myself and submit them for DC2 reruns until DM-34880 is resolved), and also I realized I forgot to add a unit test, so I added one as well. Keith Bechtol , could you have a quick look at those?
            Hide
            dtaranu Dan Taranu added a comment - - edited

            One hopefully final note - as I was adding the unit test, I realized that I could probably split the TractTableValueMeasurementTask I added into a measurement subtask (in TractTableMeasurementTasks.py) and call it with the existing TractTableMeasurementTask. The upside to this is that you could use the TractTableValueMeasurementTask with other tasks having different dimensions. These don't exist yet, but there's no reason why summary statistics couldn't be generated for dimensions other than tracts.

            I don't think it would be a completely trivial change - there's probably a reason I didn't do this in the first place that I'm forgetting about - so I'm just noting this here for now and deferring filing a ticket for it until there's a concrete use case (or in case someone wonders why they're separate tasks).

            Show
            dtaranu Dan Taranu added a comment - - edited One hopefully final note - as I was adding the unit test, I realized that I could probably split the TractTableValueMeasurementTask I added into a measurement subtask (in TractTableMeasurementTasks.py ) and call it with the existing TractTableMeasurementTask . The upside to this is that you could use the TractTableValueMeasurementTask with other tasks having different dimensions. These don't exist yet, but there's no reason why summary statistics couldn't be generated for dimensions other than tracts. I don't think it would be a completely trivial change - there's probably a reason I didn't do this in the first place that I'm forgetting about - so I'm just noting this here for now and deferring filing a ticket for it until there's a concrete use case (or in case someone wonders why they're separate tasks).
            Hide
            kbechtol Keith Bechtol added a comment -

            Thank you for clarifying the point about the magnitude bin naming conventions.

            $FARO_DIR/pipelines/measurement/measurement_tract_matched.yaml looks good to me. I had looked at the pipeline files previously to understand how TractTableValueMeasurementTask would be invoked.

            Thank you for adding the unit test. I should have suggested that earlier.

            Show
            kbechtol Keith Bechtol added a comment - Thank you for clarifying the point about the magnitude bin naming conventions. $FARO_DIR/pipelines/measurement/measurement_tract_matched.yaml looks good to me. I had looked at the pipeline files previously to understand how TractTableValueMeasurementTask would be invoked. Thank you for adding the unit test. I should have suggested that earlier.
            Hide
            dtaranu Dan Taranu added a comment -

            Merged after passing Jenkins again, this time with no sqlite errors. It's kind of annoying that the problem seems to be intermittent but I'll still leave the pipeline changes for DM-34880 until it's clear that they won't pop up again randomly.

            Show
            dtaranu Dan Taranu added a comment - Merged after passing Jenkins again , this time with no sqlite errors. It's kind of annoying that the problem seems to be intermittent but I'll still leave the pipeline changes for DM-34880 until it's clear that they won't pop up again randomly.

              People

              Assignee:
              dtaranu Dan Taranu
              Reporter:
              dtaranu Dan Taranu
              Reviewers:
              Jeffrey Carlin, Keith Bechtol
              Watchers:
              Dan Taranu, Jeffrey Carlin, Keith Bechtol
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.