Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-5945

Implement validate_drp static plots in Bokeh as proof-of-concept for SQUASH

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: None
    • Labels:

      Description

      This ticket will implement a plot from validate_drp in the QA Dashboard as a proof-of-concept for how existing matplotlib plots can be re-implemented in Bokeh with data from the QA database.

      Stretch goals (maybe for a future ticket) will be to overplot the validate_drp output of one job against another’s to understand performance changes.

        Attachments

          Issue Links

            Activity

            No builds found.
            jsick Jonathan Sick created issue -
            jsick Jonathan Sick made changes -
            Field Original Value New Value
            Epic Link DM-5555 [ 23391 ]
            jsick Jonathan Sick made changes -
            Status To Do [ 10001 ] In Progress [ 3 ]
            jsick Jonathan Sick made changes -
            Story Points 0.5
            jsick Jonathan Sick made changes -
            Story Points 0.5 1.4
            jsick Jonathan Sick made changes -
            Story Points 1.4 1.9
            jhoblitt Joshua Hoblitt made changes -
            Labels squash
            jsick Jonathan Sick made changes -
            Epic Link DM-5555 [ 23391 ] DM-6196 [ 24712 ]
            Hide
            afausti Angelo Fausti added a comment -

            Jonathan Sick this can be interesting for the serialization of plot information in validate_drp, there is an open issue for bokeh to support JSON data structures following the Vega-Lite specification
            https://github.com/bokeh/bokeh/issues/4844

            Show
            afausti Angelo Fausti added a comment - Jonathan Sick this can be interesting for the serialization of plot information in validate_drp, there is an open issue for bokeh to support JSON data structures following the Vega-Lite specification https://github.com/bokeh/bokeh/issues/4844
            Hide
            jsick Jonathan Sick added a comment -

            That's really interesting! We could add an class, kind of like Blobs in the currently-proposed measurement API, that could describe plots as vega-lite JSON. That way we could replace the matplotlib code in validate_drp with vega-lite that could be rendered both client-side from validate_drp and server side with bokeh. I guess we could create a ticket to start making and shipping Vega-Lite to SQUASH now, and then once Bokeh is able to render Vega-Lite we should switch the detailed data plots to that.

            Show
            jsick Jonathan Sick added a comment - That's really interesting! We could add an class, kind of like Blobs in the currently-proposed measurement API, that could describe plots as vega-lite JSON. That way we could replace the matplotlib code in validate_drp with vega-lite that could be rendered both client-side from validate_drp and server side with bokeh. I guess we could create a ticket to start making and shipping Vega-Lite to SQUASH now, and then once Bokeh is able to render Vega-Lite we should switch the detailed data plots to that.
            Hide
            afausti Angelo Fausti added a comment -

            I agree, we can start by doing some of the validate_drp static plots following Vega-Lite specification.

            Show
            afausti Angelo Fausti added a comment - I agree, we can start by doing some of the validate_drp static plots following Vega-Lite specification.
            jsick Jonathan Sick made changes -
            Assignee Jonathan Sick [ jsick ] Angelo Fausti [ afausti ]
            afausti Angelo Fausti made changes -
            Summary Implement validate_drp plot in Bokeh as proof-of-concept for QA Dashboard Implement validate_drp static plots in Bokeh as proof-of-concept for QA Dashboard
            afausti Angelo Fausti made changes -
            Link This issue is child task of DM-7441 [ DM-7441 ]
            afausti Angelo Fausti made changes -
            Summary Implement validate_drp static plots in Bokeh as proof-of-concept for QA Dashboard Implement validate_drp static plots in Bokeh as proof-of-concept for SQUASH
            afausti Angelo Fausti made changes -
            Story Points 1.9 10
            Hide
            afausti Angelo Fausti added a comment - - edited

            Changing bokeh app to directory style for convenience to handle multiple apps:
            http://bokeh.pydata.org/en/latest/docs/user_guide/server.html#directory-format

            Proposed reoganization has three independent apps served by the same bokeh server, for astrometry plots, photometry plots and regression testing

             
            astrometry/
               main.py
               theme.yaml
               
            photometry/
               main.py
               theme.yaml
             
            regression/  (old metrics app)
               main.py
               theme.yaml
             
            
            

            Show
            afausti Angelo Fausti added a comment - - edited Changing bokeh app to directory style for convenience to handle multiple apps: http://bokeh.pydata.org/en/latest/docs/user_guide/server.html#directory-format Proposed reoganization has three independent apps served by the same bokeh server, for astrometry plots, photometry plots and regression testing   astrometry/ main.py theme.yaml photometry/ main.py theme.yaml   regression/ (old metrics app) main.py theme.yaml  
            Hide
            jsick Jonathan Sick added a comment - - edited

            in the final JSON sample for the full REST json, blobs is an object/dict. Do you want the keys of this dict to be the blob identifier? this would make it easy to look up from a measurement.

            If so, the actions that post-qa needs to do to shim it's native format to the SQUASH format is:

            1. Convert the blobs array to an object keyed by identifier.
            2. Convert the measurements array to an an array of objects with fields: 1) metric name, and 2) array of corresponding measurements.

            Thinking of the last one, it may make more sense to simply make measurements an object keyed by metric names,

            {
              "measurements": {
                {"AM1": [], <- array of measurement objects
                ...}
              }
            }
            

            Moving discussion to DM-7043.

            Show
            jsick Jonathan Sick added a comment - - edited in the final JSON sample for the full REST json, blobs is an object/dict. Do you want the keys of this dict to be the blob identifier ? this would make it easy to look up from a measurement. If so, the actions that post-qa needs to do to shim it's native format to the SQUASH format is: Convert the blobs array to an object keyed by identifier . Convert the measurements array to an an array of objects with fields: 1) metric name, and 2) array of corresponding measurements. Thinking of the last one, it may make more sense to simply make measurements an object keyed by metric names, { "measurements": { {"AM1": [], <- array of measurement objects ...} } } Moving discussion to DM-7043 .
            afausti Angelo Fausti made changes -
            Comment [ @jsick currently the job JSON has a structure like this

            {code:java}

             "measurements": [
                            {
                                "metric": "AM1",
                                "value": 7.15136555363356
                            },
                            {
                                "metric": "AM2",
                                "value": 6.80681963522785
                            },
                            {
                                "metric": "PA1",
                                "value": 14.9064428565398
                            }
                        ]


            {code}
            I imagine replacing the scalar measurement by the new measurement JSON:


            {code:java}
            data['measurements'][0].keys()

            [u'blobs',
             u'parameters',
             u'metric',
             u'value',
             u'extras',
             u'spec_name',
             u'filter_name',
             u'identifier',
             u'unit']

            {code}

            For measurements that are done in different filters or depend on those we can have multiple measurements for
             the same metric. That means we should have a list of measurements for each metric, e.g

            {code:java}

            "measurements": [
                            {
                                "metric": "AM1",
                    ---> "measurement": [] <---
                            },
                            {
                                "metric": "AM2",
                                "measurement": []
                            },
                            {
                                "metric": "PA1",
                                "measurement": []
                            }
                        ]


            {code}
            For the datasets produced for each job we also need the blob JSON, example:

            {code:java}

            data['blobs'][0].keys()

            [u'identifier', u'data', u'name']


            {code}


            {code:java}

            {
                        "ci_id": "2",
                        "ci_name": "demo",
                        "ci_dataset": "cfht",
                        "ci_label": "centos-7",
                        "date": "2016-06-02T05:21:57.298935Z",
                        "ci_url": "https://ci.lsst.codes/job/validate_drp/dataset=cfht,label=centos-7/2/",
                        "status": 0,
               ---> "blobs": {} <---
                        "measurements": [
                            {
                                "metric": "AM1",
                                "measurement": [ ]
                            },
                            {
                                "metric": "AM2",
                                "measurement": [ ]
                            },
                            {
                                "metric": "PA1",
                                "value": [ ]
                            }
                        ],


            {code}
            if it sounds reasonable I can mock that to continue development.

            The important thing for me now is to be able to retrieve the measurement JSON from the SQUASH API given the ci_id,
            ci_dataset and the metric and then call an URL to load the corresponding bokeh app


            {code:java}
            https://angelo-squash-bokeh.lsst.codes/photometry?metric=PA1&ci_dataset=cfht&ci_id=1

            {code}
            ]
            Hide
            afausti Angelo Fausti added a comment -

            In order to extend squash drill down capabilities we implemented validade_drp static plots in bokeh

            They are available in my test environment:

            https://angelo-squash-bokeh.lsst.codes/astrometry
            https://angelo-squash-bokeh.lsst.codes/photometry

            note that the parameters metric, dataset and Job Id are fixed in those examples.

            In DM-8478 we will continue this implementation connecting those plots with the measurements in the regression app

            https://angelo-squash-bokeh.lsst.codes/regression

            Show
            afausti Angelo Fausti added a comment - In order to extend squash drill down capabilities we implemented validade_drp static plots in bokeh They are available in my test environment: https://angelo-squash-bokeh.lsst.codes/astrometry https://angelo-squash-bokeh.lsst.codes/photometry note that the parameters metric, dataset and Job Id are fixed in those examples. In DM-8478 we will continue this implementation connecting those plots with the measurements in the regression app https://angelo-squash-bokeh.lsst.codes/regression
            afausti Angelo Fausti made changes -
            Resolution Done [ 10000 ]
            Status In Progress [ 3 ] Done [ 10002 ]

              People

              Assignee:
              afausti Angelo Fausti
              Reporter:
              jsick Jonathan Sick
              Watchers:
              Angelo Fausti, Jonathan Sick
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.