Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-13179

Handling nan values in verify JSON outputs

    Details

    • Type: Story
    • Status: To Do
    • Resolution: Unresolved
    • Fix Version/s: None
    • Component/s: squash, verify
    • Labels:
      None

      Description

      Jonathan Sick Simon Krughoff when testing dispatch_verify.py with the Flask based SQuaSH RESTful API I noticed some nan values in the verify output measurements and blobs.

      They are present for example in the test data I have in the squash-rest-api repository and the problem can be reproduced using the example notebook at:

      https://github.com/lsst-sqre/squash-rest-api/blob/master/tests/test_api.ipynb

      According to the JSON RFC4627 https://tools.ietf.org/html/rfc4627#section-2.4

      "Numeric values that cannot be represented as sequences of digits
         (such as Infinity and NaN) are not permitted"
      

      and in fact SQLAlchemy cannot handle the nan in data blobs that get stored in the MySQL JSON field. Here's an exemple of error message in that case:

       
      sqlalchemy.exc.InvalidRequestError: This Session's transaction has been rolled back due to a previous exception during flush. To begin a new transaction with this Session, first issue Session.rollback(). Original exception was: (pymysql.err.InternalError) (3140, 'Invalid JSON text: "Invalid value." at position 388 in value for column \'blob.data\'.') [SQL: 'INSERT INTO `blob` (identifier, name, data, job_id) VALUES (%(identifier)s, %(name)s, %(data)s, %(job_id)s)'] [parameters: {'identifier': 'aa2d119283944b99a56841c48a6d5967', 'name': 'validate_drp.AF3_minimum', 'data': '{"annulus": {"description": "Inner and outer radii of selection annulus.", "unit": "arcmin", "value": [199.0, 201.0], "label": "annulus radii"}, "rms ... (464 characters truncated) ... e": [17.0, 21.5], "label": null}, "D": {"description": "Radial distance of annulus (arcmin)", "unit": "arcmin", "value": 200.0, "label": "Distance"}}', 'job_id': 2}] (Background on this error at: http://sqlalche.me/e/2j85)
      
      

      Currently I am handling that in SQuaSH, which looks ugly:

      https://github.com/lsst-sqre/squash-rest-api/blob/master/app/models.py#L364

      and

      https://github.com/lsst-sqre/squash-rest-api/blob/master/app/models.py#L413

      Could we solve this up front to make sure that we don't have any nan or Infinity values in the verify outputs?

      What's your suggestion for that?

      Perhaps if nan gets replaced by null in the verify outputs then I can modify the value field (float) in the measurement table to accept a MySQL null values. And we should not have any problem with data blobs that have null values since they are stored in a MySQL JSON field.

        Attachments

          Activity

            People

            • Assignee:
              afausti Angelo Fausti
              Reporter:
              afausti Angelo Fausti
              Watchers:
              Angelo Fausti, Jonathan Sick, Krzysztof Findeisen, Simon Krughoff
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:

                Summary Panel