Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-27454

ap_verify unable to push to SQuaSH

    XMLWordPrintable

Details

    • Bug
    • Status: Won't Fix
    • Resolution: Done
    • None
    • squash
    • None

    Description

      The ap_verify runs on 2020-11-06 for both cosmos_pdr2-master^gen2 and
      hits2015-master^gen2 were unable to dispatch their metrics to SQuaSH.  (cosmos_pdr2-master^gen3 doesn't push yet.)  The uploads failed with 500 Server Error: Internal Server Error for url: https://squash-restful-api.lsst.codes/

      Attachments

        Activity

          ktl Kian-Tat Lim added a comment -

          Similar problems with 500 errors occurred on 2021-02-23 and 2021-04-26.

          ktl Kian-Tat Lim added a comment - Similar problems with 500 errors occurred on 2021-02-23 and 2021-04-26.
          ktl Kian-Tat Lim added a comment - - edited

          A different problem on 2021-07-08, this time in validate_drp rather than ap_verify:

          verify.squash.post ERROR: HTTPSConnectionPool(host='squash-restful-api.lsst.codes', port=443): Read timed out. (read timeout=900.0)
          

          This went away on retry.

          ktl Kian-Tat Lim added a comment - - edited A different problem on 2021-07-08, this time in validate_drp rather than ap_verify: verify.squash.post ERROR: HTTPSConnectionPool(host='squash-restful-api.lsst.codes', port=443): Read timed out. (read timeout=900.0) This went away on retry.
          tjenness Tim Jenness added a comment -

          Can this ticket be closed?

          tjenness Tim Jenness added a comment - Can this ticket be closed?

          We still have seen occasional SQuaSH failures, but we can close this and open a new one if we see any problems with Sasquatch.

          ktl Kian-Tat Lim added a comment - We still have seen occasional SQuaSH failures, but we can close this and open a new one if we see any problems with Sasquatch.

          Each of the individual push failures was fixed, but the overall causes (OOM, disk failure, other node problems) may persist. Nevertheless, closing as SQuaSH has become Sasquatch.

          ktl Kian-Tat Lim added a comment - Each of the individual push failures was fixed, but the overall causes (OOM, disk failure, other node problems) may persist. Nevertheless, closing as SQuaSH has become Sasquatch.

          People

            afausti Angelo Fausti
            ktl Kian-Tat Lim
            Angelo Fausti, Kian-Tat Lim, Tim Jenness
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Jenkins

                No builds found.