Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-31956

mergeExecutionButler task hits database deadlock intermittently

    XMLWordPrintable

    Details

    • Story Points:
      1
    • Team:
      Data Release Production
    • Urgent?:
      No

      Description

      When running a pipeline (at least through bps) the final mergeExecutionButler step will sometimes fail due to a database deadlock. A typical error looks like this:

       
       
      lsst.daf.butler._butler INFO: Transferring 6635 datasets into Butler(collections=[], run=None, datastore='file:///repo/main/', registry='PostgreSQL@lsstdb1:main_20210215')
      lsst.daf.butler.cli.utils ERROR: Caught an exception, details are in traceback:
      Traceback (most recent call last):
        File "/software/lsstsw/stack_20210813/conda/miniconda3-py38_4.9.2/envs/lsst-scipipe/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1771, in _execute_context
          self.dialect.do_execute(
        File "/software/lsstsw/stack_20210813/conda/miniconda3-py38_4.9.2/envs/lsst-scipipe/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 717, in do_execute
          cursor.execute(statement, parameters)
      psycopg2.errors.DeadlockDetected: deadlock detected
      DETAIL:  Process 13613 waits for ExclusiveLock on relation 203222 of database 16384; blocked by process 13779.
      Process 13779 waits for RowShareLock on relation 202991 of database 16384; blocked by process 13613.
      HINT:  See server log for query details.
      

       

      I tried again a couple of times and got similar error before it finally succeeded on the fourth try without my changing anything. I guess whatever was causing the deadlock had cleared up by then. I do note, however, that in each of the three failures the relation numbers in the error messages werethe same, though of course the process numbers were different. That is, the errors all said

       

       Process XXXXX waits for ExclusiveLock on relation 203222 of database 16384; blocked by process YYYYY.
      ...
      -Process YYYYY waits for RowShareLock on relation 202991 of database 16384; blocked by process XXXXX.
       
      

       

      So XXXXX and YYYYY were naturally different in each job, but 203222 and 202991 were always the same, in case that helps.

        Attachments

          Issue Links

            Activity

            Hide
            jbosch Jim Bosch added a comment -

            Andy Salnikov, this is a tiny change, but because it's a subtle concurrency-related change, I'm hoping you can actually take a look at Database.sync in its entirety, and see if you agree with my conclusion that the outcome of a race condition that occurs before, after, or during it but within the same (possibly outer) transaction is either:

            • one processing "winning"
            • an exception

            i.e. not any kind of silently bad write.

            PR is here: https://github.com/lsst/daf_butler/pull/579

            I'm going to repurpose DM-27224 for fixing the other rarer deadlocks noted in my last post; I'll comment there with my ideas.

            Show
            jbosch Jim Bosch added a comment - Andy Salnikov , this is a tiny change, but because it's a subtle concurrency-related change, I'm hoping you can actually take a look at Database.sync in its entirety, and see if you agree with my conclusion that the outcome of a race condition that occurs before, after, or during it but within the same (possibly outer) transaction is either: one processing "winning" an exception i.e. not any kind of silently bad write. PR is here: https://github.com/lsst/daf_butler/pull/579 I'm going to repurpose DM-27224 for fixing the other rarer deadlocks noted in my last post; I'll comment there with my ideas.
            Hide
            salnikov Andy Salnikov added a comment -

            I think dropping the table lock is the right thing to do here. I am not 100% sure that this solves all issues though, my reading of everything makes me think that a lot of it depends on transaction isolation and if that is not handled consistently we may still get into trouble. One note for sync() is that there may be couple of other ways to implement it:

            • SELECT ... FOR UPDATE first to put a lock on a record (or index) and INSERT/UPDATE after that
            • SELECT followed by INSERT/UPDATE+COMMIT, and if that fails do another SELECT, but this needs to be done in a separate transaction obviously.
            Show
            salnikov Andy Salnikov added a comment - I think dropping the table lock is the right thing to do here. I am not 100% sure that this solves all issues though, my reading of everything makes me think that a lot of it depends on transaction isolation and if that is not handled consistently we may still get into trouble. One note for sync() is that there may be couple of other ways to implement it: SELECT ... FOR UPDATE first to put a lock on a record (or index) and INSERT/UPDATE after that SELECT followed by INSERT/UPDATE+COMMIT, and if that fails do another SELECT, but this needs to be done in a separate transaction obviously.
            Hide
            salnikov Andy Salnikov added a comment - - edited

            Just one more comment - removing the table lock should greatly reduce chance of a deadlock, but it does not eliminate it completely. When doing INSERT/UPDATE some potion of a table (or its index) still needs to be locked (existing read lock promoted to exclusive). If two concurrent clients want to insert/update the same or neighbor rows we may still get into a situation with a deadlock.

            Show
            salnikov Andy Salnikov added a comment - - edited Just one more comment - removing the table lock should greatly reduce chance of a deadlock, but it does not eliminate it completely. When doing INSERT/UPDATE some potion of a table (or its index) still needs to be locked (existing read lock promoted to exclusive). If two concurrent clients want to insert/update the same or neighbor rows we may still get into a situation with a deadlock.
            Hide
            jbosch Jim Bosch added a comment -

            I do think this logic should be safe enough under any isolation level (other than READ UNCOMMITTED, which isn't really useful for anything), but here are the caveats I'm aware of:

            • With READ COMMITTED (the default in PostgreSQL, and what we are using there), using sync at the start of a transaction doesn't mean its guarantees persist to the end of the transaction, as code calling sync often assumes (e.g. sync an "instrument" record, then follow up by inserting "detector" records assuming the instrument hasn't changed).  But I think that's just the usual guarantee of READ COMMITTED, and our exposure to actual problems here is pretty small, because foreign keys will prevent any remotely probable invalid inserts.
            • With REPEATABLE READ or SERIALIZABLE isolation, sync's guarantees do persist until the end of the current transaction, but one can get serialization errors from the database when conflicting concurrent writes occur. That problem can already happen with READ COMMITTED if the concurrent write happens between the INSERT ... ON CONFLICT IGNORE and SELECT run by sync, and if it does we raise RuntimeError (or, as you noted, another deadlock exception).  With the stronger-guarantee isolation levels, the type of exception raised would change to something low-level emitted by SQLAlchemy, and it would also be possible to see that after sync ends but before its transaction is committed.

            I think it might be worth trying to use SERIALIZABLE isolation to resolve the first point in the future; there are two challenges:

            • if it yields database exceptions for possibly conflicting writes that don't actually conflict (especially those INSERT ... ON CONFLICT IGNORE statements we rely on), we'd need to do retries, and that's a big restructuring (this was the original idea for DM-27224)
            • we can't change the isolation level after a transaction has begun, so we'd need to either make it illegal to use sync inside a transaction again, use SERIALIZABLE isolation by default, or demand that higher-level code (e.g. in obs_base) explicitly mark transactions that need to be started with SERIALIZABLE.

            I'll note on DM-27224 that there are comments here that should be taken into account (your comments and this last one by me, at least) when deciding what to do in the long term, and merge this fix now.

            Show
            jbosch Jim Bosch added a comment - I do think this logic should be safe enough under any isolation level (other than READ UNCOMMITTED, which isn't really useful for anything), but here are the caveats I'm aware of: With READ COMMITTED (the default in PostgreSQL, and what we are using there), using sync at the start of a transaction doesn't mean its guarantees persist to the end of the transaction, as code calling sync often assumes (e.g. sync an "instrument" record, then follow up by inserting "detector" records assuming the instrument hasn't changed).  But I think that's just the usual guarantee of READ COMMITTED, and our exposure to actual problems here is pretty small, because foreign keys will prevent any remotely probable invalid inserts. With REPEATABLE READ or SERIALIZABLE isolation, sync 's guarantees do persist until the end of the current transaction, but one can get serialization errors from the database when conflicting concurrent writes occur. That problem can already happen with READ COMMITTED if the concurrent write happens between the INSERT ... ON CONFLICT IGNORE and SELECT run by sync , and if it does we raise RuntimeError (or, as you noted, another deadlock exception).  With the stronger-guarantee isolation levels, the type of exception raised would change to something low-level emitted by SQLAlchemy, and it would also be possible to see that after sync ends but before its transaction is committed. I think it might be worth trying to use SERIALIZABLE isolation to resolve the first point in the future; there are two challenges: if it yields database exceptions for possibly conflicting writes that don't actually conflict (especially those INSERT ... ON CONFLICT IGNORE statements we rely on), we'd need to do retries, and that's a big restructuring (this was the original idea for DM-27224 ) we can't change the isolation level after a transaction has begun, so we'd need to either make it illegal to use sync inside a transaction again, use SERIALIZABLE isolation by default, or demand that higher-level code (e.g. in obs_base) explicitly mark transactions that need to be started with SERIALIZABLE. I'll note on DM-27224 that there are comments here that should be taken into account (your comments and this last one by me, at least) when deciding what to do in the long term, and merge this fix now.
            Hide
            yusra Yusra AlSayyad added a comment -

            Marking preops because it was backported for DP0.2

            Show
            yusra Yusra AlSayyad added a comment - Marking preops because it was backported for DP0.2

              People

              Assignee:
              jbosch Jim Bosch
              Reporter:
              kherner Kenneth Herner
              Reviewers:
              Andy Salnikov
              Watchers:
              Andy Salnikov, Eli Rykoff, Ian Sullivan, Jim Bosch, Kenneth Herner, Meredith Rawls, Tim Jenness, Yusra AlSayyad
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.