XMLWordPrintable

#### Details

• Type: Bug
• Status: Done
• Resolution: Done
• Fix Version/s: None
• Component/s:
• Labels:
• Story Points:
1
• Team:
Data Release Production
• Urgent?:
No

#### Description

When running a pipeline (at least through bps) the final mergeExecutionButler step will sometimes fail due to a database deadlock. A typical error looks like this:

     lsst.daf.butler._butler INFO: Transferring 6635 datasets into Butler(collections=[], run=None, datastore='file:///repo/main/', registry='PostgreSQL@lsstdb1:main_20210215') lsst.daf.butler.cli.utils ERROR: Caught an exception, details are in traceback: Traceback (most recent call last):   File "/software/lsstsw/stack_20210813/conda/miniconda3-py38_4.9.2/envs/lsst-scipipe/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1771, in _execute_context     self.dialect.do_execute(   File "/software/lsstsw/stack_20210813/conda/miniconda3-py38_4.9.2/envs/lsst-scipipe/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 717, in do_execute     cursor.execute(statement, parameters) psycopg2.errors.DeadlockDetected: deadlock detected DETAIL:  Process 13613 waits for ExclusiveLock on relation 203222 of database 16384; blocked by process 13779. Process 13779 waits for RowShareLock on relation 202991 of database 16384; blocked by process 13613. HINT:  See server log for query details. 

I tried again a couple of times and got similar error before it finally succeeded on the fourth try without my changing anything. I guess whatever was causing the deadlock had cleared up by then. I do note, however, that in each of the three failures the relation numbers in the error messages werethe same, though of course the process numbers were different. That is, the errors all said

  Process XXXXX waits for ExclusiveLock on relation 203222 of database 16384; blocked by process YYYYY. ... -Process YYYYY waits for RowShareLock on relation 202991 of database 16384; blocked by process XXXXX.   

So XXXXX and YYYYY were naturally different in each job, but 203222 and 202991 were always the same, in case that helps.

#### Activity

Hide
Jim Bosch added a comment -

Andy Salnikov, this is a tiny change, but because it's a subtle concurrency-related change, I'm hoping you can actually take a look at Database.sync in its entirety, and see if you agree with my conclusion that the outcome of a race condition that occurs before, after, or during it but within the same (possibly outer) transaction is either:

• one processing "winning"
• an exception

i.e. not any kind of silently bad write.

PR is here: https://github.com/lsst/daf_butler/pull/579

I'm going to repurpose DM-27224 for fixing the other rarer deadlocks noted in my last post; I'll comment there with my ideas.

Show
Jim Bosch added a comment - Andy Salnikov , this is a tiny change, but because it's a subtle concurrency-related change, I'm hoping you can actually take a look at Database.sync in its entirety, and see if you agree with my conclusion that the outcome of a race condition that occurs before, after, or during it but within the same (possibly outer) transaction is either: one processing "winning" an exception i.e. not any kind of silently bad write. PR is here: https://github.com/lsst/daf_butler/pull/579 I'm going to repurpose DM-27224 for fixing the other rarer deadlocks noted in my last post; I'll comment there with my ideas.
Hide
Andy Salnikov added a comment -

I think dropping the table lock is the right thing to do here. I am not 100% sure that this solves all issues though, my reading of everything makes me think that a lot of it depends on transaction isolation and if that is not handled consistently we may still get into trouble. One note for sync() is that there may be couple of other ways to implement it:

• SELECT ... FOR UPDATE first to put a lock on a record (or index) and INSERT/UPDATE after that
• SELECT followed by INSERT/UPDATE+COMMIT, and if that fails do another SELECT, but this needs to be done in a separate transaction obviously.
Show
Andy Salnikov added a comment - I think dropping the table lock is the right thing to do here. I am not 100% sure that this solves all issues though, my reading of everything makes me think that a lot of it depends on transaction isolation and if that is not handled consistently we may still get into trouble. One note for sync() is that there may be couple of other ways to implement it: SELECT ... FOR UPDATE first to put a lock on a record (or index) and INSERT/UPDATE after that SELECT followed by INSERT/UPDATE+COMMIT, and if that fails do another SELECT, but this needs to be done in a separate transaction obviously.
Hide
Andy Salnikov added a comment - - edited

Just one more comment - removing the table lock should greatly reduce chance of a deadlock, but it does not eliminate it completely. When doing INSERT/UPDATE some potion of a table (or its index) still needs to be locked (existing read lock promoted to exclusive). If two concurrent clients want to insert/update the same or neighbor rows we may still get into a situation with a deadlock.

Show
Andy Salnikov added a comment - - edited Just one more comment - removing the table lock should greatly reduce chance of a deadlock, but it does not eliminate it completely. When doing INSERT/UPDATE some potion of a table (or its index) still needs to be locked (existing read lock promoted to exclusive). If two concurrent clients want to insert/update the same or neighbor rows we may still get into a situation with a deadlock.
Hide
Jim Bosch added a comment -

I do think this logic should be safe enough under any isolation level (other than READ UNCOMMITTED, which isn't really useful for anything), but here are the caveats I'm aware of:

• With READ COMMITTED (the default in PostgreSQL, and what we are using there), using sync at the start of a transaction doesn't mean its guarantees persist to the end of the transaction, as code calling sync often assumes (e.g. sync an "instrument" record, then follow up by inserting "detector" records assuming the instrument hasn't changed).  But I think that's just the usual guarantee of READ COMMITTED, and our exposure to actual problems here is pretty small, because foreign keys will prevent any remotely probable invalid inserts.
• With REPEATABLE READ or SERIALIZABLE isolation, sync's guarantees do persist until the end of the current transaction, but one can get serialization errors from the database when conflicting concurrent writes occur. That problem can already happen with READ COMMITTED if the concurrent write happens between the INSERT ... ON CONFLICT IGNORE and SELECT run by sync, and if it does we raise RuntimeError (or, as you noted, another deadlock exception).  With the stronger-guarantee isolation levels, the type of exception raised would change to something low-level emitted by SQLAlchemy, and it would also be possible to see that after sync ends but before its transaction is committed.

I think it might be worth trying to use SERIALIZABLE isolation to resolve the first point in the future; there are two challenges:

• if it yields database exceptions for possibly conflicting writes that don't actually conflict (especially those INSERT ... ON CONFLICT IGNORE statements we rely on), we'd need to do retries, and that's a big restructuring (this was the original idea for DM-27224)
• we can't change the isolation level after a transaction has begun, so we'd need to either make it illegal to use sync inside a transaction again, use SERIALIZABLE isolation by default, or demand that higher-level code (e.g. in obs_base) explicitly mark transactions that need to be started with SERIALIZABLE.

I'll note on DM-27224 that there are comments here that should be taken into account (your comments and this last one by me, at least) when deciding what to do in the long term, and merge this fix now.

Show
Hide

Marking preops because it was backported for DP0.2

Show
Yusra AlSayyad added a comment - Marking preops because it was backported for DP0.2

#### People

Assignee:
Jim Bosch
Reporter:
Kenneth Herner
Reviewers:
Andy Salnikov
Watchers:
Andy Salnikov, Eli Rykoff, Ian Sullivan, Jim Bosch, Kenneth Herner, Meredith Rawls, Tim Jenness, Yusra AlSayyad