Details
-
Type:
Story
-
Status: Done
-
Resolution: Done
-
Fix Version/s: None
-
Component/s: ctrl_mpexec
-
Story Points:2
-
Epic Link:
-
Sprint:DB_F20_09
-
Team:Data Access and Database
-
Urgent?:No
Description
This option was added to the argument parser on DM-21849 but not implemented due to problems with dataset deletion.
Attachments
Issue Links
Activity
It's supposed to delete the previous run in a chain (and the datasets within it) instead of just shunting them off to the side by taking the run out of the chain. Or if we change that UI (since everyone is confused by it), we can consider this ticket to be about providing some option to just blow away the last run when running again.
Jim Bosch can you put a story point estimate on this ticket? It seems like it's a trivial thing that Andy Salnikov could do in an afternoon but I might be wrong (especially if "revamp the entire command line" is part of the plan).
Let's make it the trivial thing. I think we're already planning to tell users this interface needs feedback and improvement at Gen2 deprecation time, and that overhaul shouldn't block this ticket.
Reading docstring for Butler.pruneDatasets I see this:
disassociate : bool`, optional
|
Disassociate pruned datasets from ``self.tags`` (or the collections
|
given via the ``tags`` argument). Ignored if ``refs`` is ``...``.
|
but I think that refs is not supposed to be ... in this method? Jim Bosch, does the docstring need a fix?
Definitely needs a fix, but I'm not sure what I meant to say with the last sentence. Probably best to just remove the last sentence.
This "trivial" fix gives me lots of troubles, mostly in that I had to do non-trivial things to test it. Now I think my tests are OK, but I'm getting an exception with my implementation of "purge" option. I though I can implement it with a simple call:
butler.pruneCollection(replaced, purge=True, unstore=True)
|
(where "replaced" is a collection name and it is RUN collection). This crashes with FOREIGN KEY violation:
Traceback (most recent call last):
|
File "/software/lsstsw/stack_20200515/python/miniconda3-4.7.12/envs/lsst-scipipe/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1248, in _execute_context
|
cursor, statement, parameters, context
|
File "/software/lsstsw/stack_20200515/python/miniconda3-4.7.12/envs/lsst-scipipe/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 590, in do_execute
|
cursor.execute(statement, parameters)
|
sqlite3.IntegrityError: FOREIGN KEY constraint failedThe above exception was the direct cause of the following exception:Traceback (most recent call last):
|
File "tests/test_cmdLineFwk.py", line 573, in testSimpleQGraphReplaceRun
|
fwk.runPipeline(copy.deepcopy(qgraph), taskFactory, args)
|
File "/project/salnikov/gen3-middleware/ctrl_mpexec/python/lsst/ctrl/mpexec/cmdLineFwk.py", line 669, in runPipeline
|
butler = _ButlerFactory.makeWriteButler(args)
|
File "/project/salnikov/gen3-middleware/ctrl_mpexec/python/lsst/ctrl/mpexec/cmdLineFwk.py", line 363, in makeWriteButler
|
butler.pruneCollection(replaced, purge=True, unstore=True)
|
File "/software/lsstsw/stack_20200515/stack/miniconda3-4.7.12-46b24e8/Linux64/daf_butler/19.0.0-148-g38150800+de92faa26f/python/lsst/daf/butler/_butler.py", line 959, in pruneCollection
|
self.registry.removeCollection(name)
|
File "/software/lsstsw/stack_20200515/stack/miniconda3-4.7.12-46b24e8/Linux64/daf_butler/19.0.0-148-g38150800+de92faa26f/python/lsst/daf/butler/core/utils.py", line 253, in inner
|
return func(self, *args, **kwargs)
|
File "/software/lsstsw/stack_20200515/stack/miniconda3-4.7.12-46b24e8/Linux64/daf_butler/19.0.0-148-g38150800+de92faa26f/python/lsst/daf/butler/registry/_registry.py", line 398, in removeCollection
|
self._collections.remove(name)
|
File "/software/lsstsw/stack_20200515/stack/miniconda3-4.7.12-46b24e8/Linux64/daf_butler/19.0.0-148-g38150800+de92faa26f/python/lsst/daf/butler/registry/collections/_base.py", line 410, in remove
|
{self._collectionIdName: record.key})
|
File "/software/lsstsw/stack_20200515/stack/miniconda3-4.7.12-46b24e8/Linux64/daf_butler/19.0.0-148-g38150800+de92faa26f/python/lsst/daf/butler/registry/interfaces/_database.py", line 1299, in delete
|
return self._connection.execute(sql, *rows).rowcount
|
File "/software/lsstsw/stack_20200515/python/miniconda3-4.7.12/envs/lsst-scipipe/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 984, in execute
|
return meth(self, multiparams, params)
|
File "/software/lsstsw/stack_20200515/python/miniconda3-4.7.12/envs/lsst-scipipe/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 293, in _execute_on_connection
|
return connection._execute_clauseelement(self, multiparams, params)
|
File "/software/lsstsw/stack_20200515/python/miniconda3-4.7.12/envs/lsst-scipipe/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1103, in _execute_clauseelement
|
distilled_params,
|
File "/software/lsstsw/stack_20200515/python/miniconda3-4.7.12/envs/lsst-scipipe/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1288, in _execute_context
|
e, statement, parameters, cursor, context
|
File "/software/lsstsw/stack_20200515/python/miniconda3-4.7.12/envs/lsst-scipipe/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1482, in _handle_dbapi_exception
|
sqlalchemy_exception, with_traceback=exc_info[2], from_=e
|
File "/software/lsstsw/stack_20200515/python/miniconda3-4.7.12/envs/lsst-scipipe/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 178, in raise_
|
raise exception
|
File "/software/lsstsw/stack_20200515/python/miniconda3-4.7.12/envs/lsst-scipipe/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1248, in _execute_context
|
cursor, statement, parameters, context
|
File "/software/lsstsw/stack_20200515/python/miniconda3-4.7.12/envs/lsst-scipipe/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 590, in do_execute
|
cursor.execute(statement, parameters)
|
sqlalchemy.exc.IntegrityError: (sqlite3.IntegrityError) FOREIGN KEY constraint failed
|
[SQL: DELETE FROM collection WHERE collection.collection_id = ?]
|
[parameters: (5,)]
|
(Background on this error at: http://sqlalche.me/e/gkpj)
|
Jim Bosch, are my assumptions totally wrong, do I need to do something else before (or instead of) pruneCollection?
I thought what you tried world work. AFK now, but will look at exception when back at it.
I think a collection has to be removed from chain collection before it can be removed itself. The schema:
CREATE TABLE collection_chain (
|
parent BIGINT,
|
position SMALLINT,
|
child BIGINT NOT NULL,
|
dataset_type_name VARCHAR(128),
|
PRIMARY KEY (parent, position),
|
CONSTRAINT collection_chain_len_dataset_type_name CHECK (length(dataset_type_name)<=128 AND length(dataset_type_name)>=1),
|
CONSTRAINT fkey_collection_chain_collection_collection_id_parent FOREIGN KEY(parent) REFERENCES collection (collection_id) ON DELETE CASCADE,
|
CONSTRAINT fkey_collection_chain_collection_collection_id_child FOREIGN KEY(child) REFERENCES collection (collection_id)
|
);
|
does not have ON DELETE CASCADE for child FK, was it intentional?
That was intentional, because deleting a RUN child collection also implies deleting datasets, and that requires datastore involvement and hence can't be done exclusively through ON DELETE.
But if the problem is that it is trying to delete a CHAINED collection, that explains it - I was imagining that the replaced variable in your code was the name of a RUN collection, because --prune-replaced should be popping the first RUN off of the CHAINED collection stack, and then deleting that RUN, not the CHAINED collection. Or is there a scenario I had not considered where the first collection on that stack is also a CHAINED collection?
replaced is indeed a RUN collection, I'm not trying to delete CHAINED collection. But deleting RUN collection also need to "unlink" it from a CHAINED collection, and that step has to be done explicitly if there is no ON DELETE CASCADE on the child key in the collection_chain table. I don't think this has to do something with datastore, dropping datasets is done explicitly by pruneCollection.
Oh, sorry, I misunderstood your last post, and your analysis was exactly right. But I'm not sure if we want to add ON DELETE CASCADE to fix this; if a user tries to do --prune-replaced on a RUN that is referenced by some other CHAINED collection than the one they are modifying, it might be a sign (especially in a multi-user environment) that the RUN is important in some other context they were not aware of, and they should think twice (and then maybe use a butler command to delete it if they are sure; that doesn't exist yet, but it's DM-26684). Of course, we would need to make sure we remove it from the CHAINED collection that the user is trying to modify before deleting it, then, and I'm not sure if that's easy with the way the code is structured right now.
Ah, OK, it makes sense. I think workaround should be simple, I can call butler.registry.setCollectionChain with whatever remains in the chain after popping first element, let me try that.
Should be ready for review. I think my understanding of what the options should do is right but please check that it really should work that way.
ctrl_mpexec: https://github.com/lsst/ctrl_mpexec/pull/72
daf_butler: https://github.com/lsst/daf_butler/pull/372
Andy Salnikov am I correct in thinking that this ticket can be closed?
Indeed, I thought I closed it already.
What does the --prune-replaced option do?