Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-34951

Redesign Registry/Datastore boundary w.r.t. deletion and location lookups

    XMLWordPrintable

Details

    • Architecture
    • No

    Description

      We currently require a dataset to exist in Registry in order for it to exist in a Datastore's internal records or the dataset_location table, via a foreign key. Maintaining that constraint when datasets are deleted is tricky, leading to a "trash" system that:

      • is problematic in multi-user contexts (one user's delete operation can attempt to empty trash left over from a different user's failure - which might be disastrous if the different users have different permissions on the backing storage);
      • sort of defeats the purpose of the constraint anyway, allowing datasets to be deleted from the Registry without actually deleting them from Datastore's backing storage by allowing them to merely be trashed in Datastore's records.

      In addition, all of our schemes for shared-database-free batch execution (see DMTN-177) also violate this constraint in spirit, by allowing Datastore files to exist long before their database records (for either Registry or Datastore side) exist.

      And, finally, one of the main reasons we had this constraint - preservation of unique dataset IDs by delegating their creation to Registry - has effectively gone away with UUIDs.

      A better solution to this problem needs serious thought, but I think it ought to involve the following principles:

      • A dataset may exist in either Registry or Datastore without existing in the other. For example, in no-shared-database batch, output datasets will exist only in Datastore until they are "transferred back", and some intermediate datasets may be deleted from a Datastore (or never moved to permanent storage) while being retained in Registry for provenance.
      • If a dataset's file exists in a Datastore, its records must exist in some location managed by the data repository (findable via configuration in the data repository root). That will often be the Registry database (after processing outputs are transferred back), but it could also be a per-quantum file during execution, a per-run file used in Rucio-Butler communications, or some kind of future more scalable database used during execution. The important point here is that Datastore should never have to check underlying storage for file existence to satisfy an existence check call, except perhaps in some limited failure-recovery modes where we have reason to believe that things have gotten out of sync.
      • We should only provide atomic-operation guarantees in deletion that we can easily and efficiently delegate to the systems we use under the hood (i.e. filesystems, object stores, and SQL databases); we just need to be clear about the guarantees that we do provide. I think a database transaction rollback does count as "easy and efficient", while moving files in an object store to a temporary location prior to actually deleting them (in order to make that deletion reversible) probably does not, and doing something similar with hard-links on filesystems that support them is somewhere in between.

      I think we should also consider dropping the dataset_location, and instead have:

      • a Python interface for vectorized Datastore existence and storage-metadata checks (e.g. URI, checksum, file size) that's accessible to the Registry query system through the bridge interface;
      • an interface for providing access to SQLAlchemy table or subqueries that provide Datastore records with these columns (which would return none when records are not in the Registry, or indicate in some other way when some records may not be in the Registry).

      This ticket is not necessarily the one where we do the work - I expect its output product to be a some combination of technote, confluence page, and follow-up tickets.

      tjenness, this is something I've been thinking about in a shallow sense for quite a while, and this ticket is my attempt to capture the thinking I have done in case you'd like to take it the rest of the way (or push back on parts you disagree with). I'm also happy to pick it up myself later, but I'm not planning to do that until the DM-31725 (etc) query system work is complete.

      Attachments

        Issue Links

          Activity

            ktl Kian-Tat Lim added a comment -

            Some things I think could be clarified in the text, and some things to pay attention to in the implementation, but overall this looks good.

            ktl Kian-Tat Lim added a comment - Some things I think could be clarified in the text, and some things to pay attention to in the implementation, but overall this looks good.
            jbosch Jim Bosch added a comment -

            Thanks; I'll address individual PR comments later, but a few of your them - on a shared secret between server-side Registry and client-side Datastore, and the possibility that big registry queries could hit Gafaelfawr pretty hard - make me want to revisit some aspects of this. I may need to bug you, Tim, or Russ at some point to make sure I've understood the possibilities correctly.

            jbosch Jim Bosch added a comment - Thanks; I'll address individual PR comments later, but a few of your them - on a shared secret between server-side Registry and client-side Datastore, and the possibility that big registry queries could hit Gafaelfawr pretty hard - make me want to revisit some aspects of this. I may need to bug you, Tim, or Russ at some point to make sure I've understood the possibilities correctly.
            ktl Kian-Tat Lim added a comment -

            To be clear, you can't share a secret between server-side Registry and client-side Datastore, only between a server-side Registry and a server-side Datastore, which is what you were discussing ("a Datastore server cannot easily perform this job").

            ktl Kian-Tat Lim added a comment - To be clear, you can't share a secret between server-side Registry and client-side Datastore, only between a server-side Registry and a server-side Datastore, which is what you were discussing ("a Datastore server cannot easily perform this job").
            jbosch Jim Bosch added a comment -

            I believe I've addressed all review comments on the PR, though a few involved questions from me.

            I've largely left the overall proposal unchanged - while the text now acknowledges shared secret between a Datastore server and Registry server as a solution to the signed-URL trust issue, I still think it's simpler and more efficient to just have a Registry server (with some server-side Datastore logic it can call via a bridge class).

            The only change to the prototype code comes as a response to ktl's question about where journal files live and how we control access to them; I'm proposing now that they live within the Datastore root (or something very similar to that, like a different object store buck with the same access controls), and hence they may need signed URLs to be manipulated by the Datastore client. The prototype now provides a way for those signed URLs to be obtained. But this isn't the only way I think we could handle writing the journal files, it's just the way that seemed simplest to me, so I'm very much open to feedback on this.

            jbosch Jim Bosch added a comment - I believe I've addressed all review comments on the PR, though a few involved questions from me. I've largely left the overall proposal unchanged - while the text now acknowledges shared secret between a Datastore server and Registry server as a solution to the signed-URL trust issue, I still think it's simpler and more efficient to just have a Registry server (with some server-side Datastore logic it can call via a bridge class). The only change to the prototype code comes as a response to ktl 's question about where journal files live and how we control access to them; I'm proposing now that they live within the Datastore root (or something very similar to that, like a different object store buck with the same access controls), and hence they may need signed URLs to be manipulated by the Datastore client. The prototype now provides a way for those signed URLs to be obtained. But this isn't the only way I think we could handle writing the journal files, it's just the way that seemed simplest to me, so I'm very much open to feedback on this.
            jbosch Jim Bosch added a comment -

            I've created DM-39443 to address the remaining issue with write consistency guarantees.  I'm going to merge this in its current state because I think conversation on all other topics has converged or is best deferred to implementation time.

            jbosch Jim Bosch added a comment - I've created DM-39443 to address the remaining issue with write consistency guarantees.  I'm going to merge this in its current state because I think conversation on all other topics has converged or is best deferred to implementation time.

            People

              jbosch Jim Bosch
              jbosch Jim Bosch
              Kian-Tat Lim
              Jim Bosch, Kian-Tat Lim, Tim Jenness
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Jenkins

                  No builds found.