Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-32131

Merge Cassandra branch of APDB

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: dax_apdb
    • Labels:
    • Story Points:
      8
    • Epic Link:
    • Sprint:
      DB_F21_06
    • Team:
      Data Access and Database
    • Urgent?:
      No

      Description

      Since DM-31458 has merged, now it is a good time to do actual Cassandra merging. Cassandra development was on a special branch u/andy-slac/cassandra-2 (and u/andy-slac/cassandra before that). I do expect usual merging conflicts because of changes on master. I want to add few tests that can be run with a single-node Cassandra cluster running on my desktop to make sure that things are merged OK. Doing this sort of checks in unit test is problematic, bringing up an actual Cassandra cluster is not a trivial task, and writing a mock for Cassandra is far from trivial too.

        Attachments

          Issue Links

            Activity

            Hide
            cmorrison Chris Morrison [X] (Inactive) added a comment -

            Ah, sorry, do you have what the output error of the pipeline is? The problem is that there have been pipeline steps added and modified that are Gen3 only.

            Show
            cmorrison Chris Morrison [X] (Inactive) added a comment - Ah, sorry, do you have what the output error of the pipeline is? The problem is that there have been pipeline steps added and modified that are Gen3 only.
            Hide
            salnikov Andy Salnikov added a comment -

            Here is what I see:

            $ ap_verify.py --dataset ap_verify_ci_cosmos_pdr2 --output ap_verify_ci_cosmos_pdr2-gen3 --gen3
            ap.verify.ingestion.ingestDataset INFO: Data ingested
            ap.verify.ap_verify.main INFO: Running pipeline...
            lsst.daf.butler.cli.utils ERROR: Caught an exception, details are in traceback:
            Traceback (most recent call last):
              File "/home/salnikov/lsst_stack/stack/miniconda3-py38_4.9.2-0.7.0/Linux64/ctrl_mpexec/22.0.1-26-gdc29f13+4f3462a7cb/python/lsst/ctrl/mpexec/cli/cmd/commands.py", line 106, in run
                qgraph = script.qgraph(pipelineObj=pipeline, **kwargs)
              File "/home/salnikov/lsst_stack/stack/miniconda3-py38_4.9.2-0.7.0/Linux64/ctrl_mpexec/22.0.1-26-gdc29f13+4f3462a7cb/python/lsst/ctrl/mpexec/cli/script/qgraph.py", line 148, in qgraph
                qgraph = f.makeGraph(pipelineObj, args)
              File "/home/salnikov/lsst_stack/stack/miniconda3-py38_4.9.2-0.7.0/Linux64/ctrl_mpexec/22.0.1-26-gdc29f13+4f3462a7cb/python/lsst/ctrl/mpexec/cmdLineFwk.py", line 571, in makeGraph
                qgraph = graphBuilder.makeGraph(pipeline, collections, run, args.data_query, metadata=metadata)
              File "/home/salnikov/lsst_stack/stack/miniconda3-py38_4.9.2-0.7.0/Linux64/pipe_base/22.0.1-29-gc8092ca+64bf416945/python/lsst/pipe/base/graphBuilder.py", line 926, in makeGraph
                scaffolding = _PipelineScaffolding(pipeline, registry=self.registry)
              File "/home/salnikov/lsst_stack/stack/miniconda3-py38_4.9.2-0.7.0/Linux64/pipe_base/22.0.1-29-gc8092ca+64bf416945/python/lsst/pipe/base/graphBuilder.py", line 413, in __init__
                datasetTypes = PipelineDatasetTypes.fromPipeline(pipeline, registry=registry)
              File "/home/salnikov/lsst_stack/stack/miniconda3-py38_4.9.2-0.7.0/Linux64/pipe_base/22.0.1-29-gc8092ca+64bf416945/python/lsst/pipe/base/pipeline.py", line 960, in fromPipeline
                for taskDef in pipeline:
              File "/home/salnikov/lsst_stack/stack/miniconda3-py38_4.9.2-0.7.0/Linux64/pipe_base/22.0.1-29-gc8092ca+64bf416945/python/lsst/pipe/base/pipeline.py", line 625, in toExpandedPipeline
                raise pipelineIR.ContractError(f"Contract(s) '{contract.contract}' were not "
            lsst.pipe.base.pipelineIR.ContractError: Contract(s) 'imageDifference.connections.coaddName == fracDiaSourcesToSciSources.connections.coaddName' were not satisfied
            ap.verify.pipeline_driver.runApPipeGen3 INFO: Pipeline complete.
            

            Show
            salnikov Andy Salnikov added a comment - Here is what I see: $ ap_verify.py --dataset ap_verify_ci_cosmos_pdr2 --output ap_verify_ci_cosmos_pdr2-gen3 --gen3 ap.verify.ingestion.ingestDataset INFO: Data ingested ap.verify.ap_verify.main INFO: Running pipeline... lsst.daf.butler.cli.utils ERROR: Caught an exception, details are in traceback: Traceback (most recent call last): File "/home/salnikov/lsst_stack/stack/miniconda3-py38_4.9.2-0.7.0/Linux64/ctrl_mpexec/22.0.1-26-gdc29f13+4f3462a7cb/python/lsst/ctrl/mpexec/cli/cmd/commands.py", line 106, in run qgraph = script.qgraph(pipelineObj=pipeline, **kwargs) File "/home/salnikov/lsst_stack/stack/miniconda3-py38_4.9.2-0.7.0/Linux64/ctrl_mpexec/22.0.1-26-gdc29f13+4f3462a7cb/python/lsst/ctrl/mpexec/cli/script/qgraph.py", line 148, in qgraph qgraph = f.makeGraph(pipelineObj, args) File "/home/salnikov/lsst_stack/stack/miniconda3-py38_4.9.2-0.7.0/Linux64/ctrl_mpexec/22.0.1-26-gdc29f13+4f3462a7cb/python/lsst/ctrl/mpexec/cmdLineFwk.py", line 571, in makeGraph qgraph = graphBuilder.makeGraph(pipeline, collections, run, args.data_query, metadata=metadata) File "/home/salnikov/lsst_stack/stack/miniconda3-py38_4.9.2-0.7.0/Linux64/pipe_base/22.0.1-29-gc8092ca+64bf416945/python/lsst/pipe/base/graphBuilder.py", line 926, in makeGraph scaffolding = _PipelineScaffolding(pipeline, registry=self.registry) File "/home/salnikov/lsst_stack/stack/miniconda3-py38_4.9.2-0.7.0/Linux64/pipe_base/22.0.1-29-gc8092ca+64bf416945/python/lsst/pipe/base/graphBuilder.py", line 413, in __init__ datasetTypes = PipelineDatasetTypes.fromPipeline(pipeline, registry=registry) File "/home/salnikov/lsst_stack/stack/miniconda3-py38_4.9.2-0.7.0/Linux64/pipe_base/22.0.1-29-gc8092ca+64bf416945/python/lsst/pipe/base/pipeline.py", line 960, in fromPipeline for taskDef in pipeline: File "/home/salnikov/lsst_stack/stack/miniconda3-py38_4.9.2-0.7.0/Linux64/pipe_base/22.0.1-29-gc8092ca+64bf416945/python/lsst/pipe/base/pipeline.py", line 625, in toExpandedPipeline raise pipelineIR.ContractError(f"Contract(s) '{contract.contract}' were not " lsst.pipe.base.pipelineIR.ContractError: Contract(s) 'imageDifference.connections.coaddName == fracDiaSourcesToSciSources.connections.coaddName' were not satisfied ap.verify.pipeline_driver.runApPipeGen3 INFO: Pipeline complete.
            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            That looks like DM-26140 changes being applied out of sync. Can you make sure both your stack and the ap_verify datasets are up-to-date?

            Show
            krzys Krzysztof Findeisen added a comment - - edited That looks like DM-26140 changes being applied out of sync. Can you make sure both your stack and the ap_verify datasets are up-to-date?
            Hide
            salnikov Andy Salnikov added a comment -

            Indeed, my fault, I forgot to update ap_verify_ci_cosmos_pdr2. Checking out latest master HEAD I can now run --gen3 version without problems. Sorry for the noise.

            Show
            salnikov Andy Salnikov added a comment - Indeed, my fault, I forgot to update ap_verify_ci_cosmos_pdr2. Checking out latest master HEAD I can now run --gen3 version without problems. Sorry for the noise.
            Hide
            salnikov Andy Salnikov added a comment -

            Thanks for review and all suggestions! I have merged everything after re-running Jenkins.

            To share my ideas on further development - there are few things that I'm planning to work on in the near-medium future:

            • Schema definition, I think we should be using the schema as defined in a central repository (sdm_schema) instead of maintaining our own set of YAML files. This probably means that sdm_schema needs an update to reflect what you already have in ap_association, and I think DPDD may also need a similar update. And probably schema for PPDB should be based on the same source, we'll see if any coordination between APDB and PPDB is needed in that respect.
            • Apdb interface will need an extension to support new query types for replication of APDB data to PPDB, there is not much known there yet, it will become clearer when I start designing that process. This should not affect current Apdb methods used by AP pipeline.
            • Any new features that you want from Apdb interface which are not yet covered?
            Show
            salnikov Andy Salnikov added a comment - Thanks for review and all suggestions! I have merged everything after re-running Jenkins. To share my ideas on further development - there are few things that I'm planning to work on in the near-medium future: Schema definition, I think we should be using the schema as defined in a central repository (sdm_schema) instead of maintaining our own set of YAML files. This probably means that sdm_schema needs an update to reflect what you already have in ap_association, and I think DPDD may also need a similar update. And probably schema for PPDB should be based on the same source, we'll see if any coordination between APDB and PPDB is needed in that respect. Apdb interface will need an extension to support new query types for replication of APDB data to PPDB, there is not much known there yet, it will become clearer when I start designing that process. This should not affect current Apdb methods used by AP pipeline. Any new features that you want from Apdb interface which are not yet covered?

              People

              Assignee:
              salnikov Andy Salnikov
              Reporter:
              salnikov Andy Salnikov
              Reviewers:
              Chris Morrison [X] (Inactive)
              Watchers:
              Andy Salnikov, Chris Morrison [X] (Inactive), Krzysztof Findeisen
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.