Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-27113

Convert RC2 w_2020_38 to gen3 with w_2020_42 stack

    XMLWordPrintable

    Details

      Description

      The conversion was originally attempted with w_2020_40, but it failed. A schema was renamed from rc2w38_ssw40 to rc2w38_ssw42, and another attempt was made with the w_2020_42 stack.

        Attachments

          Issue Links

            Activity

            No builds found.
            madamow Monika Adamow created issue -
            madamow Monika Adamow made changes -
            Field Original Value New Value
            Epic Link DM-22990 [ 428665 ]
            Hide
            madamow Monika Adamow added a comment -

            Error during the conversion:

             

            INFO  2020-10-08T13:57:22.332-0500 convertRepo - Ingesting 207 ps1_pv3_3pi_20170110 datasets into run refcats.
            Traceback (most recent call last):
              File "./gen3-hsc-rc2/bootstrap.py", line 354, in <module>
                main()
              File "./gen3-hsc-rc2/bootstrap.py", line 350, in main
                continue_=options.continue_, reruns=reruns)
              File "./gen3-hsc-rc2/bootstrap.py", line 300, in run
                visits=makeVisitList(tracts, filters)
              File "/software/lsstsw/stack_20200922/stack/miniconda3-py37_4.8.2-cb4e2dc/Linux64/obs_base/20.0.0-52-g73d9071+9bf1eb8e0a/python/lsst/obs/base/gen2to3/convertRepo.py", line 559, in run
                converter.ingest()
              File "/software/lsstsw/stack_20200922/stack/miniconda3-py37_4.8.2-cb4e2dc/Linux64/obs_base/20.0.0-52-g73d9071+9bf1eb8e0a/python/lsst/obs/base/gen2to3/repoConverter.py", line 493, in ingest
                run = self.getRun(datasetType.name, calibDate)
              File "/software/lsstsw/stack_20200922/stack/miniconda3-py37_4.8.2-cb4e2dc/Linux64/obs_base/20.0.0-52-g73d9071+9bf1eb8e0a/python/lsst/obs/base/gen2to3/standardRepoConverter.py", line 202, in getRun
                raise ValueError(f"No default run for repo at {self.root}, and no "
            ValueError: No default run for repo at /datasets/hsc/repo, and no override for dataset fgcmLookUpTable.
            

             

            Show
            madamow Monika Adamow added a comment - Error during the conversion:   INFO  2020-10-08T13:57:22.332-0500 convertRepo - Ingesting 207 ps1_pv3_3pi_20170110 datasets into run refcats. Traceback (most recent call last):   File "./gen3-hsc-rc2/bootstrap.py", line 354, in <module>     main()   File "./gen3-hsc-rc2/bootstrap.py", line 350, in main     continue_=options.continue_, reruns=reruns)   File "./gen3-hsc-rc2/bootstrap.py", line 300, in run     visits=makeVisitList(tracts, filters)   File "/software/lsstsw/stack_20200922/stack/miniconda3-py37_4.8.2-cb4e2dc/Linux64/obs_base/20.0.0-52-g73d9071+9bf1eb8e0a/python/lsst/obs/base/gen2to3/convertRepo.py", line 559, in run     converter.ingest()   File "/software/lsstsw/stack_20200922/stack/miniconda3-py37_4.8.2-cb4e2dc/Linux64/obs_base/20.0.0-52-g73d9071+9bf1eb8e0a/python/lsst/obs/base/gen2to3/repoConverter.py", line 493, in ingest     run = self.getRun(datasetType.name, calibDate)   File "/software/lsstsw/stack_20200922/stack/miniconda3-py37_4.8.2-cb4e2dc/Linux64/obs_base/20.0.0-52-g73d9071+9bf1eb8e0a/python/lsst/obs/base/gen2to3/standardRepoConverter.py", line 202, in getRun     raise ValueError(f"No default run for repo at {self.root}, and no " ValueError: No default run for repo at /datasets/hsc/repo, and no override for dataset fgcmLookUpTable.  
            tjenness Tim Jenness made changes -
            Watchers Monika Adamow [ Monika Adamow ] Jim Bosch, Michelle Gower, Monika Adamow [ Jim Bosch, Michelle Gower, Monika Adamow ]
            Hide
            tjenness Tim Jenness added a comment -

            In ci_hsc_gen2 we have a config file for the conversion which sets a default collection for specific datasets:

            # This file contains overrides for obs.base.gen2to3.ConvertRepoTask to export
            # the jointcal_* datasets that are in the root of the Gen2 repo into a special
            # "HSC/external" RUN collection, since it doesn't make sense to put them any of
            # the other RUNs generated from that conversion.  This doesn't go in the
            # obs_subaru config overrides because having those datasets in the root is
            # unique to ci_hsc_gen2.
             
            from lsst.obs.subaru import HyperSuprimeCam
             
            collection = HyperSuprimeCam.makeCollectionName("external")
            config.runs["jointcal_wcs"] = collection
            config.runs["jointcal_photoCalib"] = collection
            

            so maybe we need something like that for fgcmLookUpTable?

            Show
            tjenness Tim Jenness added a comment - In ci_hsc_gen2 we have a config file for the conversion which sets a default collection for specific datasets: # This file contains overrides for obs.base.gen2to3.ConvertRepoTask to export # the jointcal_* datasets that are in the root of the Gen2 repo into a special # "HSC/external" RUN collection, since it doesn't make sense to put them any of # the other RUNs generated from that conversion. This doesn't go in the # obs_subaru config overrides because having those datasets in the root is # unique to ci_hsc_gen2.   from lsst.obs.subaru import HyperSuprimeCam   collection = HyperSuprimeCam.makeCollectionName("external") config.runs["jointcal_wcs"] = collection config.runs["jointcal_photoCalib"] = collection so maybe we need something like that for fgcmLookUpTable?
            tjenness Tim Jenness made changes -
            Team Data Release Production [ 10301 ]
            tjenness Tim Jenness made changes -
            Assignee Monika Adamow [ madamow ] Jim Bosch [ jbosch ]
            tjenness Tim Jenness made changes -
            Watchers Jim Bosch, Michelle Gower, Monika Adamow, Tim Jenness [ Jim Bosch, Michelle Gower, Monika Adamow, Tim Jenness ] Eli Rykoff, Jim Bosch, Michelle Gower, Monika Adamow, Tim Jenness [ Eli Rykoff, Jim Bosch, Michelle Gower, Monika Adamow, Tim Jenness ]
            tjenness Tim Jenness made changes -
            Epic Link DM-22990 [ 428665 ]
            Hide
            jbosch Jim Bosch added a comment -

            Eli Rykoff, I think the question we need answered here is basically how we'd expect the FGCM lookup table to get to Gen3 repos in a pure-Gen3 world, because that tells us what collection we should use in the converter to make it look more-or-less like it was produced in a pure-Gen3 world.

            fgcmLookUpTable is something that will be produced by a PipelineTask running on as much of a survey as we can, but is then useful for calibrating even observations that were not in that set, right?

            Is there one dataset per instrument [per FGCM run]?

            Or is this something produced by modtran-ish things to make a file that some human has to manually ingest?

            Show
            jbosch Jim Bosch added a comment - Eli Rykoff , I think the question we need answered here is basically how we'd expect the FGCM lookup table to get to Gen3 repos in a pure-Gen3 world, because that tells us what collection we should use in the converter to make it look more-or-less like it was produced in a pure-Gen3 world. fgcmLookUpTable is something that will be produced by a PipelineTask running on as much of a survey as we can, but is then useful for calibrating even observations that were not in that set, right? Is there one dataset per instrument [per FGCM run] ? Or is this something produced by modtran-ish things to make a file that some human has to manually ingest?
            Hide
            erykoff Eli Rykoff added a comment -

            Ah, so this error is the same thing I mentioned on slack: https://lsstc.slack.com/archives/C2JPT1KB7/p1601913036259400 where this used to be a `WARN` and now is an exception. I'm still not 100% behind that change...

            Anyway, so specifically looking forward fgcmLookUpTable is a "curated calibration". It should be one per instrument, though it could be updated over time it is not a matter of a validity range, it's a choice for a given processing run.

            In the case of RC2, there is actually a default lookup table in the root repo, but that assumes we have r2 and i2, which we don't have for RC2. So that's why the RC2 instructions have it regenerated within the rerun. I'm not sure what the best thing to do about this is except wait for RC3.

            Show
            erykoff Eli Rykoff added a comment - Ah, so this error is the same thing I mentioned on slack: https://lsstc.slack.com/archives/C2JPT1KB7/p1601913036259400 where this used to be a `WARN` and now is an exception. I'm still not 100% behind that change... Anyway, so specifically looking forward fgcmLookUpTable is a "curated calibration". It should be one per instrument, though it could be updated over time it is not a matter of a validity range, it's a choice for a given processing run. In the case of RC2 , there is actually a default lookup table in the root repo, but that assumes we have r2 and i2 , which we don't have for RC2 . So that's why the RC2 instructions have it regenerated within the rerun. I'm not sure what the best thing to do about this is except wait for RC3 .
            Hide
            jbosch Jim Bosch added a comment -

            Interesting.  So it's a curated calibration - in the sense that we should put what we thing is the best one for an instrument in an obs_x_data package, even?  But it doesn't have a validity range, either because it has temporal dependence inside it or it doesn't actually depend on anything temporal?

            Show
            jbosch Jim Bosch added a comment - Interesting.  So it's a curated calibration - in the sense that we should put what we thing is the best one for an instrument in an obs_x_data package, even?  But it doesn't have a validity range, either because it has temporal dependence inside it or it doesn't actually depend on anything temporal?
            Hide
            erykoff Eli Rykoff added a comment -

            It's quite large (a few Gb), is that okay for an obs_x_data package? But I don't know how these things are carried around...something to note, though, is that it can be regenerated for a particular repo without too much trouble, it just takes some time.

            As for validity range, the only temporal dependence is implied (e.g., there would be an r and r2 filter which were installed at different times), but that is handled by the fgcm code itself.

            What might change is that (say) we replace a filter (as in HSC); or we get better measurements of the mirror reflectivity or CCD QEs, etc.

            In the future, though, it is possible that explicit temporal dependence is added, especially with regards to mirror reflectivity (I hope the CCD QE and filter throughputs don't change!) but that is not currently supported.

            Show
            erykoff Eli Rykoff added a comment - It's quite large (a few Gb), is that okay for an obs_x_data package? But I don't know how these things are carried around...something to note, though, is that it can be regenerated for a particular repo without too much trouble, it just takes some time. As for validity range, the only temporal dependence is implied (e.g., there would be an r and r2 filter which were installed at different times), but that is handled by the fgcm code itself. What might change is that (say) we replace a filter (as in HSC); or we get better measurements of the mirror reflectivity or CCD QEs, etc. In the future, though, it is possible that explicit temporal dependence is added, especially with regards to mirror reflectivity (I hope the CCD QE and filter throughputs don't change!) but that is not currently supported.
            jbosch Jim Bosch made changes -
            Remote Link This issue links to "Page (Confluence)" [ 26140 ]
            Hide
            tjenness Tim Jenness added a comment -

            Monika Adamow is blocked on this ticket. Is the simplest possible answer to add a line to the conversion configuration that declares a collection to use like we do with jointcal_wcs above?

            Show
            tjenness Tim Jenness added a comment - Monika Adamow is blocked on this ticket. Is the simplest possible answer to add a line to the conversion configuration that declares a collection to use like we do with jointcal_wcs above?
            Hide
            erykoff Eli Rykoff added a comment - - edited

            All dataset types need to be listed explicitly in the conversion config as of w40, as I mentioned above. So this needs to be added to the conversion config, and anything else that wasn't explicitly mentioned. There will be more than just this, I'm sure this is just the first.

            It used to be that it would warn and continue anyway.

            I think that whether this is a curated calibration or what is just a red herring. Right now it's just a dataset, and it must be configured.

            Show
            erykoff Eli Rykoff added a comment - - edited All dataset types need to be listed explicitly in the conversion config as of w40, as I mentioned above. So this needs to be added to the conversion config, and anything else that wasn't explicitly mentioned. There will be more than just this, I'm sure this is just the first. It used to be that it would warn and continue anyway. I think that whether this is a curated calibration or what is just a red herring. Right now it's just a dataset, and it must be configured.
            Hide
            jbosch Jim Bosch added a comment -

            So, best solution I can think of now is to put this in the "unbounded" collection run, and revisit this in the future (at least on DM-27147), in an obs_subaru config override:

            from lsst.obs.subaru import HyperSuprimeCam
            config.runs["fgcmLookUpTable"] = HyperSuprimeCam.makeUnboundedCalibrationRunName()

            That's an awkward location to expect pipetask invocations to find this, and we should probably fix that by finding a way to

            a) mark this as a calibration when we register the dataset type (currently I think ConvertRepoTask does that IFF it finds a dataset in a calibraiton repo)

            b) certify it into CALIBRATION collections like HSC/calib with an unbounded validity range (currently something only done by writeCuratedCalibrations.

            But we don't need any of those other steps to unblock RC2 Gen3 conversion, so maybe they should be another ticket; this does get cleaner if we decide to make this a regular curated calibration.  Treating it the same way as the HSC yBackground datasets might be another option, but I confess I don't actually remember exactly how we handle that.

            Show
            jbosch Jim Bosch added a comment - So, best solution I can think of now is to put this in the "unbounded" collection run, and revisit this in the future (at least on DM-27147 ), in an obs_subaru config override: from lsst.obs.subaru import HyperSuprimeCam config.runs[ "fgcmLookUpTable" ] = HyperSuprimeCam.makeUnboundedCalibrationRunName() That's an awkward location to expect pipetask invocations to find this, and we should probably fix that by finding a way to a) mark this as a calibration when we register the dataset type (currently I think ConvertRepoTask does that IFF it finds a dataset in a calibraiton repo) b) certify it into CALIBRATION collections like HSC/calib with an unbounded validity range (currently something only done by writeCuratedCalibrations . But we don't need any of those other steps to unblock RC2 Gen3 conversion, so maybe they should be another ticket; this does get cleaner if we decide to make this a regular curated calibration.  Treating it the same way as the HSC yBackground datasets might be another option, but I confess I don't actually remember exactly how we handle that.
            Hide
            tjenness Tim Jenness added a comment -

            Jim Bosch so to be completely clear, you want those two lines to be added to the end of obs_subaru/config/hsc/convertRepo.py ?

            Show
            tjenness Tim Jenness added a comment - Jim Bosch so to be completely clear, you want those two lines to be added to the end of obs_subaru/config/hsc/convertRepo.py ?
            Hide
            jbosch Jim Bosch added a comment -

            Yes, that's my proposal.

            Show
            jbosch Jim Bosch added a comment - Yes, that's my proposal.
            Hide
            mgower Michelle Gower added a comment -

            Since this started as an RC2 gen3 conversion ticket, please make a separate ticket for the other steps mentioned in the above comment.  We'll reassign this ticket to Monika Adamow.   She will make a ticket branch of obs_subaru and make this change and any other similar simple change that pops up while trying to run the conversion. 

            Show
            mgower Michelle Gower added a comment - Since this started as an RC2 gen3 conversion ticket, please make a separate ticket for the other steps mentioned in the above comment.  We'll reassign this ticket to Monika Adamow .   She will make a ticket branch of obs_subaru and make this change and any other similar simple change that pops up while trying to run the conversion. 
            mgower Michelle Gower made changes -
            Assignee Jim Bosch [ jbosch ] Monika Adamow [ madamow ]
            Hide
            erykoff Eli Rykoff added a comment -

            The script itself patches the config as well: https://github.com/lsst-dm/gen3-hsc-rc2/blob/master/bootstrap.py#L236-L252

            And this excludes the yBackground as best as I can tell.

            I think that there will be other hiccups, if I'm reading the code and configs correctly, then the datasetIncludePatterns only fires if a rerun isn't specified. And if a rerun isn't specified we're going to have to explicitly include or exclude all dataset types, configs, etc, that are in the RC2 repo.

            Show
            erykoff Eli Rykoff added a comment - The script itself patches the config as well: https://github.com/lsst-dm/gen3-hsc-rc2/blob/master/bootstrap.py#L236-L252 And this excludes the yBackground as best as I can tell. I think that there will be other hiccups, if I'm reading the code and configs correctly, then the datasetIncludePatterns only fires if a rerun isn't specified. And if a rerun isn't specified we're going to have to explicitly include or exclude all dataset types, configs, etc, that are in the RC2 repo.
            Hide
            jbosch Jim Bosch added a comment -

            I was originally thinking that obs_subaru was a better place for this config override than bootstrap.py because it'd be more general.  But now I see that bootstrap.py ignores yBackground in the configs precisely because it has other logic to add it later, and that logic is exactly the "extra steps" I referenced above.  So maybe with that as our model, we should apply the config override for FGCM there, too, and just get it all done now.  I take Michelle Gower's point that we don't want scope creep on a ticket like this, but now that I see that established pattern I think it's barely more difficult to just get it all done.  If there are no objections, I'll just put that on a branch of gen3-hsc-rc2 today (probably in about an hour and a half) and ask Monika Adamow to test it.

            I think that there will be other hiccups, if I'm reading the code and configs correctly, then the datasetIncludePatterns only fires if a rerun isn't specified. And if a rerun isn't specified we're going to have to explicitly include or exclude all dataset types, configs, etc, that are in the RC2 repo.

            This should only be true of datasets that are in a root Gen2 repo, and right now I still consider it a feature rather than a bug that we're forced to figure out what to do with any datasets people are putting there.  I might recant if we discover that the state of Gen2 repos in the wild is even more varied than I suspect, but I think there are a finite number of weird testdata repos in git LFS (which we've just about worked our way through), plus the big shared ones (NCSA, Princeton, NAOJ, NERSC, CCIN2P3) that are most standard and where non-standard datasets are a bigger problem anyway.

            Show
            jbosch Jim Bosch added a comment - I was originally thinking that obs_subaru was a better place for this config override than bootstrap.py because it'd be more general.  But now I see that bootstrap.py ignores yBackground in the configs precisely because it has other logic to add it later, and that logic is exactly the "extra steps" I referenced above.  So maybe with that as our model, we should apply the config override for FGCM there, too, and just get it all done now.  I take Michelle Gower 's point that we don't want scope creep on a ticket like this, but now that I see that established pattern I think it's barely more difficult to just get it all done.  If there are no objections, I'll just put that on a branch of gen3-hsc-rc2 today (probably in about an hour and a half) and ask Monika Adamow to test it. I think that there will be other hiccups, if I'm reading the code and configs correctly, then the datasetIncludePatterns only fires if a rerun isn't specified. And if a rerun isn't specified we're going to have to explicitly include or exclude all dataset types, configs, etc, that are in the RC2 repo. This should only be true of datasets that are in a root Gen2 repo, and right now I still consider it a feature rather than a bug that we're forced to figure out what to do with any datasets people are putting there.  I might recant if we discover that the state of Gen2 repos in the wild is even more varied than I suspect, but I think there are a finite number of weird testdata repos in git LFS (which we've just about worked our way through), plus the big shared ones (NCSA, Princeton, NAOJ, NERSC, CCIN2P3) that are most standard and where non-standard datasets are a bigger problem anyway.
            Hide
            jbosch Jim Bosch added a comment -

            Monika Adamow, I've just pushed the most minimal change that should fix this (just ignoring this dataset) to branch u/jbosch/DM-27113 of gen3-hsc-rc2, and I think it's ready for testing again.

            I'll open a new ticket for a more complete fix; after looking a bit more, doing that well will take more work than belongs on this ticket.

             

            Show
            jbosch Jim Bosch added a comment - Monika Adamow , I've just pushed the most minimal change that should fix this (just ignoring this dataset) to branch u/jbosch/ DM-27113 of gen3-hsc-rc2, and I think it's ready for testing again. I'll open a new ticket for a more complete fix; after looking a bit more, doing that well will take more work than belongs on this ticket.  
            madamow Monika Adamow made changes -
            Description The conversion was originally attempted with w_2020_40, but it failed. A schema was renamed from rc2w38_ssw40 to rc2w38_ssw42 and another attempt was made with w_2020_42 stack.
            madamow Monika Adamow made changes -
            Description The conversion was originally attempted with w_2020_40, but it failed. A schema was renamed from rc2w38_ssw40 to rc2w38_ssw42 and another attempt was made with w_2020_42 stack. The conversion was originally attempted with w_2020_40, but it failed. A schema was renamed from rc2w38_ssw40 to rc2w38_ssw42 and another attempt was made with the w_2020_42 stack.
            madamow Monika Adamow made changes -
            Description The conversion was originally attempted with w_2020_40, but it failed. A schema was renamed from rc2w38_ssw40 to rc2w38_ssw42 and another attempt was made with the w_2020_42 stack. The conversion was originally attempted with w_2020_40, but it failed. A schema was renamed from rc2w38_ssw40 to rc2w38_ssw42, and another attempt was made with the w_2020_42 stack.
            madamow Monika Adamow made changes -
            Attachment rc2w38ssw42.log [ 45880 ]
            Hide
            madamow Monika Adamow added a comment -

            The fix in u/jbosch/DM-27113 worked, but the conversion failed with another error. After discussing it with Michelle Gower, we decided to try with w_2020_42 stack (schema was renamed). It failed with the same error. Log file is attached to this ticket (rc2w38_ssw42.log). 

             

            Show
            madamow Monika Adamow added a comment - The fix in u/jbosch/ DM-27113  worked, but the conversion failed with another error. After discussing it with  Michelle Gower , we decided to try with w_2020_42 stack (schema was renamed). It failed with the same error. Log file is attached to this ticket (rc2w38_ssw42.log).   
            Hide
            tjenness Tim Jenness added a comment -

            There is an error about a missing run name at the top but I assume that's not important.

            This is the other error:

            INFO  2020-10-20T16:39:02.262-0500 convertRepo - Ingesting 1 deepCoadd_skyMap dataset into run skymaps.
            Traceback (most recent call last):
              File "/software/lsstsw/stack_20200922/conda/miniconda3-py37_4.8.2/envs/lsst-scipipe/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1278, in _execute_context
                cursor, statement, parameters, context
              File "/software/lsstsw/stack_20200922/conda/miniconda3-py37_4.8.2/envs/lsst-scipipe/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 593, in do_execute
                cursor.execute(statement, parameters)
            psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "dataset_tags_0008000000_unq_dataset_type_id_collection_5cf81e3b"
            DETAIL:  Key (dataset_type_id, collection_id, skymap)=(16, 238, hsc_rings_v1) already exists.
             
             
            The above exception was the direct cause of the following exception:
             
            Traceback (most recent call last):
              File "/software/lsstsw/stack_20200922/stack/miniconda3-py37_4.8.2-cb4e2dc/Linux64/daf_butler/19.0.0-174-g77e5f269+ff10c6d78d/python/lsst/daf/butler/registry/_registry.py", line 671, in insertDatasets
                refs = list(storage.insert(runRecord, expandedDataIds))
              File "/software/lsstsw/stack_20200922/stack/miniconda3-py37_4.8.2-cb4e2dc/Linux64/daf_butler/19.0.0-174-g77e5f269+ff10c6d78d/python/lsst/daf/butler/registry/datasets/byDimensions/_storage.py", line 83, in insert
                self._db.insert(self._tags, *tagsRows)
              File "/software/lsstsw/stack_20200922/stack/miniconda3-py37_4.8.2-cb4e2dc/Linux64/daf_butler/19.0.0-174-g77e5f269+ff10c6d78d/python/lsst/daf/butler/registry/interfaces/_database.py", line 1222, in insert
                self._connection.execute(table.insert(), *rows)
              File "/software/lsstsw/stack_20200922/conda/miniconda3-py37_4.8.2/envs/lsst-scipipe/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1014, in execute
                return meth(self, multiparams, params)
              File "/software/lsstsw/stack_20200922/conda/miniconda3-py37_4.8.2/envs/lsst-scipipe/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
                return connection._execute_clauseelement(self, multiparams, params)
              File "/software/lsstsw/stack_20200922/conda/miniconda3-py37_4.8.2/envs/lsst-scipipe/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1133, in _execute_clauseelement
                distilled_params,
              File "/software/lsstsw/stack_20200922/conda/miniconda3-py37_4.8.2/envs/lsst-scipipe/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1318, in _execute_context
                e, statement, parameters, cursor, context
              File "/software/lsstsw/stack_20200922/conda/miniconda3-py37_4.8.2/envs/lsst-scipipe/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1512, in _handle_dbapi_exception
                sqlalchemy_exception, with_traceback=exc_info[2], from_=e
              File "/software/lsstsw/stack_20200922/conda/miniconda3-py37_4.8.2/envs/lsst-scipipe/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 178, in raise_
                raise exception
              File "/software/lsstsw/stack_20200922/conda/miniconda3-py37_4.8.2/envs/lsst-scipipe/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1278, in _execute_context
                cursor, statement, parameters, context
              File "/software/lsstsw/stack_20200922/conda/miniconda3-py37_4.8.2/envs/lsst-scipipe/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 593, in do_execute
                cursor.execute(statement, parameters)
            sqlalchemy.exc.IntegrityError: (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "dataset_tags_0008000000_unq_dataset_type_id_collection_5cf81e3b"
            DETAIL:  Key (dataset_type_id, collection_id, skymap)=(16, 238, hsc_rings_v1) already exists.
             
            [SQL: INSERT INTO rc2w38_ssw42.dataset_tags_0008000000 (dataset_type_id, dataset_id, collection_id, skymap) VALUES (%(dataset_type_id)s, %(dataset_id)s, %(collection_id)s, %(skymap)s)]
            [parameters: {'dataset_type_id': 16, 'dataset_id': 439257, 'collection_id': 238, 'skymap': 'hsc_rings_v1'}]
            (Background on this error at: http://sqlalche.me/e/13/gkpj)
             
            The above exception was the direct cause of the following exception:
             
            Traceback (most recent call last):
              File "./gen3-hsc-rc2/bootstrap.py", line 356, in <module>
                main()
              File "./gen3-hsc-rc2/bootstrap.py", line 352, in main
                continue_=options.continue_, reruns=reruns)
              File "./gen3-hsc-rc2/bootstrap.py", line 302, in run
                visits=makeVisitList(tracts, filters)
              File "/software/lsstsw/stack_20200922/stack/miniconda3-py37_4.8.2-cb4e2dc/Linux64/obs_base/20.0.0-58-g47b63df+0e9af1ef10/python/lsst/obs/base/gen2to3/convertRepo.py", line 570, in run
                converter.ingest()
              File "/software/lsstsw/stack_20200922/stack/miniconda3-py37_4.8.2-cb4e2dc/Linux64/obs_base/20.0.0-58-g47b63df+0e9af1ef10/python/lsst/obs/base/gen2to3/repoConverter.py", line 505, in ingest
                run=run)
              File "/software/lsstsw/stack_20200922/stack/miniconda3-py37_4.8.2-cb4e2dc/Linux64/daf_butler/19.0.0-174-g77e5f269+ff10c6d78d/python/lsst/daf/butler/core/utils.py", line 261, in inner
                return func(self, *args, **kwargs)
              File "/software/lsstsw/stack_20200922/stack/miniconda3-py37_4.8.2-cb4e2dc/Linux64/daf_butler/19.0.0-174-g77e5f269+ff10c6d78d/python/lsst/daf/butler/_butler.py", line 1239, in ingest
                run=run)
              File "/software/lsstsw/stack_20200922/stack/miniconda3-py37_4.8.2-cb4e2dc/Linux64/daf_butler/19.0.0-174-g77e5f269+ff10c6d78d/python/lsst/daf/butler/core/utils.py", line 261, in inner
                return func(self, *args, **kwargs)
              File "/software/lsstsw/stack_20200922/stack/miniconda3-py37_4.8.2-cb4e2dc/Linux64/daf_butler/19.0.0-174-g77e5f269+ff10c6d78d/python/lsst/daf/butler/registry/_registry.py", line 678, in insertDatasets
                f"dimension row is missing.") from err
            lsst.daf.butler.registry._exceptions.ConflictingDefinitionError: A database constraint failure was triggered by inserting one or more datasets of type DatasetType('deepCoadd_skyMap', {skymap}, SkyMap) into collection 'skymaps'. This probably means a dataset with the same data ID and dataset type already exists, but it may also mean a dimension row is missing.
            + ./query_results.py postgresql://madamow@lsst-pg-prod1.ncsa.illinois.edu:5432/lsstdb1 rc2w38_ssw42
            select count(*), c.name from rc2w38_ssw42.run r, rc2w38_ssw42.collection c, rc2w38_ssw42.dataset ds where ds.run_id=r.collection_id and r.collection_id=c.collection_id group by c.name order by c.name
            

            Show
            tjenness Tim Jenness added a comment - There is an error about a missing run name at the top but I assume that's not important. This is the other error: INFO 2020-10-20T16:39:02.262-0500 convertRepo - Ingesting 1 deepCoadd_skyMap dataset into run skymaps. Traceback (most recent call last): File "/software/lsstsw/stack_20200922/conda/miniconda3-py37_4.8.2/envs/lsst-scipipe/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1278, in _execute_context cursor, statement, parameters, context File "/software/lsstsw/stack_20200922/conda/miniconda3-py37_4.8.2/envs/lsst-scipipe/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 593, in do_execute cursor.execute(statement, parameters) psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "dataset_tags_0008000000_unq_dataset_type_id_collection_5cf81e3b" DETAIL: Key (dataset_type_id, collection_id, skymap)=(16, 238, hsc_rings_v1) already exists.     The above exception was the direct cause of the following exception:   Traceback (most recent call last): File "/software/lsstsw/stack_20200922/stack/miniconda3-py37_4.8.2-cb4e2dc/Linux64/daf_butler/19.0.0-174-g77e5f269+ff10c6d78d/python/lsst/daf/butler/registry/_registry.py", line 671, in insertDatasets refs = list(storage.insert(runRecord, expandedDataIds)) File "/software/lsstsw/stack_20200922/stack/miniconda3-py37_4.8.2-cb4e2dc/Linux64/daf_butler/19.0.0-174-g77e5f269+ff10c6d78d/python/lsst/daf/butler/registry/datasets/byDimensions/_storage.py", line 83, in insert self._db.insert(self._tags, *tagsRows) File "/software/lsstsw/stack_20200922/stack/miniconda3-py37_4.8.2-cb4e2dc/Linux64/daf_butler/19.0.0-174-g77e5f269+ff10c6d78d/python/lsst/daf/butler/registry/interfaces/_database.py", line 1222, in insert self._connection.execute(table.insert(), *rows) File "/software/lsstsw/stack_20200922/conda/miniconda3-py37_4.8.2/envs/lsst-scipipe/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1014, in execute return meth(self, multiparams, params) File "/software/lsstsw/stack_20200922/conda/miniconda3-py37_4.8.2/envs/lsst-scipipe/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection return connection._execute_clauseelement(self, multiparams, params) File "/software/lsstsw/stack_20200922/conda/miniconda3-py37_4.8.2/envs/lsst-scipipe/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1133, in _execute_clauseelement distilled_params, File "/software/lsstsw/stack_20200922/conda/miniconda3-py37_4.8.2/envs/lsst-scipipe/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1318, in _execute_context e, statement, parameters, cursor, context File "/software/lsstsw/stack_20200922/conda/miniconda3-py37_4.8.2/envs/lsst-scipipe/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1512, in _handle_dbapi_exception sqlalchemy_exception, with_traceback=exc_info[2], from_=e File "/software/lsstsw/stack_20200922/conda/miniconda3-py37_4.8.2/envs/lsst-scipipe/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 178, in raise_ raise exception File "/software/lsstsw/stack_20200922/conda/miniconda3-py37_4.8.2/envs/lsst-scipipe/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1278, in _execute_context cursor, statement, parameters, context File "/software/lsstsw/stack_20200922/conda/miniconda3-py37_4.8.2/envs/lsst-scipipe/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 593, in do_execute cursor.execute(statement, parameters) sqlalchemy.exc.IntegrityError: (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "dataset_tags_0008000000_unq_dataset_type_id_collection_5cf81e3b" DETAIL: Key (dataset_type_id, collection_id, skymap)=(16, 238, hsc_rings_v1) already exists.   [SQL: INSERT INTO rc2w38_ssw42.dataset_tags_0008000000 (dataset_type_id, dataset_id, collection_id, skymap) VALUES (%(dataset_type_id)s, %(dataset_id)s, %(collection_id)s, %(skymap)s)] [parameters: {'dataset_type_id': 16, 'dataset_id': 439257, 'collection_id': 238, 'skymap': 'hsc_rings_v1'}] (Background on this error at: http://sqlalche.me/e/13/gkpj)   The above exception was the direct cause of the following exception:   Traceback (most recent call last): File "./gen3-hsc-rc2/bootstrap.py", line 356, in <module> main() File "./gen3-hsc-rc2/bootstrap.py", line 352, in main continue_=options.continue_, reruns=reruns) File "./gen3-hsc-rc2/bootstrap.py", line 302, in run visits=makeVisitList(tracts, filters) File "/software/lsstsw/stack_20200922/stack/miniconda3-py37_4.8.2-cb4e2dc/Linux64/obs_base/20.0.0-58-g47b63df+0e9af1ef10/python/lsst/obs/base/gen2to3/convertRepo.py", line 570, in run converter.ingest() File "/software/lsstsw/stack_20200922/stack/miniconda3-py37_4.8.2-cb4e2dc/Linux64/obs_base/20.0.0-58-g47b63df+0e9af1ef10/python/lsst/obs/base/gen2to3/repoConverter.py", line 505, in ingest run=run) File "/software/lsstsw/stack_20200922/stack/miniconda3-py37_4.8.2-cb4e2dc/Linux64/daf_butler/19.0.0-174-g77e5f269+ff10c6d78d/python/lsst/daf/butler/core/utils.py", line 261, in inner return func(self, *args, **kwargs) File "/software/lsstsw/stack_20200922/stack/miniconda3-py37_4.8.2-cb4e2dc/Linux64/daf_butler/19.0.0-174-g77e5f269+ff10c6d78d/python/lsst/daf/butler/_butler.py", line 1239, in ingest run=run) File "/software/lsstsw/stack_20200922/stack/miniconda3-py37_4.8.2-cb4e2dc/Linux64/daf_butler/19.0.0-174-g77e5f269+ff10c6d78d/python/lsst/daf/butler/core/utils.py", line 261, in inner return func(self, *args, **kwargs) File "/software/lsstsw/stack_20200922/stack/miniconda3-py37_4.8.2-cb4e2dc/Linux64/daf_butler/19.0.0-174-g77e5f269+ff10c6d78d/python/lsst/daf/butler/registry/_registry.py", line 678, in insertDatasets f"dimension row is missing.") from err lsst.daf.butler.registry._exceptions.ConflictingDefinitionError: A database constraint failure was triggered by inserting one or more datasets of type DatasetType('deepCoadd_skyMap', {skymap}, SkyMap) into collection 'skymaps'. This probably means a dataset with the same data ID and dataset type already exists, but it may also mean a dimension row is missing. + ./query_results.py postgresql://madamow@lsst-pg-prod1.ncsa.illinois.edu:5432/lsstdb1 rc2w38_ssw42 select count(*), c.name from rc2w38_ssw42.run r, rc2w38_ssw42.collection c, rc2w38_ssw42.dataset ds where ds.run_id=r.collection_id and r.collection_id=c.collection_id group by c.name order by c.name
            madamow Monika Adamow made changes -
            Comment [ I can fix the run name. ]
            Hide
            jbosch Jim Bosch added a comment -

            I have a theory: there are equivalent deepCoadd_skyMap datasets in multiple Gen2 repos, we have no code dedicating to deduplicating them, and a

            config.runs["deepCoadd_skyMap"]

            override is telling the converter to put them all in one Gen3 collection. I think we want some of those to land in rerun-named collections instead, which means that override is active somewhere it shouldn't be in the converter code. But I'd like to poke around the Gen2 repos and conversion code to test that theory to be sure.

            Show
            jbosch Jim Bosch added a comment - I have a theory: there are equivalent deepCoadd_skyMap datasets in multiple Gen2 repos, we have no code dedicating to deduplicating them, and a config.runs ["deepCoadd_skyMap"] override is telling the converter to put them all in one Gen3 collection. I think we want some of those to land in rerun-named collections instead, which means that override is active somewhere it shouldn't be in the converter code. But I'd like to poke around the Gen2 repos and conversion code to test that theory to be sure.
            Hide
            jbosch Jim Bosch added a comment -

            I have a potential fix on branch u/jbosch/DM-27113 of obs_base. Monika Adamow, could you test that? If it works, we should make sure it gets through Jenkins ci_hsc before merging it.

            Show
            jbosch Jim Bosch added a comment - I have a potential fix on branch u/jbosch/ DM-27113 of obs_base. Monika Adamow , could you test that? If it works, we should make sure it gets through Jenkins ci_hsc before merging it.
            Hide
            madamow Monika Adamow added a comment -

            Jim Bosch one problem fixed, another popped out.

             

            INFO  2020-10-22T16:29:42.040-0500 convertRepo - Calibration validity gap closed from 2017-09-04 00:00:00.000 to 2017-09-05 00:00:00.000
            Traceback (most recent call last):
              File "./gen3-hsc-rc2/bootstrap.py", line 356, in <module>
                main()
              File "./gen3-hsc-rc2/bootstrap.py", line 352, in main
                continue_=options.continue_, reruns=reruns)
              File "./gen3-hsc-rc2/bootstrap.py", line 302, in run
                visits=makeVisitList(tracts, filters)
              File "/scratch/madamow/rc2w38_convert_w42/obs_base/python/lsst/obs/base/gen2to3/convertRepo.py", line 583, in run
                chain.append(spec.parent)
            AttributeError: 'Rerun' object has no attribute 'parent'
            

             

            Show
            madamow Monika Adamow added a comment - Jim Bosch one problem fixed, another popped out.   INFO  2020-10-22T16:29:42.040-0500 convertRepo - Calibration validity gap closed from 2017-09-04 00:00:00.000 to 2017-09-05 00:00:00.000 Traceback (most recent call last):   File "./gen3-hsc-rc2/bootstrap.py", line 356, in <module>     main()   File "./gen3-hsc-rc2/bootstrap.py", line 352, in main     continue_=options.continue_, reruns=reruns)   File "./gen3-hsc-rc2/bootstrap.py", line 302, in run     visits=makeVisitList(tracts, filters)   File "/scratch/madamow/rc2w38_convert_w42/obs_base/python/lsst/obs/base/gen2to3/convertRepo.py", line 583, in run     chain.append(spec.parent) AttributeError: 'Rerun' object has no attribute 'parent'  
            madamow Monika Adamow made changes -
            Status To Do [ 10001 ] In Progress [ 3 ]
            Hide
            jbosch Jim Bosch added a comment -

            I've pushed another commit to the u/jbosch/DM-27113 branch of obs_base that will hopefully fix that one.

            Show
            jbosch Jim Bosch added a comment - I've pushed another commit to the u/jbosch/ DM-27113 branch of obs_base that will hopefully fix that one.
            Hide
            madamow Monika Adamow added a comment -

            Thanks Jim Bosch! The conversion is complete.

            Show
            madamow Monika Adamow added a comment - Thanks Jim Bosch ! The conversion is complete.
            madamow Monika Adamow made changes -
            Epic Link DM-22990 [ 428665 ]
            madamow Monika Adamow made changes -
            Summary Convert RC2 w_2020_38 to gen3 with w_2020_40 stack Convert RC2 w_2020_38 to gen3 with w_2020_42 stack
            madamow Monika Adamow made changes -
            Attachment rc2w38ssw42_att3_success.log [ 45936 ]
            madamow Monika Adamow made changes -
            Resolution Done [ 10000 ]
            Status In Progress [ 3 ] Done [ 10002 ]
            tjenness Tim Jenness made changes -
            Assignee Monika Adamow [ madamow ]
            tjenness Tim Jenness made changes -
            Assignee Monika Adamow [ madamow ]
            tjenness Tim Jenness made changes -
            Component/s obs_base [ 10719 ]
            tjenness Tim Jenness made changes -
            Labels gen3-middleware
            yusra Yusra AlSayyad made changes -
            Team Data Release Production [ 10301 ] Data Facility [ 12219 ]

              People

              Assignee:
              madamow Monika Adamow
              Reporter:
              madamow Monika Adamow
              Watchers:
              Eli Rykoff, Jim Bosch, Michelle Gower, Monika Adamow, Tim Jenness
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.