Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-28991

Fix gen3 jointcal refcat area

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Won't Fix
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: jointcal
    • Labels:
      None
    • Story Points:
      2
    • Team:
      Data Access and Database
    • Urgent?:
      No

      Description

      As implemented in DM-27869, jointcal can only load refcats that overlap the specified tract: the gen3 `ReferenceObjectLoader` is only given datarefs to refcat shards that touch the tract when the when the QuantumGraph is built. Nate Lust suggested the approach to fix this is to make refcats PrerequisiteInputs and then define a lookupFunction to find every refcat that touches every visit+detector combination.

      Doing this as a follow-on to DM-27869 so I can get that merged as a starting point.

      This may become entirely trivial if DM-21904 is completed.

        Attachments

          Issue Links

            Activity

            Hide
            nlust Nate Lust added a comment - - edited

            I'm cross posting this from a personal message in case it proves useful for someone else to see, or for long term records. Below is a script to look up the ref cats as required and contains comments on how this would be different as part of a lookup function.

            from lsst.daf.butler import Butler
            butlerPath = '/datasets/hsc/gen3repo/rc2w02_ssw03/'
            collections = 'HSC/runs/RC2/w_2021_02' # this will be provided to you
            butler = Butler(butlerPath)
            # registry is what you will be provided
            registry = butler.registry
            # You will be given a quantum dataId, but we will look up one for this example, look up tract 9813 for i band
            quantaId = list(registry.queryDataIds(("tract","band","instrument", "skymap"), where="tract=9813 AND skymap='hsc_rings_v1' AND instrument='HSC' AND band='i'"))[0]
            # Now with the given quantumId look up the visits that correspond with that Tract
            visitRefs = set(butler.registry.queryDataIds(("visit","instrument"), dataId=quantaId))
            # get the refcats that overlap with each visit
            # note the datasettypename will be self.name in the lookup function
            dsTypeName = 'gaia_dr2_20200414'
             
            refCatRefs = set()
            for vr in visitRefs:
                refCats = refCats.union(butler.registry.queryDatasets(dsTypeName, collections=collections, dataId=vr))
            print(refCats) # These are the DatasetRefs for refcats which correspond to the visits that overlap the tract you are working with
            

            Show
            nlust Nate Lust added a comment - - edited I'm cross posting this from a personal message in case it proves useful for someone else to see, or for long term records. Below is a script to look up the ref cats as required and contains comments on how this would be different as part of a lookup function. from lsst.daf.butler import Butler butlerPath = '/datasets/hsc/gen3repo/rc2w02_ssw03/' collections = 'HSC/runs/RC2/w_2021_02' # this will be provided to you butler = Butler(butlerPath) # registry is what you will be provided registry = butler.registry # You will be given a quantum dataId, but we will look up one for this example, look up tract 9813 for i band quantaId = list (registry.queryDataIds(( "tract" , "band" , "instrument" , "skymap" ), where = "tract=9813 AND skymap='hsc_rings_v1' AND instrument='HSC' AND band='i'" ))[ 0 ] # Now with the given quantumId look up the visits that correspond with that Tract visitRefs = set (butler.registry.queryDataIds(( "visit" , "instrument" ), dataId = quantaId)) # get the refcats that overlap with each visit # note the datasettypename will be self.name in the lookup function dsTypeName = 'gaia_dr2_20200414'   refCatRefs = set () for vr in visitRefs: refCats = refCats.union(butler.registry.queryDatasets(dsTypeName, collections = collections, dataId = vr)) print (refCats) # These are the DatasetRefs for refcats which correspond to the visits that overlap the tract you are working with
            Hide
            jbosch Jim Bosch added a comment -

            I have a probable fix for this in place on tickets/DM-29615, where I was changing nearby code in similar ways and saw a TODO comment for this ticket. I've tested that it doesn't break anything, and I'm pretty convinced that means it's doing the right thing, but I haven't directly confirmed that it addresses the problem (e.g. by inspecting the QuantumGraph in an RC2 run and plotting various regions), and haven't begun to think about how to unit test it (and have no plans to on that ticket), so I think it's worth keeping this ticket around for some kind of testing, even if that makes it a lower priority.

            Show
            jbosch Jim Bosch added a comment - I have a probable fix for this in place on tickets/ DM-29615 , where I was changing nearby code in similar ways and saw a TODO comment for this ticket. I've tested that it doesn't break anything, and I'm pretty convinced that means it's doing the right thing, but I haven't directly confirmed that it addresses the problem (e.g. by inspecting the QuantumGraph in an RC2 run and plotting various regions), and haven't begun to think about how to unit test it (and have no plans to on that ticket), so I think it's worth keeping this ticket around for some kind of testing, even if that makes it a lower priority.
            Hide
            jbosch Jim Bosch added a comment -

            The last comment notes that this was done on DM-29615, and this ticket was kept around just to test that it worked.  We've certainly done one-off testing by this point, and I'm pretty sure the regular jointcal unit tests have got this covered now, too.  John Parejko can reopen if he disagrees.

            Show
            jbosch Jim Bosch added a comment - The last comment notes that this was done on DM-29615 , and this ticket was kept around just to test that it worked.  We've certainly done one-off testing by this point, and I'm pretty sure the regular jointcal unit tests have got this covered now, too.  John Parejko can reopen if he disagrees.
            Hide
            Parejkoj John Parejko added a comment -

            Jointcal unittests don't cover this aspect: it needs to have refcat data extending past the tract boundary (or equivalent mocks), and I don't have testdata like that. The relevant method is `lookupVisitRefCats`, added by you; see discussion here. Jointcal at least outputs the relevant metrics in gen3 now, so a gen2/gen3 comparison on a full tracted+multiple dithers would probably tell us for sure that we're getting it right. I think Lauren MacArthur's work on DM-29821 demonstrated that we were loading the same refcat area in gen2/gen3 on 9813 (assuming that's a good comparison tract?), so we're probably ok here.

            Show
            Parejkoj John Parejko added a comment - Jointcal unittests don't cover this aspect: it needs to have refcat data extending past the tract boundary (or equivalent mocks), and I don't have testdata like that. The relevant method is `lookupVisitRefCats`, added by you; see discussion here . Jointcal at least outputs the relevant metrics in gen3 now, so a gen2/gen3 comparison on a full tracted+multiple dithers would probably tell us for sure that we're getting it right. I think Lauren MacArthur 's work on DM-29821 demonstrated that we were loading the same refcat area in gen2/gen3 on 9813 (assuming that's a good comparison tract?), so we're probably ok here.

              People

              Assignee:
              Parejkoj John Parejko
              Reporter:
              Parejkoj John Parejko
              Watchers:
              Ian Sullivan, Jim Bosch, John Parejko, Nate Lust
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.