Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-31382

Run faro in ci_hsc and ci_imsim

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Story Points:
      5
    • Team:
      Data Release Production
    • Urgent?:
      No

      Description

      This will give the faro devs more confidence that their tickets will run on the RC2/DC2 reprocessings and thus real data release productions.

        Attachments

          Issue Links

            Activity

            Hide
            jcarlin Jeffrey Carlin added a comment -

            Yep, that's exactly what we would ideally want, Tim. Keith Bechtol and I have scheduled discussion of how to proceed with this for our next V&V team meeting.

            Show
            jcarlin Jeffrey Carlin added a comment - Yep, that's exactly what we would ideally want, Tim. Keith Bechtol and I have scheduled discussion of how to proceed with this for our next V&V team meeting.
            Hide
            dtaranu Dan Taranu added a comment - - edited

            I have updated the PR and will be running jenkins shortly (both ci_imsim and ci_hsc_gen3 are passing locally).

            I initially just wanted to make minimal changes to allow faro to pass on both, but it was getting difficult to follow the logic of the more complex run methods so I made some steps towards DM-31061 by changing those with catalogs, photoCalibs, astromCalibs arguments to instead use a Dict[str, List[CalibratedCatalog], where CalibratedCatalog is a frozen dataclass with a SourceCatalog and optional calibs. My logic for this is that as far as I can tell, all of the subsequent methods using these lists of catalogs and calibs expect them all to be the same length and in the same order (ie for zipping), so it makes more sense to have one single object storing related data.

            I did add some much-needed type hints where I could but did not go further in implementing the use of dicts of CalibratedCatalog across faro for a couple of reasons:

            1. Once faro starts using object tables instead of source catalogs (DM-31459, DM-31460) most of this will be a moot point. This is virtually a requirement for some tasks like wPerp to run at all. Although I 'fixed' it so that it can run locally, wPerp currently uses over 100GB of memory to merge catalogs, which is exactly what objectTable is doing anyway. So I don't want to put a lot of effort into refactoring code that's just going to be wiped soon anyway. You may want to reconsider the priority of DM-31061 in light of this.

            2. I'm not sure if you actually want to homogenize on a Dict[str, List] everywhere (where str is a band). You might want to consider attaching the dataId to the CalibratedCatalog (or, in the future, object table) instead. But either way the run methods need some kind of validation or separation between those that expect exactly one catalog vs multiple, one band vs multiple, etc.

            Show
            dtaranu Dan Taranu added a comment - - edited I have updated the PR and will be running jenkins shortly (both ci_imsim and ci_hsc_gen3 are passing locally). I initially just wanted to make minimal changes to allow faro to pass on both, but it was getting difficult to follow the logic of the more complex run methods so I made some steps towards DM-31061 by changing those with catalogs, photoCalibs, astromCalibs arguments to instead use a Dict[str, List[CalibratedCatalog] , where CalibratedCatalog is a frozen dataclass with a SourceCatalog and optional calibs. My logic for this is that as far as I can tell, all of the subsequent methods using these lists of catalogs and calibs expect them all to be the same length and in the same order (ie for zipping), so it makes more sense to have one single object storing related data. I did add some much-needed type hints where I could but did not go further in implementing the use of dicts of CalibratedCatalog across faro for a couple of reasons: 1. Once faro starts using object tables instead of source catalogs ( DM-31459 , DM-31460 ) most of this will be a moot point. This is virtually a requirement for some tasks like wPerp to run at all. Although I 'fixed' it so that it can run locally, wPerp currently uses over 100GB of memory to merge catalogs, which is exactly what objectTable is doing anyway. So I don't want to put a lot of effort into refactoring code that's just going to be wiped soon anyway. You may want to reconsider the priority of DM-31061 in light of this. 2. I'm not sure if you actually want to homogenize on a Dict [str, List] everywhere (where str is a band). You might want to consider attaching the dataId to the CalibratedCatalog (or, in the future, object table) instead. But either way the run methods need some kind of validation or separation between those that expect exactly one catalog vs multiple, one band vs multiple, etc.
            Hide
            dtaranu Dan Taranu added a comment -

            Successful Jenkins run. Keith Bechtol, are you happy with the new changes? There was a fair amount added from the initial PR.

            Show
            dtaranu Dan Taranu added a comment - Successful Jenkins run . Keith Bechtol , are you happy with the new changes? There was a fair amount added from the initial PR.
            Hide
            kbechtol Keith Bechtol added a comment -

            Thanks Dan Taranu for these updates. I reviewed again the additional changes in the faro package as well as the updates to add faro to ci_hsc_gen3 and ci_imsim testing. It will be an enormous aid to our development cycle to be able to easy perform this CI testing for the whole set of metrics. Also, the changes to the faro package in terms of homogenizing and simplifying the arguments passed to the run methods are very much appreciated.

            Show
            kbechtol Keith Bechtol added a comment - Thanks Dan Taranu for these updates. I reviewed again the additional changes in the faro package as well as the updates to add faro to ci_hsc_gen3 and ci_imsim testing. It will be an enormous aid to our development cycle to be able to easy perform this CI testing for the whole set of metrics. Also, the changes to the faro package in terms of homogenizing and simplifying the arguments passed to the run methods are very much appreciated.
            Hide
            dtaranu Dan Taranu added a comment -

            Merged and seemingly working correctly for the last week.

            Show
            dtaranu Dan Taranu added a comment - Merged and seemingly working correctly for the last week.

              People

              Assignee:
              dtaranu Dan Taranu
              Reporter:
              yusra Yusra AlSayyad
              Reviewers:
              Keith Bechtol
              Watchers:
              Dan Taranu, Jeffrey Carlin, Keith Bechtol, Tim Jenness, Yusra AlSayyad
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.