Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-26039

Missing ObsCore fields for image metadata in Butler Gen3

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: ImgServ
    • Labels:
    • Team:
      Data Access and Database

      Description

      The missing (or desired) fields are denoted by # FIXME

      in the generation script for ObsCore data for images:

      https://github.com/lsst/dax_imgserv/blob/tickets/DM-21433/integration/ci_hsc/gen_img_obscore.py

       

      Here're the FIXME line items:

            r["target_name"] = ? # FIXME: to be scheduler field ID 

            r["obs_id"] = ? # FIXME: to be online OBSID

            r["s_fov"] =  ?  # FIXME: fov - the diameter of a circle around s_ra, s_dec

       

      For reference on the above fields, see Gregory Dubois-Felsmann's notes: https://confluence.lsstcorp.org/pages/viewpage.action?spaceKey=~gpdf&title=Satisfying+ObsCore+from+the+Gen3+Butler+Schema

       

      As of now, the script succeeded in fetching all the other fields from the ci_hsc test dataset without loading the Exposure object of the corresponding FITS file into memory, taking about ~1 min (on my low-end PC) to process and generate 68 rows of ObsCore data. 

       

      FYI, the generated CVS raw output has been attached here as reference. This raw CVS output is to be combined with SQL DDL and DML scripts to insert the data into a local PostgreSQL database, which has been done with success, albeit using psql tool.

       

        Attachments

          Issue Links

            Activity

            No builds found.
            kennylo Kenny Lo created issue -
            kennylo Kenny Lo made changes -
            Field Original Value New Value
            Description The missing (or desired) fields are denoted by # FIXME

            in the generation script for ObsCore data for images:

            [https://github.com/lsst/dax_imgserv/blob/tickets/DM-21433/integration/ci_hsc/gen_img_obscore.py]

             

            As of now, the script succeeded in fetching all the other fields from the ci_hsc test dataset without loading the Exposure object of the corresponding FITS file into memory, taking about ~1 min (on my low-end PC) to process and generate 68 rows of ObsCore data. 

             

            FYI, the generated CVS raw output has been attached here for reference. This raw CVS output is to be combined with SQL DDL script to insert the data into a local PostgreSQL database, which has been done with success, albeit using psql tool.

             
            The missing (or desired) fields are denoted by # FIXME

            in the generation script for ObsCore data for images:

            [https://github.com/lsst/dax_imgserv/blob/tickets/DM-21433/integration/ci_hsc/gen_img_obscore.py]

             

            Here're the FIXME line items:

                  r["target_name"] = ? # FIXME: to be scheduler field ID 

                  r["obs_id"] = ? # FIXME: to be online OBSID

                  r["s_fov"] =  ?  # FIXME: fov - the diameter of a circle around s_ra, s_dec

             

            For reference on the above fields, see [~gpdf]'s notes: [https://confluence.lsstcorp.org/pages/viewpage.action?spaceKey=~gpdf&title=Satisfying+ObsCore+from+the+Gen3+Butler+Schema]

             

            As of now, the script succeeded in fetching all the other fields from the ci_hsc test dataset without loading the Exposure object of the corresponding FITS file into memory, taking about ~1 min (on my low-end PC) to process and generate 68 rows of ObsCore data. 

             

            FYI, the generated CVS raw output has been attached here as reference. This raw CVS output is to be combined with SQL DDL script to insert the data into a local PostgreSQL database, which has been done with success, albeit using psql tool.

             
            kennylo Kenny Lo made changes -
            Description The missing (or desired) fields are denoted by # FIXME

            in the generation script for ObsCore data for images:

            [https://github.com/lsst/dax_imgserv/blob/tickets/DM-21433/integration/ci_hsc/gen_img_obscore.py]

             

            Here're the FIXME line items:

                  r["target_name"] = ? # FIXME: to be scheduler field ID 

                  r["obs_id"] = ? # FIXME: to be online OBSID

                  r["s_fov"] =  ?  # FIXME: fov - the diameter of a circle around s_ra, s_dec

             

            For reference on the above fields, see [~gpdf]'s notes: [https://confluence.lsstcorp.org/pages/viewpage.action?spaceKey=~gpdf&title=Satisfying+ObsCore+from+the+Gen3+Butler+Schema]

             

            As of now, the script succeeded in fetching all the other fields from the ci_hsc test dataset without loading the Exposure object of the corresponding FITS file into memory, taking about ~1 min (on my low-end PC) to process and generate 68 rows of ObsCore data. 

             

            FYI, the generated CVS raw output has been attached here as reference. This raw CVS output is to be combined with SQL DDL script to insert the data into a local PostgreSQL database, which has been done with success, albeit using psql tool.

             
            The missing (or desired) fields are denoted by # FIXME

            in the generation script for ObsCore data for images:

            [https://github.com/lsst/dax_imgserv/blob/tickets/DM-21433/integration/ci_hsc/gen_img_obscore.py]

             

            Here're the FIXME line items:

                  r["target_name"] = ? # FIXME: to be scheduler field ID 

                  r["obs_id"] = ? # FIXME: to be online OBSID

                  r["s_fov"] =  ?  # FIXME: fov - the diameter of a circle around s_ra, s_dec

             

            For reference on the above fields, see [~gpdf]'s notes: [https://confluence.lsstcorp.org/pages/viewpage.action?spaceKey=~gpdf&title=Satisfying+ObsCore+from+the+Gen3+Butler+Schema]

             

            As of now, the script succeeded in fetching all the other fields from the ci_hsc test dataset without loading the Exposure object of the corresponding FITS file into memory, taking about ~1 min (on my low-end PC) to process and generate 68 rows of ObsCore data. 

             

            FYI, the generated CVS raw output has been attached here as reference. This raw CVS output is to be combined with SQL DDL and DML scripts to insert the data into a local PostgreSQL database, which has been done with success, albeit using psql tool.

             
            Hide
            tjenness Tim Jenness added a comment -

            I may be confused but obs_id for an exposure is the name of the exposure in the gen3 record. You can see this by looking at https://github.com/lsst/obs_base/blob/master/python/lsst/obs/base/_instrument.py#L499 – it's the unique string identifying the observation (from the OBSID header).

            The target name I can add (I already calculate it in ObservationInfo).

            Field of view seems to be a slightly different topic in that from your point of view it's fixed for the camera. Butler calculates the region for visits and puts them in the visit table but it's never going to change for LSSTCam or ComCam so you can have a lookup by instrument name and use a constant.

            Show
            tjenness Tim Jenness added a comment - I may be confused but obs_id for an exposure is the name of the exposure in the gen3 record. You can see this by looking at https://github.com/lsst/obs_base/blob/master/python/lsst/obs/base/_instrument.py#L499 – it's the unique string identifying the observation (from the OBSID header). The target name I can add (I already calculate it in ObservationInfo). Field of view seems to be a slightly different topic in that from your point of view it's fixed for the camera. Butler calculates the region for visits and puts them in the visit table but it's never going to change for LSSTCam or ComCam so you can have a lookup by instrument name and use a constant.
            Hide
            gpdf Gregory Dubois-Felsmann added a comment -

            s_fov will differ for single-epoch (CCD) images vs. coadd patches. I think the lookup should be by instrument AND Butler dataset type.

            Show
            gpdf Gregory Dubois-Felsmann added a comment - s_fov will differ for single-epoch (CCD) images vs. coadd patches. I think the lookup should be by instrument AND Butler dataset type.
            Hide
            gpdf Gregory Dubois-Felsmann added a comment -

            If target_name is going to be based on the scheduler field ID (for single-epoch images), it should have a prefix that defines a namespace, e.g., "Sched_00307", to allow these to be distinguished from other possible targets - for instance, it could be a special ID for TOO observations ("TOO_00017") and for calibration frames it could be "DomeScreen" or some such.

            I don't know if we have to understand all of that now, though?  Just make sure there's a prefix.

            Show
            gpdf Gregory Dubois-Felsmann added a comment - If target_name is going to be based on the scheduler field ID (for single-epoch images), it should have a prefix that defines a namespace, e.g., "Sched_00307", to allow these to be distinguished from other possible targets - for instance, it could be a special ID for TOO observations ("TOO_00017") and for calibration frames it could be "DomeScreen" or some such. I don't know if we have to understand all of that now, though?  Just make sure there's a prefix.
            Hide
            tjenness Tim Jenness added a comment -

            All I have is the OBJECT header. You get whatever it is they put in there.

            What goes in there is an upstream issue.

            Show
            tjenness Tim Jenness added a comment - All I have is the OBJECT header. You get whatever it is they put in there. What goes in there is an upstream issue.
            Hide
            tjenness Tim Jenness added a comment -

            Kenny Lo you've tagged this ticket as ImgServ and data access team and written it as if this is to implement fixes in ImgServ. What I'm going to do is implement the fix for target name on DM-24575 and leave this ticket for you for handling the change (and as we have discussed field-of-view is not directly a registry problem).

            Show
            tjenness Tim Jenness added a comment - Kenny Lo you've tagged this ticket as ImgServ and data access team and written it as if this is to implement fixes in ImgServ. What I'm going to do is implement the fix for target name on DM-24575 and leave this ticket for you for handling the change (and as we have discussed field-of-view is not directly a registry problem).
            tjenness Tim Jenness made changes -
            Link This issue is blocked by DM-24575 [ DM-24575 ]
            tjenness Tim Jenness made changes -
            Assignee Tim Jenness [ tjenness ] Kenny Lo [ kennylo ]
            kennylo Kenny Lo made changes -
            Status To Do [ 10001 ] In Progress [ 3 ]
            kennylo Kenny Lo made changes -
            Link This issue has to be finished together with DM-24575 [ DM-24575 ]
            Hide
            tjenness Tim Jenness added a comment -

            DM-24575 is now in review – I added science program and target name (as well as ra/dec).

            Show
            tjenness Tim Jenness added a comment - DM-24575 is now in review – I added science program and target name (as well as ra/dec).
            gpdf Gregory Dubois-Felsmann made changes -
            Labels ObsCore
            Hide
            tjenness Tim Jenness added a comment -

            Kenny Lo let me know if anything further needs to be done by me.

            Show
            tjenness Tim Jenness added a comment - Kenny Lo let me know if anything further needs to be done by me.
            Hide
            kennylo Kenny Lo added a comment -

            Tim Jenness FYI, I've been running/debugging the script on the HSC RC2 datasets on the verification cluster.  Will let you know if I run into anything.  

            Show
            kennylo Kenny Lo added a comment - Tim Jenness  FYI, I've been running/debugging the script on the HSC RC2 datasets on the verification cluster.  Will let you know if I run into anything.  
            fritzm Fritz Mueller made changes -
            Resolution Done [ 10000 ]
            Status In Progress [ 3 ] Done [ 10002 ]
            Hide
            fritzm Fritz Mueller added a comment -

            Kenny has now transitioned off the DAX team.  Any further work beyond current state on this will be separately ticketed and staffed by the SQuaRE team.

            Show
            fritzm Fritz Mueller added a comment - Kenny has now transitioned off the DAX team.  Any further work beyond current state on this will be separately ticketed and staffed by the SQuaRE team.

              People

              Assignee:
              kennylo Kenny Lo
              Reporter:
              kennylo Kenny Lo
              Reviewers:
              Gregory Dubois-Felsmann, Kenny Lo
              Watchers:
              Fritz Mueller, Gregory Dubois-Felsmann, Kenny Lo, Kian-Tat Lim, Tim Jenness
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.