Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-13701

Provide HTM/PS1 refcats for validation_data_(hsc, decam, cfht)

    Details

      Description

      Please provide appropriate HTM-format PS1 refcats for validation_data_(hsc, decam, cfht).

      This ticket simply covers creating and uploading those catalogs: they'll be adopted, and the old catalogs retired, on DM-14868.

        Attachments

          Issue Links

            Activity

            Hide
            Parejkoj John Parejko added a comment -

            It looks like the readme already gives everything we need to know about the sky circle we need to search to find the necessary shards: https://github.com/lsst/validation_data_hsc

            ra=21:20, dec=00:00, radius="HSC pupil radius + 0.8 degrees"

            Show
            Parejkoj John Parejko added a comment - It looks like the readme already gives everything we need to know about the sky circle we need to search to find the necessary shards: https://github.com/lsst/validation_data_hsc ra=21:20, dec=00:00, radius="HSC pupil radius + 0.8 degrees"
            Hide
            swinbank John Swinbank added a comment -

            I'm expanding the scope of this ticket to cover new-style PS1 refcats for all of validation_data_(hsc, decam, cfht).

            I'm simultaneously reducing it in scope to merely providing those refcats, not other changes necessary to replace and retire the astrometry.net refcat. Michael Wood-Vasey will take care of that on DM-14868.

            Show
            swinbank John Swinbank added a comment - I'm expanding the scope of this ticket to cover new-style PS1 refcats for all of validation_data_(hsc, decam, cfht). I'm simultaneously reducing it in scope to merely providing those refcats, not other changes necessary to replace and retire the astrometry.net refcat. Michael Wood-Vasey will take care of that on DM-14868 .
            Hide
            cmorrison Chris Morrison added a comment -

            Added both sdss and ps1 indexed reference catalogs to validation_data_cfht/decam/hsc. John Parejko will incorporate those catalogs into the testdata_jointcal as part of this ticket.

            Show
            cmorrison Chris Morrison added a comment - Added both sdss and ps1 indexed reference catalogs to validation_data_cfht/decam/hsc. John Parejko will incorporate those catalogs into the testdata_jointcal as part of this ticket.
            Hide
            Parejkoj John Parejko added a comment -

            John Swinbank: testdata_jointcal won't block Chris Morrison merging of the new matcher, since it contains already processed data. We should get a new ticket to move these new refcats into testdata_jointcal (leaving only the lsstSim data that I hope to replace with something from DC2 eventually) and update jointcal to use them.

            Show
            Parejkoj John Parejko added a comment - John Swinbank : testdata_jointcal won't block Chris Morrison merging of the new matcher, since it contains already processed data. We should get a new ticket to move these new refcats into testdata_jointcal (leaving only the lsstSim data that I hope to replace with something from DC2 eventually) and update jointcal to use them.
            Hide
            wmwood-vasey Michael Wood-Vasey added a comment -

            Chris Morrison These new ref_cats directories should go in the base directory not in data/. The data/ directory is the butler repository to store ingest and processed images and gets re-generated every time. So in analogy with how the astrometry_net_data directory, just create these at the base level in the repo. E.g.,

            validation_data_cfht/ref_cats
            validation_data_decam/ref_cats
            validation_data_hsc/ref_cats

            Then when data are re-processed and the data/ directory is re-generated, the ref_cats/ will get linked to in the regenerated repo. I'll update that part of the scripts as part of DM-14868 if you can move the ref_cata here in this DM-13701.

            Show
            wmwood-vasey Michael Wood-Vasey added a comment - Chris Morrison These new ref_cats directories should go in the base directory not in data/ . The data/ directory is the butler repository to store ingest and processed images and gets re-generated every time. So in analogy with how the astrometry_net_data directory, just create these at the base level in the repo. E.g., validation_data_cfht/ref_cats validation_data_decam/ref_cats validation_data_hsc/ref_cats Then when data are re-processed and the data/ directory is re-generated, the ref_cats/ will get linked to in the regenerated repo. I'll update that part of the scripts as part of DM-14868 if you can move the ref_cata here in this DM-13701 .
            Hide
            wmwood-vasey Michael Wood-Vasey added a comment -

            Please also add a brief entry for the new ref_cats directory to the "Files" section of respective README.txt files.

            Show
            wmwood-vasey Michael Wood-Vasey added a comment - Please also add a brief entry for the new ref_cats directory to the "Files" section of respective README.txt files.
            Hide
            wmwood-vasey Michael Wood-Vasey added a comment -

            Chris Morrison Do you have sets of color terms you would recommend using with these new catalogs? E.g., SDSS->CFHT, PS1->CFHT, etc.

            Show
            wmwood-vasey Michael Wood-Vasey added a comment - Chris Morrison Do you have sets of color terms you would recommend using with these new catalogs? E.g., SDSS->CFHT, PS1->CFHT, etc.
            Hide
            cmorrison Chris Morrison added a comment -

            Hey Michael Wood-Vasey I do not have color terms I could recommend.

            Show
            cmorrison Chris Morrison added a comment - Hey Michael Wood-Vasey I do not have color terms I could recommend.
            Hide
            wmwood-vasey Michael Wood-Vasey added a comment -

            validation_data_cfht looks good to go.

            I'm testing validation_data_decam today.

            Show
            wmwood-vasey Michael Wood-Vasey added a comment - validation_data_cfht looks good to go. I'm testing validation_data_decam today.
            Hide
            cmorrison Chris Morrison added a comment -

            Okay, I'm trying to fix up HSC today but it has been difficult to download and get the branch updated with your comments. I'll push to that branch once things I address your comments.

            I'll try to merge all the pull requests at once after testing them in Jenkins just to make completely sure that nothing has been broken.

            Show
            cmorrison Chris Morrison added a comment - Okay, I'm trying to fix up HSC today but it has been difficult to download and get the branch updated with your comments. I'll push to that branch once things I address your comments. I'll try to merge all the pull requests at once after testing them in Jenkins just to make completely sure that nothing has been broken.
            Hide
            wmwood-vasey Michael Wood-Vasey added a comment -

            validation_data_decam ref_cats look good.

            Show
            wmwood-vasey Michael Wood-Vasey added a comment - validation_data_decam ref_cats look good.
            Hide
            wmwood-vasey Michael Wood-Vasey added a comment -

            Ah, I had been just looking at the PS1 files. I just took a look at the SDSS HTM files.

            Could you re-generate those to have lower-case filter names?

            The '_flux' and '_fluxSigma' columns currently have names like 'G_flux' and 'G_fluxSigma'. Those should instead be 'g_flux' and 'g_fluxSigma'. Similarly for U, G, R, I, Z -> u, g, r, i, z.

            Show
            wmwood-vasey Michael Wood-Vasey added a comment - Ah, I had been just looking at the PS1 files. I just took a look at the SDSS HTM files. Could you re-generate those to have lower-case filter names? The '_flux' and '_fluxSigma' columns currently have names like 'G_flux' and 'G_fluxSigma'. Those should instead be 'g_flux' and 'g_fluxSigma'. Similarly for U, G, R, I, Z -> u, g, r, i, z.
            Hide
            cmorrison Chris Morrison added a comment -

            The files are copied from the datasets directory on lsst-dev and would have to be regenerated on a different ticket.

            Show
            cmorrison Chris Morrison added a comment - The files are copied from the datasets directory on lsst-dev and would have to be regenerated on a different ticket.
            Hide
            wmwood-vasey Michael Wood-Vasey added a comment -

            I don't understand.

            Show
            wmwood-vasey Michael Wood-Vasey added a comment - I don't understand.
            Hide
            wmwood-vasey Michael Wood-Vasey added a comment -

            Oh, somebody else (Hsin-Fang Chiang?) generated the HTM-formatted files and put them in

            /datasets/refcats/htm/ps1_pv3_3pi_20170110/
            /datasets/refcats/htm/sdss-dr9-fink-v5b

            and you've copied out those files to put them in the validation_data_* repos.

            Show
            wmwood-vasey Michael Wood-Vasey added a comment - Oh, somebody else (Hsin-Fang Chiang?) generated the HTM-formatted files and put them in /datasets/refcats/htm/ps1_pv3_3pi_20170110/ /datasets/refcats/htm/sdss-dr9-fink-v5b and you've copied out those files to put them in the validation_data_* repos.
            Hide
            wmwood-vasey Michael Wood-Vasey added a comment - - edited

            You could explicitly rename the columns in the FITS files when you add them.

            Show
            wmwood-vasey Michael Wood-Vasey added a comment - - edited You could explicitly rename the columns in the FITS files when you add them.
            Hide
            cmorrison Chris Morrison added a comment -

            I'm not too worried about this as it could be handled in a config. Also, I believe the reference catalogs will be updated soon as Russel is adding proper motions into the refcats soon. I think the most important thing this ticket accomplishes is figuring out with HTM trixels overlap the validation data. Those numbers will not change on subsequent versions.

            Show
            cmorrison Chris Morrison added a comment - I'm not too worried about this as it could be handled in a config. Also, I believe the reference catalogs will be updated soon as Russel is adding proper motions into the refcats soon. I think the most important thing this ticket accomplishes is figuring out with HTM trixels overlap the validation data. Those numbers will not change on subsequent versions.
            Hide
            wmwood-vasey Michael Wood-Vasey added a comment -

            I would suggest leaving the SDSS HTM catalogs out for now and just add the PS1 ones.

            The PS1 files carry that same information about the overlap.

            Show
            wmwood-vasey Michael Wood-Vasey added a comment - I would suggest leaving the SDSS HTM catalogs out for now and just add the PS1 ones. The PS1 files carry that same information about the overlap.
            Hide
            cmorrison Chris Morrison added a comment -

            One of the reasons I am including them is so John Parejko can incrementally change his joincal tests one step at a time instead of changing both the format and the reference data used. The original plan was to just have PS1 there but that changed after talking to him.

            Show
            cmorrison Chris Morrison added a comment - One of the reasons I am including them is so John Parejko can incrementally change his joincal tests one step at a time instead of changing both the format and the reference data used. The original plan was to just have PS1 there but that changed after talking to him.
            Hide
            Parejkoj John Parejko added a comment -

            If it makes the work easier, I'm ok with going straight to PS1, and I'll just sort out the jointcal test changes later.

            I too would prefer the column names match the filter names (e.g. U and u are quite different filters!).

            Show
            Parejkoj John Parejko added a comment - If it makes the work easier, I'm ok with going straight to PS1, and I'll just sort out the jointcal test changes later. I too would prefer the column names match the filter names (e.g. U and u are quite different filters!).
            Hide
            wmwood-vasey Michael Wood-Vasey added a comment - - edited

            I'm going to take the liberty of renaming the columns and pushing that to DM-13701.

             

            #!/usr/bin/env python
             
            """
            Rename mistakenly labeled U_flux, U_fluxSigma, G_flux etc.
            to u_flux, u_fluxSigma, g_flux, ...
            """
             
            import sys
             
            from astropy.io import fits
             
            from numpy.lib.recfunctions import rename_fields
             
            verbose = False
             
             
            def rename_filters(table, old_filters=('U', 'G', 'R', 'I', 'Z'), new_filters=('u', 'g', 'r', 'i', 'z')):
                """Rename _flux, _fluxSigma columns of table."""
                rename_pairs = {}
                for old, new in zip(old_filters, new_filters):
                    rename_pairs["%s_flux" % old] = "%s_flux" % new
                    rename_pairs["%s_fluxSigma" % old] = "%s_fluxSigma" % new
             
                print("Renaming: ")
                print(rename_pairs)
                new_table = rename_fields(table, rename_pairs)
             
                return new_table
             
             
            for file in sys.argv[1:]:
                with fits.open(file) as hdu_list:
                    table = hdu_list[1].data
                    if verbose:
                        print(table.dtype)
                    new_table = rename_filters(table)
                    if verbose:
                        print(new_table.dtype)
                    hdu_list[1].data = new_table
                    hdu_list.writeto('newtable.fits')
            

            Show
            wmwood-vasey Michael Wood-Vasey added a comment - - edited I'm going to take the liberty of renaming the columns and pushing that to DM-13701 .   #!/usr/bin/env python   """ Rename mistakenly labeled U_flux, U_fluxSigma, G_flux etc. to u_flux, u_fluxSigma, g_flux, ... """   import sys   from astropy.io import fits   from numpy.lib.recfunctions import rename_fields   verbose = False     def rename_filters(table, old_filters=('U', 'G', 'R', 'I', 'Z'), new_filters=('u', 'g', 'r', 'i', 'z')): """Rename _flux, _fluxSigma columns of table.""" rename_pairs = {} for old, new in zip(old_filters, new_filters): rename_pairs["%s_flux" % old] = "%s_flux" % new rename_pairs["%s_fluxSigma" % old] = "%s_fluxSigma" % new   print("Renaming: ") print(rename_pairs) new_table = rename_fields(table, rename_pairs)   return new_table     for file in sys.argv[1:]: with fits.open(file) as hdu_list: table = hdu_list[1].data if verbose: print(table.dtype) new_table = rename_filters(table) if verbose: print(new_table.dtype) hdu_list[1].data = new_table hdu_list.writeto('newtable.fits')
            Hide
            wmwood-vasey Michael Wood-Vasey added a comment -

            Oh, that doesn't quite work because FITS_rec wrap numpy.recarray and I need to update the Column definitions too, but I'll do that this evening.

            Show
            wmwood-vasey Michael Wood-Vasey added a comment - Oh, that doesn't quite work because FITS_rec wrap numpy.recarray and I need to update the Column definitions too, but I'll do that this evening.
            Hide
            cmorrison Chris Morrison added a comment -

            The ticket is currently passing Jenkins including ci_hsc. Let me know when you've changed the columns and I will merge the ticket.

            Show
            cmorrison Chris Morrison added a comment - The ticket is currently passing Jenkins including ci_hsc. Let me know when you've changed the columns and I will merge the ticket.
            Hide
            wmwood-vasey Michael Wood-Vasey added a comment -

            I haven't had a chance to look at the validation_data_hsc catalogs.

            Show
            wmwood-vasey Michael Wood-Vasey added a comment - I haven't had a chance to look at the validation_data_hsc catalogs.
            Hide
            wmwood-vasey Michael Wood-Vasey added a comment -

            Need to use table.columns.change_name within the FITS_rec and ColDefs world. I just attached the renaming script to this ticket for reference.

            rename_filters.py

            Show
            wmwood-vasey Michael Wood-Vasey added a comment - Need to use table.columns.change_name within the FITS_rec and ColDefs world. I just attached the renaming script to this ticket for reference. rename_filters.py
            Hide
            wmwood-vasey Michael Wood-Vasey added a comment -

            I've updated the filter names for the ref_cats/sdss-dr9-fink-v5b HTM files and pushed to the tickets/DM-13701 branch for each of validation_data_[cfht,decam,hsc].

            Show
            wmwood-vasey Michael Wood-Vasey added a comment - I've updated the filter names for the ref_cats/sdss-dr9-fink-v5b HTM files and pushed to the tickets/ DM-13701 branch for each of validation_data_ [cfht,decam,hsc] .
            Hide
            wmwood-vasey Michael Wood-Vasey added a comment -

            I'll be able to actually try running the new HTM ref_cats with validation_data_hsc tomorrow.

            Show
            wmwood-vasey Michael Wood-Vasey added a comment - I'll be able to actually try running the new HTM ref_cats with validation_data_hsc tomorrow.
            Hide
            wmwood-vasey Michael Wood-Vasey added a comment -

            validation_data_hsc is running and looks good.

            Show
            wmwood-vasey Michael Wood-Vasey added a comment - validation_data_hsc is running and looks good.
            Hide
            wmwood-vasey Michael Wood-Vasey added a comment -

            These look good. Merge them!

            Thanks for doing all of the bookkeeping to figure out which HTM files were needed.

            Show
            wmwood-vasey Michael Wood-Vasey added a comment - These look good. Merge them! Thanks for doing all of the bookkeeping to figure out which HTM files were needed.
            Hide
            cmorrison Chris Morrison added a comment -

            Confirmed shards for all validation datasets are correct and successfully find astrometric/photometric solutions.

            Jenkins run: https://ci.lsst.codes/blue/organizations/jenkins/stack-os-matrix/detail/stack-os-matrix/28575/pipeline

            Show
            cmorrison Chris Morrison added a comment - Confirmed shards for all validation datasets are correct and successfully find astrometric/photometric solutions. Jenkins run: https://ci.lsst.codes/blue/organizations/jenkins/stack-os-matrix/detail/stack-os-matrix/28575/pipeline
            Hide
            wmwood-vasey Michael Wood-Vasey added a comment -

            FWIW, that Jenkins run doesn't actually use these new catalogs. That won't happen until DM-14868.

            Show
            wmwood-vasey Michael Wood-Vasey added a comment - FWIW, that Jenkins run doesn't actually use these new catalogs. That won't happen until DM-14868 .
            Hide
            cmorrison Chris Morrison added a comment -

            Yep, it was more to test that something else didn't silently break when adding the catalogs.

            Show
            cmorrison Chris Morrison added a comment - Yep, it was more to test that something else didn't silently break when adding the catalogs.
            Hide
            wmwood-vasey Michael Wood-Vasey added a comment -

            +1. Always good to test.

            Show
            wmwood-vasey Michael Wood-Vasey added a comment - +1. Always good to test.

              People

              • Assignee:
                cmorrison Chris Morrison
                Reporter:
                Parejkoj John Parejko
                Reviewers:
                Michael Wood-Vasey
                Watchers:
                Angelo Fausti, Chris Morrison, Hsin-Fang Chiang, John Parejko, John Swinbank, Michael Wood-Vasey, Simon Krughoff
              • Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: