# Provide HTM/PS1 refcats for validation_data_(hsc, decam, cfht)

XMLWordPrintable

## Details

• Type: Improvement
• Status: Done
• Resolution: Done
• Fix Version/s: None
• Component/s:
• Labels:
• Story Points:
7
• Sprint:
AP F18-3

## Description

Please provide appropriate HTM-format PS1 refcats for validation_data_(hsc, decam, cfht).

This ticket simply covers creating and uploading those catalogs: they'll be adopted, and the old catalogs retired, on DM-14868.

## Attachments

1. rename_filters.py
2 kB

## Activity

Hide
John Parejko added a comment -

It looks like the readme already gives everything we need to know about the sky circle we need to search to find the necessary shards: https://github.com/lsst/validation_data_hsc

Show
John Parejko added a comment - It looks like the readme already gives everything we need to know about the sky circle we need to search to find the necessary shards: https://github.com/lsst/validation_data_hsc ra=21:20, dec=00:00, radius="HSC pupil radius + 0.8 degrees"
Hide
John Swinbank added a comment -

I'm expanding the scope of this ticket to cover new-style PS1 refcats for all of validation_data_(hsc, decam, cfht).

I'm simultaneously reducing it in scope to merely providing those refcats, not other changes necessary to replace and retire the astrometry.net refcat. Michael Wood-Vasey will take care of that on DM-14868.

Show
John Swinbank added a comment - I'm expanding the scope of this ticket to cover new-style PS1 refcats for all of validation_data_(hsc, decam, cfht). I'm simultaneously reducing it in scope to merely providing those refcats, not other changes necessary to replace and retire the astrometry.net refcat. Michael Wood-Vasey will take care of that on DM-14868 .
Hide
Chris Morrison added a comment -

Added both sdss and ps1 indexed reference catalogs to validation_data_cfht/decam/hsc. John Parejko will incorporate those catalogs into the testdata_jointcal as part of this ticket.

Show
Chris Morrison added a comment - Added both sdss and ps1 indexed reference catalogs to validation_data_cfht/decam/hsc. John Parejko will incorporate those catalogs into the testdata_jointcal as part of this ticket.
Hide
John Parejko added a comment -

John Swinbank: testdata_jointcal won't block Chris Morrison merging of the new matcher, since it contains already processed data. We should get a new ticket to move these new refcats into testdata_jointcal (leaving only the lsstSim data that I hope to replace with something from DC2 eventually) and update jointcal to use them.

Show
John Parejko added a comment - John Swinbank : testdata_jointcal won't block Chris Morrison merging of the new matcher, since it contains already processed data. We should get a new ticket to move these new refcats into testdata_jointcal (leaving only the lsstSim data that I hope to replace with something from DC2 eventually) and update jointcal to use them.
Hide
Michael Wood-Vasey added a comment -

Chris Morrison These new ref_cats directories should go in the base directory not in data/. The data/ directory is the butler repository to store ingest and processed images and gets re-generated every time. So in analogy with how the astrometry_net_data directory, just create these at the base level in the repo. E.g.,

validation_data_cfht/ref_cats
validation_data_decam/ref_cats
validation_data_hsc/ref_cats

Then when data are re-processed and the data/ directory is re-generated, the ref_cats/ will get linked to in the regenerated repo. I'll update that part of the scripts as part of DM-14868 if you can move the ref_cata here in this DM-13701.

Show
Michael Wood-Vasey added a comment - Chris Morrison These new ref_cats directories should go in the base directory not in data/ . The data/ directory is the butler repository to store ingest and processed images and gets re-generated every time. So in analogy with how the astrometry_net_data directory, just create these at the base level in the repo. E.g., validation_data_cfht/ref_cats validation_data_decam/ref_cats validation_data_hsc/ref_cats Then when data are re-processed and the data/ directory is re-generated, the ref_cats/ will get linked to in the regenerated repo. I'll update that part of the scripts as part of DM-14868 if you can move the ref_cata here in this DM-13701 .
Hide
Michael Wood-Vasey added a comment -

Please also add a brief entry for the new ref_cats directory to the "Files" section of respective README.txt files.

Show
Michael Wood-Vasey added a comment - Please also add a brief entry for the new ref_cats directory to the "Files" section of respective README.txt files.
Hide
Michael Wood-Vasey added a comment -

Chris Morrison Do you have sets of color terms you would recommend using with these new catalogs? E.g., SDSS->CFHT, PS1->CFHT, etc.

Show
Michael Wood-Vasey added a comment - Chris Morrison Do you have sets of color terms you would recommend using with these new catalogs? E.g., SDSS->CFHT, PS1->CFHT, etc.
Hide
Chris Morrison added a comment -

Hey Michael Wood-Vasey I do not have color terms I could recommend.

Show
Chris Morrison added a comment - Hey Michael Wood-Vasey I do not have color terms I could recommend.
Hide
Michael Wood-Vasey added a comment -

validation_data_cfht looks good to go.

I'm testing validation_data_decam today.

Show
Michael Wood-Vasey added a comment - validation_data_cfht looks good to go. I'm testing validation_data_decam today.
Hide
Chris Morrison added a comment -

I'll try to merge all the pull requests at once after testing them in Jenkins just to make completely sure that nothing has been broken.

Show
Chris Morrison added a comment - Okay, I'm trying to fix up HSC today but it has been difficult to download and get the branch updated with your comments. I'll push to that branch once things I address your comments. I'll try to merge all the pull requests at once after testing them in Jenkins just to make completely sure that nothing has been broken.
Hide
Michael Wood-Vasey added a comment -

validation_data_decam ref_cats look good.

Show
Michael Wood-Vasey added a comment - validation_data_decam ref_cats look good.
Hide
Michael Wood-Vasey added a comment -

Ah, I had been just looking at the PS1 files. I just took a look at the SDSS HTM files.

Could you re-generate those to have lower-case filter names?

The '_flux' and '_fluxSigma' columns currently have names like 'G_flux' and 'G_fluxSigma'. Those should instead be 'g_flux' and 'g_fluxSigma'. Similarly for U, G, R, I, Z -> u, g, r, i, z.

Show
Michael Wood-Vasey added a comment - Ah, I had been just looking at the PS1 files. I just took a look at the SDSS HTM files. Could you re-generate those to have lower-case filter names? The '_flux' and '_fluxSigma' columns currently have names like 'G_flux' and 'G_fluxSigma'. Those should instead be 'g_flux' and 'g_fluxSigma'. Similarly for U, G, R, I, Z -> u, g, r, i, z.
Hide
Chris Morrison added a comment -

The files are copied from the datasets directory on lsst-dev and would have to be regenerated on a different ticket.

Show
Chris Morrison added a comment - The files are copied from the datasets directory on lsst-dev and would have to be regenerated on a different ticket.
Hide
Michael Wood-Vasey added a comment -

I don't understand.

Show
Michael Wood-Vasey added a comment - I don't understand.
Hide
Michael Wood-Vasey added a comment -

Oh, somebody else (Hsin-Fang Chiang?) generated the HTM-formatted files and put them in

/datasets/refcats/htm/ps1_pv3_3pi_20170110/
/datasets/refcats/htm/sdss-dr9-fink-v5b

and you've copied out those files to put them in the validation_data_* repos.

Show
Michael Wood-Vasey added a comment - Oh, somebody else (Hsin-Fang Chiang?) generated the HTM-formatted files and put them in /datasets/refcats/htm/ps1_pv3_3pi_20170110/ /datasets/refcats/htm/sdss-dr9-fink-v5b and you've copied out those files to put them in the validation_data_* repos.
Hide
Michael Wood-Vasey added a comment - - edited

You could explicitly rename the columns in the FITS files when you add them.

Show
Michael Wood-Vasey added a comment - - edited You could explicitly rename the columns in the FITS files when you add them.
Hide
Chris Morrison added a comment -

I'm not too worried about this as it could be handled in a config. Also, I believe the reference catalogs will be updated soon as Russel is adding proper motions into the refcats soon. I think the most important thing this ticket accomplishes is figuring out with HTM trixels overlap the validation data. Those numbers will not change on subsequent versions.

Show
Chris Morrison added a comment - I'm not too worried about this as it could be handled in a config. Also, I believe the reference catalogs will be updated soon as Russel is adding proper motions into the refcats soon. I think the most important thing this ticket accomplishes is figuring out with HTM trixels overlap the validation data. Those numbers will not change on subsequent versions.
Hide
Michael Wood-Vasey added a comment -

I would suggest leaving the SDSS HTM catalogs out for now and just add the PS1 ones.

The PS1 files carry that same information about the overlap.

Show
Michael Wood-Vasey added a comment - I would suggest leaving the SDSS HTM catalogs out for now and just add the PS1 ones. The PS1 files carry that same information about the overlap.
Hide
Chris Morrison added a comment -

One of the reasons I am including them is so John Parejko can incrementally change his joincal tests one step at a time instead of changing both the format and the reference data used. The original plan was to just have PS1 there but that changed after talking to him.

Show
Chris Morrison added a comment - One of the reasons I am including them is so John Parejko can incrementally change his joincal tests one step at a time instead of changing both the format and the reference data used. The original plan was to just have PS1 there but that changed after talking to him.
Hide
John Parejko added a comment -

If it makes the work easier, I'm ok with going straight to PS1, and I'll just sort out the jointcal test changes later.

I too would prefer the column names match the filter names (e.g. U and u are quite different filters!).

Show
John Parejko added a comment - If it makes the work easier, I'm ok with going straight to PS1, and I'll just sort out the jointcal test changes later. I too would prefer the column names match the filter names (e.g. U and u are quite different filters!).
Hide
Michael Wood-Vasey added a comment - - edited

I'm going to take the liberty of renaming the columns and pushing that to DM-13701.

 #!/usr/bin/env python   """ Rename mistakenly labeled U_flux, U_fluxSigma, G_flux etc. to u_flux, u_fluxSigma, g_flux, ... """   import sys   from astropy.io import fits   from numpy.lib.recfunctions import rename_fields   verbose = False     def rename_filters(table, old_filters=('U', 'G', 'R', 'I', 'Z'), new_filters=('u', 'g', 'r', 'i', 'z')):  """Rename _flux, _fluxSigma columns of table."""  rename_pairs = {}  for old, new in zip(old_filters, new_filters):  rename_pairs["%s_flux" % old] = "%s_flux" % new  rename_pairs["%s_fluxSigma" % old] = "%s_fluxSigma" % new    print("Renaming: ")  print(rename_pairs)  new_table = rename_fields(table, rename_pairs)    return new_table     for file in sys.argv[1:]:  with fits.open(file) as hdu_list:  table = hdu_list[1].data  if verbose:  print(table.dtype)  new_table = rename_filters(table)  if verbose:  print(new_table.dtype)  hdu_list[1].data = new_table  hdu_list.writeto('newtable.fits') 

Show
Michael Wood-Vasey added a comment - - edited I'm going to take the liberty of renaming the columns and pushing that to DM-13701 .   #!/usr/bin/env python   """ Rename mistakenly labeled U_flux, U_fluxSigma, G_flux etc. to u_flux, u_fluxSigma, g_flux, ... """   import sys   from astropy.io import fits   from numpy.lib.recfunctions import rename_fields   verbose = False     def rename_filters(table, old_filters=('U', 'G', 'R', 'I', 'Z'), new_filters=('u', 'g', 'r', 'i', 'z')): """Rename _flux, _fluxSigma columns of table.""" rename_pairs = {} for old, new in zip(old_filters, new_filters): rename_pairs["%s_flux" % old] = "%s_flux" % new rename_pairs["%s_fluxSigma" % old] = "%s_fluxSigma" % new   print("Renaming: ") print(rename_pairs) new_table = rename_fields(table, rename_pairs)   return new_table     for file in sys.argv[1:]: with fits.open(file) as hdu_list: table = hdu_list[1].data if verbose: print(table.dtype) new_table = rename_filters(table) if verbose: print(new_table.dtype) hdu_list[1].data = new_table hdu_list.writeto('newtable.fits')
Hide
Michael Wood-Vasey added a comment -

Oh, that doesn't quite work because FITS_rec wrap numpy.recarray and I need to update the Column definitions too, but I'll do that this evening.

Show
Michael Wood-Vasey added a comment - Oh, that doesn't quite work because FITS_rec wrap numpy.recarray and I need to update the Column definitions too, but I'll do that this evening.
Hide
Chris Morrison added a comment -

The ticket is currently passing Jenkins including ci_hsc. Let me know when you've changed the columns and I will merge the ticket.

Show
Chris Morrison added a comment - The ticket is currently passing Jenkins including ci_hsc. Let me know when you've changed the columns and I will merge the ticket.
Hide
Michael Wood-Vasey added a comment -

I haven't had a chance to look at the validation_data_hsc catalogs.

Show
Michael Wood-Vasey added a comment - I haven't had a chance to look at the validation_data_hsc catalogs.
Hide
Michael Wood-Vasey added a comment -

Need to use table.columns.change_name within the FITS_rec and ColDefs world. I just attached the renaming script to this ticket for reference.

Show
Michael Wood-Vasey added a comment - Need to use table.columns.change_name within the FITS_rec and ColDefs world. I just attached the renaming script to this ticket for reference. rename_filters.py
Hide
Michael Wood-Vasey added a comment -

I've updated the filter names for the ref_cats/sdss-dr9-fink-v5b HTM files and pushed to the tickets/DM-13701 branch for each of validation_data_[cfht,decam,hsc].

Show
Michael Wood-Vasey added a comment - I've updated the filter names for the ref_cats/sdss-dr9-fink-v5b HTM files and pushed to the tickets/ DM-13701 branch for each of validation_data_ [cfht,decam,hsc] .
Hide
Michael Wood-Vasey added a comment -

I'll be able to actually try running the new HTM ref_cats with validation_data_hsc tomorrow.

Show
Michael Wood-Vasey added a comment - I'll be able to actually try running the new HTM ref_cats with validation_data_hsc tomorrow.
Hide
Michael Wood-Vasey added a comment -

validation_data_hsc is running and looks good.

Show
Michael Wood-Vasey added a comment - validation_data_hsc is running and looks good.
Hide
Michael Wood-Vasey added a comment -

These look good. Merge them!

Thanks for doing all of the bookkeeping to figure out which HTM files were needed.

Show
Michael Wood-Vasey added a comment - These look good. Merge them! Thanks for doing all of the bookkeeping to figure out which HTM files were needed.
Hide
Chris Morrison added a comment -

Confirmed shards for all validation datasets are correct and successfully find astrometric/photometric solutions.

Show
Chris Morrison added a comment - Confirmed shards for all validation datasets are correct and successfully find astrometric/photometric solutions. Jenkins run: https://ci.lsst.codes/blue/organizations/jenkins/stack-os-matrix/detail/stack-os-matrix/28575/pipeline
Hide
Michael Wood-Vasey added a comment -

FWIW, that Jenkins run doesn't actually use these new catalogs. That won't happen until DM-14868.

Show
Michael Wood-Vasey added a comment - FWIW, that Jenkins run doesn't actually use these new catalogs. That won't happen until DM-14868 .
Hide
Chris Morrison added a comment -

Yep, it was more to test that something else didn't silently break when adding the catalogs.

Show
Chris Morrison added a comment - Yep, it was more to test that something else didn't silently break when adding the catalogs.
Hide
Michael Wood-Vasey added a comment -

+1. Always good to test.

Show
Michael Wood-Vasey added a comment - +1. Always good to test.

## People

• Assignee:
Chris Morrison
Reporter:
John Parejko
Reviewers:
Michael Wood-Vasey
Watchers:
Angelo Fausti, Chris Morrison, Hsin-Fang Chiang, John Parejko, John Swinbank, Michael Wood-Vasey, Simon Krughoff