Neither of these datasets are currently used for verification/validation purposes, nor are they regularly processed and their READMEs and associated files are all based on gen2. Making them useable in gen3 is not worth the time because we have much better alternatives that are currently being processed regularly. Keeping them in place is potentially confusing, as they are described as "test data for exercising the LSST stack through single frame and coadd processing", which has not really been true for year or more.
I propose we mark both datasets as deprecated in their package readme files, and move them to lsst-dm/legacy-*, following the removal procedure described here
validation_data_hsc has been supplanted for verification/validation testing by the rc2_subset, which was intentionally selected to be more useful for coadd testing. ci_hsc is similarly more useful for testing the full pipeline in a CI setting. I stopped using this dataset for jointcal testing in May 2020, and I believe that I was the last real user of it.
validation_data_decam consists of Community Pipeline instcals, which we do not support in gen3. For demonstrating that we can process DECam data, we have the well documented ap_verify_ci_hits2015 (3 visits with 2 detectors each, ~6GB repo) and ap_verify_hits2015 (>80 full visits, ~220GB repo) datasets, which can be processed in gen3 with a single ap_verify command. testdata_jointcal will drop use of validation_data_decam shortly (due to the inability to process instcals in gen3, and a lack of time/desire to try to support such), and I believe there have been no other users for several years.
I am not proposing we deprecate/remove validation_data_cfht, as I was able to process it in gen3 singleFrame+jointcal on
DM-32373, and it provides our only source of CFHT test data. Although it is postISR data from the Elixir pipeline, I think it is still useful for demonstrating running our pipeline on a non-LSST/DECam/HSC data source.