Details
-
Type:
RFC
-
Status: Implemented
-
Resolution: Done
-
Component/s: DM
-
Labels:None
Description
The ap_verify framework is designed to run standardized datasets, which must be downloaded as separate Git-LFS packages (ap_verify_hits2015, ap_verify_ci_cosmos_pdr2, etc.). The current UI specifies these packages on the command line using a system of "proper" names (HiTS2015, CI-CosmosPDR2, etc.). While the mapping between these names and their corresponding repositories is documented, the distinction is confusing for new users and adds friction for experienced users. It also adds some developer overhead, since ap_verify itself must maintain a list of "supported" datasets, and the ap_verify_testdata dataset needs special casing in order to not have a user-visible name.
I propose that these names be phased out of the UI in favor of using the Git package name in all contexts. Specifically:
- Use of the existing names would be deprecated in Pipelines release 22. The --dataset argument to ap_verify.py, ingest_dataset.py, and add_gen3_repo.py would accept both old-style names and package names, and warn in the former case.
- SQuaSH uploads from ap_verify would immediately switch to using the repository name in the ci_dataset field. This would create a new stream of data, though the old names could still be inspected through the Chronograf UI. AFAIK there is no requirement that all ap_verify results be viewable as a single time series, but this change can be left out without affecting the others.
- Old-style names would no longer be supported as of release 24, and all underlying code and configs would be removed at that point.
Doing so will make ap_verify both easier to use and easier to maintain.
Support wholeheartedly!