We may be able to optimize dataset lookups by remembering which dataset types are present in various collections.
This could happen down in daf_butler, for every collection, via some kind of materialized summary view of the dataset_collection_ tables (which could be managed either by Python code or by database support for materialized joins). It could also happen by using CHAINED collections more as the "user visible" collections created by the ingest and conversion scripts, because CHAINED collections already permit the dataset types looked up in child collection to be restricted. The former is more work but seems cleaner long-term (there may be some potential for surprising behavior if we rely only on the latter), and we may want to do the latter as well anyway.
This would probably require schema changes, but probably only additions, and it may be doable in a way that allows the old schema and new schema to both be supported via configuration changes.