just to add to what Andy said:
I think we can make some solid guesses on indexing but we won't know for sure until we see this functionality in action w/ data volumes and distributions that are representative of production.
If the above two generalized SQLs account for most of the queries hitting collection tables, the UNIQUE constraint should be enough for good performance. You just have to make sure the index has the columns in the right order to get efficient plans for both SQLs (collection_name, dataset_type_id, instrument, detector, visit). That assumes join order of second SQL starts w/ collection table which may not be the case. We'll have to see.
If you plan to return a significant number of rows from collection tables leading to lots of index -> table lookups to get additional dataset_id column, it might be worth adding dataset_id to keep from touching the table at all. I think that would mean an additional index since constraint couldn't be enforced with covering index which obviously slows DML so we'd definitely need to test. Regardless, it's premature to do that.
You definitely don't want a bunch of single column indexes unless you have a lot of SQL with single column predicates or possibly varying subset of predicates against collection_name, dataset_type_id, instrument, detector, visit.
I'd be very cautious of UNION if you are returning much data from UNION'd portion of SQL. That implies an additional sort or hashing to get distinct result set and since hashing is an option, you can't depend on any kind of implied order. If that isn't absolutely necessary, use UNION ALL.
If foreign keys are indexed, that absolutely makes joins more efficient if join order is parent > child. If not, those indexes still might be necessary to prevent locking issues (row-exclusive vs table-exclusive) on DML.