It's been a while and I have some fresh perspective, especially having engineered task documentation.
It sounds like there are a now possibly two different things that we're talking about. I think the original request was to document Butler datasets, and so I'm going to stick with that scope. Documenting our databases and data products ("LSST Data Model") also needs to be done, but that's a different thing and needs a different ticket from what I can see.
For Butler datasets, I now believe that I can create canonical documentation topics in pipelines.lsst.io for each dataset. These topics will be linked to the tasks that generate and transform them. I think that from the ground-up we can document how each task modifies a table schema or modifies metadata, for example, and that information can flow into both the published documentation for a task, and also the canonical documentation for a Butler dataset.
What we mentioned last November still stands, that we can't publish a table of dataset columns that's 100% relevant to any particular pipeline. But with the system I've started to build, we can certainly give users all the tools they need to identify what columns might be part of their datasets, and expose knowledge about the task that generated those columns and what those columns mean. Again, this strategy is particular to the pipelines.lsst.io documentation and Butler datasets.