Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-6655

Webpage of flags produced by various stack products

    XMLWordPrintable

    Details

      Description

      SDSS has a handy webpage with descriptions of all of their bitmask flags:

      http://www.sdss.org/dr12/algorithms/bitmasks/#ListofBitmasks

      It would be exceptionally useful for LSST to produce a similar webpage. I could see it being auto-built from our current flags documetation, which would also help us identify places where our current docstrings are lacking (which many of them are).

        Attachments

          Issue Links

            Activity

            Hide
            jsick Jonathan Sick added a comment - - edited

            It's been a while and I have some fresh perspective, especially having engineered task documentation.

            It sounds like there are a now possibly two different things that we're talking about. I think the original request was to document Butler datasets, and so I'm going to stick with that scope. Documenting our databases and data products ("LSST Data Model") also needs to be done, but that's a different thing and needs a different ticket from what I can see.

            For Butler datasets, I now believe that I can create canonical documentation topics in pipelines.lsst.io for each dataset. These topics will be linked to the tasks that generate and transform them. I think that from the ground-up we can document how each task modifies a table schema or modifies metadata, for example, and that information can flow into both the published documentation for a task, and also the canonical documentation for a Butler dataset.

            What we mentioned last November still stands, that we can't publish a table of dataset columns that's 100% relevant to any particular pipeline. But with the system I've started to build, we can certainly give users all the tools they need to identify what columns might be part of their datasets, and expose knowledge about the task that generated those columns and what those columns mean. Again, this strategy is particular to the pipelines.lsst.io documentation and Butler datasets.

            Show
            jsick Jonathan Sick added a comment - - edited It's been a while and I have some fresh perspective, especially having engineered task documentation. It sounds like there are a now possibly two different things that we're talking about. I think the original request was to document Butler datasets , and so I'm going to stick with that scope. Documenting our databases and data products ("LSST Data Model") also needs to be done, but that's a different thing and needs a different ticket from what I can see. For Butler datasets, I now believe that I can create canonical documentation topics in pipelines.lsst.io for each dataset. These topics will be linked to the tasks that generate and transform them. I think that from the ground-up we can document how each task modifies a table schema or modifies metadata, for example, and that information can flow into both the published documentation for a task, and also the canonical documentation for a Butler dataset. What we mentioned last November still stands, that we can't publish a table of dataset columns that's 100% relevant to any particular pipeline. But with the system I've started to build, we can certainly give users all the tools they need to identify what columns might be part of their datasets, and expose knowledge about the task that generated those columns and what those columns mean. Again, this strategy is particular to the pipelines.lsst.io documentation and Butler datasets.
            Hide
            Parejkoj John Parejko added a comment -

            Since I originally filed this: I was specifically looking for a web page that documents the _flag fields that get set in our catalogs. We should be able to extract that from a "typical" run on some data (say, ci_hsc). More broadly, it would be very useful if that web page had descriptions of all of the fields our "typical" catalogs contain.

            It seems silly to me that a user has to go and read in a table and look at its schema to understand what sorts of fields the LSST software could produce.

            Show
            Parejkoj John Parejko added a comment - Since I originally filed this: I was specifically looking for a web page that documents the _flag fields that get set in our catalogs. We should be able to extract that from a "typical" run on some data (say, ci_hsc). More broadly, it would be very useful if that web page had descriptions of all of the fields our "typical" catalogs contain. It seems silly to me that a user has to go and read in a table and look at its schema to understand what sorts of fields the LSST software could produce.
            Hide
            jsick Jonathan Sick added a comment -

            Got it. I think I have that covered.

            Show
            jsick Jonathan Sick added a comment - Got it. I think I have that covered.
            Hide
            gpdf Gregory Dubois-Felsmann added a comment -

            It seems to me that we definitely need both Task-level documentation and LSST-data-model-level documentation on this. There's no guarantee that any packed flag words in the released data model will be the outputs of single algorithmic Tasks - we may well combine flags from multiple Tasks. (Of course, as a matter of good implementation practice the combiner itself might be a Task, but its selections of what to combine would be more likely to be configuration.)

            Ideally we could point back from the data model documentation, for most (perhaps all, depending on whether any combining is done) flags, to the Task documentation that defines it algorithmic meaning.

            Show
            gpdf Gregory Dubois-Felsmann added a comment - It seems to me that we definitely need both Task-level documentation and LSST-data-model-level documentation on this. There's no guarantee that any packed flag words in the released data model will be the outputs of single algorithmic Tasks - we may well combine flags from multiple Tasks. (Of course, as a matter of good implementation practice the combiner itself might be a Task, but its selections of what to combine would be more likely to be configuration.) Ideally we could point back from the data model documentation, for most (perhaps all, depending on whether any combining is done) flags, to the Task documentation that defines it algorithmic meaning.
            Hide
            gpdf Gregory Dubois-Felsmann added a comment -

            I'm taking another look at this; where do we stand on the production of flag values in the output of the SDM standardization? Has documentation of those flags been thought about as part of that work? Colin Slater, maybe?

            Show
            gpdf Gregory Dubois-Felsmann added a comment - I'm taking another look at this; where do we stand on the production of flag values in the output of the SDM standardization? Has documentation of those flags been thought about as part of that work? Colin Slater , maybe?

              People

              Assignee:
              jsick Jonathan Sick
              Reporter:
              Parejkoj John Parejko
              Watchers:
              Colin Slater, Eric Bellm, Gregory Dubois-Felsmann, Hsin-Fang Chiang, Jim Bosch, John Parejko, John Swinbank, Jonathan Sick, Krzysztof Suberlak, Leanne Guy, Simon Krughoff, Zeljko Ivezic
              Votes:
              1 Vote for this issue
              Watchers:
              12 Start watching this issue

                Dates

                Created:
                Updated:

                  Jenkins

                  No builds found.