Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-14877

Develop content for a Data Management glossary

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: None
    • Labels:
    • Team:
      DM Science

      Description

      One single glossary of terms for data management is needed, especially for commissioning teams and science collaborations using the data management software, service and data product.  Several DM glossaries exist already: 

      https://confluence.lsstcorp.org/display/LSWUG/DMS+Glossary

      https://confluence.lsstcorp.org/display/LSWUG/Astro+Glossary

      https://dev.lsstcorp.org/trac/wiki/glossary

      https://dev.lsstcorp.org/trac/attachment/wiki/Glossary/EAGlossary.pdf

      Project-level acronyms: http://ls.st/Document-11921*

      Project-level glossary: http://ls.st/Document-14412*   

      The goal is to consolidate these and provide one single glossary.

        Attachments

          Issue Links

            Activity

            Hide
            ctslater Colin Slater added a comment -

            I posted a new version with some revised definitions; deleted several of the undefined items that refer to deep inner workings of the code, added more references.

            Show
            ctslater Colin Slater added a comment - I posted a new version with some revised definitions; deleted several of the undefined items that refer to deep inner workings of the code, added more references.
            Hide
            mgraham Melissa Graham added a comment - - edited

             

            A new version of the Glossary's data file has been uploaded, MLG_Definitions_20180905.csv, which builds on CTS_definitions_20180723.csv. All of the above comments and suggestions have been incorporated. A small number of outstanding issues remain before we can push this along.

             

             

            Outstanding Issues.

            Can we get document references for:

            icExp - An image after ISR, background subtraction, and PSF determination, produced by CharacterizeImageTask.

            supertask - A means of packaging an algorithm or series of algorithms and describing how it processes data so that it can be executed in multiple computing environments ranging from notebook to desktop to distributed clusters.

            task - Tasks are the basic unit of code re-use in the LSST Stack. They are derived from the pipeline task base class, and they perform a well defined, logically contained bit of functionality. Tasks are often used as building blocks for Command-line Tasks, which are effectively data processing pipelines. For further details, see How to Write a Task in the source code documentation.

            This one seems too vague, should we remove it or can someone improve it?

            template table - A database table that contains no data, and is used to create short-lived tables using syntax "CREATE TABLE x LIKE y".

             

            Items Added

            collection - A data collection in the second-generation (Gen2) Butler (referred to as a repository in earlier generations) consists of hierarchically organized data files, an inventory or registry of the contents (i.e., metadata from the data files) stored in an sqlite3 file, and a Mapper file that specifies to the LSST Stack software the camera model to apply when accessing the data in the repository.

            repository - A data repository consists of hierarchically organized data files, an inventory or registry of the contents (i.e., metadata from the data files) stored in an sqlite3 file, and a Mapper file that specifies to the LSST Stack software the camera model to apply when accessing the data in the repository. With the second-generation (Gen2) Butler, the term repository will be replaced by collection.

            Data Release - The approximately annual reprocessing of all LSST data, and the installation of the resulting data products in the LSST Data Access Centers, which marks the start of the two-year proprietary period.

            mini-broker - A tool within the LSST Science Platform that provides a limited amount of alert filtering capabilities.

            sqlite3 - A software package external to DM, sqlite3 provides a SQL interface compliant with the DB-API 2.0 specification for SQLite, a self-contained public-domain SQL database engine.

            Science Platform - A set of integrated web applications and services deployed at the LSST Data Access Centers (DACs) through which the scientific community will access, visualize, and perform next-to-the-data analysis of the LSST data products.

            Special Program - Any LSST mini-survey or deep drilling field that is observed independently of the Wide-Fast-Deep (WFD) main survey.

            Wide-Fast-Deep - The main survey of the LSST to cover at least 18000 square degrees of the southern sky.

            Items Removed

            Data Repository - A structured directory hierarchy of data products, an inventory (or registry) of their metadata, and a Mapper file indicating the camera from which the data were obtained.
            --> this has been replaced by 'repository' and 'collection'

             

            Show
            mgraham Melissa Graham added a comment - - edited   A new version of the Glossary's data file has been uploaded, MLG_Definitions_20180905.csv, which builds on CTS_definitions_20180723.csv. All of the above comments and suggestions have been incorporated. A small number of outstanding issues remain before we can push this along.     Outstanding Issues. Can we get document references for: icExp - An image after ISR, background subtraction, and PSF determination, produced by CharacterizeImageTask. supertask - A means of packaging an algorithm or series of algorithms and describing how it processes data so that it can be executed in multiple computing environments ranging from notebook to desktop to distributed clusters. task - Tasks are the basic unit of code re-use in the LSST Stack. They are derived from the pipeline task base class, and they perform a well defined, logically contained bit of functionality. Tasks are often used as building blocks for Command-line Tasks, which are effectively data processing pipelines. For further details, see How to Write a Task in the source code documentation. This one seems too vague, should we remove it or can someone improve it? template table - A database table that contains no data, and is used to create short-lived tables using syntax "CREATE TABLE x LIKE y".   Items Added collection - A data collection in the second-generation (Gen2) Butler (referred to as a repository in earlier generations) consists of hierarchically organized data files, an inventory or registry of the contents (i.e., metadata from the data files) stored in an sqlite3 file, and a Mapper file that specifies to the LSST Stack software the camera model to apply when accessing the data in the repository. repository - A data repository consists of hierarchically organized data files, an inventory or registry of the contents (i.e., metadata from the data files) stored in an sqlite3 file, and a Mapper file that specifies to the LSST Stack software the camera model to apply when accessing the data in the repository. With the second-generation (Gen2) Butler, the term repository will be replaced by collection. Data Release - The approximately annual reprocessing of all LSST data, and the installation of the resulting data products in the LSST Data Access Centers, which marks the start of the two-year proprietary period. mini-broker - A tool within the LSST Science Platform that provides a limited amount of alert filtering capabilities. sqlite3 - A software package external to DM, sqlite3 provides a SQL interface compliant with the DB-API 2.0 specification for SQLite, a self-contained public-domain SQL database engine. Science Platform - A set of integrated web applications and services deployed at the LSST Data Access Centers (DACs) through which the scientific community will access, visualize, and perform next-to-the-data analysis of the LSST data products. Special Program - Any LSST mini-survey or deep drilling field that is observed independently of the Wide-Fast-Deep (WFD) main survey. Wide-Fast-Deep - The main survey of the LSST to cover at least 18000 square degrees of the southern sky. Items Removed Data Repository - A structured directory hierarchy of data products, an inventory (or registry) of their metadata, and a Mapper file indicating the camera from which the data were obtained. --> this has been replaced by 'repository' and 'collection'  
            Hide
            tjenness Tim Jenness added a comment -

            supertask is now pipelinetask.

            Show
            tjenness Tim Jenness added a comment - supertask is now pipelinetask.
            Hide
            mgraham Melissa Graham added a comment -

            Slightly updated version of MLG_Definitions_20180905.csv uploaded, incorporating the following immediate feedback from Slack's #dm-sst channel.

            Updated Definitions

            supertask - Deprecated term; see PipelineTask.

            PipelineTask - A means of packaging an algorithm or series of algorithms and describing how it processes data so that it can be executed in multiple computing environments ranging from notebook to desktop to distributed clusters.

            task - Tasks are the basic unit of code re-use in the LSST Stack. They are derived from the pipeline task base class, and they perform a well defined, logically contained bit of functionality. Tasks come standard with logging, processing metadata, and debugging features. Tasks are often used as building blocks for Command-line Tasks, which are effectively data processing pipelines. For further details, see How to Write a Task in the source code documentation.

            template table - removed

            icExp - removed

            Show
            mgraham Melissa Graham added a comment - Slightly updated version of MLG_Definitions_20180905.csv uploaded, incorporating the following immediate feedback from Slack's #dm-sst channel. Updated Definitions supertask - Deprecated term; see PipelineTask. PipelineTask - A means of packaging an algorithm or series of algorithms and describing how it processes data so that it can be executed in multiple computing environments ranging from notebook to desktop to distributed clusters. task - Tasks are the basic unit of code re-use in the LSST Stack. They are derived from the pipeline task base class, and they perform a well defined, logically contained bit of functionality. Tasks come standard with logging, processing metadata, and debugging features. Tasks are often used as building blocks for Command-line Tasks, which are effectively data processing pipelines. For further details, see How to Write a Task in the source code documentation. template table - removed icExp - removed
            Hide
            mgraham Melissa Graham added a comment - - edited

            RFC-520 has been completed, with all feedback integrated.

            Final versions of the DM Glossary contents have been uploaded to this ticket (dated 20180925, provided in both CSV and Mac Numbers formats). 

            This ticket is now complete. The DM Glossary terms are ready for Tim Jenness to take away and implement into MagicDraw. Note that the files also contain many non-DM terms, and that these terms have been tagged with the most applicable LSST subsystem, but otherwise not reviewed or updated.

             

            Show
            mgraham Melissa Graham added a comment - - edited RFC-520 has been completed, with all feedback integrated. Final versions of the DM Glossary contents have been uploaded to this ticket (dated 20180925, provided in both CSV and Mac Numbers formats).  This ticket is now complete. The DM Glossary terms are ready for Tim Jenness to take away and implement into MagicDraw. Note that the files also contain many non-DM terms, and that these terms have been tagged with the most applicable LSST subsystem, but otherwise not reviewed or updated.  

              People

              Assignee:
              mgraham Melissa Graham
              Reporter:
              lguy Leanne Guy
              Watchers:
              Austin Roberts, Colin Slater, Gregory Dubois-Felsmann, Jim Bosch, John Swinbank, Kian-Tat Lim, Leanne Guy, Melissa Graham, Tim Jenness
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

                Dates

                Due:
                Created:
                Updated:
                Resolved: