Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-10740

Load WISE n-band catalogs into PDAC

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: None
    • Labels:
    • Story Points:
      12
    • Sprint:
      DB_S17_5, DB_S17_6, DB_S17_7, DB_S17_8, DB_S17_9, DB_S17_10
    • Team:
      Data Access and Database

      Description

      This ticket documents data preparation and loading for the following WISE catalogs:

      Catalog name PDAC database PDAC table IRSA download page
      WISE All-Sky Single Exposure (L1b) Source Table wise_4band_00 allsky_4band_p1bs_psd http://irsa.ipac.caltech.edu/data/download/wise-4band-psd/
      WISE 3-Band Cryo Single Exposure (L1b) Source Table wise_3band_00 allsky_3band_p1bs_psd http://irsa.ipac.caltech.edu/data/download/wise-3band-psd/
      WISE Post-Cryo Single Exposure (L1b) Source Table wise_2band_00 allsky_2band_p1bs_psd http://irsa.ipac.caltech.edu/data/download/wise-2band-psd/

        Attachments

          Issue Links

            Activity

            fritzm Fritz Mueller created issue -
            fritzm Fritz Mueller made changes -
            Field Original Value New Value
            Epic Link DM-10682 [ 32634 ]
            fritzm Fritz Mueller made changes -
            Rank Ranked higher
            fritzm Fritz Mueller made changes -
            Status To Do [ 10001 ] In Progress [ 3 ]
            fritzm Fritz Mueller made changes -
            Story Points 12
            gpdf Gregory Dubois-Felsmann made changes -
            Remote Link This issue links to "Page (Confluence)" [ 15209 ]
            fritzm Fritz Mueller made changes -
            Sprint DB_S17_5 [ 615 ] DB_S17_5, DB_S17_6 [ 615, 619 ]
            fritzm Fritz Mueller made changes -
            Rank Ranked higher
            gapon Igor Gaponenko made changes -
            Description h1. Scope

            This ticket documents data preparation and loading for the following WISE catalogs:
            || Catalog name || PDAC database || PDAC table ||
            | WISE All-Sky Single Exposure (L1b) Source Table | wise_4band_00 | allsky_4band_p1bs_psd |
            | WISE 3-Band Cryo Single Exposure (L1b) Source Table | wise_3band_00 | allsky_3band_p1bs_psd |
            | WISE Post-Cryo Single Exposure (L1b) Source Table | wise_2band_00 | allsky_2band_p1bs_psd |
            gapon Igor Gaponenko made changes -
            Description h1. Scope

            This ticket documents data preparation and loading for the following WISE catalogs:
            || Catalog name || PDAC database || PDAC table ||
            | WISE All-Sky Single Exposure (L1b) Source Table | wise_4band_00 | allsky_4band_p1bs_psd |
            | WISE 3-Band Cryo Single Exposure (L1b) Source Table | wise_3band_00 | allsky_3band_p1bs_psd |
            | WISE Post-Cryo Single Exposure (L1b) Source Table | wise_2band_00 | allsky_2band_p1bs_psd |
            This ticket documents data preparation and loading for the following WISE catalogs:
            || Catalog name || PDAC database || PDAC table ||
            | WISE All-Sky Single Exposure (L1b) Source Table | wise_4band_00 | allsky_4band_p1bs_psd |
            | WISE 3-Band Cryo Single Exposure (L1b) Source Table | wise_3band_00 | allsky_3band_p1bs_psd |
            | WISE Post-Cryo Single Exposure (L1b) Source Table | wise_2band_00 | allsky_2band_p1bs_psd |
            gapon Igor Gaponenko made changes -
            Link This issue relates to DM-9372 [ DM-9372 ]
            gapon Igor Gaponenko made changes -
            Link This issue relates to DM-9736 [ DM-9736 ]
            fritzm Fritz Mueller made changes -
            Comment [ h1. Downloading *WISE All-Sky Single Exposure (L1b) Source Table* data

            \\
            | *Table name* | allsky_4band_p1bs_psd |
            | *Temporary data folder (NCSA)* | /datasets/gapon/wise/allsky_4band_p1bs_psd/ |
            | *Catalog data (IPAC)* | [https://irsa.ipac.caltech.edu/data/download/wise-4band-psd/] |
            | *Schema file (IPAC)* | [https://irsa.ipac.caltech.edu/data/download/wise-4band-psd/wise_allsky_4band_p1bs_psd-schema.txt] |

            h2. Catalog schema

            The original schema file was downloaded as:
            {code}
            /datasets/gapon/wise/allsky_4band_p1bs_psd/allsky_4band_p1bs_psd.txt
            {code}
            Then it was translated into MySQL schema definition (SQL DDL) file using this tool (available in GitHub package [txt2schema.py|https://github.com/lsst-dm/db_pdac_wise/tree/master/tools]):
            {code:bash}
            % python \
              /datasets/gapon/development/db_pdac_wise/tools//txt2schema.py \
                allsky_4band_p1bs_psd.txt \
                > allsky_4band_p1bs_psd.schema
            {code}
            *ATTENTION*: one manual addition of the *PRIMARY KEY* definition had to be done to the generated schema file:
            {code:sql}
            CREATE TABLE `allsky_4band_p1bs_psd` (

                `source_id` CHAR(16) DEFAULT NULL,
                ....
                PRIMARY KEY(`source_id`)

            ) ENGINE=MyISAM;
            {code}

            h2. Catalog data

            Started the download script from *lsst-xfer*. The files will be placed at the temporary data folder at:
            {code}
            /datasets/gapon/wise/allsky_4band_p1bs_psd/downloaded/
            {code}

            The operation succeeded with:
            * *2304* compressed files (*bz2*)
            * *2.5 TB* of total data amount ]
            fritzm Fritz Mueller made changes -
            Comment [ h1. Processing downloaded files of catalog *allsky_4band_p1bs_psd*

            h2. Uncompressing the files

            This stage took nearly 2 days of the wall-clock time using LSST Evaluation Cluster at NCSA. The resulted files were put into the same folder where the compressed ones were located. The original files were replaced with the uncompressed ones. The total amount of data in the folder is *10 TB*:
            {code}
            /datasets/gapon/wise/allsky_4band_p1bs_psd/downloaded/data/
            {code}

            h2. Translating files from the *unl* format into the *TSV* format

            This stage is needed because *TSV* is the only input data format supported by the LSST DB catalog partitioning tools. The output are placed at:
            {code}
            /datasets/gapon/wise/allsky_4band_p1bs_psd/tsv
            {code}
            The translation tool *unl2tsv* is in GitHub at:
            * [https://github.com/lsst-dm/db_pdac_wise/tree/master/tools]

            This operation finished with *11 TB* of data within the output folder.
            ]
            fritzm Fritz Mueller made changes -
            Comment [ h1. Partitioning catalog *allsky_4band_p1bs_psd*

            {color:red}TBC{color} ]
            fritzm Fritz Mueller made changes -
            Comment [ h1. Loading catalog *allsky_4band_p1bs_psd* into *PDAC*

            {color:red}TBC{color} ]
            fritzm Fritz Mueller made changes -
            Comment [ h1. Downloading *WISE 3-Band Cryo Single Exposure (L1b) Source Table* data

            \\
            | *Table name* | allsky_3band_p1bs_psd |
            | *Temporary data folder (NCSA)* | /datasets/gapon/wise/allsky_3band_p1bs_psd/ |
            | *Catalog data (IPAC)* | [https://irsa.ipac.caltech.edu/data/download/wise-3band-psd/] |
            | *Schema file (IPAC)* | [https://irsa.ipac.caltech.edu/data/download/wise-3band-psd/wise_allsky_3band_p1bs_psd-schema.txt] |

            h2. Catalog schema

            The original schema file was downloaded as:
            {code}
            /datasets/gapon/wise/allsky_3band_p1bs_psd/allsky_3band_p1bs_psd.txt
            {code}
            Then it was translated into MySQL schema definition (SQL DDL) file using this tool (available in GitHub package [txt2schema.py|https://github.com/lsst-dm/db_pdac_wise/tree/master/tools]):
            {code:bash}
            % python \
              /datasets/gapon/development/db_pdac_wise/tools//txt2schema.py \
                allsky_3band_p1bs_psd.txt \
                > allsky_3band_p1bs_psd.schema
            {code}
            *ATTENTION*: one manual addition of the *PRIMARY KEY* definition had to be done to the generated schema file:
            {code:sql}
            CREATE TABLE `allsky_3band_p1bs_psd` (

                `source_id` CHAR(16) DEFAULT NULL,
                ....
                PRIMARY KEY(`source_id`)

            ) ENGINE=MyISAM;
            {code}

            h2. Catalog data

            Started the download script from *lsst-xfer*. The files will be placed at the temporary data folder at:
            {code}
            /datasets/gapon/wise/allsky_3band_p1bs_psd/downloaded/
            {code}

            The operation has succeeded with:
            * *2304 * compressed files (*bz2*)
            * *1 TB* of total data amount ]
            fritzm Fritz Mueller made changes -
            Comment [ h1. Processing downloaded files of catalog *allsky_3band_p1bs_psd*

            h2. Uncompressing the files

            The resulted files were put into the same folder where the compressed ones were located. The original files were replaced with the uncompressed ones. The total amount of data in the folder is *3.3 TB*:
            {code}
            /datasets/gapon/wise/allsky_3band_p1bs_psd/downloaded/data/
            {code}

            h2. Translating files from the *unl* format into the *TSV* format

            This stage is needed because *TSV* is the only input data format supported by the LSST DB catalog partitioning tools. The output are placed at:
            {code}
            /datasets/gapon/wise/allsky_3band_p1bs_psd/tsv
            {code}
            The translation tool *unl2tsv* is in GitHub at:
            * [https://github.com/lsst-dm/db_pdac_wise/tree/master/tools]

            This operation finished with *3.4 TB* of data within the output folder.
            ]
            fritzm Fritz Mueller made changes -
            Comment [ h1. Partitioning catalog *allsky_3band_p1bs_psd*

            {color:red}TBC{color} ]
            fritzm Fritz Mueller made changes -
            Comment [ h1. Loading catalog *allsky_3band_p1bs_psd* into *PDAC*

            {color:red}TBC{color} ]
            gapon Igor Gaponenko made changes -
            Description This ticket documents data preparation and loading for the following WISE catalogs:
            || Catalog name || PDAC database || PDAC table ||
            | WISE All-Sky Single Exposure (L1b) Source Table | wise_4band_00 | allsky_4band_p1bs_psd |
            | WISE 3-Band Cryo Single Exposure (L1b) Source Table | wise_3band_00 | allsky_3band_p1bs_psd |
            | WISE Post-Cryo Single Exposure (L1b) Source Table | wise_2band_00 | allsky_2band_p1bs_psd |
            This ticket documents data preparation and loading for the following WISE catalogs:
            || Catalog name || PDAC database || PDAC table || IRSA download page ||
            | WISE All-Sky Single Exposure (L1b) Source Table | wise_4band_00 | allsky_4band_p1bs_psd | http://irsa.ipac.caltech.edu/data/download/wise-4band-psd/ |
            | WISE 3-Band Cryo Single Exposure (L1b) Source Table | wise_3band_00 | allsky_3band_p1bs_psd | http://irsa.ipac.caltech.edu/data/download/wise-3band-psd/ |
            | WISE Post-Cryo Single Exposure (L1b) Source Table | wise_2band_00 | allsky_2band_p1bs_psd | http://irsa.ipac.caltech.edu/data/download/wise-2band-psd/ |
            fritzm Fritz Mueller made changes -
            Remote Link This issue links to "Page (Confluence)" [ 15234 ]
            gapon Igor Gaponenko made changes -
            Link This issue is triggering DM-11027 [ DM-11027 ]
            gapon Igor Gaponenko made changes -
            Link This issue is triggering DM-10988 [ DM-10988 ]
            fritzm Fritz Mueller made changes -
            Sprint DB_S17_5, DB_S17_6 [ 615, 619 ] DB_S17_5, DB_S17_6, DB_S17_7 [ 615, 619, 627 ]
            fritzm Fritz Mueller made changes -
            Rank Ranked higher
            gapon Igor Gaponenko made changes -
            Link This issue is triggering IHS-378 [ IHS-378 ]
            fritzm Fritz Mueller made changes -
            Rank Ranked higher
            fritzm Fritz Mueller made changes -
            Sprint DB_S17_5, DB_S17_6, DB_S17_7 [ 615, 619, 627 ] DB_S17_5, DB_S17_6, DB_S17_7, DB_S17_8 [ 615, 619, 627, 634 ]
            fritzm Fritz Mueller made changes -
            Sprint DB_S17_5, DB_S17_6, DB_S17_7, DB_S17_8 [ 615, 619, 627, 634 ] DB_S17_5, DB_S17_6, DB_S17_7, DB_S17_8, DB_S17_9 [ 615, 619, 627, 634, 640 ]
            fritzm Fritz Mueller made changes -
            Rank Ranked higher
            fritzm Fritz Mueller made changes -
            Sprint DB_S17_5, DB_S17_6, DB_S17_7, DB_S17_8, DB_S17_9 [ 615, 619, 627, 634, 640 ] DB_S17_5, DB_S17_6, DB_S17_7, DB_S17_8, DB_S17_9, DB_S17_10 [ 615, 619, 627, 634, 640, 650 ]
            fritzm Fritz Mueller made changes -
            Rank Ranked higher
            gapon Igor Gaponenko made changes -
            Comment [ h1. Loading catalog allsky_3band_p1bs_psd into PDAC

            The loading protocol is explained in details at:
            * [https://confluence.lsstcorp.org/display/DM/Loading+WISE+catalogs+into+PDAC]

            The rest of this section presents a short summary of actions and tests taken during this loading.

            Dataset configuration
            Database name: *wise_3band_00*
            Do this at lsat-dev:

            {code:bash}
            % cd /datasets/gapon/development/db_pdac_wise/scripts
            % ln -s dataset.bash.wise_3band_00 dataset.bash
            {code}
            ]
            gapon Igor Gaponenko made changes -
            Comment [ h1. Loading catalog allsky_4band_p1bs_psd into PDAC

            The loading protocol is explained in details at:
            * [https://confluence.lsstcorp.org/display/DM/Loading+WISE+catalogs+into+PDAC]

            The rest of this section presents a short summary of actions and tests taken during this loading.

            {color:red}TBC...{color}

            ]
            gapon Igor Gaponenko made changes -
            Link This issue relates to DM-12523 [ DM-12523 ]
            gapon Igor Gaponenko made changes -
            Resolution Done [ 10000 ]
            Status In Progress [ 3 ] Done [ 10002 ]
            gapon Igor Gaponenko made changes -
            Link This issue relates to DM-12910 [ DM-12910 ]

              People

              Assignee:
              gapon Igor Gaponenko
              Reporter:
              fritzm Fritz Mueller
              Watchers:
              Fritz Mueller, Igor Gaponenko
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.