Details
-
Type:
Story
-
Status: Done
-
Resolution: Done
-
Fix Version/s: None
-
Component/s: None
-
Labels:
-
Story Points:12
-
Epic Link:
-
Sprint:DB_S17_5, DB_S17_6, DB_S17_7, DB_S17_8, DB_S17_9, DB_S17_10
-
Team:Data Access and Database
Description
This ticket documents data preparation and loading for the following WISE catalogs:
Catalog name | PDAC database | PDAC table | IRSA download page |
---|---|---|---|
WISE All-Sky Single Exposure (L1b) Source Table | wise_4band_00 | allsky_4band_p1bs_psd | http://irsa.ipac.caltech.edu/data/download/wise-4band-psd/ |
WISE 3-Band Cryo Single Exposure (L1b) Source Table | wise_3band_00 | allsky_3band_p1bs_psd | http://irsa.ipac.caltech.edu/data/download/wise-3band-psd/ |
WISE Post-Cryo Single Exposure (L1b) Source Table | wise_2band_00 | allsky_2band_p1bs_psd | http://irsa.ipac.caltech.edu/data/download/wise-2band-psd/ |
Attachments
Issue Links
- is triggering
-
DM-10988 Evaluating benefits of SSD and NVMe storage technologies for Qserv "secondary index"
- Done
-
DM-11027 Investigate options for speeding up data ingestion into Qserv "secondary index"
- Done
- relates to
-
DM-9736 Quotes around column names in SQL statements cause a parser error.
- Won't Fix
-
DM-9372 Load WISE catalog data in PDAC
- Done
-
DM-12523 Load NEOWISE-R Year 1 Single Exposure (L1b) Source Table into PDAC
- Done
-
DM-12910 Summary of WISE catalogs loaded into PDAC in 2017
- Done
Activity
Field | Original Value | New Value |
---|---|---|
Epic Link |
|
Rank | Ranked higher |
Status | To Do [ 10001 ] | In Progress [ 3 ] |
Story Points | 12 |
Remote Link | This issue links to "Page (Confluence)" [ 15209 ] |
Sprint | DB_S17_5 [ 615 ] | DB_S17_5, DB_S17_6 [ 615, 619 ] |
Rank | Ranked higher |
Description |
h1. Scope
This ticket documents data preparation and loading for the following WISE catalogs: || Catalog name || PDAC database || PDAC table || | WISE All-Sky Single Exposure (L1b) Source Table | wise_4band_00 | allsky_4band_p1bs_psd | | WISE 3-Band Cryo Single Exposure (L1b) Source Table | wise_3band_00 | allsky_3band_p1bs_psd | | WISE Post-Cryo Single Exposure (L1b) Source Table | wise_2band_00 | allsky_2band_p1bs_psd | |
Description |
h1. Scope
This ticket documents data preparation and loading for the following WISE catalogs: || Catalog name || PDAC database || PDAC table || | WISE All-Sky Single Exposure (L1b) Source Table | wise_4band_00 | allsky_4band_p1bs_psd | | WISE 3-Band Cryo Single Exposure (L1b) Source Table | wise_3band_00 | allsky_3band_p1bs_psd | | WISE Post-Cryo Single Exposure (L1b) Source Table | wise_2band_00 | allsky_2band_p1bs_psd | |
This ticket documents data preparation and loading for the following WISE catalogs:
|| Catalog name || PDAC database || PDAC table || | WISE All-Sky Single Exposure (L1b) Source Table | wise_4band_00 | allsky_4band_p1bs_psd | | WISE 3-Band Cryo Single Exposure (L1b) Source Table | wise_3band_00 | allsky_3band_p1bs_psd | | WISE Post-Cryo Single Exposure (L1b) Source Table | wise_2band_00 | allsky_2band_p1bs_psd | |
Comment |
[ h1. Downloading *WISE All-Sky Single Exposure (L1b) Source Table* data
\\ | *Table name* | allsky_4band_p1bs_psd | | *Temporary data folder (NCSA)* | /datasets/gapon/wise/allsky_4band_p1bs_psd/ | | *Catalog data (IPAC)* | [https://irsa.ipac.caltech.edu/data/download/wise-4band-psd/] | | *Schema file (IPAC)* | [https://irsa.ipac.caltech.edu/data/download/wise-4band-psd/wise_allsky_4band_p1bs_psd-schema.txt] | h2. Catalog schema The original schema file was downloaded as: {code} /datasets/gapon/wise/allsky_4band_p1bs_psd/allsky_4band_p1bs_psd.txt {code} Then it was translated into MySQL schema definition (SQL DDL) file using this tool (available in GitHub package [txt2schema.py|https://github.com/lsst-dm/db_pdac_wise/tree/master/tools]): {code:bash} % python \ /datasets/gapon/development/db_pdac_wise/tools//txt2schema.py \ allsky_4band_p1bs_psd.txt \ > allsky_4band_p1bs_psd.schema {code} *ATTENTION*: one manual addition of the *PRIMARY KEY* definition had to be done to the generated schema file: {code:sql} CREATE TABLE `allsky_4band_p1bs_psd` ( `source_id` CHAR(16) DEFAULT NULL, .... PRIMARY KEY(`source_id`) ) ENGINE=MyISAM; {code} h2. Catalog data Started the download script from *lsst-xfer*. The files will be placed at the temporary data folder at: {code} /datasets/gapon/wise/allsky_4band_p1bs_psd/downloaded/ {code} The operation succeeded with: * *2304* compressed files (*bz2*) * *2.5 TB* of total data amount ] |
Comment |
[ h1. Processing downloaded files of catalog *allsky_4band_p1bs_psd*
h2. Uncompressing the files This stage took nearly 2 days of the wall-clock time using LSST Evaluation Cluster at NCSA. The resulted files were put into the same folder where the compressed ones were located. The original files were replaced with the uncompressed ones. The total amount of data in the folder is *10 TB*: {code} /datasets/gapon/wise/allsky_4band_p1bs_psd/downloaded/data/ {code} h2. Translating files from the *unl* format into the *TSV* format This stage is needed because *TSV* is the only input data format supported by the LSST DB catalog partitioning tools. The output are placed at: {code} /datasets/gapon/wise/allsky_4band_p1bs_psd/tsv {code} The translation tool *unl2tsv* is in GitHub at: * [https://github.com/lsst-dm/db_pdac_wise/tree/master/tools] This operation finished with *11 TB* of data within the output folder. ] |
Comment |
[ h1. Partitioning catalog *allsky_4band_p1bs_psd*
{color:red}TBC{color} ] |
Comment |
[ h1. Loading catalog *allsky_4band_p1bs_psd* into *PDAC*
{color:red}TBC{color} ] |
Comment |
[ h1. Downloading *WISE 3-Band Cryo Single Exposure (L1b) Source Table* data
\\ | *Table name* | allsky_3band_p1bs_psd | | *Temporary data folder (NCSA)* | /datasets/gapon/wise/allsky_3band_p1bs_psd/ | | *Catalog data (IPAC)* | [https://irsa.ipac.caltech.edu/data/download/wise-3band-psd/] | | *Schema file (IPAC)* | [https://irsa.ipac.caltech.edu/data/download/wise-3band-psd/wise_allsky_3band_p1bs_psd-schema.txt] | h2. Catalog schema The original schema file was downloaded as: {code} /datasets/gapon/wise/allsky_3band_p1bs_psd/allsky_3band_p1bs_psd.txt {code} Then it was translated into MySQL schema definition (SQL DDL) file using this tool (available in GitHub package [txt2schema.py|https://github.com/lsst-dm/db_pdac_wise/tree/master/tools]): {code:bash} % python \ /datasets/gapon/development/db_pdac_wise/tools//txt2schema.py \ allsky_3band_p1bs_psd.txt \ > allsky_3band_p1bs_psd.schema {code} *ATTENTION*: one manual addition of the *PRIMARY KEY* definition had to be done to the generated schema file: {code:sql} CREATE TABLE `allsky_3band_p1bs_psd` ( `source_id` CHAR(16) DEFAULT NULL, .... PRIMARY KEY(`source_id`) ) ENGINE=MyISAM; {code} h2. Catalog data Started the download script from *lsst-xfer*. The files will be placed at the temporary data folder at: {code} /datasets/gapon/wise/allsky_3band_p1bs_psd/downloaded/ {code} The operation has succeeded with: * *2304 * compressed files (*bz2*) * *1 TB* of total data amount ] |
Comment |
[ h1. Processing downloaded files of catalog *allsky_3band_p1bs_psd*
h2. Uncompressing the files The resulted files were put into the same folder where the compressed ones were located. The original files were replaced with the uncompressed ones. The total amount of data in the folder is *3.3 TB*: {code} /datasets/gapon/wise/allsky_3band_p1bs_psd/downloaded/data/ {code} h2. Translating files from the *unl* format into the *TSV* format This stage is needed because *TSV* is the only input data format supported by the LSST DB catalog partitioning tools. The output are placed at: {code} /datasets/gapon/wise/allsky_3band_p1bs_psd/tsv {code} The translation tool *unl2tsv* is in GitHub at: * [https://github.com/lsst-dm/db_pdac_wise/tree/master/tools] This operation finished with *3.4 TB* of data within the output folder. ] |
Comment |
[ h1. Partitioning catalog *allsky_3band_p1bs_psd*
{color:red}TBC{color} ] |
Comment |
[ h1. Loading catalog *allsky_3band_p1bs_psd* into *PDAC*
{color:red}TBC{color} ] |
Description |
This ticket documents data preparation and loading for the following WISE catalogs:
|| Catalog name || PDAC database || PDAC table || | WISE All-Sky Single Exposure (L1b) Source Table | wise_4band_00 | allsky_4band_p1bs_psd | | WISE 3-Band Cryo Single Exposure (L1b) Source Table | wise_3band_00 | allsky_3band_p1bs_psd | | WISE Post-Cryo Single Exposure (L1b) Source Table | wise_2band_00 | allsky_2band_p1bs_psd | |
This ticket documents data preparation and loading for the following WISE catalogs:
|| Catalog name || PDAC database || PDAC table || IRSA download page || | WISE All-Sky Single Exposure (L1b) Source Table | wise_4band_00 | allsky_4band_p1bs_psd | http://irsa.ipac.caltech.edu/data/download/wise-4band-psd/ | | WISE 3-Band Cryo Single Exposure (L1b) Source Table | wise_3band_00 | allsky_3band_p1bs_psd | http://irsa.ipac.caltech.edu/data/download/wise-3band-psd/ | | WISE Post-Cryo Single Exposure (L1b) Source Table | wise_2band_00 | allsky_2band_p1bs_psd | http://irsa.ipac.caltech.edu/data/download/wise-2band-psd/ | |
Remote Link | This issue links to "Page (Confluence)" [ 15234 ] |
Sprint | DB_S17_5, DB_S17_6 [ 615, 619 ] | DB_S17_5, DB_S17_6, DB_S17_7 [ 615, 619, 627 ] |
Rank | Ranked higher |
Link | This issue is triggering IHS-378 [ IHS-378 ] |
Rank | Ranked higher |
Sprint | DB_S17_5, DB_S17_6, DB_S17_7 [ 615, 619, 627 ] | DB_S17_5, DB_S17_6, DB_S17_7, DB_S17_8 [ 615, 619, 627, 634 ] |
Sprint | DB_S17_5, DB_S17_6, DB_S17_7, DB_S17_8 [ 615, 619, 627, 634 ] | DB_S17_5, DB_S17_6, DB_S17_7, DB_S17_8, DB_S17_9 [ 615, 619, 627, 634, 640 ] |
Rank | Ranked higher |
Sprint | DB_S17_5, DB_S17_6, DB_S17_7, DB_S17_8, DB_S17_9 [ 615, 619, 627, 634, 640 ] | DB_S17_5, DB_S17_6, DB_S17_7, DB_S17_8, DB_S17_9, DB_S17_10 [ 615, 619, 627, 634, 640, 650 ] |
Rank | Ranked higher |
Comment |
[ h1. Loading catalog allsky_3band_p1bs_psd into PDAC
The loading protocol is explained in details at: * [https://confluence.lsstcorp.org/display/DM/Loading+WISE+catalogs+into+PDAC] The rest of this section presents a short summary of actions and tests taken during this loading. Dataset configuration Database name: *wise_3band_00* Do this at lsat-dev: {code:bash} % cd /datasets/gapon/development/db_pdac_wise/scripts % ln -s dataset.bash.wise_3band_00 dataset.bash {code} ] |
Comment |
[ h1. Loading catalog allsky_4band_p1bs_psd into PDAC
The loading protocol is explained in details at: * [https://confluence.lsstcorp.org/display/DM/Loading+WISE+catalogs+into+PDAC] The rest of this section presents a short summary of actions and tests taken during this loading. {color:red}TBC...{color} ] |
Resolution | Done [ 10000 ] | |
Status | In Progress [ 3 ] | Done [ 10002 ] |