Details
-
Type:
Story
-
Status: Done
-
Resolution: Done
-
Fix Version/s: None
-
Component/s: Qserv
-
Labels:
-
Story Points:16
-
Epic Link:
-
Sprint:DB_F16_7, DB_F16_8, DB_F16_9
-
Team:Data Access and Database
Description
Create a database populated with complete catalogs resulting from processing of the SDSS Stripe 82 data at both NCSA and IN2P3 sites. The database has to be created at NCSA on the following database server lsst-db.ncsa.illinois.edu
- 2016-08-10: CORRECTION: that was the original plan. Due to the space limitation on that server the database will be installed on Nebula volume lsst-dm-gapon attached to image lsst-gapon-0 (floating IP address 141.142.211.152). See details on this setup in the comments stream below.)
The database name will be: gapon_SDRP_Stripe82
The database will be populated with the contents of the following databases:
- NCSA (lsst-db.ncsa.illinois.edu):
daues_SDRP_Stripe82_ncsa
daues_SDRP_dedupe_byfilter_0
daues_SDRP_dedupe_byfilter_1
daues_SDRP_dedupe_byfilter_2
daues_SDRP_dedupe_byfilter_3
daues_SDRP_dedupe_byfilter_4
- IN2P3 (ccdb02.in2p3.fr):
lsst_prod_DC_2013_2
lsst_prod_dedupe_byfilter_g
lsst_prod_dedupe_byfilter_i
lsst_prod_dedupe_byfilter_r
lsst_prod_dedupe_byfilter_u
lsst_prod_dedupe_byfilter_z
Additional requirements:
- The duplicate entries (due to the overlap in the RA range) will need to be carefully assessed and eliminated.
- the referential integrity of the resulted database will need to be tested
Attachments
Issue Links
Activity
Field | Original Value | New Value |
---|---|---|
Epic Link |
|
Description |
Create a database populated with complete catalogs resulting from processing of the *SDSS Stripe 82* data at both NCSA and IN2P3 sites. The database has to be created at NCSA on the following database server:
* lsst-db.ncsa.illinois.edu The database name will be: * gapon_SDRP_Stripe82 The database will be populated with the contents of the following databases: *NCSA* (lsst-db.ncsa.illinois.edu): | daues_SDRP_Stripe82_ncsa | | daues_SDRP_dedupe_byfilter_0 | | daues_SDRP_dedupe_byfilter_1 | | daues_SDRP_dedupe_byfilter_2 | | daues_SDRP_dedupe_byfilter_3 | | daues_SDRP_dedupe_byfilter_4 | *IN2P3* (ccdb02.in2p3.fr): | lsst_prod_DC_2013_2 | | lsst_prod_dedupe_byfilter_g | | lsst_prod_dedupe_byfilter_i | | lsst_prod_dedupe_byfilter_r | | lsst_prod_dedupe_byfilter_u | | lsst_prod_dedupe_byfilter_z | Additional requirements: * The duplicate entries (due to the overlap in the RA range) will need to be carefully assessed and eliminated. * the referential integrity of the resulted database will need to be tested |
Create a database populated with complete catalogs resulting from processing of the *SDSS Stripe 82* data at both NCSA and IN2P3 sites. The database has to be created at NCSA on the following database server:
* *lsst-db.ncsa.illinois.edu* The database name will be: * *gapon_SDRP_Stripe82* The database will be populated with the contents of the following databases: *NCSA* (lsst-db.ncsa.illinois.edu): {code} daues_SDRP_Stripe82_ncsa daues_SDRP_dedupe_byfilter_0 daues_SDRP_dedupe_byfilter_1 daues_SDRP_dedupe_byfilter_2 daues_SDRP_dedupe_byfilter_3 daues_SDRP_dedupe_byfilter_4 {code} *IN2P3* (ccdb02.in2p3.fr): {code} lsst_prod_DC_2013_2 lsst_prod_dedupe_byfilter_g lsst_prod_dedupe_byfilter_i lsst_prod_dedupe_byfilter_r lsst_prod_dedupe_byfilter_u lsst_prod_dedupe_byfilter_z {code} Additional requirements: * The duplicate entries (due to the overlap in the RA range) will need to be carefully assessed and eliminated. * the referential integrity of the resulted database will need to be tested |
Description |
Create a database populated with complete catalogs resulting from processing of the *SDSS Stripe 82* data at both NCSA and IN2P3 sites. The database has to be created at NCSA on the following database server:
* *lsst-db.ncsa.illinois.edu* The database name will be: * *gapon_SDRP_Stripe82* The database will be populated with the contents of the following databases: *NCSA* (lsst-db.ncsa.illinois.edu): {code} daues_SDRP_Stripe82_ncsa daues_SDRP_dedupe_byfilter_0 daues_SDRP_dedupe_byfilter_1 daues_SDRP_dedupe_byfilter_2 daues_SDRP_dedupe_byfilter_3 daues_SDRP_dedupe_byfilter_4 {code} *IN2P3* (ccdb02.in2p3.fr): {code} lsst_prod_DC_2013_2 lsst_prod_dedupe_byfilter_g lsst_prod_dedupe_byfilter_i lsst_prod_dedupe_byfilter_r lsst_prod_dedupe_byfilter_u lsst_prod_dedupe_byfilter_z {code} Additional requirements: * The duplicate entries (due to the overlap in the RA range) will need to be carefully assessed and eliminated. * the referential integrity of the resulted database will need to be tested |
Create a database populated with complete catalogs resulting from processing of the *SDSS Stripe 82* data at both NCSA and IN2P3 sites. The database has to be created at NCSA on the following database server:
* *lsst-db.ncsa.illinois.edu* The database name will be: * *gapon_SDRP_Stripe82* The database will be populated with the contents of the following databases: * *NCSA* (lsst-db.ncsa.illinois.edu): {code} daues_SDRP_Stripe82_ncsa daues_SDRP_dedupe_byfilter_0 daues_SDRP_dedupe_byfilter_1 daues_SDRP_dedupe_byfilter_2 daues_SDRP_dedupe_byfilter_3 daues_SDRP_dedupe_byfilter_4 {code} * *IN2P3* (ccdb02.in2p3.fr): {code} lsst_prod_DC_2013_2 lsst_prod_dedupe_byfilter_g lsst_prod_dedupe_byfilter_i lsst_prod_dedupe_byfilter_r lsst_prod_dedupe_byfilter_u lsst_prod_dedupe_byfilter_z {code} Additional requirements: * The duplicate entries (due to the overlap in the RA range) will need to be carefully assessed and eliminated. * the referential integrity of the resulted database will need to be tested |
Story Points | 8 | 16 |
Status | To Do [ 10001 ] | In Progress [ 3 ] |
Sprint | DB_F16_7 [ 232 ] | DB_F16_7, DB_F16_8 [ 232, 233 ] |
Rank | Ranked higher |
Description |
Create a database populated with complete catalogs resulting from processing of the *SDSS Stripe 82* data at both NCSA and IN2P3 sites. The database has to be created at NCSA on the following database server:
* *lsst-db.ncsa.illinois.edu* The database name will be: * *gapon_SDRP_Stripe82* The database will be populated with the contents of the following databases: * *NCSA* (lsst-db.ncsa.illinois.edu): {code} daues_SDRP_Stripe82_ncsa daues_SDRP_dedupe_byfilter_0 daues_SDRP_dedupe_byfilter_1 daues_SDRP_dedupe_byfilter_2 daues_SDRP_dedupe_byfilter_3 daues_SDRP_dedupe_byfilter_4 {code} * *IN2P3* (ccdb02.in2p3.fr): {code} lsst_prod_DC_2013_2 lsst_prod_dedupe_byfilter_g lsst_prod_dedupe_byfilter_i lsst_prod_dedupe_byfilter_r lsst_prod_dedupe_byfilter_u lsst_prod_dedupe_byfilter_z {code} Additional requirements: * The duplicate entries (due to the overlap in the RA range) will need to be carefully assessed and eliminated. * the referential integrity of the resulted database will need to be tested |
Create a database populated with complete catalogs resulting from processing of the *SDSS Stripe 82* data at both NCSA and IN2P3 sites. -The database has to be created at NCSA on the following database server:
* *lsst-db.ncsa.illinois.edu* - ** 2016-08-10: *CORRECTION*: that was the original plan. Due to the space limitation on that server the database will be installed on Nebula volume *lsst-dm-gapon* attached to image *lsst-gapon-0* (floating IP address *141.142.211.152*. See details on this setup in the comments stream below.) The database name will be: * *gapon_SDRP_Stripe82* The database will be populated with the contents of the following databases: * *NCSA* (lsst-db.ncsa.illinois.edu): {code} daues_SDRP_Stripe82_ncsa daues_SDRP_dedupe_byfilter_0 daues_SDRP_dedupe_byfilter_1 daues_SDRP_dedupe_byfilter_2 daues_SDRP_dedupe_byfilter_3 daues_SDRP_dedupe_byfilter_4 {code} * *IN2P3* (ccdb02.in2p3.fr): {code} lsst_prod_DC_2013_2 lsst_prod_dedupe_byfilter_g lsst_prod_dedupe_byfilter_i lsst_prod_dedupe_byfilter_r lsst_prod_dedupe_byfilter_u lsst_prod_dedupe_byfilter_z {code} Additional requirements: * The duplicate entries (due to the overlap in the RA range) will need to be carefully assessed and eliminated. * the referential integrity of the resulted database will need to be tested |
Description |
Create a database populated with complete catalogs resulting from processing of the *SDSS Stripe 82* data at both NCSA and IN2P3 sites. -The database has to be created at NCSA on the following database server:
* *lsst-db.ncsa.illinois.edu* - ** 2016-08-10: *CORRECTION*: that was the original plan. Due to the space limitation on that server the database will be installed on Nebula volume *lsst-dm-gapon* attached to image *lsst-gapon-0* (floating IP address *141.142.211.152*. See details on this setup in the comments stream below.) The database name will be: * *gapon_SDRP_Stripe82* The database will be populated with the contents of the following databases: * *NCSA* (lsst-db.ncsa.illinois.edu): {code} daues_SDRP_Stripe82_ncsa daues_SDRP_dedupe_byfilter_0 daues_SDRP_dedupe_byfilter_1 daues_SDRP_dedupe_byfilter_2 daues_SDRP_dedupe_byfilter_3 daues_SDRP_dedupe_byfilter_4 {code} * *IN2P3* (ccdb02.in2p3.fr): {code} lsst_prod_DC_2013_2 lsst_prod_dedupe_byfilter_g lsst_prod_dedupe_byfilter_i lsst_prod_dedupe_byfilter_r lsst_prod_dedupe_byfilter_u lsst_prod_dedupe_byfilter_z {code} Additional requirements: * The duplicate entries (due to the overlap in the RA range) will need to be carefully assessed and eliminated. * the referential integrity of the resulted database will need to be tested |
Create a database populated with complete catalogs resulting from processing of the *SDSS Stripe 82* data at both NCSA and IN2P3 sites. -The database has to be created at NCSA on the following database server *lsst-db.ncsa.illinois.edu*-
* 2016-08-10: *CORRECTION*: that was the original plan. Due to the space limitation on that server the database will be installed on Nebula volume *lsst-dm-gapon* attached to image *lsst-gapon-0* (floating IP address *141.142.211.152*. See details on this setup in the comments stream below.) The database name will be: *gapon_SDRP_Stripe82* The database will be populated with the contents of the following databases: * *NCSA* (lsst-db.ncsa.illinois.edu): {code} daues_SDRP_Stripe82_ncsa daues_SDRP_dedupe_byfilter_0 daues_SDRP_dedupe_byfilter_1 daues_SDRP_dedupe_byfilter_2 daues_SDRP_dedupe_byfilter_3 daues_SDRP_dedupe_byfilter_4 {code} * *IN2P3* (ccdb02.in2p3.fr): {code} lsst_prod_DC_2013_2 lsst_prod_dedupe_byfilter_g lsst_prod_dedupe_byfilter_i lsst_prod_dedupe_byfilter_r lsst_prod_dedupe_byfilter_u lsst_prod_dedupe_byfilter_z {code} Additional requirements: * The duplicate entries (due to the overlap in the RA range) will need to be carefully assessed and eliminated. * the referential integrity of the resulted database will need to be tested |
Description |
Create a database populated with complete catalogs resulting from processing of the *SDSS Stripe 82* data at both NCSA and IN2P3 sites. -The database has to be created at NCSA on the following database server *lsst-db.ncsa.illinois.edu*-
* 2016-08-10: *CORRECTION*: that was the original plan. Due to the space limitation on that server the database will be installed on Nebula volume *lsst-dm-gapon* attached to image *lsst-gapon-0* (floating IP address *141.142.211.152*. See details on this setup in the comments stream below.) The database name will be: *gapon_SDRP_Stripe82* The database will be populated with the contents of the following databases: * *NCSA* (lsst-db.ncsa.illinois.edu): {code} daues_SDRP_Stripe82_ncsa daues_SDRP_dedupe_byfilter_0 daues_SDRP_dedupe_byfilter_1 daues_SDRP_dedupe_byfilter_2 daues_SDRP_dedupe_byfilter_3 daues_SDRP_dedupe_byfilter_4 {code} * *IN2P3* (ccdb02.in2p3.fr): {code} lsst_prod_DC_2013_2 lsst_prod_dedupe_byfilter_g lsst_prod_dedupe_byfilter_i lsst_prod_dedupe_byfilter_r lsst_prod_dedupe_byfilter_u lsst_prod_dedupe_byfilter_z {code} Additional requirements: * The duplicate entries (due to the overlap in the RA range) will need to be carefully assessed and eliminated. * the referential integrity of the resulted database will need to be tested |
Create a database populated with complete catalogs resulting from processing of the *SDSS Stripe 82* data at both NCSA and IN2P3 sites. -The database has to be created at NCSA on the following database server *lsst-db.ncsa.illinois.edu*-
* 2016-08-10: *CORRECTION*: that was the original plan. Due to the space limitation on that server the database will be installed on Nebula volume *lsst-dm-gapon* attached to image *lsst-gapon-0* (floating IP address *141.142.211.152*). See details on this setup in the comments stream below.) The database name will be: *gapon_SDRP_Stripe82* The database will be populated with the contents of the following databases: * *NCSA* (lsst-db.ncsa.illinois.edu): {code} daues_SDRP_Stripe82_ncsa daues_SDRP_dedupe_byfilter_0 daues_SDRP_dedupe_byfilter_1 daues_SDRP_dedupe_byfilter_2 daues_SDRP_dedupe_byfilter_3 daues_SDRP_dedupe_byfilter_4 {code} * *IN2P3* (ccdb02.in2p3.fr): {code} lsst_prod_DC_2013_2 lsst_prod_dedupe_byfilter_g lsst_prod_dedupe_byfilter_i lsst_prod_dedupe_byfilter_r lsst_prod_dedupe_byfilter_u lsst_prod_dedupe_byfilter_z {code} Additional requirements: * The duplicate entries (due to the overlap in the RA range) will need to be carefully assessed and eliminated. * the referential integrity of the resulted database will need to be tested |
Comment |
[ h1. Initializing database gapon_SDRP_Stripe82 (NCSA)
The database will be initialized as a clone of an existing database *daues_SDRP_Stripe82_ncsa.sql*. The procedure has the following steps: # making a *mysqldump* of the source database # creating the destination database # loading the dump into the destination database Initially the dump files were placed at (NCSA): {code} /nfs/lsst8/gapon {code} After switching the database server from *lsst-db* to a Nebula image/volume the dumps were made (uploaded) onto the Nebula volume first. A reason of that was that neither LSST NFS mount poibnt was available to the Nebula images nor the Nebula storage volumes could be mounted on the public development machines allocated to LSST at NCSA. h2. Loading options * in theory a more efficient way of cloning a database would be by chaining two operations using UNIX pipe w/o compressing/decompressing the stream: {code:bash} mysqldump ... daues_SDRP_Stripe82_ncsa | mysql ... gapon_SDRP_Stripe82 {code} The main reason why I chose to go via an intermediate file is because it allows a quicker turnaround time for making other clones (should this be needed) based on the dump file. The second reason was related to the security restrictions imposed by MySQL for views. For instance, a straightforward import of the above taken snapshot failed with the following message: {code} ERROR 1227 (42000) at line 140273: Access denied; you need (at least one of) the SUPER privilege(s) for this operation {code} Further investigation has pointed out onto the following line in the dump: {code} /*!50013 DEFINER=`daues`@`%` SQL SECURITY DEFINER */ {code} Hence, a solution was to: {code:bash} % cat daues_SDRP_Stripe82_ncsa.sql | sed 's/daues/gapon/g' | mysql gapon_SDRP_Stripe82 {code} At this point the database was properly seeded. h3. Observations The performance of the database loading was rather low: 20 MB/s or less. And there was a substantial CPU usage on the database server: {code} % top PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 5236 mysql 20 0 972804 216772 4044 S 83.0 1.3 689:45.22 mysqld 8307 gapon 20 0 238796 6632 2768 S 16.3 0.0 95:42.25 mysql {code} ] |
Comment |
[ h1. Loading deep forced sources into gapon_SDRP_Stripe82 (NCSA)
Loading data of the *u* band (filter index *0*): {code:sql} USE gapon_SDRP_Stripe82; CREATE TABLE RunDeepForcedSource LIKE daues_SDRP_dedupe_byfilter_0.RunDeepForcedSource; INSERT INTO RunDeepForcedSource SELECT * FROM daues_SDRP_dedupe_byfilter_0.RunDeepForcedSource; {code} The query finished in 32 hours: {code} Query OK, 1729599791 rows affected (1 day 8 hours 15 min 12.97 sec) Records: 1729599791 Duplicates: 0 Warnings: 0 {code} *Observations*: the aggregate performance was (700 GB of data from the table + 190 GB of index) / (1 day 8 hours 15 min) = *7.7 MB/s*. *UPDATED: 2016-08-08*: ran out of space on the database server. Giving up with this option. Investigating other options for setting up a database server with the large disk. ] |
Sprint | DB_F16_7, DB_F16_8 [ 232, 233 ] | DB_F16_7, DB_F16_8, DB_F16_9 [ 232, 233, 249 ] |
Rank | Ranked higher |
Labels | pdac |
Resolution | Done [ 10000 ] | |
Status | In Progress [ 3 ] | Done [ 10002 ] |
Remote Link | This issue links to "Page (Confluence)" [ 14327 ] |