Details
-
Type:
Story
-
Status: Done
-
Resolution: Done
-
Fix Version/s: None
-
Component/s: None
-
Labels:
-
Story Points:12
-
Epic Link:
-
Sprint:DB_S17_5, DB_S17_6, DB_S17_7, DB_S17_8, DB_S17_9, DB_S17_10
-
Team:Data Access and Database
Description
This ticket documents data preparation and loading for the following WISE catalogs:
Catalog name | PDAC database | PDAC table | IRSA download page |
---|---|---|---|
WISE All-Sky Single Exposure (L1b) Source Table | wise_4band_00 | allsky_4band_p1bs_psd | http://irsa.ipac.caltech.edu/data/download/wise-4band-psd/ |
WISE 3-Band Cryo Single Exposure (L1b) Source Table | wise_3band_00 | allsky_3band_p1bs_psd | http://irsa.ipac.caltech.edu/data/download/wise-3band-psd/ |
WISE Post-Cryo Single Exposure (L1b) Source Table | wise_2band_00 | allsky_2band_p1bs_psd | http://irsa.ipac.caltech.edu/data/download/wise-2band-psd/ |
Attachments
Issue Links
- is triggering
-
DM-10988 Evaluating benefits of SSD and NVMe storage technologies for Qserv "secondary index"
- Done
-
DM-11027 Investigate options for speeding up data ingestion into Qserv "secondary index"
- Done
- relates to
-
DM-9736 Quotes around column names in SQL statements cause a parser error.
- Won't Fix
-
DM-9372 Load WISE catalog data in PDAC
- Done
-
DM-12523 Load NEOWISE-R Year 1 Single Exposure (L1b) Source Table into PDAC
- Done
-
DM-12910 Summary of WISE catalogs loaded into PDAC in 2017
- Done
Activity
Loading catalog allsky_3band_p1bs_psd into PDAC
The loading protocol is explained in details at:
The rest of this section presents a short summary of actions and tests taken during this loading.
Dataset configuration
Database name: wise_3band_00
Do this at lsat-dev:
% cd /datasets/gapon/development/db_pdac_wise/scripts |
% ln -s dataset.bash.wise_3band_00 dataset.bash |
Creating database and initializing CSS
All these steps were conducted on the master node of the cluster lsst-qserv-master01:
% /bin/sudo -u qserv \ |
/datasets/gapon/development/db_pdac_wise/scripts/run_create_database.bash \ |
--delete --verbose >& run_create_database.log&
|
|
% /bin/sudo -u qserv \ |
/datasets/gapon/development/db_pdac_wise/scripts/setup_css.bash \ |
>& setup_css.log
|
% source /datasets/gapon/stack/loadLSST.bash |
% setup -t qserv-dev qserv_distrib; setup -j -r /datasets/gapon/development/qserv |
% qserv-admin.py "DUMP EVERYTHING;" | grep wise_3band_00 |
/DBS/wise_3band_00 READY |
/DBS/wise_3band_00/.packed.json {"partitioningId":"0000000028","releaseStatus":"RELEASED","storageClass":"L2"} |
/DBS/wise_3band_00/TABLES \N |
/DBS/wise_3band_00/TABLES/allsky_3band_p1bs_psd READY |
/DBS/wise_3band_00/TABLES/allsky_3band_p1bs_psd/partitioning \N |
/DBS/wise_3band_00/TABLES/allsky_3band_p1bs_psd/partitioning/.packed.json {"dirColName":"source_id","dirDb":"wise_3band_00","dirTable":"allsky_3band_p1bs_psd","latColName":"decl","lonColName":"ra","subChunks":"1"} |
/DBS/wise_3band_00/TABLES/allsky_3band_p1bs_psd/schema ( |
/DBS/wise_3band_00/TABLES/allsky_3band_p1bs_psd/sharedScan \N |
/DBS/wise_3band_00/TABLES/allsky_3band_p1bs_psd/sharedScan/.packed.json {"lockInMem":"1"} |
Configuring MySQL server and Docker container
Make sure each worker server has 16k connections limit:
/bin/sudo -u qserv \ |
/datasets/gapon/development/db_pdac_wise/scripts/run_show_processlist.bash --total |
lsst-qserv-master01: 2 / 16384
|
lsst-qserv-db01: 1 / 16384
|
lsst-qserv-db02: 1 / 16384
|
lsst-qserv-db03: 1 / 16384
|
..
|
lsst-qserv-db28: 1 / 16384
|
lsst-qserv-db29: 1 / 16384
|
lsst-qserv-db30: 1 / 16384
|
% docker ps |
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
|
75d553b6cd06 qserv/qserv:tickets_DM-11755_master "/bin/sh -c /qserv..." 3 days ago Up 3 days qserv |
% docker inspect qserv
|
|
"Mounts": [ |
{
|
"Type": "bind", |
"Source": "/qserv/data", |
"Destination": "/qserv/data", |
"Mode": "", |
"RW": true, |
"Propagation": "" |
},
|
{
|
"Type": "bind", |
"Source": "/qserv/log", |
"Destination": "/qserv/run/var/log", |
"Mode": "", |
"RW": true, |
"Propagation": "" |
},
|
{
|
"Type": "bind", |
"Source": "/qserv/tmp", |
"Destination": "/qserv/run/tmp", |
"Mode": "", |
"RW": true, |
"Propagation": "" |
}
|
],
|
CONCLUSION: the most important mapping for /qserv/run/tmp used by the wmgr service for storying intermediate files is properly configured.
Loading partitioned tables
% nohup /bin/sudo -u qserv \ |
/datasets/gapon/development/db_pdac_wise/scripts/run_load_object.bash \ |
--verbose \
|
>& run_load_object.log&
|
Testing results:
% for f in `ls /tmp/wise_3band_00/log/qserv-db*_load_object.log`; do tail -1 $f; done |
[Fri Nov 3 02:40:01 CDT 2017] ** Finished loading **
|
[Fri Nov 3 02:35:00 CDT 2017] ** Finished loading **
|
[Fri Nov 3 02:34:50 CDT 2017] ** Finished loading **
|
[Fri Nov 3 02:36:59 CDT 2017] ** Finished loading **
|
[Fri Nov 3 02:42:29 CDT 2017] ** Finished loading **
|
[Fri Nov 3 02:48:35 CDT 2017] ** Finished loading **
|
[Fri Nov 3 02:42:06 CDT 2017] ** Finished loading **
|
[Fri Nov 3 02:36:06 CDT 2017] ** Finished loading **
|
[Fri Nov 3 02:32:24 CDT 2017] ** Finished loading **
|
[Fri Nov 3 02:31:45 CDT 2017] ** Finished loading **
|
[Fri Nov 3 02:30:49 CDT 2017] ** Finished loading **
|
[Fri Nov 3 02:30:40 CDT 2017] ** Finished loading **
|
[Fri Nov 3 02:30:49 CDT 2017] ** Finished loading **
|
[Fri Nov 3 02:30:46 CDT 2017] ** Finished loading **
|
[Fri Nov 3 02:32:07 CDT 2017] ** Finished loading **
|
[Fri Nov 3 02:36:00 CDT 2017] ** Finished loading **
|
[Fri Nov 3 02:43:55 CDT 2017] ** Finished loading **
|
[Fri Nov 3 02:43:17 CDT 2017] ** Finished loading **
|
[Fri Nov 3 02:38:38 CDT 2017] ** Finished loading **
|
[Fri Nov 3 02:35:56 CDT 2017] ** Finished loading **
|
[Fri Nov 3 02:35:10 CDT 2017] ** Finished loading **
|
[Fri Nov 3 02:44:31 CDT 2017] ** Finished loading **
|
[Fri Nov 3 02:47:58 CDT 2017] ** Finished loading **
|
[Fri Nov 3 02:39:12 CDT 2017] ** Finished loading **
|
[Fri Nov 3 02:39:31 CDT 2017] ** Finished loading **
|
[Fri Nov 3 02:37:09 CDT 2017] ** Finished loading **
|
[Fri Nov 3 02:36:53 CDT 2017] ** Finished loading **
|
[Fri Nov 3 02:37:57 CDT 2017] ** Finished loading **
|
[Fri Nov 3 02:41:49 CDT 2017] ** Finished loading **
|
[Fri Nov 3 02:46:58 CDT 2017] ** Finished loading **
|
% for f in `ls /tmp/wise_3band_00/log/qserv-db*_load_object.log`; do tail -1 $f; done | wc -l |
30
|
Generating the secondary index
Harvesting triplets
% nohup \ |
/bin/sudo -u qserv \ |
/datasets/gapon/development/db_pdac_wise/scripts/run_dump_triplet.bash \ |
--verbose \
|
>& run_dump_triplet.log&
|
Inspect log files and cross-check the number of per-worker chunks against the number of success reports in the log files:
% for n in `seq --format "qserv-db%02.f" 1 30`; do ssh -n $n |
'wc -l /tmp/wise_3band_00/chunks.txt;tail /tmp/wise_3band_00/log/dump_triplet.* | grep Finished | wc -l' |
done |
1514 /tmp/wise_3band_00/chunks.txt |
1514
|
1518 /tmp/wise_3band_00/chunks.txt |
1518
|
1516 /tmp/wise_3band_00/chunks.txt |
1516
|
1509 /tmp/wise_3band_00/chunks.txt |
1509
|
1519 /tmp/wise_3band_00/chunks.txt |
1519
|
1519 /tmp/wise_3band_00/chunks.txt |
1519
|
1524 /tmp/wise_3band_00/chunks.txt |
1524
|
1521 /tmp/wise_3band_00/chunks.txt |
1521
|
1520 /tmp/wise_3band_00/chunks.txt |
1520
|
1523 /tmp/wise_3band_00/chunks.txt |
1523
|
1515 /tmp/wise_3band_00/chunks.txt |
1515
|
1514 /tmp/wise_3band_00/chunks.txt |
1514
|
1520 /tmp/wise_3band_00/chunks.txt |
1520
|
1519 /tmp/wise_3band_00/chunks.txt |
1519
|
1514 /tmp/wise_3band_00/chunks.txt |
1514
|
1517 /tmp/wise_3band_00/chunks.txt |
1517
|
1518 /tmp/wise_3band_00/chunks.txt |
1518
|
1513 /tmp/wise_3band_00/chunks.txt |
1513
|
1512 /tmp/wise_3band_00/chunks.txt |
1512
|
1515 /tmp/wise_3band_00/chunks.txt |
1515
|
1513 /tmp/wise_3band_00/chunks.txt |
1513
|
1513 /tmp/wise_3band_00/chunks.txt |
1513
|
1510 /tmp/wise_3band_00/chunks.txt |
1510
|
1509 /tmp/wise_3band_00/chunks.txt |
1509
|
1511 /tmp/wise_3band_00/chunks.txt |
1511
|
1510 /tmp/wise_3band_00/chunks.txt |
1510
|
1504 /tmp/wise_3band_00/chunks.txt |
1504
|
1502 /tmp/wise_3band_00/chunks.txt |
1502
|
1514 /tmp/wise_3band_00/chunks.txt |
1514
|
1519 /tmp/wise_3band_00/chunks.txt |
1519
|
Pulling triplets from the worker nodes to master
Pulling triplet files from all worker nodes into the master w/o loading them:
% nohup \ |
/bin/sudo -u qserv \ |
/datasets/gapon/development/db_pdac_wise/scripts/load_secondary_index.bash \ |
--verbose \
|
--get-remote-triplets \
|
--no-triplets-loading \
|
>& pull_triplets.log&
|
When the loading finished, inspected results to see if the number of TSV files matches expectations:
% wc -l /datasets/gapon/wise/allsky_3band_p1bs_psd/partitioned_x16/all_chunks_unique_sorted.txt |
45490
|
% sudo ls /qserv/data/dumps/wise_3band_00/ | grep tsv | wc -l |
45445
|
Investigating:
% for f in `sudo ls /qserv/data/dumps/wise_3band_00/`; do |
echo ${f:11:-4}; done | sort -un > wise_3band_00.triplet.unique_chunks_sorted |
|
% diff \ |
/datasets/gapon/wise/allsky_3band_p1bs_psd/partitioned_x16/all_chunks_unique_sorted.txt \ |
./wise_3band_00.triplet.unique_chunks_sorted \ |
| grep '<' | awk '{print $2}' > wise_3band_00.triplet.missing_chunks_sorted |
% wc -l wise_3band_00.triplet.missing_chunks_sorted |
45 wise_3band_00.triplet.missing_chunks_sorted
|
|
% cat |
13706
|
21165
|
23184
|
28622
|
30822
|
32706
|
42913
|
48610
|
52439
|
54080
|
56521
|
68766
|
72167
|
73107
|
88795
|
92569
|
100048
|
100819
|
101842
|
103103
|
111272
|
113410
|
114674
|
116464
|
119864
|
126557
|
128702
|
141178
|
141857
|
144646
|
150679
|
151697
|
153842
|
155773
|
165283
|
171015
|
182357
|
195929
|
198776
|
202885
|
203510
|
208257
|
208259
|
219034
|
225101
|
ATTENTION: Those 45 chunks are missing. Try to see if they'a related to any host.
The next command checks if the input to the triplet harvesting script is consistent with th resulting number of the TSV files transferred to master:
% for n in `seq --format "qserv-db%02.f" 1 30`; do |
ssh -n $n 'cat /tmp/wise_3band_00/chunks.txt'; done | wc -l |
45445
|
Now look which chunks were successfully loaded. This information is extracted from the log files of the loader:
% for f in `ls /tmp/wise_3band_00/log/qserv-db*_load_object.log`; do |
fgrep 'loaded chunks' $f; done > wise_3band_00.loaded_chunks |
done
|
% for l in `cat wise_3band_00.loaded_chunks`; do |
echo ${l} |
done | sort -un > wise_3band_00.loaded_chunks.sorted |
# replace '[INFO]' with '0' in the first line of the file before comparing results
|
% diff wise_3band_00.triplet.unique_chunks_sorted wise_3band_00.loaded_chunks.sorted |
<no differences found>
|
CONCLUSION OF THIS STAGE: triplets match the number of chunks loaded into the database
Now look at the results of the partitioner to see which files were produced by the partitioner:
% for chunk in `cat wise_3band_00.triplet.missing_chunks_sorted`; do |
fgrep chunk_${chunk}_ \ |
/datasets/gapon/wise/allsky_3band_p1bs_psd/partitioned_x16/all_chunks_and_overlaps.txt |
done | wc -l |
438
|
% for chunk in `cat wise_3band_00.triplet.missing_chunks_sorted`; do |
fgrep chunk_${chunk}. \ |
/datasets/gapon/wise/allsky_3band_p1bs_psd/partitioned_x16/all_chunks_and_overlaps.txt |
done | wc -l |
0
|
THAT EXPLAINS TH DISCREPANCY - the "missing" chunks are just overlaps around the spatial area of the WISE catalog. So, this is not a problem.
CONCLUSION: all numbers match the expectations.
Pre-sorting and merging TSV files
NOTE: This stage is required due to the 30% reduction of the table size demonstrated in DM-11027. Also, the sort-merge has no significant effect on the total duration of the procedure.
Creating 64 groups of TSV files which will be processed in parallel (because the second soft-merge procedure is CPU intensive):
% /bin/sudo -u qserv -i |
% cd /qserv/data/dumps/ |
% mkdir wise_3band_00_tsv64 |
% ls wise_3band_00 | grep tsv > wise_3band_00_tsv64.txt |
% wc -l wise_3band_00_tsv64.txt |
45445 wise_3band_00_tsv64.txt
|
|
% cd wise_3band_00_tsv64 |
% split -d -l 1674 ../wise_3band_00_tsv64.txt '' |
% ls -1 | wc -l |
28
|
|
% ls -al |
-rw-rw-r-- 1 qserv qserv 36756 Nov 3 20:44 00
|
...
|
-rw-rw-r-- 1 qserv qserv 35090 Nov 3 20:44 26
|
-rw-rw-r-- 1 qserv qserv 5187 Nov 3 20:44 27
|
|
% wc -l * | tail -1 |
45445 total
|
Sort in 64 streams and store the sorted files in a separate folder:
% cd /qserv/data/dumps/ |
% mkdir wise_3band_00_sorted |
% date |
Fri Nov 3 20:47:27 CDT 2017
|
|
% for c in `ls wise_3band_00_tsv64`; do |
for f in `cat wise_3band_00_tsv64/${c}`; do |
sort wise_3band_00/${f} > wise_3band_00_sorted/${f} |
done& |
done |
Analyzing results after all 64 jobs finished:
% date
|
Fri Nov 3 21:15:09 CDT 2017
|
% ls -1 wise_3band_00_sorted | wc -l
|
45445
|
% du -hs wise_3band_00_sorted
|
95G wise_3band_00_sorted
|
The next step is to do the merge-sort. Due to a large number of input files this operation is split into 2 steps:
- sort-merging files within each stream (as defined on the previous step). All 64 streams are processed simulnateously.
- final merge of the 64 files (produced by the previous stage) into a single file
Parallel merge of 64 streams:
% cd wise_3band_00_sorted |
% mkdir ../wise_3band_00_sorted_merged |
% for f in `ls ../wise_3band_00_tsv64`; do |
mkdir -p ../wise_3band_00_tmp_${f} |
cat ../wise_3band_00_tsv64/${f} | tr '\n' '\0' | sort -m -s -k1 -T ../wise_3band_00_tmp_${f} -o ../wise_3band_00_sorted_merged/${f}.tsv --files0-from=- & |
done |
NOTE: the translation of newlines to NULL is required to feed null-terminated file (input) names into the sort tool. Also note that each stream uses its own temporary directory on the current disk instead of relying on a small /tmp file system.
Analyzing a status after it finished
% ls -1 wise_3band_00_sorted_merged/ | wc -l |
28
|
% du -hs wise_3band_00_sorted_merged/ |
95G wise_3band_00_sorted_merged/
|
% wc -l wise_3band_00_sorted_merged/*.tsv | tail -1 |
3703319374 total
|
Final merge of 64 steam files into the large one:
% mkdir wise_3band_00_tmp |
% nohup sort -m -s -k1 -T wise_3band_00_tmp/ -o wise_3band_00_sorted_merged.tsv wise_3band_00_sorted_merged/*& |
% date |
Analyzing a status after it finished
% ls -alh wise_3band_00_sorted_merged.tsv |
-rw-rw-r-- 1 qserv qserv 95G Nov 3 22:15 wise_3band_00_sorted_merged.tsv
|
% wc -l wise_3band_00_sorted_merged.tsv |
3703319374 wise_3band_00_sorted_merged.tsv
|
Splitting the large file into smaller chunks
% mkdir wise_3band_00_sorted_merged_split |
% cd wise_3band_00_sorted_merged_split |
% split -d -l 37033194 ../wise_3band_00_sorted_merged.tsv '' |
Analyzing a status after it finished
% ls -1 | wc -l |
91
|
% du -hs . |
% wc -l * | tail -1 |
Total number of files: 100
Average file size: < 1 GB
Lines per file: * 37033194*
Rename folders and those lastly created 100 TSV files to allow the previously developed tool to work:
% mv wise_3band_00 wise_3band_00_imported |
% ln -s wise_3band_00_sorted_merged_split wise_3band_00 |
% cd wise_3band_00_sorted_merged_split |
% for f in `ls`; do mv ${f} ${f}.tsv; done |
|
Loading:
|
{code:bash} |
% cd /home/gapon |
% nohup /bin/sudo -u qserv \ |
/datasets/gapon/development/db_pdac_wise/scripts/load_secondary_index.bash \ |
--verbose \
|
>& load_secondary_index_load_only.log&
|
Monitoring the progress:
% iostat -m 1 | grep sub |
...
|
sdb 369.00 5.38 66.95 5 66
|
sdb 441.00 5.50 76.78 5 76
|
sdb 521.00 7.50 99.96 7 99
|
sdb 379.00 5.38 68.05 5 68
|
...
|
The operation finished within less than 5 hours.
The log files are unremarkable.
The row count matches the number of objects in tables:
SELECT COUNT(*) FROM qservMeta.wise_3band_00__allsky_3band_p1bs_psd; |
3703319374
|
The size of the table files is:
% sudo ls -alh /qserv/data/mysql/qservMeta | grep wise_3band_00 |
-rw-rw---- 1 qserv qserv 1014 Nov 3 19:22 wise_3band_00__allsky_3band_p1bs_psd.frm
|
-rw-rw---- 1 qserv qserv 210G Nov 4 18:10 wise_3band_00__allsky_3band_p1bs_psd.ibd
|
Cleaning up:
% cd /qserv/data/dumps |
% df -h /data |
Setting up the empty chunk list file
The file was generated using the trivial Python script based on early produced list of unique chunks:
chunks = {} |
with open("all_chunks_unique_sorted.txt", "r") as f: |
for chunk in f: |
chunks[int(chunk[:-1])] = True |
|
minChunk = 0 |
maxChunk = 1000000 |
for chunk in range(minChunk,maxChunk): |
if chunk not in chunks: |
print chunk |
The generator was used like this:
% cd /datasets/gapon/wise/allsky_3band_p1bs_psd/partitioned_x16 |
% python generate_empty_chunk_list.py > empty_wise_3band_00.txt
|
% wc -l empty_wise_3band_00.txt |
954510 empty_wise_3band_00.txt
|
% wc -l all_chunks_unique_sorted.txt |
45490 all_chunks_unique_sorted.txt
|
Then installed the resulting files on the master node:
% /bin/sudo -u qserv \ |
cp /datasets/gapon/wise/allsky_3band_p1bs_psd/partitioned_x16/empty_wise_3band_00.txt \ |
/qserv/data/qserv/ |
Enabling the database in Qserv
% /bin/sudo -u qserv \ |
/datasets/gapon/development/db_pdac_wise/scripts/run_qserv_enable_database.bash \ |
--verbose \
|
>& run_qserv_enable_database.log&
|
ATTENTION: Qserv needs to be restarted to recognize the database
Testing
Testing the number of objects returned by the shared scan query versus the one reported from the secondary index:
SELECT COUNT(*) FROM wise_3band_00.allsky_3band_p1bs_psd; |
+----------------+ |
| SUM(QS1_COUNT) | |
+----------------+ |
| 3703319374 |
|
+----------------+ |
1 row in set (2 min 28.15 sec) |
CONCLUSION: the numbers matches the number of the secondary index entries.
Loading catalog allsky_4band_p1bs_psd into PDAC
The loading protocol is explained in details at:
The rest of this section presents a short summary of actions and tests taken during this loading.
Dataset configuration
Database name: wise_4band_00
Do this at lsat-dev:
% cd /datasets/gapon/development/db_pdac_wise/scripts |
% ln -s dataset.bash.wise_4band_00 dataset.bash |
Creating database and initializing CSS
All these steps were conducted on the master node of the cluster lsst-qserv-master01:
% /bin/sudo -u qserv \ |
/datasets/gapon/development/db_pdac_wise/scripts/run_create_database.bash \ |
--delete --verbose >& run_create_database.log&
|
|
% /bin/sudo -u qserv \ |
/datasets/gapon/development/db_pdac_wise/scripts/setup_css.bash \ |
>& setup_css.log
|
% source /datasets/gapon/stack/loadLSST.bash |
% setup -t qserv-dev qserv_distrib; setup -j -r /datasets/gapon/development/qserv |
% qserv-admin.py "DUMP EVERYTHING;" | grep wise_4band_00 |
/DBS/wise_4band_00 READY |
/DBS/wise_4band_00/.packed.json {"partitioningId":"0000000029","releaseStatus":"RELEASED","storageClass":"L2"} |
/DBS/wise_4band_00/TABLES \N |
/DBS/wise_4band_00/TABLES/allsky_4band_p1bs_psd READY |
/DBS/wise_4band_00/TABLES/allsky_4band_p1bs_psd/partitioning \N |
/DBS/wise_4band_00/TABLES/allsky_4band_p1bs_psd/partitioning/.packed.json {"dirColName":"source_id","dirDb":"wise_4band_00","dirTable":"allsky_4band_p1bs_psd","latColName":"decl","lonColName":"ra","subChunks":"1"} |
/DBS/wise_4band_00/TABLES/allsky_4band_p1bs_psd/schema ( |
/DBS/wise_4band_00/TABLES/allsky_4band_p1bs_psd/sharedScan \N |
/DBS/wise_4band_00/TABLES/allsky_4band_p1bs_psd/sharedScan/.packed.json {"lockInMem":"1"} |
Configuring MySQL server and Docker container
These actions are required to ensure each worker is properly configured. See loading protocol for other n-band protocol for details.
Loading partitioned tables
% nohup /bin/sudo -u qserv \ |
/datasets/gapon/development/db_pdac_wise/scripts/run_load_object.bash \ |
--verbose \
|
>& run_load_object.log&
|
Testing results:
% for f in `ls /tmp/wise_4band_00/log/qserv-db*_load_object.log`; do tail -1 $f; done |
[Sat Nov 18 18:29:21 CST 2017] ** Finished loading **
|
[Sat Nov 18 18:12:13 CST 2017] ** Finished loading **
|
..
|
% for f in `ls /tmp/wise_4band_00/log/qserv-db*_load_object.log`; do tail -1 $f; done | wc -l |
30
|
Generating the secondary index
Harvesting triplets
% nohup \ |
/bin/sudo -u qserv \ |
/datasets/gapon/development/db_pdac_wise/scripts/run_dump_triplet.bash \ |
--verbose \
|
>& run_dump_triplet.log&
|
Inspect log files and cross-check the number of per-worker chunks against the number of success reports in the log files:
% for n in `seq --format "qserv-db%02.f" 1 30`; do |
ssh -n $n 'wc -l /tmp/wise_4band_00/chunks.txt; tail /tmp/wise_4band_00/log/dump_triplet.* | grep Finished | wc -l' |
4878 /tmp/wise_4band_00/chunks.txt |
4878
|
4878 /tmp/wise_4band_00/chunks.txt |
4878
|
..
|
4877 /tmp/wise_4band_00/chunks.txt |
4877
|
done |
Pulling triplets from the worker nodes to master
Pulling triplet files from all worker nodes into the master w/o loading them:
% nohup \ |
/bin/sudo -u qserv \ |
/datasets/gapon/development/db_pdac_wise/scripts/load_secondary_index.bash \ |
--verbose \
|
--get-remote-triplets \
|
--no-triplets-loading \
|
>& pull_triplets.log&
|
When the loading finished, inspected results to see if the number of TSV files matches expectations:
% wc -l /datasets/gapon/wise/allsky_4band_p1bs_psd/partitioned_x16/all_chunks_unique_sorted.txt |
146332 /datasets/gapon/wise/allsky_4band_p1bs_psd/partitioned_x16/all_chunks_unique_sorted.txt |
% sudo ls /qserv/data/dumps/wise_4band_00/ | grep tsv | wc -l |
146332
|
% /bin/sudo -u qserv du -hs /qserv/data/dumps/wise_4band_00/ |
242G /qserv/data/dumps/wise_4band_00/ |
Pre-sorting and merging TSV files
NOTE: This stage is required due to the 30% reduction of the table size demonstrated in DM-11027. Also, the sort-merge has no significant effect on the total duration of the procedure.
Creating 88 groups of TSV files which will be processed in parallel (because the second soft-merge procedure is CPU intensive):
% /bin/sudo -u qserv -i |
% cd /qserv/data/dumps/ |
% mkdir wise_4band_00_tsv64 |
% ls wise_4band_00 | grep tsv > wise_4band_00_tsv64.txt |
% wc -l wise_4band_00_tsv64.txt |
146332 wise_4band_00_tsv64.txt
|
% cd wise_4band_00_tsv64 |
% split -d -l 1674 ../wise_4band_00_tsv64.txt '' |
% ls -1 | wc -l |
88
|
% wc -l * | tail -1 |
146332 total
|
Sort in 88 streams and store the sorted files in a separate folder:
% cd /qserv/data/dumps/ |
% mkdir wise_4band_00_sorted |
% for c in `ls wise_4band_00_tsv64`; do |
for f in `cat wise_4band_00_tsv64/${c}`; do |
sort wise_4band_00/${f} > wise_4band_00_sorted/${f} |
done& |
done |
Analyzing results after all 88 jobs finished:
% date
|
% ls -1 wise_4band_00_sorted | wc -l
|
146332
|
% du -hs wise_4band_00_sorted
|
242G wise_4band_00_sorted
|
The next step is to do the merge-sort. Due to a large number of input files this operation is split into 2 steps:
- sort-merging files within each stream (as defined on the previous step). All 88 streams are processed simultaneously.
- final merge of the 88 files (produced by the previous stage) into a single file
Parallel merge of 88 streams:
% cd wise_4band_00_sorted |
% mkdir ../wise_4band_00_sorted_merged |
% for f in `ls ../wise_4band_00_tsv64`; do |
mkdir -p ../wise_4band_00_tmp_${f} |
cat ../wise_4band_00_tsv64/${f} | tr '\n' '\0' | sort -m -s -k1 -T ../wise_4band_00_tmp_${f} -o ../wise_4band_00_sorted_merged/${f}.tsv --files0-from=- & |
done |
NOTE: the translation of newlines to NULL is required to feed null-terminated file (input) names into the sort tool. Also note that each stream uses its own temporary directory on the current disk instead of relying on a small /tmp file system.
Analyzing a status after it finished
% ls -1 wise_4band_00_sorted_merged/ | wc -l |
88
|
% du -hs wise_4band_00_sorted_merged/ |
241G ../wise_4band_00_sorted_merged |
% wc -l wise_4band_00_sorted_merged/*.tsv | tail -1 |
All 88 files has reasonable content in their last lines.
Final merge of 88 steam files into the large one:
% mkdir wise_4band_00_tmp |
% nohup sort -m -s -k1 -T wise_4band_00_tmp/ -o wise_4band_00_sorted_merged.tsv wise_4band_00_sorted_merged/*& |
Analyzing a status after it finished
% ls -alh wise_4band_00_sorted_merged.tsv |
-rw-rw-r-- 1 qserv qserv 241G Nov 20 18:49 wise_4band_00_sorted_merged.tsv
|
% wc -l wise_4band_00_sorted_merged.tsv |
9479433101 wise_4band_00_sorted_merged.tsv
|
Splitting the large file into smaller chunks
% mkdir wise_4band_00_sorted_merged_split |
% cd wise_4band_00_sorted_merged_split |
% split -d -l 148116143 ../wise_4band_00_sorted_merged.tsv '' |
Analyzing a status after it finished
% ls -1 | wc -l |
% du -hs . |
241G .
|
% wc -l * | tail -1 |
9479433101 total
|
Total number of files: 64
Average file size: 3.8 GB
Lines per file: 148M
Rename folders and those lastly created NN TSV files to allow the previously developed tool to work:
% mv wise_4band_00 wise_4band_00_imported |
% ln -s wise_4band_00_sorted_merged_split wise_4band_00 |
% cd wise_4band_00_sorted_merged_split |
% for f in `ls`; do mv ${f} ${f}.tsv; done |
Loading:
% cd /home/gapon |
% nohup /bin/sudo -u qserv \ |
/datasets/gapon/development/db_pdac_wise/scripts/load_secondary_index.bash \ |
--verbose \
|
>& load_secondary_index_load_only.log&
|
When finished
% /bin/sudo -u qserv ls -alh /qserv/data/mysql/qservMeta | grep wise_4band_00 |
-rw-rw---- 1 qserv qserv 1014 Nov 19 21:28 wise_4band_00__allsky_4band_p1bs_psd.frm
|
-rw-rw---- 1 qserv qserv 537G Nov 21 04:21 wise_4band_00__allsky_4band_p1bs_psd.ibd
|
The log files are unremarkable.
The row count matches the number of objects in tables:
SELECT COUNT(*) FROM qservMeta.wise_4band_00__allsky_4band_p1bs_psd; |
9479433101
|
Cleaning up:
% cd /qserv/data/dumps |
% df -h /data |
Setting up the empty chunk list file
The file was generated using the trivial Python script based on early produced list of unique chunks:
chunks = {} |
with open("all_chunks_unique_sorted.txt", "r") as f: |
for chunk in f: |
chunks[int(chunk[:-1])] = True |
|
minChunk = 0 |
maxChunk = 1000000 |
for chunk in range(minChunk,maxChunk): |
if chunk not in chunks: |
print chunk |
The generator was used like this:
% cd /datasets/gapon/wise/allsky_4band_p1bs_psd/partitioned_x16 |
% python generate_empty_chunk_list.py > empty_wise_4band_00.txt
|
% wc -l empty_wise_4band_00.txt |
853668 empty_wise_4band_00.txt
|
% wc -l all_chunks_unique_sorted.txt |
146332 all_chunks_unique_sorted.txt
|
Then installed the resulting files on the master node:
% /bin/sudo -u qserv \ |
cp /datasets/gapon/wise/allsky_4band_p1bs_psd/partitioned_x16/empty_wise_4band_00.txt \ |
/qserv/data/qserv/ |
Enabling the database in Qserv
% /bin/sudo -u qserv \ |
/datasets/gapon/development/db_pdac_wise/scripts/run_qserv_enable_database.bash \ |
--verbose \
|
>& run_qserv_enable_database.log&
|
ATTENTION: Qserv needs to be restarted to recognize the database
Testing
Testing the number of objects returned by the shared scan query versus the one reported from the secondary index:
SELECT COUNT(*) FROM wise_4band_00.allsky_4band_p1bs_psd; |
+----------------+ |
| SUM(QS1_COUNT) | |
+----------------+ |
| 9479433101 |
|
+----------------+ |
1 row in set (11 min 24.86 sec) |
CONCLUSION: the numbers matches the number of the secondary index entries.
All three catalogs have been successfully loaded. Proceeding to DM-12523.
Woohoo!!
Loading catalog allsky_2band_p1bs_psd into PDAC
The loading protocol is explained in details at:
The rest of this section presents a short summary of actions and tests taken during this loading.
Dataset configuration
Database name: wise_2band_00
Do this at lsst-dev:
Creating database and initializing CSS
All these steps were conducted on the master node of the cluster lsst-qserv-master01:
--delete --verbose >& run_create_database.log&
>& setup_css.log
Configuring MySQL server and Docker container
sst-qserv-master01: 14 / 16384
lsst-qserv-db01: 2 / 16384
lsst-qserv-db02: 2 / 16384
lsst-qserv-db03: 2 / 16384
...
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
% docker inspect qserv
{
},
{
},
{
}
],
Loading partitioned tables
--verbose \
>& run_load_object.log&
Testing results:
30
[Sat Jun 10 13:59:28 CDT 2017] ** Finished loading **
[Sat Jun 10 13:50:58 CDT 2017] ** Finished loading **
[Sat Jun 10 13:39:20 CDT 2017] ** Finished loading **
[Sat Jun 10 13:33:27 CDT 2017] ** Finished loading **
[Sat Jun 10 13:28:02 CDT 2017] ** Finished loading **
[Sat Jun 10 13:28:10 CDT 2017] ** Finished loading **
[Sat Jun 10 13:24:06 CDT 2017] ** Finished loading **
[Sat Jun 10 13:22:07 CDT 2017] ** Finished loading **
[Sat Jun 10 13:12:26 CDT 2017] ** Finished loading **
[Sat Jun 10 13:12:15 CDT 2017] ** Finished loading **
[Sat Jun 10 13:16:00 CDT 2017] ** Finished loading **
[Sat Jun 10 13:14:34 CDT 2017] ** Finished loading **
[Sat Jun 10 13:18:59 CDT 2017] ** Finished loading **
[Sat Jun 10 13:11:46 CDT 2017] ** Finished loading **
[Sat Jun 10 13:13:28 CDT 2017] ** Finished loading **
[Sat Jun 10 13:25:58 CDT 2017] ** Finished loading **
[Sat Jun 10 13:30:49 CDT 2017] ** Finished loading **
[Sat Jun 10 13:22:43 CDT 2017] ** Finished loading **
[Sat Jun 10 13:21:04 CDT 2017] ** Finished loading **
[Sat Jun 10 13:23:58 CDT 2017] ** Finished loading **
[Sat Jun 10 13:25:21 CDT 2017] ** Finished loading **
[Sat Jun 10 13:38:21 CDT 2017] ** Finished loading **
[Sat Jun 10 13:37:53 CDT 2017] ** Finished loading **
[Sat Jun 10 13:31:47 CDT 2017] ** Finished loading **
[Sat Jun 10 13:25:03 CDT 2017] ** Finished loading **
[Sat Jun 10 13:27:11 CDT 2017] ** Finished loading **
[Sat Jun 10 13:25:20 CDT 2017] ** Finished loading **
[Sat Jun 10 13:34:24 CDT 2017] ** Finished loading **
[Sat Jun 10 13:43:53 CDT 2017] ** Finished loading **
[Sat Jun 10 13:57:27 CDT 2017] ** Finished loading **
Generating the secondary index (FIRST ATTEMPT)
Harvesting triplets:
--verbose \
>& run_dump_triplet.log&
Loading triplets into the index:
--verbose \
--get-remote-triplets \
>& load_secondary_index.log&
Performance degradation after loading 60% of TSV files
The number of files which have been loaded:
63104
SHOW PROCESSLIST;
Actually, the gradual performance degradation was seen much earlier. Later it become more obvious before it literally stalled while loading file idx_object_200800.tsv as per the log file analysis:
The last file is being loaded for the last 2 hours (at least). The file is noticeable larger than the previous ones:
-rw-r--r-- 1 qserv qserv 23608289 Jun 10 20:54 idx_object_200796.tsv
-rw-r--r-- 1 qserv qserv 30216240 Jun 10 21:11 idx_object_200797.tsv
-rw-r--r-- 1 qserv qserv 44797664 Jun 10 21:28 idx_object_200798.tsv
-rw-r--r-- 1 qserv qserv 73420288 Jun 10 21:44 idx_object_200799.tsv
-rw-r--r-- 1 qserv qserv 148547683 Jun 10 22:02 idx_object_200800.tsv
-rw-r--r-- 1 qserv qserv 156758880 Jun 10 22:19 idx_object_200801.tsv
-rw-r--r-- 1 qserv qserv 75832749 Jun 10 22:35 idx_object_200802.tsv
-rw-r--r-- 1 qserv qserv 42320964 Jun 10 22:53 idx_object_200803.tsv
-rw-r--r-- 1 qserv qserv 30272316 Jun 10 23:10 idx_object_200804.tsv
Reconfiguring InnoDB
Further analysis of the database service configuration showed that the server runs with the default configuration of the InnoDB storage engine. This configuration seems to be way to suboptimal for the large bulk inserts. This is seen with:
| Variable_name | Value |
| innodb_buffer_pool_dump_pct | 100 |
| innodb_buffer_pool_filename | ib_buffer_pool |
| innodb_buffer_pool_instances | 8 |
| innodb_buffer_pool_size | 134217728 |
Among those the most problematic one is the last parameter innodb_buffer_pool_size which is set to 128 MB only MySQL suggest that it should be set up to 80% of the machine's RAM.
Unfortunately this parameter is NOT DYNAMIC for a version of the MariaDB service run inside the Qserv container:
% mysql ...
Server version: 10.1.21-MariaDB Source distribution
The documentation suggests that the parameter is DYNAMIC as of verision 10.2.2 or higher. Therefore the only solution is to abort the index loading, reconfigure the service and resume the loading process from a file which takes to long to load.
Aborting the loading:
Though, this didn't really kill the loading thread within the service as per:
SHOW PROCESSLIST;
Besides, the server is still running in then same node as earlier:
Tasks: 485 total, 1 running, 484 sleeping, 0 stopped, 0 zombie
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4806 qserv 20 0 3242864 255704 4808 S 9.6 0.2 732:49.71 mysqld
...
Reconfiguring and restarting the service by logging into the Docker container:
# added two parameters
18,20d17
< innodb_buffer_pool_size=96G
< innodb_buffer_pool_instances=1
[ ok ] Reloading service MySQL.
# UNFORTUNATELY this did NOT stop the server
stop
# The previous operation was stuck as well. Killing the server process in the brute force way:
# Cleaning up the PID because of
[ ok ] MySQL running (27236).
total 4
drwxr-xr-x 2 qserv qserv 23 Jun 16 16:47 .
drwxr-xr-x 4 qserv qserv 99 Jun 14 11:39 ..
-rw-rw---- 1 qserv qserv 6 Jun 16 16:47 mysqld.pid
[FAIL] MySQL is not running, but lock exists ... failed!
total 0
drwxr-xr-x 2 qserv qserv 97 Jun 14 11:39 .
drwxr-xr-x 3 qserv qserv 19 Apr 27 17:41 ..
-rw-r--r-- 1 qserv qserv 0 Jun 14 11:39 cmsd
-rw-r--r-- 1 qserv qserv 0 Jun 14 11:38 mysql
-rw-r--r-- 1 qserv qserv 0 Jun 14 11:39 mysql-proxy
-rw-r--r-- 1 qserv qserv 0 Jun 14 11:39 qserv-watcher
-rw-r--r-- 1 qserv qserv 0 Jun 14 11:39 qserv-wmgr
-rw-r--r-- 1 qserv qserv 0 Jun 14 11:39 xrootd
[FAIL] MySQL is not running ... failed!
Starting MySQL
# However the database daemon seems to be back:
UID PID PPID C STIME TTY TIME CMD
...
...
Testing the service and checking the new configuration from the host machine's shell:
Tasks: 484 total, 1 running, 483 sleeping, 0 stopped, 0 zombie
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
19440 qserv 20 0 0.103t 8.846g 10148 S 0.3 7.1 1:53.85 mysqld
NOTE: the new virtual size of the process.
Checking the configuration parameters:
SHOW PROCESSLIST;
| Variable_name | Value |
| innodb_buffer_pool_dump_pct | 100 |
| innodb_buffer_pool_filename | ib_buffer_pool |
| innodb_buffer_pool_instances | 1 |
| innodb_buffer_pool_size | 103079215104 |
Resuming the index loading
Test the number of objects in the secondary index table:
Checking the index loading status:
-rw-r--r-- 1 qserv qserv 0 Jun 16 11:35 load_idx_object_200797.tsv.log
-rw-r--r-- 1 qserv qserv 0 Jun 16 11:47 load_idx_object_200798.tsv.log
-rw-r--r-- 1 qserv qserv 0 Jun 16 12:18 load_idx_object_200799.tsv.log
-rw-r--r-- 1 qserv qserv 58 Jun 16 16:33 load_idx_object_200800.tsv.log
This is the DESIRED result meaning that loading of the last TSV file was aborted and trasaction was rolled back which is also supported by this message in the server's log file:
...
2017-06-16 16:57:18 7f969c974700 InnoDB: Rollback of non-prepared transactions completed
...
Making sure the new loading won't destroy the previously loaded content. This is the SQL DML statement which create the table:
Putting away TSV files which were successfully loaded as per the master log file:
63104
And now move the files for real (NOTE we should put the last file idx_object_200800.tsv back because it was not properly loaded while the process was aborted):
done
63103
43788
Saved the previous log file and restarted the operation:
--verbose >& load_secondary_index.log&
Table size:
-rw-rw---- 1 qserv qserv 1014 Jun 14 12:02 wise_2band_00__allsky_2band_p1bs_psd.frm
-rw-rw---- 1 qserv qserv 334G Jun 16 18:36 wise_2band_00__allsky_2band_p1bs_psd.ibd
Disk space status:
Filesystem Size Used Avail Use% Mounted on
ATTENTION: The previously loaded files may need to be removed to free some space of needed:
Monitoring the loading process to see if the new configuration has allowed to speed it up:
Tasks: 487 total, 1 running, 486 sleeping, 0 stopped, 0 zombie
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
19440 qserv 20 0 0.103t 8.945g 10412 S 11.0 7.1 2:36.26 mysqld
% iostat -k 10
...
0.23 0.00 0.06 1.39 0.00 98.33
sda 184.10 0.00 10396.05 0 103960
dm-0 183.60 0.00 10378.45 0 103784
0.22 0.00 0.06 1.44 0.00 98.28
sda 189.00 0.00 11197.50 0 111975
dm-0 189.10 0.00 11200.70 0 112007
...
The last operation HAS FAILED due to disk space limitation. At this point the loading was stopped and a major upgrade of the master node hardware ordered as per JIRA [IHS-378].
The second attempt to generate the secondary index after the hardware upgrade
NOTE: this attempt will also involve an additional step of presorting the contents of the TSV triplet files before loading them into the table. The main benefit of this is a significant table size reduction as demonstrated in
DM-11027(jump straight to the Conclusions section).Verifying the status of the secondary index generation and cleaning it up
Compare the total number of loaded Objects versus the number of entries in the incomplete secondary index.
Obtaining the total number of loaded objects via Qserv proxy:
| 7337642955 |
Obtaining the number of the secondary index entries via a direct connection to the master's MySQL service:
| 6738261333 |
NOTE:' it took about 3 hours to get the result of the query.
Wipe out the secondary index:
Get rid of the local (master) cache of the intermediate TSV files:
Pulling triplets from the worker nodes to master
ATTENTION: this operation is based on an assumption that all triplets which were created during the first (the failed) attempt were still available. And indeed this was verified prior to proceed with this stage. Details are not going to be reflected here.
Harvesting triplets:
--verbose \
>& run_dump_triplet.log&
Verify if the dump completed on all worker nodes:
Loading triplets into the index:
--verbose \
--get-remote-triplets \
--no-triplets-loading \
>& pull_triplets.log&
Verifying a status of triplets after the file migration operation finished. Verifying if the total number of TSV fils matched the number of partitions:
106891
Verifying if the total number of objects matches expectations (the actual counting was a bit more complex because the shell interpreter was not able to expand the list of 106891 files):
7337642955
CONCLUSION: bot numbers match the expectations.
Pre-sorting and merging TSV files
NOTE: This stage is required due to the 30% reduction of the table size demonstrated in
DM-11027. Also, the sort-merge has no significant effect on the total duration of the procedure.Creating 64 groups of TSV files which will be processed in parallel (because the second soft-merge procedure is CPU intensive):
106891 wise_2band_00_tsv64.txt
-rw-rw-r-- 1 qserv qserv 36809 Oct 27 15:42 00
...
-rw-rw-r-- 1 qserv qserv 35141 Oct 27 15:42 62
-rw-rw-r-- 1 qserv qserv 30009 Oct 27 15:42 63
106891 total
Sort in 64 streams and store the sorted files in a separate folder:
% ate
Fri Oct 27 15:46:06 CDT 2017
Analyzing results after all 64 jobs finished:
% date
Fri Oct 27 16:25:52 CDT 2017
% ls -1 wise_2band_00_sorted | wc -l
106891
% du -hs wise_2band_00_sorted
187G wise_2band_00_sorted
Total run time of the stage: 40 min
The next step is to do the merge-sort. Due to a large number of input files this operation is split into 2 steps:
Parallel merge of 64 streams:
done
Fri Oct 27 17:54:27 CDT 2017
NOTE: the translation of newlines to NULL is required to feed null-terminated file (input) names into the sort tool. Also note that each stream uses its own temporary directory on the current disk instead of relying on a small /tmp file system.
Analyzing a status after it finished
Fri Oct 27 18:16:33 CDT 2017
64
187G wise_2band_00_sorted_merged/
7337642955 total
Total run time of the stage: 22 minutes
Final merge of 64 steam files into the large one:
Fri Oct 27 18:42:34 CDT 2017
Analyzing a status after it finished
Fri Oct 27 20:47:04 CDT 2017
-rw-rw-r-- 1 qserv qserv 187G Oct 27 20:41 wise_2band_00_sorted_merged.tsv
7337642955 wise_2band_00_sorted_merged.tsv
Total run time of the stage: 2 hours
Splitting the large file into smaller chunks
Fri Oct 27 20:52:56 CDT 2017
Fri Oct 27 20:59:36 CDT 2017
Analyzing a status after it finished
91
187G .
7337642955 total
Total number of files: 91
Average file size: 2.2 GB
Lines per file: 80663508
Total run time of the stage: 7 minutes
Rename folders and those lastly created 91 TSV files to allow the previously developed tool to work:
Loading:
Fri Oct 27 21:27:06 CDT 2017
--verbose \
>& load_secondary_index_load_only.log&
Monitoring the progress:
sdb 123.12 5.44 5.93 1444284 1573500
sdb 489.00 6.75 93.35 6 93
sdb 583.00 7.75 110.45 7 110
sdb 395.00 8.00 74.53 8 74
sdb 554.00 4.50 104.20 4 104
sdb 626.00 8.50 120.70 8 120
sdb 794.00 7.62 165.61 7 165
sdb 575.00 10.00 109.70 10 109
sdb 755.00 9.62 143.21 9 143
...
-rw-rw---- 1 qserv qserv 1014 Oct 27 03:40 wise_2band_00__allsky_2band_p1bs_psd.frm
-rw-rw---- 1 qserv qserv 6.3G Oct 27 21:33 wise_2band_00__allsky_2band_p1bs_psd.ibd
The operation finished within less than 7 hours.
The log files are unremarkable.
The row count matches the number of objects in tables:
| 7337642955 |
The size of the table files is:
-rw-rw---- 1 qserv qserv 1014 Oct 27 03:40 wise_2band_00__allsky_2band_p1bs_psd.frm
-rw-rw---- 1 qserv qserv 416G Oct 28 04:22 wise_2band_00__allsky_2band_p1bs_psd.ibd
Cleaning up:
Filesystem Size Used Avail Use% Mounted on
Setting up the empty chunk list file
The file was generated using the trivial Python script based on early produced list of unique chunks:
The generator was used like this:
% python generate_empty_chunk_list.py > empty_wise_2band_00.txt
893059 empty_wise_2band_00.txt
106941 all_chunks_unique_sorted.txt
Then installed the resulting files on the master node:
Enabling the database in Qserv
--verbose \
>& run_qserv_enable_database.log&
ATTENTION: Qserv needs to be restarted to recognize the database