Fix Version/s: None
Team:Data Access and Database
The goal of this ticket is to enable communication between CC-IN2P3 and Qserv team in order to prepare for Pan-STARRS data ingestion into Qserv. This data ingestion step is necessary for the large scale tests of Qserv foreseen for summer 2016.
Specifically, we need to understand:
- What is the size of the data set to be imported to CC-IN2P3?
- Where the Pan-STARRS data set to be imported is currently located?
- What mechanisms will the host of Pan-STARRS data make available to CC-IN2P3 for downloading the data set?
- Does the envisaged ingestion mechanism into Qserv requires that the data transit through the Qserv master server or will each Qserv worker be able to ingest its own chunk of data?
- After the ingestion process is finished, do we need to keep a copy of the ingested data out of Qserv?
Given the size of the dataset likely involved in this process, this project will probably require that we (both Qserv and CC-IN2P3 experts) set up specific mechanisms and equipment for efficient transport, storage and ingestion of these data. Timely planning and several testing campaigns seem necessary for this project to make progress.