Status: To Do
Fix Version/s: None
Sprint:DB_F20_09, DB_S21_12, DB_F21_06
Team:Data Access and Database
Many complex queries in Qserv would benefit from having persistent engine-independent table-level statistics. The technique is explained for MariaDB in https://mariadb.com/kb/en/engine-independent-table-statistics/
The statistic collection phase is a data-intensive operation that requires the MySQL/MariaDB server to read the content of each table. Altogether it could become a lengthy procedure for large catalogs. Based on early experimentations with query ANALYZE TABLE, the performance of each such statement is about 15 MB/s for the table read operations (measured by iostat) while consuming 100% CPU. For a typical 12-cores worker machine, this translates into approximately 200 MB/s of the worker-level performance. The resulting catalog-level performance would depend on the total size of the catalog and the number of worker machines in the catalog (NOTE: an assumption here is that the underlying data filesystem is based on a decent hardware RAID).
Hence, the main goal of the proposed effort is to extend the Replication system to allow analyzing and retrieving statistics. The main interface for the new tools will be the REST API of the Master Replication Controller.
The following REST services will be added into the system:
|POST||/replication/sql/table/stat||Build or re-build the statistics for a select table of a catalog|
|GET||/replication/sql/table/stat||Return info on the table statistics (if any)|
|DELETE||/replication/sql/table/stat||Delete table statistics (if any)|
In addition, there will be a command-line tool that would implement similar functionality.
- The operations implemented in this context would be allowed on the published catalogs only.
- All three above-mentioned services would allow extra flexibility of choosing a scope of the operation, which could be one of:
- A specific group of tables, the regular one or the partitioned (implying all chunks of a partitioned table, including replicas) distributed across all workers.
- A subset of the partitioned tables (to differentiate between chunk or overlap tables)
- A single table like Object of chunk 123 (including all replicas).
- The GET method would also support an option for returning results at various levels of utilization, including:
- (default) A short YES/NO status of the availability of stats for each table.
- An extended status that included a table and index-level info.
- And complete status including column-level stats ( This option would be available only for a single table due to a large amount of data returned for each such table).
- (possibly) A column-level stat accros a specific group of tables.