Details
-
Type:
Improvement
-
Status: Done
-
Resolution: Done
-
Fix Version/s: None
-
Component/s: Qserv
-
Labels:None
-
Epic Link:
-
Sprint:DB_S22_12
-
Team:Data Access and Database
Description
The problem
The current implementation of the query processor in Qserv czar won't retain any specific error info (but the general FAILED or ABORTED status) on the failed queries in the czar's database. The conditions are only logged into the logging stream (LSST Logger) and/or reported directly to users initiating queries. This model complicates further analysis and tracking of the failures by the Qserv data administrators and alike. This ticket is meant to address this problem.
In the current implementation of Qserv errors are recorded in the temporary tables qservResult.message_<message-id> which are immediately deleted after reporting error conditions to users.
Two improvements to the error processing logic are proposed below.
Extended status codes
In addition to the above-mentioned status codes FAILED or ABORTED, add specific codes for the following error conditions:
- PARSER_ERROR: for queries that weren't succesfully parsed
- LARGE_RESULT: for queries that failed due to exceeding the result set limit set in the Qserv configuration
- COMM_ERROR: for queries failed due to persistent (non-recoverable) communication errors with the workers
- WORKER_ERROR: for queries failed due to error conditions reported by the workers
More error codes for the well-defined conditions could be added if needed.
Error messages
Extend table schema of qservMeta.QInfo by adding the text collumn error_message that will store a copy of the error messages from the tables qservResult.message_<message-id>.
As an alternative option, investigate the possibility of keeping replacing temporary tables qservResult.message_<message-id> with the permanent ones.
Other improvements
It would be nice to retain:
- the spatial coverage of the queries (the number of chunks involved)
- the number of rows in a result set
- the number of bytes in the result set
The Web Dashboard will be extended to display the additional info. Igor Gaponenko will work on that.
Attachments
Issue Links
- blocks
-
DM-34784 Add "SELECT" user queries that fail parsing or analysis to QInfo and QMessages.
- In Progress
- is triggering
-
DM-36223 The byte and row counter columns in Qserv's QInfo should be 64-bit integers instead of 32-bit ones
- Done
-
DM-35188 Functionality and performance improvement of the Qserv query monitor in the Dashboard
- Done
Few comments.
I really hate parsing arbitrary text strings when doing various reports from the database tables. Note that this table is mostly meant to be used by us (not by users) for purposes of (presumably - automated) error analysis.