Details
-
Type:
Improvement
-
Status: Done
-
Resolution: Done
-
Fix Version/s: None
-
Component/s: None
-
Labels:None
-
Story Points:8
-
Epic Link:
-
Sprint:DB_F22_6
-
Team:Data Access and Database
Description
The problem to be addressed
Many services provided by the REST API of the Qserv Replication/Ingest system for the dependent workflows require passing various JSON objects specifying the scope and parameters of a request. The schema of such objects depends on the current version of the REST API. A similar problem exists for services invoked via the HTTP method GET. In this case, the names and values of the request's query may also depend on the version of the API. The result JSON objects that are returned by many services are also version-specific.
The client-service dependency on a specific version of the API is getting especially important at the current stage of development in which the Replication/Ingest system is still being evolved. The API has to change as new (or adjusted) functions are added to the system after gaining more experience with ingesting a variety of different data products at a large scale into Qserv. The changes still need to happen frequently (once a month, or once every few months). And because of that, the ingest workflows need to adjust accordingly to the new state of the API. To help the workflows, the API has a special REST service provided by the Master Replication Controller and the worker Replication/Ingest services:
GET /meta/version
|
This service is documented at: https://confluence.lsstcorp.org/display/DM/Ingest%3A+11.1.1.2+Obtaining+the+current+version+of+the+API. According to the document, the ingest workflows are recommended to use this service to check the current version of the API before making any further interactions with the API. Knowing the actual version of the API allows the workflows to adjust parameters sent to the API or refuse to operate should the current version of the API be incompatible with the expectations of the workflow.
This mechanism still leaves behind another class of JSON configuration objects that are stored in the JSON files. These files are used for configuring databases, tables, indexes, and other artifacts created or managed via the API. Normally, the files are created with a specific version of the API in mind and they could be reused later for re-ingesting the same artifacts in various Qserv instances. Further attempts to reuse the files for ingesting catalogs (or managing existing artifacts) into the upgraded Qserv & Replication/Ingest system may result in unexpected failures due to hidden incompatibilities between the outdated schemas of the JSON objects and the new API. Quite often it's impossible to solve this problem at a level of the ingest workflows that are unaware of the implied version of the API for these objects. A similar problem exists with the JSON objects composed by the ingest workflows at run-time.
Another issue with the current versioning model of the API is that at each evolution of the API (resulting in incrementing its version), only a small subset of methods gets updated. The API presently has over 50 methods of which less than 50% are used by the known ingest workflows. Besides those, there are other non-ingest applications, such as Qserv Web Dashboard and the monitoring applications that depend on the API.
This effort is meant to address the above-presented problems by introducing the extended per-method versioning mechanism into the interaction model between the ingest workflows and the Replication/Ingest system's services.
The proposed solution
In addition to the existing "rigid" version match that is available to the clients, the clients will be given a more flexible mechanism for checking and enforcing versions of the called methods. In the new model, when calling a method, a client may also send the optional parameter carrying the version number the method is expected to support. The version number sent by the client doesn't have to be the same as the version of the whole API. Note that different methods of the API may evolve (if they have to evolve) with different frequencies. Therefore, a given method should be able to support the following range of versions (inclusive at both ends of the interval):
- from: the version number when the expected schema of the service's input or output was modified
- to: the current version of the REST API
Therefore, in order for the call to be considered by the method, the version number sent by the client doesn't need to be an exact match for the current version of the API. The request will be accepted for as long as the input number is found within the above-mentioned range. That would have a clear benefit for the ingest workflows as it would reduce the need for frequent changes in the workflows simply because the API version had to change due to changes (or the addition of) some non-relevant REST services. A similar benefit would apply to the non-ingest clients.
In case the version number sent to the service won't match expectations the service would respond with the following error:
"error": "The requested version <num> of the API is not in the range supported by the service.",
|
"error_ext": {
|
"min_version": <min-num-inclusive>,
|
"max_version": <the-current-version-of-the-api>
|
},
|
For the sake of compatibility with existing applications, this mechanism has been made optional. If no version info is passed to a method then the API will return a warning in the resulting JSON object, along with the version range accepted by the method:
"warning": "No version number was provided in the request's query/body.",
|
The called service will still attempt to execute the request. Though, in case of a failure of such a request, it would be more difficult to determine if the failure was caused by version incompatibility or something else. A hope is that all clients of the API would eventually be migrated to the extended versioning interface.
The extended mechanism also allows per-method version detection if using a special version 0. In this case, an advanced ingest workflow that was written say for version N might send "probes" to the required REST method with the version=0 in the pre-flight version compatibility checking stage. In this case, the REST API is guaranteed not to execute the methods and just return the previously expected error message and a range of versions supported by each "probed" method. After that the ingest workflow would check if its current version N is found within the ranges of the methods and, depending on the results, act accordingly.
In general, it's expected that the clients of the API would be sending a version number for which the client was written (or last time updated).
Passing the version number
The way the optional version number is expected to be passed to the service depends on what HTTP method is required for calling the service. For all POST, PUT, and DELETE requests the version will be expected in the body of a request:
{"version": <number>,
|
...
|
}
|
For the GET method, the version number will be looked up at the request's query string (a part of the request's URL):
http://<host>:<port>/<svc-path>?version=<num>
|
Further discussion
The clear benefits of this proposal are:
- The unified approach to the version handling would work equally for version numbers sent in the request's query, in the request's body originating from some pre-existing JSON file, or formed by the caller at run-time.
- Backward compatibility with existing clients (no need to rewrite those right away).
- The ongoing support for the existing (fixed) version checking on the client's side.
The proposed extension is just adding the optional "version negotiation" which is up to a client to use or not.
Speaking of the alternative options, the only one that's most frequently used by other applications is to have the version number in the name of the corresponding resource:
GET /v1/service1
|
PUT /v1/service2
|
POST /v3/service3
|
DELETE /v2/service4
|
etc.
|
Though this might look similar to what's proposed in this document, using the version-based naming convention would have a number of disadvantages as compared with this proposal. In particular:
- It won't allow the automatic version discovery for the services at run-time (one would have to manually read the documentation or check the source code of the services).
- It won't help with the versioning of the JSON files, for which a redundant (and potentially - conflicting versioning mechanism) would have to be implemented ("back to square one" of this ticket's goal).
- And finally, it would complicated the implementation of the REST servers as it would require making the service router (the one that gives names to the resources) to be aware of the implementation of the services. Note that no such problem would exist in the proposed solution as all version-specific details a confined within a code of the modules themselves. The names of the services are now orthogonal to their functionality.
PRs: