Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-14122

Make LTD Keeper PATCH /builds/<id> asynchronous

    XMLWordPrintable

    Details

      Description

      After a client uploads documentation to LTD’s S3 bucket, they must do a PATCH /builds/<id> call to tell the API that the build is complete. When this happens, the LTD Keeper app determines if the new build corresponds to an edition. If so, it copies the build over to become the new edition in the S3 bucket. For large sites, this takes a significant amount of time since actual data is being copied in the S3 bucket. The result is that the PATCH call routinely times out and produces a 503 server error.

      The solution is to make this post processes asynchronous with the client API handling. Being a Flask app, we already have experience using Celery in Kubernetes to implement asynchronous tasks.

      The goal of this ticket is:

      1. Add celery to LTD Keeper and its Kubernetes deployment
      2. Refactor the edition update into a Celery task.

        Attachments

          Activity

          Hide
          jsick Jonathan Sick added a comment -

          Changelog:

          This release includes the celery task queuing system and major internal updates to the application structure and dependencies.

          DM-14122.

          API updates

          • Endpoints that launch asynchronous queue tasks now provide a queue_url field. This is a URL to an endpoint that provides status information on the queued task. For example, after PATCHing an edition with a new build, you can watch the queue_url to see when the rebuild is complete. The queue_urls are provided by the new GET /queue/(id) endpoint.
          • We don't yet provide a way to query the queue in general — you can only get URLs by being the user that triggered the task.
          • Endpoints, especially PATCH /editions/(id), should no longer timeout (500 error) for large documentation projects.
          • The /editions/(id) resource includes a new pending_rebuild field. This field acts as a semaphore and is set to true if there is a pending rebuild task. You can't PATCH the edition's build_url when pending_rebuild is true. If necessary, an operator can PATCH pending_rebuild to false if the Celery task that rebuilds the edition failed.

          Deployment updates

          • New deployment: keeper-redis. This deployment consists of a single Redis container (official redis:4-alpine image). There is no persistent storage or high-availability at thistime (this was judged a fair trade off since the Celery queue is inherently transient).
          • New service: keeper-redis. This service fronts the keeper-redis deployment.
          • New deployment: keeper-worker-deployment. This deployment mirrors keeper-deployment, except that the run command starts a Celery worker for the LTD Keeper application. This deployment can be scaled up to provide additional workers. The keeper-worker-deployment is not fronted by a service since the Celery workers pull tasks from keeper-redis.

          Internal updates

          • Dependency updates:
          • Flask 0.12.2
          • Requests 2.18.4
          • uwsgi 2.0.17
          • Flask-SQLAlchemy 2.3.2
          • PyMySQL 0.8.0
          • Flask-Migrate 2.1.1
          • Switched from Flask-Script to flask.cli. The Makefile now fronts most of the Flask commands for convience during development. Run make help to learn more.
          • Application architecture improvements:
          • Moved the Flask application factory out of _init_.py to keeper.appfactory.
          • Moved the get_auth_token route to the api blueprint.
          • Moved DB connection object to keeper.models.db.
          • Add Product.from_url() and Edition.from_url() methods for consistency with Build.from_url.
          • Logging updates:
          • Now we specifically set up the keeper logger instead of the root logger. This keeps things manageable when turning on debug-level logging.
          • New app configuration for logging level. Debug-level logging is used in the development and testing profiles, while info-level logging is used in production.
          • New celery app factory in keeper.celery.
          • New Celery task queuing infrastructure in keeper.taskrunner. In a request context, application code can add an asynchronous task by calling append_task_to_chain() with a Celery task signature. These task signatures are persisted, within the request context, in flask.g.tasks. Just before a route handler returns it should call launch_task_chain(), which launches the task chain asynchronously. The advantage of this whole-context chain is that it orders asynchronous tasks: editions are rebuilt before the dashboard is created. If a task is known to be fully independent of other tasks it could just be launched immediately.
          • New Celery tasks:
          • keeper.tasks.editionrebuild.rebuild_edition(): copies a build on S3 onto the edition.
          • keeper.tasks.dashboardbuild.build_dashboard(): triggers LTD Dasher.
          • Replace Edition.rebuild() with Edition.set_pending_rebuild to use the new rebuild_edition task.
          Show
          jsick Jonathan Sick added a comment - Changelog: This release includes the celery task queuing system and major internal updates to the application structure and dependencies. DM-14122 . API updates Endpoints that launch asynchronous queue tasks now provide a queue_url field. This is a URL to an endpoint that provides status information on the queued task. For example, after PATCHing an edition with a new build, you can watch the queue_url to see when the rebuild is complete. The queue_urls are provided by the new GET /queue/(id) endpoint. We don't yet provide a way to query the queue in general — you can only get URLs by being the user that triggered the task. Endpoints, especially PATCH /editions/(id), should no longer timeout (500 error) for large documentation projects. The /editions/(id) resource includes a new pending_rebuild field. This field acts as a semaphore and is set to true if there is a pending rebuild task. You can't PATCH the edition's build_url when pending_rebuild is true. If necessary, an operator can PATCH pending_rebuild to false if the Celery task that rebuilds the edition failed. Deployment updates New deployment: keeper-redis. This deployment consists of a single Redis container (official redis:4-alpine image). There is no persistent storage or high-availability at thistime (this was judged a fair trade off since the Celery queue is inherently transient). New service: keeper-redis. This service fronts the keeper-redis deployment. New deployment: keeper-worker-deployment. This deployment mirrors keeper-deployment, except that the run command starts a Celery worker for the LTD Keeper application. This deployment can be scaled up to provide additional workers. The keeper-worker-deployment is not fronted by a service since the Celery workers pull tasks from keeper-redis. Internal updates Dependency updates: Flask 0.12.2 Requests 2.18.4 uwsgi 2.0.17 Flask-SQLAlchemy 2.3.2 PyMySQL 0.8.0 Flask-Migrate 2.1.1 Switched from Flask-Script to flask.cli. The Makefile now fronts most of the Flask commands for convience during development. Run make help to learn more. Application architecture improvements: Moved the Flask application factory out of _ init _.py to keeper.appfactory. Moved the get_auth_token route to the api blueprint. Moved DB connection object to keeper.models.db. Add Product.from_url() and Edition.from_url() methods for consistency with Build.from_url. Logging updates: Now we specifically set up the keeper logger instead of the root logger. This keeps things manageable when turning on debug-level logging. New app configuration for logging level. Debug-level logging is used in the development and testing profiles, while info-level logging is used in production. New celery app factory in keeper.celery. New Celery task queuing infrastructure in keeper.taskrunner. In a request context, application code can add an asynchronous task by calling append_task_to_chain() with a Celery task signature. These task signatures are persisted, within the request context, in flask.g.tasks. Just before a route handler returns it should call launch_task_chain(), which launches the task chain asynchronously. The advantage of this whole-context chain is that it orders asynchronous tasks: editions are rebuilt before the dashboard is created. If a task is known to be fully independent of other tasks it could just be launched immediately. New Celery tasks: keeper.tasks.editionrebuild.rebuild_edition(): copies a build on S3 onto the edition. keeper.tasks.dashboardbuild.build_dashboard(): triggers LTD Dasher. Replace Edition.rebuild() with Edition.set_pending_rebuild to use the new rebuild_edition task.
          Hide
          jsick Jonathan Sick added a comment -

          Released as version 1.9.0.

          Show
          jsick Jonathan Sick added a comment - Released as version 1.9.0.

            People

            Assignee:
            jsick Jonathan Sick
            Reporter:
            jsick Jonathan Sick
            Watchers:
            Jonathan Sick
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved:

                Jenkins

                No builds found.