Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-2176

Worker management service - impl

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: Qserv
    • Labels:
      None

      Description

      We need to replace direct worker-mysql communication and other administrative channels with a special service which will control all worker communication. Some light-weight service running alongside other worker servers, probably HTTP-based. Data loading, start/stop should be handled by this service.

        Attachments

          Issue Links

            Activity

            Hide
            salnikov Andy Salnikov added a comment -

            There are few issues related to data loading via HTTP:

            • data volume loaded in each request is potentially large
            • data is formatted currently (and likely in the future) according to representation accepted by LOAD DATA INFILE ..., together with the data we need to pass some extra info (line terminators, column separators, etc.)
            • with large data volumes there is a chance that connection could be broken while data is transferred from client to worker, this should result in clean rollback as if no data were loaded at all so that clients can retry and succeed
              • one simple way to achieve that is to store loaded data into a temporary file and then load it using LOAD DATA INFILE ....
            • we probably want to avoid unnecessary encoding/decoding of the data if possible at all (except for encryption which will be required)
              • if we only need to send file data then we could use application/octet-stream content type to transfer data in raw format
              • but since we also need to pass extra info we'll likely need to use multipart/form-data for request (combined with application/octet-stream for file data and some other content type for other parameters}}
              • I'm not sure if this could result in a performance hit, need to investigate

            It clearly needs some research, I think I'll postpone the implementation of the data loading part and open new story for it.

            Show
            salnikov Andy Salnikov added a comment - There are few issues related to data loading via HTTP: data volume loaded in each request is potentially large data is formatted currently (and likely in the future) according to representation accepted by LOAD DATA INFILE ... , together with the data we need to pass some extra info (line terminators, column separators, etc.) with large data volumes there is a chance that connection could be broken while data is transferred from client to worker, this should result in clean rollback as if no data were loaded at all so that clients can retry and succeed one simple way to achieve that is to store loaded data into a temporary file and then load it using LOAD DATA INFILE ... . we probably want to avoid unnecessary encoding/decoding of the data if possible at all (except for encryption which will be required) if we only need to send file data then we could use application/octet-stream content type to transfer data in raw format but since we also need to pass extra info we'll likely need to use multipart/form-data for request (combined with application/octet-stream for file data and some other content type for other parameters}} I'm not sure if this could result in a performance hit, need to investigate It clearly needs some research, I think I'll postpone the implementation of the data loading part and open new story for it.
            Hide
            salnikov Andy Salnikov added a comment -

            Fabrice, Jacek, can I ask two of you to review the ticket? I propose to split it as:

            • Fabrice: anything under admin/ (adding one more service)
            • Jacek: anything under core/modules/wmgr/ (Flask-based stuff)

            PR: https://github.com/lsst/qserv/pull/52

            It's functional, but it's not quite feature-complete, there is a separate ticket to implement data loading part, and some development on xrootd side is needed to implement remaining xrootd requests. And of course client part for this new service will be done separately.

            Show
            salnikov Andy Salnikov added a comment - Fabrice, Jacek, can I ask two of you to review the ticket? I propose to split it as: Fabrice: anything under admin/ (adding one more service) Jacek: anything under core/modules/wmgr/ (Flask-based stuff) PR: https://github.com/lsst/qserv/pull/52 It's functional, but it's not quite feature-complete, there is a separate ticket to implement data loading part, and some development on xrootd side is needed to implement remaining xrootd requests. And of course client part for this new service will be done separately.
            Hide
            jammes Fabrice Jammes added a comment -

            Maybe do something about the dropDB function, or open a ticket about it?

            I don't like to have duplicated code to validate table name. Some people may also implement it in a wrong way or forget to do it, the DB package should have the responsability of it.

            Show
            jammes Fabrice Jammes added a comment - Maybe do something about the dropDB function, or open a ticket about it? I don't like to have duplicated code to validate table name. Some people may also implement it in a wrong way or forget to do it, the DB package should have the responsability of it.
            Hide
            salnikov Andy Salnikov added a comment -

            Thinking about it again I believe that name validation should NOT be in lsst.db. That kind of knowledge may be too specific and database-dependent (or even policy-dependent) so we should keep that one level higher (in the web service).

            Jacek Becla, do you still want to review PR?

            Show
            salnikov Andy Salnikov added a comment - Thinking about it again I believe that name validation should NOT be in lsst.db . That kind of knowledge may be too specific and database-dependent (or even policy-dependent) so we should keep that one level higher (in the web service). Jacek Becla , do you still want to review PR?
            Hide
            jbecla Jacek Becla added a comment -

            I'd like to but I won't have time today, so if i am blocking you just go ahead!

            Show
            jbecla Jacek Becla added a comment - I'd like to but I won't have time today, so if i am blocking you just go ahead!
            Hide
            salnikov Andy Salnikov added a comment -

            Well, I can still merge it even from CERN, so there is no rush if you can still review it before the end of the sprint.

            I'll switch it back under review for now.

            Show
            salnikov Andy Salnikov added a comment - Well, I can still merge it even from CERN, so there is no rush if you can still review it before the end of the sprint. I'll switch it back under review for now.
            Hide
            jbecla Jacek Becla added a comment -

            This comment is triggered because I saw _specialDbs in dbMgr.py.

            Just checking.. are we assuming that entire mysqld space belongs to the worker? I am asking because at some point we were considering having some prefix (like a unique worker id) for each database managed by a given worker, which would allow non-qserv managed databases to happily coexist. But I do realize it is much simpler if we require that entire mysqld data_dir space is ours.

            Show
            jbecla Jacek Becla added a comment - This comment is triggered because I saw _specialDbs in dbMgr.py. Just checking.. are we assuming that entire mysqld space belongs to the worker? I am asking because at some point we were considering having some prefix (like a unique worker id) for each database managed by a given worker, which would allow non-qserv managed databases to happily coexist. But I do realize it is much simpler if we require that entire mysqld data_dir space is ours.
            Hide
            jbecla Jacek Becla added a comment -

            I noticed few times I paused and thought when seeing things like "dbCreate": are we actually creating a database here (e.g. create database x), or are we just creating metadata for it? Perhaps changing names to something like "dbCreateInMeta", or using a word "register" instead of "create" when we are creating db in metadata would help?

            Same comment about deleteDb.

            Show
            jbecla Jacek Becla added a comment - I noticed few times I paused and thought when seeing things like "dbCreate": are we actually creating a database here (e.g. create database x), or are we just creating metadata for it? Perhaps changing names to something like "dbCreateInMeta", or using a word "register" instead of "create" when we are creating db in metadata would help? Same comment about deleteDb.
            Hide
            jbecla Jacek Becla added a comment -

            you are using pretty wide screen, some like eg in dbMgr.py are quite long. I'd suggest to keep things shorter

            Show
            jbecla Jacek Becla added a comment - you are using pretty wide screen, some like eg in dbMgr.py are quite long. I'd suggest to keep things shorter
            Hide
            jbecla Jacek Becla added a comment -

            Looking at dbMgr.py... is the name like "dbDelete" conforming to our standard? I thought verb should come first. BTW, the comment right after dbDelete is "drop". Let's unify this

            Show
            jbecla Jacek Becla added a comment - Looking at dbMgr.py... is the name like "dbDelete" conforming to our standard? I thought verb should come first. BTW, the comment right after dbDelete is "drop". Let's unify this
            Hide
            jbecla Jacek Becla added a comment -

            I think having a quick readme, or some example would be useful.

            Show
            jbecla Jacek Becla added a comment - I think having a quick readme, or some example would be useful.
            Hide
            jbecla Jacek Becla added a comment - - edited

            Looks good, some minor comments in PR and in comments here. Thanks

            Show
            jbecla Jacek Becla added a comment - - edited Looks good, some minor comments in PR and in comments here. Thanks
            Hide
            salnikov Andy Salnikov added a comment -

            Thanks Jacek, few answers:

            Re _specialDbs:

            • yes, my assumption is that everything in worker database "belongs" to worker in a sense that we can do whatever we want with the contents that we are allowed to see/modify. I was not aware there is an attempt to separate different workers using the same mysql server. This could potentially be done of course if needed either with prefix or simply using separate accounts with corresponding permissions.

            Re createDb vs registerDb:

            • agree, I'll update names to make them clearer. My current naming reflects REST resources and operations that we do on them, but I can change it.

            Re deleteDb vs dbDelete:

            • this is again influenced mostly by REST resource+operation scheme but it does not have to be that way. I can change that as well.

            I'll add README.md. Full documentation is in a separate ticket.

            Show
            salnikov Andy Salnikov added a comment - Thanks Jacek, few answers: Re _specialDbs: yes, my assumption is that everything in worker database "belongs" to worker in a sense that we can do whatever we want with the contents that we are allowed to see/modify. I was not aware there is an attempt to separate different workers using the same mysql server. This could potentially be done of course if needed either with prefix or simply using separate accounts with corresponding permissions. Re createDb vs registerDb: agree, I'll update names to make them clearer. My current naming reflects REST resources and operations that we do on them, but I can change it. Re deleteDb vs dbDelete: this is again influenced mostly by REST resource+operation scheme but it does not have to be that way. I can change that as well. I'll add README.md. Full documentation is in a separate ticket.
            Hide
            salnikov Andy Salnikov added a comment - - edited

            you are using pretty wide screen, some like eg in dbMgr.py are quite long. I'd suggest to keep things shorter

            Everything is below 110 columns, which is the upper limit that our coding standard sets. Typically long lines are those which contain message text (like in logging or exceptions). I find them harder to read if they are split into multiple short lines. But if standard would say that we should try hard to keep everything below 80 characters for readability reasons I'd be happy to oblige

            Show
            salnikov Andy Salnikov added a comment - - edited you are using pretty wide screen, some like eg in dbMgr.py are quite long. I'd suggest to keep things shorter Everything is below 110 columns, which is the upper limit that our coding standard sets. Typically long lines are those which contain message text (like in logging or exceptions). I find them harder to read if they are split into multiple short lines. But if standard would say that we should try hard to keep everything below 80 characters for readability reasons I'd be happy to oblige
            Hide
            salnikov Andy Salnikov added a comment -

            I added a bunch of small fixes to resolve Fabrice's and Jacek's comments. Merged and pushed. New service is started now if you run qserv-start.sh (even in mono mode), it's not used yet, but you can play with it if you like.

            Show
            salnikov Andy Salnikov added a comment - I added a bunch of small fixes to resolve Fabrice's and Jacek's comments. Merged and pushed. New service is started now if you run qserv-start.sh (even in mono mode), it's not used yet, but you can play with it if you like.

              People

              Assignee:
              salnikov Andy Salnikov
              Reporter:
              salnikov Andy Salnikov
              Reviewers:
              Jacek Becla
              Watchers:
              Andy Salnikov, Fabrice Jammes, Jacek Becla
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  CI Builds

                  No builds found.