Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-7398

Define policy on providing a shared software stack to developers

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Invalid
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Team:
      System Management

      Description

      Historically (and simplified), we have provided a shared system for use by developers in the form of lsst-dev. It's generally taken as read that this should provide an up-to-date installation of the LSST stack in a "shared" form — that is, configured so that developers can declare their own special versions of packages and share them with their peers.

      In X16, Science Pipelines agreed to temporarily take over maintenance of the lsst-dev shared stack from SQuaRE in order to keep our developers moving while the latter group was overloaded. This has worked reasonably well: we've rolled out a system which automatically installs recent stack releases as they are tagged, which has been in use for several months now.

      However, there are known issues with this means of maintaining the stack, in particular as regards to long term support (how long should historical builds be operational?), upgrades to the underlying Anaconda installation (should this be kept in sync with the lsstsw package list, for example?) and stack "hygiene" (if developers are allowed to tag their own products, who is responsible for cleaning up after them?). See, for example, DM-7361 for an example of the issues faced. In short, more effort is required to support and update this system in the long term. Who is responsible?

      However, the issue goes beyond the narrow scope of lsst-dev. Moving forward, we're likely to see developers expecting to use some form of shared stack that's available on the Nebula infrastructure, the commissioning cluster, and so on. There are additional questions over what it means to "share" a stack in this context, since these systems will presumably not share the same filesystem. What will be provided here? Who is responsible for providing it?

        Attachments

          Issue Links

            Activity

            Hide
            swinbank John Swinbank added a comment -

            Worth noting that we should expand the scope of this ticket beyond just providing a “shared stack” to include other packages and tools that developers will need to get their work done. For example, this will likely include the (versioned) test data repositories like afwdata, validate_drp, etc.

            Show
            swinbank John Swinbank added a comment - Worth noting that we should expand the scope of this ticket beyond just providing a “shared stack” to include other packages and tools that developers will need to get their work done. For example, this will likely include the (versioned) test data repositories like afwdata, validate_drp, etc.
            Hide
            womullan Wil O'Mullane added a comment -

            Margaret Gelman Donald Petravick Should probably comment on this. My assumption would be for shared resources these are controlled by NCSA. I would not mix lsst-dev where you need flexibility with operational system where you will NOT have flexibility. A start was made describing releases and how we might deliver containers for them as the delivery mechanism in LDM-564. For the commissioning cluster providing JupyterLab should allow a user to spin up a specific container as the kernel with a specific version of the stack etc. We do need to standardize on the toolset and filesystems etc available there but the idea is to have multiple potential kernels avialble and even to allow modification in situ for - this needs to be worked out with Chuck for commissioning and Simon Krughoff is working on that with him.

            This issue calls for a policy but I am not sure this is a single policy - it will differ for each environment. Plus some of the questions are more release related - those will be the job of the new release manager. (e..g stack hygiene)

            Show
            womullan Wil O'Mullane added a comment - Margaret Gelman Donald Petravick Should probably comment on this. My assumption would be for shared resources these are controlled by NCSA. I would not mix lsst-dev where you need flexibility with operational system where you will NOT have flexibility. A start was made describing releases and how we might deliver containers for them as the delivery mechanism in LDM-564. For the commissioning cluster providing JupyterLab should allow a user to spin up a specific container as the kernel with a specific version of the stack etc. We do need to standardize on the toolset and filesystems etc available there but the idea is to have multiple potential kernels avialble and even to allow modification in situ for - this needs to be worked out with Chuck for commissioning and Simon Krughoff is working on that with him. This issue calls for a policy but I am not sure this is a single policy - it will differ for each environment. Plus some of the questions are more release related - those will be the job of the new release manager. (e..g stack hygiene)
            Hide
            petravick Donald Petravick added a comment -

            My first impulse is that the NCSA is responsible for file system support for the stack for developers. I understand there are two use cases containers and EUPS fie system distributions. I understand potential problems with EUPS with plenty of open() and stat() calls.

            the other matter is data repositories. I see this is a rich territory for discussion. We imagine the test data needed for production scale testing is in the production file systems in the data backbone. We imagine lesser scale test data to be resident in the /datasets file system at NCSA. We are interested in understand the release manager’s processes and discussion when (s)he arrives.

            Show
            petravick Donald Petravick added a comment - My first impulse is that the NCSA is responsible for file system support for the stack for developers. I understand there are two use cases containers and EUPS fie system distributions. I understand potential problems with EUPS with plenty of open() and stat() calls. the other matter is data repositories. I see this is a rich territory for discussion. We imagine the test data needed for production scale testing is in the production file systems in the data backbone. We imagine lesser scale test data to be resident in the /datasets file system at NCSA. We are interested in understand the release manager’s processes and discussion when (s)he arrives.
            Hide
            womullan Wil O'Mullane added a comment -

            Michelle Butler [X] Margaret Gelman is this in a way related to RFC-444  - could we close this with an update to the developer guide stating what is available (filesystems etc .. ) and keeping it up to date . Then also how to request updates .  Perhaps I should reassign this to one of you ?

            Show
            womullan Wil O'Mullane added a comment - Michelle Butler [X] Margaret Gelman is this in a way related to RFC-444   - could we close this with an update to the developer guide stating what is available (filesystems etc .. ) and keeping it up to date . Then also how to request updates .  Perhaps I should reassign this to one of you ?
            Hide
            swinbank John Swinbank added a comment -

            Just to reiterate the point above — at the moment, the shared stack is maintained by Pipelines. I'd like this off my plate, and I don't think it's a Pipelines job, but it does require some degree of familiarity with how the stack works and developer expectations. I don't know if that expertise is currently available within the LDF group. Whether it is or not, the scope of the work here certainly goes beyond just updating the Developer Guide — there's a bunch of ongoing maintenance to be aware of, and outstanding questions that need answering/work that needs doing.

            Show
            swinbank John Swinbank added a comment - Just to reiterate the point above — at the moment, the shared stack is maintained by Pipelines. I'd like this off my plate, and I don't think it's a Pipelines job, but it does require some degree of familiarity with how the stack works and developer expectations. I don't know if that expertise is currently available within the LDF group. Whether it is or not, the scope of the work here certainly goes beyond just updating the Developer Guide — there's a bunch of ongoing maintenance to be aware of, and outstanding questions that need answering/work that needs doing.
            Hide
            womullan Wil O'Mullane added a comment -

            Marked invalid since this now moves to SLAC but Richard Dubois  may care to take note ..

            Show
            womullan Wil O'Mullane added a comment - Marked invalid since this now moves to SLAC but Richard Dubois   may care to take note ..
            Hide
            tjenness Tim Jenness added a comment -

            I will clarify that it is still Pipelines who curate the shared stack at NCSA despite them not wanting to be involved. I'm sure Yusra AlSayyad is hoping that USDF will be different.

            Show
            tjenness Tim Jenness added a comment - I will clarify that it is still Pipelines who curate the shared stack at NCSA despite them not wanting to be involved. I'm sure Yusra AlSayyad is hoping that USDF will be different.

              People

              Assignee:
              mbutler Michelle Butler [X] (Inactive)
              Reporter:
              swinbank John Swinbank
              Watchers:
              Donald Petravick, Frossie Economou, Gabriele Comoretto [X] (Inactive), Hsin-Fang Chiang, John Parejko, John Swinbank, Tim Jenness, Wil O'Mullane
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.