Uploaded image for project: 'Request For Comments'
  1. Request For Comments
  2. RFC-552

Provide extra information for each Science Pipelines repository

    XMLWordPrintable

    Details

    • Type: RFC
    • Status: Adopted
    • Resolution: Unresolved
    • Component/s: LSST
    • Labels:
      None

      Description

      In order to properly include in the product tree all packages resolved via lsst_distrib, it is necessary to provide the following information, in each package git repository:

      • Short name of the package, a text to display in the (yellow box) of the product tree (max 18 characters)
        • Suggestion: It could follow the namespace naming rules, but first letter capital
      • Key string, 2 to 7 character, it shall be unique, the shorter the better, it has to be used for automatic process building the product tree
        • Examples, JCAL for jointcal, ASTMT for astro_metadata_translator, DAFB for daf_butler,
      • Reference person(s), someone that has the best knowledge on that software package and is usually maintaining it, 2 references can be given
        • Proposed format: github_username (Full Name) [github_username (Full Name)]

       

      The proposal is to add this information in a file, my first thought is a info.yaml in the home folder of each repository.

        Attachments

          Issue Links

            Activity

            No builds found.
            gcomoretto Gabriele Comoretto [X] (Inactive) created issue -
            Hide
            Parejkoj John Parejko added a comment -

            Key string, 2 to 7 character, it shall be unique, the shorter the better, it has to be used for automatic process building the product tree
            Examples, JCAL for jointcal, ASTMT for astro_metadata_translator, DAFB for daf_butler,

            Why is this necessary? The product names are already relatively short and contain abbreviations themselves. I don't think we need more abbreivations.

            Show
            Parejkoj John Parejko added a comment - Key string, 2 to 7 character, it shall be unique, the shorter the better, it has to be used for automatic process building the product tree Examples, JCAL for jointcal, ASTMT for astro_metadata_translator, DAFB for daf_butler, Why is this necessary? The product names are already relatively short and contain abbreviations themselves. I don't think we need more abbreivations.
            Parejkoj John Parejko made changes -
            Field Original Value New Value
            Link This issue relates to DM-4875 [ DM-4875 ]
            Parejkoj John Parejko made changes -
            Link This issue relates to DM-14875 [ DM-14875 ]
            Parejkoj John Parejko made changes -
            Link This issue relates to DM-4619 [ DM-4619 ]
            Hide
            krzys Krzysztof Findeisen added a comment -

            Some packages have grown to the point where no single person can really be considered an expert on the whole thing (afw, I'm looking at you). Could that field be multivalued?

            Show
            krzys Krzysztof Findeisen added a comment - Some packages have grown to the point where no single person can really be considered an expert on the whole thing ( afw , I'm looking at you). Could that field be multivalued?
            gcomoretto Gabriele Comoretto [X] (Inactive) made changes -
            Description In order to properly include in the product tree all packages resolved via lsst_distrib, it is necessary to provide the following information, in each package git repository:
             * Short name of the package, a text to display in the (yellow box) of the product tree (max 18 characters)
             * Suggestion: It could follow the namespace naming rules, but first letter capital


             * Key string, 2 to 7 character, it shall be unique, the shorter the better, it has to be used for automatic process building the product tree
             * Examples, JCAL for jointcal, ASTMT for astro_metadata_translator, DAFB for daf_butler,


             * Reference person, someone that has the best knowledge on that software package and is usually maintaining it
             * Proposed format: github_username (Full Name)

             

            The proposal is to add this information in a file, my first thought is a info.yaml in the home folder of each repository.
            In order to properly include in the product tree all packages resolved via lsst_distrib, it is necessary to provide the following information, in each package git repository:
             * Short name of the package, a text to display in the (yellow box) of the product tree (max 18 characters)
             * Suggestion: It could follow the namespace naming rules, but first letter capital

             * Key string, 2 to 7 character, it shall be unique, the shorter the better, it has to be used for automatic process building the product tree
             * Examples, JCAL for jointcal, ASTMT for astro_metadata_translator, DAFB for daf_butler,

             * Reference person(s), someone that has the best knowledge on that software package and is usually maintaining it
             * Proposed format: github_username (Full Name)

             

            The proposal is to add this information in a file, my first thought is a info.yaml in the home folder of each repository.
            Hide
            gcomoretto Gabriele Comoretto [X] (Inactive) added a comment - - edited

            Examples of non short name are: 

            • meas_extensions_astrometryNet, 
            • meas_extensions_simpleShape

            The reference person may have deep knowledge one some parts of the package, and know to who to refer for the other parts. Two persons could be listed as reference also.

            Show
            gcomoretto Gabriele Comoretto [X] (Inactive) added a comment - - edited Examples of non short name are:  meas_extensions_astrometryNet,  meas_extensions_simpleShape The reference person may have deep knowledge one some parts of the package, and know to who to refer for the other parts. Two persons could be listed as reference also.
            gcomoretto Gabriele Comoretto [X] (Inactive) made changes -
            Description In order to properly include in the product tree all packages resolved via lsst_distrib, it is necessary to provide the following information, in each package git repository:
             * Short name of the package, a text to display in the (yellow box) of the product tree (max 18 characters)
             * Suggestion: It could follow the namespace naming rules, but first letter capital

             * Key string, 2 to 7 character, it shall be unique, the shorter the better, it has to be used for automatic process building the product tree
             * Examples, JCAL for jointcal, ASTMT for astro_metadata_translator, DAFB for daf_butler,

             * Reference person(s), someone that has the best knowledge on that software package and is usually maintaining it
             * Proposed format: github_username (Full Name)

             

            The proposal is to add this information in a file, my first thought is a info.yaml in the home folder of each repository.
            In order to properly include in the product tree all packages resolved via lsst_distrib, it is necessary to provide the following information, in each package git repository:
             * Short name of the package, a text to display in the (yellow box) of the product tree (max 18 characters)
             ** Suggestion: It could follow the namespace naming rules, but first letter capital

             * Key string, 2 to 7 character, it shall be unique, the shorter the better, it has to be used for automatic process building the product tree
             ** Examples, JCAL for jointcal, ASTMT for astro_metadata_translator, DAFB for daf_butler,

             * Reference person(s), someone that has the best knowledge on that software package and is usually maintaining it, 2 references can be given
             ** Proposed format: github_username (Full Name) [github_username (Full Name)]

             

            The proposal is to add this information in a file, my first thought is a info.yaml in the home folder of each repository.
            Hide
            Parejkoj John Parejko added a comment -

            An example of non short name if:
            meas_extensions_astrometryNet

            Why is that too long?

            Show
            Parejkoj John Parejko added a comment - An example of non short name if: meas_extensions_astrometryNet Why is that too long?
            Hide
            gcomoretto Gabriele Comoretto [X] (Inactive) added a comment - - edited

            It do not fit in a product tree box.

            At the same time you may want to reword the name differently.

            From a logical point of view, the name of a product and the github repository name, are two different things.

             

            Show
            gcomoretto Gabriele Comoretto [X] (Inactive) added a comment - - edited It do not fit in a product tree box. At the same time you may want to reword the name differently. From a logical point of view, the name of a product and the github repository name, are two different things.  
            Hide
            swinbank John Swinbank added a comment -

            I think it would help to better assess this RFC if we had more information on exactly how this information will be used. What will be reading these YAML (or whatever) files? What will it be doing with the results?

            Other questions:

            • Why do packages need both a short name and a key string?
            • Why isn't the package short name the same as the repository name?
            • Why use a GitHub, rather than an LSST, username?
            • What's the distinct role of the “reference person” for the package that isn't covered by the responsible T/CAM, product owner or science/technical lead?
            Show
            swinbank John Swinbank added a comment - I think it would help to better assess this RFC if we had more information on exactly how this information will be used. What will be reading these YAML (or whatever) files? What will it be doing with the results? Other questions: Why do packages need both a short name and a key string? Why isn't the package short name the same as the repository name? Why use a GitHub, rather than an LSST, username? What's the distinct role of the “reference person” for the package that isn't covered by the responsible T/CAM, product owner or science/technical lead?
            Hide
            gcomoretto Gabriele Comoretto [X] (Inactive) added a comment -

            As specified in the first line of the RFC, this information is used to populate the product tree, in an automatic way.

            The short name is used to be displayed in the product tree boxes.

            The short key is used internally, it could be generated automatically, but I think it is preferable to get this chosen by who is setting up the new repository (the reference person(s) is already setup).

            Can you be more clear on the third bullet?

            The reference person could be contacted in case of problem, could take care of the issues assigned to that SW package. Indeed all activities need to be coordinated with the relevant T/CAM, product owner and science/technical lead, in the same way as they are now.

            Show
            gcomoretto Gabriele Comoretto [X] (Inactive) added a comment - As specified in the first line of the RFC, this information is used to populate the product tree, in an automatic way. The short name is used to be displayed in the product tree boxes. The short key is used internally, it could be generated automatically, but I think it is preferable to get this chosen by who is setting up the new repository (the reference person(s) is already setup). Can you be more clear on the third bullet? The reference person could be contacted in case of problem, could take care of the issues assigned to that SW package. Indeed all activities need to be coordinated with the relevant T/CAM, product owner and science/technical lead, in the same way as they are now.
            Hide
            Parejkoj John Parejko added a comment -

            I'm sorry, can you please explain more why we need the short name and short key, and why the existing product names are not sufficient? I don't know what a "product tree box" is, but it seems like adding extra names for packages would be even more confusing.

            Show
            Parejkoj John Parejko added a comment - I'm sorry, can you please explain more why we need the short name and short key, and why the existing product names are not sufficient? I don't know what a "product tree box" is, but it seems like adding extra names for packages would be even more confusing.
            Hide
            swinbank John Swinbank added a comment -

            Can you be more clear on the third bullet?

            Third bullet was:

            Why use a GitHub, rather than an LSST, username?

            My point here is simply that we have an LSST identity service which follows us across the project: we have a single username that applies to project.lsst.org, Jira, Confluence, the Data Facility, etc. And then there's GitHub, which is anomalous.

            One can imagine that, in years to come, we might migrate our code off GitHub. However, one assumes that the LSST project IDM will remain supported through the duration of the project. That seems to point to the latter being the more stable, more supported system.

            Why would we prefer GitHub in this circumstance?

            Show
            swinbank John Swinbank added a comment - Can you be more clear on the third bullet? Third bullet was: Why use a GitHub, rather than an LSST, username? My point here is simply that we have an LSST identity service which follows us across the project: we have a single username that applies to project.lsst.org, Jira, Confluence, the Data Facility, etc. And then there's GitHub, which is anomalous. One can imagine that, in years to come, we might migrate our code off GitHub. However, one assumes that the LSST project IDM will remain supported through the duration of the project. That seems to point to the latter being the more stable, more supported system. Why would we prefer GitHub in this circumstance?
            Hide
            swinbank John Swinbank added a comment -

            As specified in the first line of the RFC, this information is used to populate the product tree, in an automatic way... The short name is used to be displayed in the product tree... (etc)

            This RFC seems to be oriented around narrow technical considerations about producing a particular arrangement of yellow boxes on a PDF.

            I suggest that's not really the most important question. Why do we want to collect this information? Who does it need to be made available to? What will they do with it?

            Once we've addressed those, then we can discuss questions of formatting.

            Show
            swinbank John Swinbank added a comment - As specified in the first line of the RFC, this information is used to populate the product tree, in an automatic way... The short name is used to be displayed in the product tree... (etc) This RFC seems to be oriented around narrow technical considerations about producing a particular arrangement of yellow boxes on a PDF. I suggest that's not really the most important question. Why do we want to collect this information? Who does it need to be made available to? What will they do with it? Once we've addressed those, then we can discuss questions of formatting.
            Hide
            gcomoretto Gabriele Comoretto [X] (Inactive) added a comment - - edited

            John Parejko: I can derive the information from what available in github, and just let open the possibility to provide inputs (a different name), if this will be the outcome of the RFC.

            John Swinbank: the main reason to provide this information is to have a complete product tree to add to our baselined documentation. This was done in 2017, and after the product tree review this summer (2018) the lower level product tree is missing. Adding this information, will permit us to get the complete product tree up to date, each time it changes.

            The user information can be provided in the way we prefer, since we are getting information from github, github username seems to me to make more sense, and github is not going to be unsupported in the short or mid term. However, if we prefer, the LSST identity manager username can be used instead. Is just a matter of agree on it.

            Show
            gcomoretto Gabriele Comoretto [X] (Inactive) added a comment - - edited John Parejko : I can derive the information from what available in github, and just let open the possibility to provide inputs (a different name), if this will be the outcome of the RFC. John Swinbank : the main reason to provide this information is to have a complete product tree to add to our baselined documentation. This was done in 2017, and after the product tree review this summer (2018) the lower level product tree is missing. Adding this information, will permit us to get the complete product tree up to date, each time it changes. The user information can be provided in the way we prefer, since we are getting information from github, github username seems to me to make more sense, and github is not going to be unsupported in the short or mid term. However, if we prefer, the LSST identity manager username can be used instead. Is just a matter of agree on it.
            Hide
            jsick Jonathan Sick added a comment -

            I'll amplify others who doubt the value of a new short name for each repository. If I understand you correctly, the issue is that our repo names are too long for the LaTeX that's currently being generated by https://github.com/lsst/LDM-294/blob/master/makeProductTree.py ? If that's the case, it might be better to have tools like makeProductTree.py (and others) deal with our actual names by wrapping text, and so on. Creating an extra layer of naming seems like it might obfuscate things and create misunderstandings in the long run.

            For naming those who are responsible for any given repository, I'll point out that GitHub already has a CODEOWNERS file (https://help.github.com/articles/about-codeowners/). Though yes, it ties into GitHub identities rather than LSST IT's Kerberos, it is at least a standard that other tooling can leverage if we so decide in the future. There's still the larger question of whether we know exactly who these responsible people are, as John Swinbank mentions. A useful source of metadata that already exists here is the default assignee for Jira components that map to these repositories.

            Lastly, I'll just state general interest in this space. For www.lsst.io I intend to index our GitHub projects (https://sqr-013.lsst.io for the very old concept). Now my interest is mostly in generating codemeta so that our software can be directly cited. My thinking has been that such a codemeta.json metadata file can be mostly automatically generated from existing metadata sources, though it may be useful to embed a stub codemeta.json file in repositories as well. So although this is highly tangential to the current RFC, I just want to point out that there are other folks thinking about adding metadata files to Git repositories, and that there may be useful standards we can build on.

            Show
            jsick Jonathan Sick added a comment - I'll amplify others who doubt the value of a new short name for each repository. If I understand you correctly, the issue is that our repo names are too long for the LaTeX that's currently being generated by https://github.com/lsst/LDM-294/blob/master/makeProductTree.py ? If that's the case, it might be better to have tools like makeProductTree.py (and others) deal with our actual names by wrapping text, and so on. Creating an extra layer of naming seems like it might obfuscate things and create misunderstandings in the long run. For naming those who are responsible for any given repository, I'll point out that GitHub already has a CODEOWNERS file ( https://help.github.com/articles/about-codeowners/ ). Though yes, it ties into GitHub identities rather than LSST IT's Kerberos, it is at least a standard that other tooling can leverage if we so decide in the future. There's still the larger question of whether we know exactly who these responsible people are, as John Swinbank mentions. A useful source of metadata that already exists here is the default assignee for Jira components that map to these repositories. Lastly, I'll just state general interest in this space. For www.lsst.io I intend to index our GitHub projects ( https://sqr-013.lsst.io for the very old concept). Now my interest is mostly in generating codemeta so that our software can be directly cited. My thinking has been that such a codemeta.json metadata file can be mostly automatically generated from existing metadata sources, though it may be useful to embed a stub codemeta.json file in repositories as well. So although this is highly tangential to the current RFC, I just want to point out that there are other folks thinking about adding metadata files to Git repositories, and that there may be useful standards we can build on.
            Hide
            jsick Jonathan Sick added a comment -

            Ok, we chatted offline about this and it seems that this metadata is necessary just to get the configuration management work off the ground, so it's not really my place to micromanage that. We think that integration with SQuaRE's project metadata initiative makes sense, but no need to tie those projects together at the outset (especially since the timeline for really starting work on www.lsst.io is a bit farther out).

            Show
            jsick Jonathan Sick added a comment - Ok, we chatted offline about this and it seems that this metadata is necessary just to get the configuration management work off the ground, so it's not really my place to micromanage that. We think that integration with SQuaRE's project metadata initiative makes sense, but no need to tie those projects together at the outset (especially since the timeline for really starting work on www.lsst.io is a bit farther out).
            Hide
            swinbank John Swinbank added a comment - - edited

            So, here's why I think this is useful, and what that can tell us about implementation.

            First of all, what I hope to gain from this is a clear mapping from package/repository to a responsible team. I use the word “team“ rather than “individual” deliberately: most directly, I see this as a tool to help us understand who is on the hook for delivering what. The responsible team links directly to the budget and to the T/CAM, who has the responsibility for assessing, prioritising and scheduling issues. This directly forces us to resolve and document perennially vexed questions like “who owns obs_decam?, pex_config?, ... etc”: that is the primary value here.

            In the past, the topic of “package experts” has come up (e.g. RFC-150). I personally still regard this as a distraction from our regular workflow, but — given the success of that RFC — I've no objection to including an expert, with the caveat that everything they do in that role has to be agreed with their T/CAM.

            In general, we should expect external users of the codebase to install high level “meta-packages” (lsst_distrib, or its successors) and to expect the project to have a centralised and coherent approach to handling the issues they encounter. Except for a few of the real enthusiasts, I don't expect science users to be crawling through metadata on individual packages, and certainly not to be using that to develop “targeted” bug reports.

            It's a nice side effect that we can build a diagram showing which packages relate to which high-level product, but I don't think that's of fundamental importance and it shouldn't drive the implementation. Simply “adding a complete product tree to our baselined documentation” isn't a fundamental good or something to strive for unless that, in turn, is enabling something else (if so: what?).

            I quite strongly agree with comments to the effect that introducing two(​!) new names for each package is unnecessary and likely to be counterproductive and confusing.

            Show
            swinbank John Swinbank added a comment - - edited So, here's why I think this is useful, and what that can tell us about implementation. First of all, what I hope to gain from this is a clear mapping from package/repository to a responsible team. I use the word “team“ rather than “individual” deliberately: most directly, I see this as a tool to help us understand who is on the hook for delivering what. The responsible team links directly to the budget and to the T/CAM, who has the responsibility for assessing, prioritising and scheduling issues. This directly forces us to resolve and document perennially vexed questions like “who owns obs_decam?, pex_config?, ... etc”: that is the primary value here. In the past, the topic of “package experts” has come up (e.g. RFC-150 ). I personally still regard this as a distraction from our regular workflow, but — given the success of that RFC — I've no objection to including an expert, with the caveat that everything they do in that role has to be agreed with their T/CAM. In general, we should expect external users of the codebase to install high level “meta-packages” (lsst_distrib, or its successors) and to expect the project to have a centralised and coherent approach to handling the issues they encounter. Except for a few of the real enthusiasts, I don't expect science users to be crawling through metadata on individual packages, and certainly not to be using that to develop “targeted” bug reports. It's a nice side effect that we can build a diagram showing which packages relate to which high-level product, but I don't think that's of fundamental importance and it shouldn't drive the implementation. Simply “adding a complete product tree to our baselined documentation” isn't a fundamental good or something to strive for unless that, in turn, is enabling something else (if so: what?). I quite strongly agree with comments to the effect that introducing two(​!) new names for each package is unnecessary and likely to be counterproductive and confusing.
            Hide
            ktl Kian-Tat Lim added a comment -

            It's kind of ironic to say that the Jira component owner could be used to seed the package "reference person"; I was rather hoping that the opposite would be the case, as I think many packages lack proper Jira component owners (if they even have Jira components).

            I was also hoping that the "short name" could be reused as a standardized namespace abbreviation in Python and C++, which is otherwise generated in an ad hoc fashion.

            Show
            ktl Kian-Tat Lim added a comment - It's kind of ironic to say that the Jira component owner could be used to seed the package "reference person"; I was rather hoping that the opposite would be the case, as I think many packages lack proper Jira component owners (if they even have Jira components). I was also hoping that the "short name" could be reused as a standardized namespace abbreviation in Python and C++, which is otherwise generated in an ad hoc fashion.
            Hide
            gcomoretto Gabriele Comoretto [X] (Inactive) added a comment -

            I think that adding the owner  in the list of metadata collected is a good idea for the time been.

            Adding a technical reference, a short name, and a key can be optional. If not provided they can be derived with a python function.

            As Kian-Tat Lim pointed out, maintaining Jira components may be easier when this RFC is implemented.

            Show
            gcomoretto Gabriele Comoretto [X] (Inactive) added a comment - I think that adding the owner   in the list of metadata collected is a good idea for the time been. Adding a technical reference, a short name, and a key can be optional. If not provided they can be derived with a python function. As Kian-Tat Lim pointed out, maintaining Jira components may be easier when this RFC is implemented.
            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            Kian-Tat Lim exactly how do you see the "standardized namespace abbreviation" working? Would a developer who wants to abbreviate, say, lsst.daf.persistence be expected to look up the short name in the daf_persistence repository?

            It seems to me that if developers have to look up a particular package's abbreviation, that means they're not familiar with the (an?) abbreviated form, and standardization provides no benefit in terms of reader friction.

            Show
            krzys Krzysztof Findeisen added a comment - - edited Kian-Tat Lim exactly how do you see the "standardized namespace abbreviation" working? Would a developer who wants to abbreviate, say, lsst.daf.persistence be expected to look up the short name in the daf_persistence repository? It seems to me that if developers have to look up a particular package's abbreviation, that means they're not familiar with the (an?) abbreviated form, and standardization provides no benefit in terms of reader friction.
            Hide
            ktl Kian-Tat Lim added a comment -

            Krzysztof Findeisen Yes, a developer would be expected to look it up if he/she were unfamiliar. At least there would be a suggestion of an appropriate namespace abbreviation that would be in the repository. Presumably the same abbreviation would be used throughout the package's code and in its documentation. Right now, there's not even a potential single source of truth for such an abbreviation. I'd be fine (and I think Gabriele Comoretto [X] could adapt) if this lived somewhere well-known in package/doc/ rather than package/info.yaml.

            In my mind, standardization means that abbreviations for common dependencies can be learned and become familiar over time rather than potentially having to be relearned for each new package using the dependency because the depending package developer chose a different abbreviation. (The reduction in expenditure of creative energy by the depending package developer in coming up with new abbreviations is admittedly small and is traded off against the lookup cost.)

            To me, this is not a major motivation for the RFC; it is a side-effect.

            Show
            ktl Kian-Tat Lim added a comment - Krzysztof Findeisen Yes, a developer would be expected to look it up if he/she were unfamiliar. At least there would be a suggestion of an appropriate namespace abbreviation that would be in the repository. Presumably the same abbreviation would be used throughout the package's code and in its documentation. Right now, there's not even a potential single source of truth for such an abbreviation. I'd be fine (and I think Gabriele Comoretto [X] could adapt) if this lived somewhere well-known in package/doc/ rather than package/info.yaml . In my mind, standardization means that abbreviations for common dependencies can be learned and become familiar over time rather than potentially having to be relearned for each new package using the dependency because the depending package developer chose a different abbreviation. (The reduction in expenditure of creative energy by the depending package developer in coming up with new abbreviations is admittedly small and is traded off against the lookup cost.) To me, this is not a major motivation for the RFC; it is a side-effect.
            Hide
            bvan Brian Van Klaveren added a comment -

            GitHub already has a semi-standard way of representing "code owners" and it seems it can also be broken down by directory:

            https://blog.github.com/2017-07-06-introducing-code-owners/

            I'd suggest trying to leverage that, if possible, because the pull requests seem to potentially have that integrated in as well.

            IMO - A 7 character, unique, key will be insufficient to convey any useful information about a repo except for some kind of identity mapping - so I'd suggest expecting it to be a black box ID and just hashing the the name at that point or using the github ID (https://api.github.com/users/lsst/repos) or something, which doesn't require the user to come up with a unique 7 character name.

            Show
            bvan Brian Van Klaveren added a comment - GitHub already has a semi-standard way of representing "code owners" and it seems it can also be broken down by directory: https://blog.github.com/2017-07-06-introducing-code-owners/ I'd suggest trying to leverage that, if possible, because the pull requests seem to potentially have that integrated in as well. IMO - A 7 character, unique, key will be insufficient to convey any useful information about a repo except for some kind of identity mapping - so I'd suggest expecting it to be a black box ID and just hashing the the name at that point or using the github ID ( https://api.github.com/users/lsst/repos ) or something, which doesn't require the user to come up with a unique 7 character name.
            Hide
            gcomoretto Gabriele Comoretto [X] (Inactive) added a comment - - edited

            Using the code owner as suggested by Jonathan Sick and Brian Van Klaveren, seems to me a good idea. I suspect that this is not the same owner that John Swinbank is referring to.

            Show
            gcomoretto Gabriele Comoretto [X] (Inactive) added a comment - - edited Using the code owner as suggested by  Jonathan Sick and Brian Van Klaveren , seems to me a good idea. I suspect that this is not the same owner that John Swinbank is referring to.
            Hide
            krughoff Simon Krughoff added a comment - - edited

            Actually, I think it was Jonathan Sick who mentioned code owners in this post earlier in the thread, and I think it's the same thing Brian Van Klaveren is linking to.

            Show
            krughoff Simon Krughoff added a comment - - edited Actually, I think it was Jonathan Sick who mentioned code owners in this post earlier in the thread, and I think it's the same thing Brian Van Klaveren is linking to.
            Hide
            gcomoretto Gabriele Comoretto [X] (Inactive) added a comment -

            Yes, sorry.

            Show
            gcomoretto Gabriele Comoretto [X] (Inactive) added a comment - Yes, sorry.
            Hide
            gcomoretto Gabriele Comoretto [X] (Inactive) added a comment -

            I have been analyzing the inputs provided so far with Jonathan Sick.

            It seems that a reasonable way forward can be to extract metadata to existing files, rather than adding extra files to the repositories.

            First, we can extend the README files to include a standardized "Info" section that includes a listing of metadata. From the above comments, we can add these fields:

            • wbsowner (required): the name of T/CAM or the budget responsible. This information is not readily available anywhere else.
            • short_name (optional): used to display in the product tree. If not provided it will be derived programmatically from the existing information
            • key (optional): used by the product tree tool to uniquely identify each product. If not provided it will be derived programmatically from the existing information

            We can incorporate this information into the README template. By structuring this information, it can be parsed and extracted directly from the README file. It will also be easily readable by anyone browsing source repositories.

            Second, we can leverage the CODEOWNER file already provided by GitHub to establish a "reference person" who on a practical basis is deeply involved in the development and maintenance of the code and can be assigned to review code changes. The person(s) named in the CODEOWNER file can also be the default assignee for issues in the corresponding Jira component. See RFC-150.

            So, if John Swinbank and the other commenters are OK with the proposal, we will set the RFC to adopted and open a couple of implementation issues:

            • Extend README template to include this metadata in a standard and structured way.
            • Update existing READMEs.

            We will also invite everybody to add the code owner information in each repository in GitHub.  See https://help.github.com/articles/about-codeowners/ . For the time been we can consider this information as not mandatory.

            Due date is extended a couple of days more.

             

            Show
            gcomoretto Gabriele Comoretto [X] (Inactive) added a comment - I have been analyzing the inputs provided so far with Jonathan Sick . It seems that a reasonable way forward can be to extract metadata to existing files, rather than adding extra files to the repositories. First, we can extend the README files to include a standardized "Info" section that includes a listing of metadata. From the above comments, we can add these fields: wbsowner (required): the name of T/CAM or the budget responsible. This information is not readily available anywhere else. short_name (optional): used to display in the product tree. If not provided it will be derived programmatically from the existing information key (optional): used by the product tree tool to uniquely identify each product. If not provided it will be derived programmatically from the existing information We can incorporate this information into the README template . By structuring this information, it can be parsed and extracted directly from the README file. It will also be easily readable by anyone browsing source repositories. Second, we can leverage the CODEOWNER file already provided by GitHub to establish a "reference person" who on a practical basis is deeply involved in the development and maintenance of the code and can be assigned to review code changes. The person(s) named in the CODEOWNER file can also be the default assignee for issues in the corresponding Jira component. See RFC-150 . So, if John Swinbank and the other commenters are OK with the proposal, we will set the RFC to adopted and open a couple of implementation issues: Extend README template to include this metadata in a standard and structured way. Update existing READMEs. We will also invite everybody to add the code owner information in each repository in GitHub.  See https://help.github.com/articles/about-codeowners/  . For the time been we can consider this information as not mandatory. Due date is extended a couple of days more.  
            gcomoretto Gabriele Comoretto [X] (Inactive) made changes -
            Planned End 10/Dec/18 6:52 PM 13/Dec/18 6:52 PM
            Hide
            Parejkoj John Parejko added a comment -

            I still don't understand what the justification for the short_name and key are. If the goal is to identify our software in some managerial document, I would think it more important to use the actual name of the software, not some other (shortened) name.

            Show
            Parejkoj John Parejko added a comment - I still don't understand what the justification for the short_name and key are. If the goal is to identify our software in some managerial document, I would think it more important to use the actual name of the software, not some other (shortened) name.
            Hide
            gcomoretto Gabriele Comoretto [X] (Inactive) added a comment -

            The name information is usually available from the README. If not, the github package name can be considered the official name also. So there is no need to add extra metadata for it. If you feel that a short name is not relevant don't add it.

            Show
            gcomoretto Gabriele Comoretto [X] (Inactive) added a comment - The name information is usually available from the README. If not, the github package name can be considered the official name also. So there is no need to add extra metadata for it. If you feel that a short name is not relevant don't add it.
            Hide
            swinbank John Swinbank added a comment -
            • short_name (optional): used to display in the product tree. If not provided it will be derived programmatically from the existing information
            • key (optional): used by the product tree tool to uniquely identify each product. If not provided it will be derived programmatically from the existing information

            I'd still prefer to omit these unless we really understand what they are for. In what circumstances is it not appropriate to use the product name as the short name? Whose responsibility is it to “feel that a short name is not relevant”?

            establish a "reference person" who on a practical basis is deeply involved in the development and maintenance of the code and can be assigned to review code changes. The person(s) named in the CODEOWNER file can also be the default assignee for issues in the corresponding Jira component. See RFC-150.

            No, please don't do this. Specifically:

            • There should be no implication that the person listed as an “owner” should be assigned to review changes. It's not fair on any individual to have them as the “default reviewer” for a huge product like pipe_tasks or afw.
            • The conclusion re default assignees in RFC-150 seems to be “it would be best if the default assignee for new stories would actually be Unassigned, not TCAM, and not the component expert”.
            Show
            swinbank John Swinbank added a comment - short_name (optional): used to display in the product tree. If not provided it will be derived programmatically from the existing information key (optional): used by the product tree tool to uniquely identify each product. If not provided it will be derived programmatically from the existing information I'd still prefer to omit these unless we really understand what they are for. In what circumstances is it not appropriate to use the product name as the short name? Whose responsibility is it to “feel that a short name is not relevant”? establish a "reference person" who on a practical basis is deeply involved in the development and maintenance of the code and can be assigned to review code changes. The person(s) named in the CODEOWNER file can also be the default assignee for issues in the corresponding Jira component. See RFC-150 . No, please don't do this. Specifically: There should be no implication that the person listed as an “owner” should be assigned to review changes. It's not fair on any individual to have them as the “default reviewer” for a huge product like pipe_tasks or afw. The conclusion re default assignees in RFC-150 seems to be “it would be best if the default assignee for new stories would actually be Unassigned, not TCAM, and not the component expert”.
            Hide
            womullan Wil O'Mullane added a comment -

            the key I think we can definitely derive - the short_name is perhaps a misnomer - its display_name - its used by my script to make the product tree which I would still like to print out large for review purposes.

             

            Agree not to have default assignee - but we did want to identify "someone" in general responsible for the packages.

            Show
            womullan Wil O'Mullane added a comment - the key I think we can definitely derive - the short_name is perhaps a misnomer - its display_name - its used by my script to make the product tree which I would still like to print out large for review purposes.   Agree not to have default assignee - but we did want to identify "someone" in general responsible for the packages.
            Hide
            swinbank John Swinbank added a comment - - edited

            Hi Wil O'Mullane — I'm not sure what value is gained by producing a poster-sized display of all our code repositories. This seems like it's going beyond the level of abstraction usefully captured by the product tree.

            However, if we agree that this is the goal, then I suggest that “display names” for each repository can be more conveniently stored with (or generated by) the product tree generation scripts, rather than in the repositories themselves. Regular developers, who aren't involved with generating this sort of display, should never need to see them, define them, or use them.

            Agree not to have default assignee - but we did want to identify "someone" in general responsible for the packages.

            Yes, agreed. This is the conclusion of RFC-150 (which I disagree with, but I don't want to reopen that discussion). Identifying an “expert” who is not a default assignee or default reviewer is fine.

            Show
            swinbank John Swinbank added a comment - - edited Hi Wil O'Mullane — I'm not sure what value is gained by producing a poster-sized display of all our code repositories. This seems like it's going beyond the level of abstraction usefully captured by the product tree. However, if we agree that this is the goal, then I suggest that “display names” for each repository can be more conveniently stored with (or generated by) the product tree generation scripts, rather than in the repositories themselves. Regular developers, who aren't involved with generating this sort of display, should never need to see them, define them, or use them. Agree not to have default assignee - but we did want to identify "someone" in general responsible for the packages. Yes, agreed. This is the conclusion of RFC-150 (which I disagree with, but I don't want to reopen that discussion). Identifying an “expert” who is not a default assignee or default reviewer is fine.
            Hide
            womullan Wil O'Mullane added a comment -

            When I did the original product tree I thought we agreed tieing in all repos was the correct thing to do- we can roll up and display at any level we like once we have a consistent set of relationships.

            Ah yes you mean put a lookup map(json)  with the script generating the tree that could work.

            Show
            womullan Wil O'Mullane added a comment - When I did the original product tree I thought we agreed tieing in all repos was the correct thing to do- we can roll up and display at any level we like once we have a consistent set of relationships. Ah yes you mean put a lookup map(json)  with the script generating the tree that could work.
            Hide
            swinbank John Swinbank added a comment -

            Ah yes you mean put a lookup map(json) with the script generating the tree that could work.

            .

            Show
            swinbank John Swinbank added a comment - Ah yes you mean put a lookup map(json) with the script generating the tree that could work. .
            Hide
            gcomoretto Gabriele Comoretto [X] (Inactive) added a comment -

            If I understand correctly from the last comments, the only information we need is:

            • wbsowner
            • expert
            Show
            gcomoretto Gabriele Comoretto [X] (Inactive) added a comment - If I understand correctly from the last comments, the only information we need is: wbsowner expert
            Hide
            gcomoretto Gabriele Comoretto [X] (Inactive) added a comment - - edited

            I had a look to the milestone project (in lsst-dm) but I did not find any json with project names.

            John Swinbank can you indicate where I can find this information?

            Show
            gcomoretto Gabriele Comoretto [X] (Inactive) added a comment - - edited I had a look to the milestone project (in lsst-dm) but I did not find any json with project names. John Swinbank can you indicate where I can find this information?
            Hide
            swinbank John Swinbank added a comment - - edited

            Gabriele Comoretto [X] — sorry I didn't see your comment earlier.

            I think we discussed this at the DM-CCB yesterday. The only JSON file in https://github.com/lsst-dm/milestones is used to provide information about milestones. I'm not aware of any existing JSON file which defines project (product?) names, although I guess maybe https://github.com/lsst/repos/blob/master/etc/repos.yaml is close?

            Show
            swinbank John Swinbank added a comment - - edited Gabriele Comoretto [X] — sorry I didn't see your comment earlier. I think we discussed this at the DM-CCB yesterday. The only JSON file in https://github.com/lsst-dm/milestones is used to provide information about milestones. I'm not aware of any existing JSON file which defines project (product?) names, although I guess maybe https://github.com/lsst/repos/blob/master/etc/repos.yaml is close?
            Hide
            gcomoretto Gabriele Comoretto [X] (Inactive) added a comment -

            I think repos.yaml is used for build purposes. Add there information for documentation purpose is not the right thing in my opinion.

            Since I will be parsing README files in any cases, I think that this is the most suitable location for the (optional) display_name information, in addition to wbsowner and expert as concluded above.

            Show
            gcomoretto Gabriele Comoretto [X] (Inactive) added a comment - I think repos.yaml is used for build purposes. Add there information for documentation purpose is not the right thing in my opinion. Since I will be parsing README files in any cases, I think that this is the most suitable location for the (optional) display_name information, in addition to wbsowner and expert as concluded above.
            gcomoretto Gabriele Comoretto [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 19570 ]
            gcomoretto Gabriele Comoretto [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 19592 ]
            Hide
            gcomoretto Gabriele Comoretto [X] (Inactive) added a comment -

            As per DMCCB #3 discussion.

            Show
            gcomoretto Gabriele Comoretto [X] (Inactive) added a comment - As per DMCCB #3 discussion.
            gcomoretto Gabriele Comoretto [X] (Inactive) made changes -
            Status Proposed [ 10805 ] Adopted [ 10806 ]
            gcomoretto Gabriele Comoretto [X] (Inactive) made changes -
            Link This issue is triggering DM-17682 [ DM-17682 ]
            gcomoretto Gabriele Comoretto [X] (Inactive) made changes -
            Remote Link This issue links to "Page (Confluence)" [ 19652 ]
            ktl Kian-Tat Lim made changes -
            Assignee Gabriele Comoretto [ gcomoretto ] Kian-Tat Lim [ ktl ]
            ktl Kian-Tat Lim made changes -
            Remote Link This issue links to "Page (Confluence)" [ 27441 ]

              People

              Assignee:
              ktl Kian-Tat Lim
              Reporter:
              gcomoretto Gabriele Comoretto [X] (Inactive)
              Watchers:
              Brian Van Klaveren, Gabriele Comoretto [X] (Inactive), John Parejko, Jonathan Sick, Kian-Tat Lim, Simon Krughoff, Tim Jenness, Wil O'Mullane
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Dates

                Created:
                Updated:
                Planned End:

                  Jenkins

                  No builds found.