Uploaded image for project: 'Request For Comments'
  1. Request For Comments
  2. RFC-62

Make config files and command line specifications of configurations consistent

    XMLWordPrintable

    Details

    • Type: RFC
    • Status: Implemented
    • Resolution: Done
    • Component/s: DM
    • Labels:
      None
    • Location:
      PR

      Description

      There is an irritating inconsistency between how you specify configuration parameters on the command line (config.foo.bar) and in a configuration file (root.foo.bar). This is especially annoying when you use --show config to obtain the name, cut and paste the answer into a file, and are rewarded with:

      error: cannot load config file 'ptfConfig.py': name 'config' is not defined

      This RFC proposes that we adopt a consist naming convention. I prefer config everywhere, but (as Kian-Tat Lim points out) that's a slightly more complex change than adopting root everywhere. On the other hand, I think that adopting config is much clearer, and wouldn't tempt us to change the the pipe_base's -show config[=glob] option.

        Attachments

          Issue Links

            Activity

            Hide
            ktl Kian-Tat Lim added a comment -

            Note that the command line syntax is actually --config foo.bar.

            Show
            ktl Kian-Tat Lim added a comment - Note that the command line syntax is actually --config foo.bar .
            Hide
            rowen Russell Owen added a comment -

            All tasks and standard "configurables" use "config" as the constructor argument and instance variable. Thus if we used config in persisted config files there would be a one-to-one correspondence between the top-level task's config, the command line arguments and the persisted config file, which eliminates a source of confusion. Thus I am strongly in favor of this RFC.

            If we chose to use root instead of config then it just pushes the mis-match deeper into tasks (since root is clearly a poor name for an config). I admit the analogy is imperfect inside subtasks, but I still feel using "config" instead of "root" is less confusing.

            I have two suggested refinements:

            • Allow "root" as well as "config" when unpersisting configs, for backwards compatibility.
            • Eliminate the name argument used in Config's methods that persist and unpersist configs. When writing configs always use "config" instead of "root". If a name is required for hierarchical persistence and unpersistence then support that using a private method.
            Show
            rowen Russell Owen added a comment - All tasks and standard "configurables" use "config" as the constructor argument and instance variable. Thus if we used config in persisted config files there would be a one-to-one correspondence between the top-level task's config, the command line arguments and the persisted config file, which eliminates a source of confusion. Thus I am strongly in favor of this RFC. If we chose to use root instead of config then it just pushes the mis-match deeper into tasks (since root is clearly a poor name for an config). I admit the analogy is imperfect inside subtasks, but I still feel using "config" instead of "root" is less confusing. I have two suggested refinements: Allow "root" as well as "config" when unpersisting configs, for backwards compatibility. Eliminate the name argument used in Config's methods that persist and unpersist configs. When writing configs always use "config" instead of "root". If a name is required for hierarchical persistence and unpersistence then support that using a private method.
            Hide
            price Paul Price added a comment -

            I would prefer config to root as well, but am concerned about the large number of existing files that already use root. Because of this, I suggest that if we go that way, we should support both, at least for a release or two (though that's potentially dangerous, since "config" may be used in a config file already), with deprecation warnings.

            Standardising on root would be far easier (I don't think it's only "slightly" less complex), as it only involves changing our code, as opposed to users' code that is outside our control.

            Show
            price Paul Price added a comment - I would prefer config to root as well, but am concerned about the large number of existing files that already use root . Because of this, I suggest that if we go that way, we should support both , at least for a release or two (though that's potentially dangerous, since "config" may be used in a config file already), with deprecation warnings. Standardising on root would be far easier (I don't think it's only "slightly" less complex), as it only involves changing our code, as opposed to users' code that is outside our control.
            Hide
            rowen Russell Owen added a comment -

            I think being able to read a config override file that uses "root" or "config" is probably as simple as adding this to the loading code: "root = config:. Then they are both (pointers to) the same object, so overrides to one will have the same effect as overrides to the other.

            Show
            rowen Russell Owen added a comment - I think being able to read a config override file that uses "root" or "config" is probably as simple as adding this to the loading code: "root = config:. Then they are both (pointers to) the same object, so overrides to one will have the same effect as overrides to the other.
            Hide
            price Paul Price added a comment -

            I don't think it should be that simple. You should put them in as separate objects so you can detect which one is used in the file and, based on which one is used, prompt a deprecation warning. Using separate objects also allows you to detect whether the user is using config in a way not consistent with the expectation that it should be the same as root.

            Show
            price Paul Price added a comment - I don't think it should be that simple. You should put them in as separate objects so you can detect which one is used in the file and, based on which one is used, prompt a deprecation warning. Using separate objects also allows you to detect whether the user is using config in a way not consistent with the expectation that it should be the same as root .
            Hide
            rowen Russell Owen added a comment -

            A deprecation warning and detection of root and config in the same file would be nice, but do we really need them enough to justify the additional code complexity.

            Show
            rowen Russell Owen added a comment - A deprecation warning and detection of root and config in the same file would be nice, but do we really need them enough to justify the additional code complexity.
            Hide
            price Paul Price added a comment -

            Yes. There is enough user code out there already that we need to be careful in changing this very important interface.

            Show
            price Paul Price added a comment - Yes. There is enough user code out there already that we need to be careful in changing this very important interface.
            Hide
            rowen Russell Owen added a comment -

            I'm not sure what you mean by code. The only case where "root" is used that I know of is when persisting and unpersisting configs. The latter is probably the main area of concern, and usually consists of reading in small config override files. As long as they are read in correctly I don't think it matters if they use "root" or "config", does it?

            Show
            rowen Russell Owen added a comment - I'm not sure what you mean by code. The only case where "root" is used that I know of is when persisting and unpersisting configs. The latter is probably the main area of concern, and usually consists of reading in small config override files. As long as they are read in correctly I don't think it matters if they use "root" or "config", does it?
            Hide
            price Paul Price added a comment -

            The config override files are the user code I refer to. Note that they are not just key-value pairs, but they are (by design) code, so that there's nothing to keep one from already containing a config variable that is not equal to root. If you now expect that root and config are equal, there is the possibility that you will break something or (worse) cause unexpected behaviour without any indication that there is something amiss. The responsible thing to do, then, is to attempt to detect unexpected conditions and provide warnings. The code to do that is not long or difficult to write or maintain.

            Show
            price Paul Price added a comment - The config override files are the user code I refer to. Note that they are not just key-value pairs, but they are (by design) code , so that there's nothing to keep one from already containing a config variable that is not equal to root . If you now expect that root and config are equal, there is the possibility that you will break something or (worse) cause unexpected behaviour without any indication that there is something amiss. The responsible thing to do, then, is to attempt to detect unexpected conditions and provide warnings. The code to do that is not long or difficult to write or maintain.
            Hide
            jbosch Jim Bosch added a comment -

            I strongly agree that the result of --show config should use the same name as persisted config files. I think I slightly prefer using root in both places, because I think that's the right name: it's a variable that's referring to the root of the configuration tree. Using "config" there would be completely redundant; everything in that file is configuration. I see no real conflict with the fact that within a task we use self.config - variable names should be interpreted in the context of what they mean to the code around them, not what objects they contain (and while it wouldn't be a big deal in this case, I think trying to make all variables that refer to the same object have the same name in different parts of the codebase generally goes against the principle of encapsulation, because that object may represent something different to different parts of the codebase, and some of the code that uses it shouldn't know as much about it).

            Show
            jbosch Jim Bosch added a comment - I strongly agree that the result of --show config should use the same name as persisted config files. I think I slightly prefer using root in both places, because I think that's the right name: it's a variable that's referring to the root of the configuration tree. Using "config" there would be completely redundant; everything in that file is configuration. I see no real conflict with the fact that within a task we use self.config - variable names should be interpreted in the context of what they mean to the code around them, not what objects they contain (and while it wouldn't be a big deal in this case, I think trying to make all variables that refer to the same object have the same name in different parts of the codebase generally goes against the principle of encapsulation, because that object may represent something different to different parts of the codebase, and some of the code that uses it shouldn't know as much about it).
            Hide
            rowen Russell Owen added a comment -

            My objection to "root" is that it isn't obvious what it refers to. "config" is much clearer. For a user looking at some random bit of .py code there is no particular reason to understand that "root" means "the root level of a config". To my eyes "root" is meaningless jargon (root of WHAT?), which makes our stack harder for new users to understand.

            By comparison I feel that "config" is quite clear. If a user sees a file containing config.foo.bar = value then it's pretty clear that a configuration field is being set.

            Show
            rowen Russell Owen added a comment - My objection to "root" is that it isn't obvious what it refers to. "config" is much clearer. For a user looking at some random bit of .py code there is no particular reason to understand that "root" means "the root level of a config". To my eyes "root" is meaningless jargon (root of WHAT?), which makes our stack harder for new users to understand. By comparison I feel that "config" is quite clear. If a user sees a file containing config.foo.bar = value then it's pretty clear that a configuration field is being set.
            Hide
            price Paul Price added a comment -

            I agree with Jim Bosch that root is clear as the root of the configuration hierarchy (though I have a very mild preference for config). I propose changing the output of --show config to use root. That solves the problem of invalidating existing user code.

            Aside: this has been a known problem for 3 years (https://dev.lsstcorp.org/trac/ticket/2044), and the same proposed solution was made there in the code review:

            I'm a little worried that --show config produces a config that is not loadable without renaming or use of load(root="config"). But, since this ticket is just about finding things in the config and not reloading it, this is OK for now (and a simple fix if needed later).

            Show
            price Paul Price added a comment - I agree with Jim Bosch that root is clear as the root of the configuration hierarchy (though I have a very mild preference for config ). I propose changing the output of --show config to use root . That solves the problem of invalidating existing user code. Aside: this has been a known problem for 3 years ( https://dev.lsstcorp.org/trac/ticket/2044 ), and the same proposed solution was made there in the code review: I'm a little worried that --show config produces a config that is not loadable without renaming or use of load(root="config") . But, since this ticket is just about finding things in the config and not reloading it, this is OK for now (and a simple fix if needed later).
            Hide
            rowen Russell Owen added a comment -

            If we switch to root in the task output then the output of --show config will show root.foo.bar = x instead of config.foo.bar = x for every config item. This output appears among many other messages unrelated to the config, making the data hard enough to spot already. I think it will be much, much harder to spot and more confusing if the word config changes to root.

            I disagree with Jim (which is a bad sign, since he's usually right) when he says it should be obvious in context that a .py file is a config override file. I have not found that to be true. I admit that many config override files are in a directory named "config", but that's just for obs-package-specific override files, not files in test directories or private directories where a user is messing about. And even if it were always true, a user may not know where a file is located when trying to understand that file.

            It is even worse for the "--show config" option when running a task. There is no context in that situation. The user sees a lot more output when using that flag, but which are the lines the user is looking for? Why should the user expect them to start with root? Wouldn't "config" be easier to spot? I certainly felt so, which was why I took the bold and possibly obnoxious stance of forcing "config" even when config files used "root" instead.

            I understand the argument for "the root of the config", but we're calling it "root" not "configRoot", so the result is a conceptual purity that I feel obscures the underlying reality that we are setting an attribute of an instance of a Config. I think we all agree that configRoot is too wordy, so we must abbreviate it, thus losing information. The two obvious abbreviations are "config" and "root" and I feel the former is much clearer. If we use "config" everywhere then the user sees "config" and should mentally add "root" to get "the root of the configuration". As it stands, the user sees "root", but where is the clue that tells them this is a configuration object?

            Show
            rowen Russell Owen added a comment - If we switch to root in the task output then the output of --show config will show root.foo.bar = x instead of config.foo.bar = x for every config item. This output appears among many other messages unrelated to the config, making the data hard enough to spot already. I think it will be much, much harder to spot and more confusing if the word config changes to root. I disagree with Jim (which is a bad sign, since he's usually right) when he says it should be obvious in context that a .py file is a config override file. I have not found that to be true. I admit that many config override files are in a directory named "config", but that's just for obs-package-specific override files, not files in test directories or private directories where a user is messing about. And even if it were always true, a user may not know where a file is located when trying to understand that file. It is even worse for the "--show config" option when running a task. There is no context in that situation. The user sees a lot more output when using that flag, but which are the lines the user is looking for? Why should the user expect them to start with root? Wouldn't "config" be easier to spot? I certainly felt so, which was why I took the bold and possibly obnoxious stance of forcing "config" even when config files used "root" instead. I understand the argument for "the root of the config", but we're calling it "root" not "configRoot", so the result is a conceptual purity that I feel obscures the underlying reality that we are setting an attribute of an instance of a Config. I think we all agree that configRoot is too wordy, so we must abbreviate it, thus losing information. The two obvious abbreviations are "config" and "root" and I feel the former is much clearer. If we use "config" everywhere then the user sees "config" and should mentally add "root" to get "the root of the configuration". As it stands, the user sees "root", but where is the clue that tells them this is a configuration object?
            Hide
            ktl Kian-Tat Lim added a comment -

            config shall be the common root.

            Show
            ktl Kian-Tat Lim added a comment - config shall be the common root.
            Hide
            rowen Russell Owen added a comment -

            I will implement this.

            Show
            rowen Russell Owen added a comment - I will implement this.

              People

              Assignee:
              rhl Robert Lupton
              Reporter:
              rhl Robert Lupton
              Watchers:
              Gregory Dubois-Felsmann, Jim Bosch, John Swinbank, Kian-Tat Lim, Paul Price, Robert Lupton, Russell Owen
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:
                Planned End:

                  Jenkins

                  No builds found.