Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-31120

Change butler logger to python logging

    XMLWordPrintable

    Details

    • Story Points:
      4
    • Team:
      Architecture
    • Urgent?:
      No

      Description

      Currently the butler sends all log output to lsst.log. Now that there is a way to forward all log messages from lsst.log to python (DM-30996) change the way that butler (and derivatives such as ctrl_mpexec) handles logging. This will make everything consistent regardless of whether lsst.log is available and will simplify the implementation of DM-30977.

        Attachments

          Issue Links

            Activity

            No builds found.
            tjenness Tim Jenness created issue -
            tjenness Tim Jenness made changes -
            Field Original Value New Value
            Link This issue relates to DM-30977 [ DM-30977 ]
            tjenness Tim Jenness made changes -
            Link This issue relates to DM-30996 [ DM-30996 ]
            tjenness Tim Jenness made changes -
            Status To Do [ 10001 ] In Progress [ 3 ]
            Hide
            tjenness Tim Jenness added a comment -

            Andy Salnikov sorry for another review but you are the obvious person to look at this.

            This changes butler to forward python lsst.log messages directly to python logging and uses the C++ pylog appender for C++ log messages. Hopefully this will help with Kian-Tat Lim's worries regarding performance since only the C++ messages are coming back to python.

            I have had to add a new Formatter class to give me timezone-aware timestamps with millisec precision. I'm not entirely sure whether the default should be UTC or local timezone but for now I use localtime and assume that software can parse the tzoffset string.

            Here is the output from the start of pipelines_check to prove that everything is working:

            INFO 2021-07-16T08:54:09.034-07:00 ctrl.mpexec.cmdLineFwk ()(cmdLineFwk.py:559) - QuantumGraph contains 3 quanta for 3 tasks, graph ID: '1626450849.033102-56546'
            INFO 2021-07-16T08:54:09.960-07:00 conda.common.io ()(io.py:246) - overtaking stderr and stdout
            INFO 2021-07-16T08:54:13.091-07:00 conda.common.io ()(io.py:261) - stderr and stdout yielding back
            WARNING 2021-07-16T08:54:17.012-07:00 afw.image.MaskedImageFitsReader (isr:{instrument: 'HSC', detector: 10, exposure: 903342, ...})(MaskedImageFitsReader.cc:94) - Expected extension type not found: IMAGE
            WARNING 2021-07-16T08:54:17.154-07:00 afw.image.MaskedImageFitsReader (isr:{instrument: 'HSC', detector: 10, exposure: 903342, ...})(MaskedImageFitsReader.cc:94) - Expected extension type not found: IMAGE
            WARNING 2021-07-16T08:54:18.869-07:00 isr (isr:{instrument: 'HSC', detector: 10, exposure: 903342, ...})(logContinued.py:119) - Constructing linearizer from cameraGeom information.
            INFO 2021-07-16T08:54:18.870-07:00 isr (isr:{instrument: 'HSC', detector: 10, exposure: 903342, ...})(isrTask.py:1410) - Converting exposure to floating point values.
            INFO 2021-07-16T08:54:19.082-07:00 isr (isr:{instrument: 'HSC', detector: 10, exposure: 903342, ...})(isrTask.py:1475) - Assembling CCD from amplifiers.
            

            This has lsst.log C++ messages, lsst.log messages from python and standard python logging.

            I do wonder if we should force the level for the conda logger to debug so we don't see that message.

            Show
            tjenness Tim Jenness added a comment - Andy Salnikov sorry for another review but you are the obvious person to look at this. This changes butler to forward python lsst.log messages directly to python logging and uses the C++ pylog appender for C++ log messages. Hopefully this will help with Kian-Tat Lim 's worries regarding performance since only the C++ messages are coming back to python. I have had to add a new Formatter class to give me timezone-aware timestamps with millisec precision. I'm not entirely sure whether the default should be UTC or local timezone but for now I use localtime and assume that software can parse the tzoffset string. Here is the output from the start of pipelines_check to prove that everything is working: INFO 2021-07-16T08:54:09.034-07:00 ctrl.mpexec.cmdLineFwk ()(cmdLineFwk.py:559) - QuantumGraph contains 3 quanta for 3 tasks, graph ID: '1626450849.033102-56546' INFO 2021-07-16T08:54:09.960-07:00 conda.common.io ()(io.py:246) - overtaking stderr and stdout INFO 2021-07-16T08:54:13.091-07:00 conda.common.io ()(io.py:261) - stderr and stdout yielding back WARNING 2021-07-16T08:54:17.012-07:00 afw.image.MaskedImageFitsReader (isr:{instrument: 'HSC', detector: 10, exposure: 903342, ...})(MaskedImageFitsReader.cc:94) - Expected extension type not found: IMAGE WARNING 2021-07-16T08:54:17.154-07:00 afw.image.MaskedImageFitsReader (isr:{instrument: 'HSC', detector: 10, exposure: 903342, ...})(MaskedImageFitsReader.cc:94) - Expected extension type not found: IMAGE WARNING 2021-07-16T08:54:18.869-07:00 isr (isr:{instrument: 'HSC', detector: 10, exposure: 903342, ...})(logContinued.py:119) - Constructing linearizer from cameraGeom information. INFO 2021-07-16T08:54:18.870-07:00 isr (isr:{instrument: 'HSC', detector: 10, exposure: 903342, ...})(isrTask.py:1410) - Converting exposure to floating point values. INFO 2021-07-16T08:54:19.082-07:00 isr (isr:{instrument: 'HSC', detector: 10, exposure: 903342, ...})(isrTask.py:1475) - Assembling CCD from amplifiers. This has lsst.log C++ messages, lsst.log messages from python and standard python logging. I do wonder if we should force the level for the conda logger to debug so we don't see that message.
            tjenness Tim Jenness made changes -
            Reviewers Andy Salnikov [ salnikov ]
            Status In Progress [ 3 ] In Review [ 10004 ]
            Hide
            salnikov Andy Salnikov added a comment -

            Looks good, few minor comments.

            I think local time zone is what regular people prefer when running pipetask locally. Maybe we could have a global flag (envvar?) to use UTC when running in production environment?

            conda messages are indeed annoying, maybe we should set threshold to WARNING for them by default? And I'm not sure where do they come from.

            Show
            salnikov Andy Salnikov added a comment - Looks good, few minor comments. I think local time zone is what regular people prefer when running pipetask locally. Maybe we could have a global flag (envvar?) to use UTC when running in production environment? conda messages are indeed annoying, maybe we should set threshold to WARNING for them by default? And I'm not sure where do they come from.
            salnikov Andy Salnikov made changes -
            Status In Review [ 10004 ] Reviewed [ 10101 ]
            Hide
            tjenness Tim Jenness added a comment -

            It looks like the RSP always uses UTC as its primary timezone so that will all happen automatically. An env var is easy to add later if we need to but I'll default it to local for now.

            conda messages come from inside the conda API that we use to introspect the list of conda packages. I think I'd just want to put

            logging.getLogger("conda").setLevel(logging.WARNING)
            

            in the butler log setup and they would disappear. Patching the conda code will take longer but it does seem odd that they chose INFO for those messages.

            Show
            tjenness Tim Jenness added a comment - It looks like the RSP always uses UTC as its primary timezone so that will all happen automatically. An env var is easy to add later if we need to but I'll default it to local for now. conda messages come from inside the conda API that we use to introspect the list of conda packages. I think I'd just want to put logging.getLogger( "conda" ).setLevel(logging.WARNING) in the butler log setup and they would disappear. Patching the conda code will take longer but it does seem odd that they chose INFO for those messages.
            Hide
            ktl Kian-Tat Lim added a comment -

            My main worry is that formatting all DEBUG-and-above C++ messages (even if they are thrown away at the Python level) will impact performance.

            Show
            ktl Kian-Tat Lim added a comment - My main worry is that formatting all DEBUG-and-above C++ messages (even if they are thrown away at the Python level) will impact performance.
            Hide
            tjenness Tim Jenness added a comment -

            When you use --log-level it sets both the level of the python logger of that name and the level of the lsst logger of that name. Do appenders see everything independently of the effective level of the log message? Can the appender call `isEnabledFor` itself in C++ before it calls python?

            Show
            tjenness Tim Jenness added a comment - When you use --log-level it sets both the level of the python logger of that name and the level of the lsst logger of that name. Do appenders see everything independently of the effective level of the log message? Can the appender call `isEnabledFor` itself in C++ before it calls python?
            Hide
            salnikov Andy Salnikov added a comment -

            I think we are doing it right already by changing `log4cxx` log level/threshold in sync with Python. Logging macros that we define for C++ side do check effective logging level before formatting. And appender is not even called if message level is below logger threshold.

            Show
            salnikov Andy Salnikov added a comment - I think we are doing it right already by changing `log4cxx` log level/threshold in sync with Python. Logging macros that we define for C++ side do check effective logging level before formatting. And appender is not even called if message level is below logger threshold.
            tjenness Tim Jenness made changes -
            Resolution Done [ 10000 ]
            Status Reviewed [ 10101 ] Done [ 10002 ]
            Hide
            ktl Kian-Tat Lim added a comment -

            Oh, somehow I thought I saw that the C++ side was defaulting to DEBUG and relying on the Python side for filtering. But if that's not the case, then OK.

            Show
            ktl Kian-Tat Lim added a comment - Oh, somehow I thought I saw that the C++ side was defaulting to DEBUG and relying on the Python side for filtering. But if that's not the case, then OK.
            Hide
            salnikov Andy Salnikov added a comment -

            K-T is right as usual, and I missed that part. Python logging and lsst.log are initialized with different level, if --logging-level is not used then lsst.log will use DEBUG.
            This call https://github.com/lsst/daf_butler/blob/master/python/lsst/daf/butler/cli/cliLog.py#L111 should use "INFO" to match Python basicConfig().

            Show
            salnikov Andy Salnikov added a comment - K-T is right as usual, and I missed that part. Python logging and lsst.log are initialized with different level, if --logging-level is not used then lsst.log will use DEBUG. This call https://github.com/lsst/daf_butler/blob/master/python/lsst/daf/butler/cli/cliLog.py#L111 should use "INFO" to match Python basicConfig().
            Hide
            tjenness Tim Jenness added a comment -

            Doesn't this line override things though?

            https://github.com/lsst/daf_butler/blob/master/python/lsst/daf/butler/cli/cliLog.py#L136

            I don't see lsst.log accepting debug messages from python lsst.log usage (in the sense that isEnabledFor is not letting them through).

            Show
            tjenness Tim Jenness added a comment - Doesn't this line override things though? https://github.com/lsst/daf_butler/blob/master/python/lsst/daf/butler/cli/cliLog.py#L136 I don't see lsst.log accepting debug messages from python lsst.log usage (in the sense that isEnabledFor is not letting them through).
            Hide
            salnikov Andy Salnikov added a comment -

            You are right, I missed that explicit call to _setLogLevel. Sorry for my confusion, indeed things look fine as they are now.

            Show
            salnikov Andy Salnikov added a comment - You are right, I missed that explicit call to _setLogLevel. Sorry for my confusion, indeed things look fine as they are now.
            krzys Krzysztof Findeisen made changes -
            Link This issue relates to DM-31180 [ DM-31180 ]
            lauren Lauren MacArthur made changes -
            Link This issue relates to DM-31530 [ DM-31530 ]

              People

              Assignee:
              tjenness Tim Jenness
              Reporter:
              tjenness Tim Jenness
              Reviewers:
              Andy Salnikov
              Watchers:
              Andy Salnikov, Kian-Tat Lim, Tim Jenness
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.