# Use lsst::log in pipeline tasks

XMLWordPrintable

#### Details

• Type: RFC
• Status: Implemented
• Resolution: Done
• Component/s:
• Labels:
None

#### Description

This is a continuation of RFC-29, and narrower in scope.

I would like to switch from using pex.logging to lsst.log in pipe_base. After that, all pipeline tasks (task or cmdLineTask) can switch to use lsst.log. With lsst::log (via log4cxx implementation) there are 6 logging levels (trace, debug, info, warn, error, fatal). Arbitrary integers will no longer be accepted as levels. In this framework it’s encouraged to use loggers with distinct, hierarchical names instead of custom levels.

In Python, I plan to use the recently added lsst.log Python interface, not through Python logging and LogHandler. For example:

  logger = lsst.log.Log.getLogger(“a.named.logger”)  logger.setLevel(Log.INFO)  logger.trace("This is TRACE")  logger.debug("This is DEBUG")  logger.info("This is INFO")  logger.warn("This is WARN")  logger.error("This is ERROR")  logger.fatal("Format %d %g %s", 3, 2.71828, "foo") 

For most tasks, little modifications are needed in Python, as long as the task’s own log attribute is used for logging. The “logdebug” method in pex_logging will be replaced by “debug”, and most will be format changes.

For CmdLineTasks, the namespace logging level control from the command line -- loglevel is preserved. The command line option -- logdest will be removed; the file destination is instead set through log4cxx configuration. As I understand, many people use this option for its default formatter with dataId attached with each log record. After switching to lsst::log, dataId can still be included in the logs. I’m thinking two possibilities: an command line option to include the dataId in log records, or an command line option to more easily supply user’s own log4cxx configuration file in which format can be set.

In C++, macros are used, e.g.

  LOG_LOGGER logger = LOG_GET(“a.named.logger“); 

For logging one can use either one of the two macro families, the sprintf-based interface:

  LOGL_INFO(logger, "This is INFO");  LOGL_WARN(“a.named.logger“, "This is WARN");  LOGL_FATAL(logger, "This is FATAL %d %.4f %s", 65, 42.123, "logging"); 

and the iostream-based interface:

  LOGLS_INFO(logger, "Format " << 3 << " " << 2.71828 << " foo c++");  LOGLS_WARN(“a.named.logger“, "This warning is” << “scary”); 

The transition plan is to do this in stages: (1) individual logging not chained to CmdLineTask (e.g. daf_butlerUtils, or hard-coded and non-configurable logging), as they are not configurable through the command line task interface anyway. (2) Python-side logging for tasks/CmdLineTasks plus log in C ++. (3) Debug/Trace in C ++. They are typically run with unit tests and each package can be handled separately.

(Proposed documentations)

### Use of logger names and log levels

Logger names should generally start with the (fully-qualified) name of the module/file containing the logger. If the logger is saved as a variable in a class, it is often appropriate to name the logger after the class. Logger names use "." as component separators, not "::", even in C++.

Example:

 log = lsst.log.Log.getLogger("meas.algorithms.starSelector") log.info("This is information about the star selector algorithm execution. %f", 3.14) 

In Python, comma-separated arguments are preferred over the string formatting so it is done only if the log record is to be printed.

Log levels should be used as follows:

• FATAL: for severe errors that may prevent further execution of the component
• ERROR: for errors that may still allow the execution to continue
• WARNING: for conditions that may indicate a problem but that allow continued execution
• INFO: for information that is of interest during normal execution including production
• DEBUG: for information that is of interest to developers but not of interest during production
• TRACE: for detailed information when debugging

For loggers used at DEBUG and TRACE levels, it is often desirable to add further components to the logger name; these would indicate which specific portion of the code or algorithm that the logged information pertains to.

 debugLogger = lsst.log.Log.getLogger("meas.algorithms.starSelector.catalogReader") debugLogger.debug("Catalog reading took %f seconds", finish - start) 

The idea here is that the author understands the intent of the log message and can simply name it, without worrying about its relative importance or priority compared with other log messages in the same component. A person debugging the code would typically be looking at it and so would be able to determine the appropriate name to enable. The hierarchy allows all components to be easily enabled or disabled together.

As an alternative for TRACE loggers where there are different messages at increasing levels of verbosity but no specific component names that would be appropriate, or where increasing verbosity spans a number of levels of the component hierarchy, logger names can be prefixed with "TRACE1", "TRACE2", "TRACE3", etc. to indicate increasing verbosity.

 traceLogger = lsst.log.Log.getLogger("TRACE2.meas.algorithms.starSelector") traceLogger.debug("On %d-th iteration of star selection", iteration) innerTraceLogger = lsst.log.getLogger("TRACE2.meas.algorithms.starSelector.catalogReader") innerTraceLogger.debug("Reading catalog %s", catalogName) 

In C++

 LOG_LOGGER traceLogger = LOG_GET(“TRACE2.meas.algorithms.starSelector“); LOGL_DEBUG(traceLogger, "On %d-th iteration of star selection", iteration); LOG_LOGGER innerTraceLogger = LOG_GET(“TRACE2.meas.algorithms.starSelector.catalogReader"); LOGL_DEBUG(innerTraceLogger, "Reading catalog %s", catalogName); 

Notice that all loggers in the hierarchy under a given component at a given trace level can be enabled easily using, e.g., "TRACE2.lsst.meas.algorithms.starSelector".

Getting a logger object and logging to that is preferred over logging using a string as the logger name. The latter can be used, for examples:

 lsst.log.log("meas.algorithms.starSelector.psfCandidate", lsst.log.WARN, "Failed to make a psfCandidate") lsst.log.log("TRACE4.meas.algorithms.starSelector.psfCandidate", lsst.log.DEBUG, "Making a psfCandidate from star %d=", starId) 

 LOGL_WARN("meas.algorithms.starSelector.psfCandidate", "Failed to make a psfCandidate") LOGL_DEBUG("TRACE4.meas.algorithms.starSelector.psfCandidate", "Making a psfCandidate from star %d=", starId) 

Developers are encouraged to insert log messages whenever and wherever they might be useful, with appropriate component names and levels.

#### Activity

No builds found.
Hsin-Fang Chiang created issue -
Hsin-Fang Chiang made changes -
Field Original Value New Value
Description This is a continuation of RFC-29, and narrower in scope.

I would like to switch from using {{pex.logging}} to {{lsst.log}} in {{pipe_base}}. After that, all pipeline tasks ({{task}} or {{cmdLineTask}}) can switch to use {{lsst.log}}. With lsst::log (via log4cxx implementation) there are 6 logging levels (trace, debug, info, warn, error, fatal). Arbitrary integers will no longer be accepted as levels. In this framework it’s encouraged to use loggers with distinct, hierarchical names instead of custom levels.

In Python, I plan to use the recently added lsst.log Python interface, not through Python logging and LogHandler. For example:
{code:java}
logger = lsst.log.Log.getLogger(“a.named.logger”)
logger.setLevel(Log.INFO)
logger.trace("This is TRACE")
logger.debug("This is DEBUG")
logger.info("This is INFO")
logger.warn("This is WARN")
logger.error("This is ERROR")
logger.fatal("Format %d %g %s", 3, 2.71828, "foo")
{code}

For most tasks, little modifications are needed in Python, as long as the task’s own {{log}} attribute is used for logging. The “logdebug” method in pex_logging will be replaced by “debug”, and most will be format changes.

For CmdLineTasks, the namespace logging level control from the command line {{--loglevel}} is preserved. The command line option {{--logdest}} will be removed; the file destination is instead set through log4cxx configuration. As I understand, many people use this option for its default formatter with dataId attached with each log record. After switching to {{lsst::log}}, dataId can still be included in the logs. I’m thinking two possibilities: an command line option to include the dataId in log records, or an command line option to more easily supply user’s own log4cxx configuration file in which format can be set.

In C++, macros are used, e.g.
{code:java}
LOG_LOGGER logger = LOG_GET(“a.named.logger“);
{code}

For logging one can use either one of the two macro families, the sprintf-based interface:
{code:java}
LOGL_INFO(logger, "This is INFO");
LOGL_WARN(“a.named.logger“, "This is WARN");
LOGL_FATAL(logger, "This is FATAL %d %.4f %s", 65, 42.123, "logging");
{code}

and the iostream-based interface:
{code:java}
LOGLS_INFO(logger, "Format " << 3 << " " << 2.71828 << " foo c++");
LOGLS_WARN(“a.named.logger“, "This warning is” << “scary”);
{code}

As I understand pex.logging {{Trace}}/{{Debug}} classes have been deprecated (RFC-86), despite still used in the codebase. Can I use log levels combined with named loggers to replace {{Trace}} with integer 1-11 (TRACE1-TRACE11)? What mapping between Trace and log level may be accepted? How about <3: INFO, 4-7: DEBUG, >8: TRACE? Or do all of them map to DEBUG or TRACE?

I plan to break down works into several tickets and get them reviewed separately, but merge them altogether when pipe_tasks is done. I will consult package masters as questions come up. I think some individual logging uses not chained to CmdLineTask (e.g. daf_butlerUtils, or hard-coded and non-configurable logging) can be merged independently, as they are not configurable through command line task interface anyway. Would that be fine?
This is a continuation of RFC-29, and narrower in scope.

I would like to switch from using {{pex.logging}} to {{lsst.log}} in {{pipe_base}}. After that, all pipeline tasks ({{task}} or {{cmdLineTask}}) can switch to use {{lsst.log}}. With lsst::log (via log4cxx implementation) there are 6 logging levels (trace, debug, info, warn, error, fatal). Arbitrary integers will no longer be accepted as levels. In this framework it’s encouraged to use loggers with distinct, hierarchical names instead of custom levels.

In Python, I plan to use the recently added lsst.log Python interface, not through Python logging and LogHandler. For example:
{code:java}
logger = lsst.log.Log.getLogger(“a.named.logger”)
logger.setLevel(Log.INFO)
logger.trace("This is TRACE")
logger.debug("This is DEBUG")
logger.info("This is INFO")
logger.warn("This is WARN")
logger.error("This is ERROR")
logger.fatal("Format %d %g %s", 3, 2.71828, "foo")
{code}

For most tasks, little modifications are needed in Python, as long as the task’s own {{log}} attribute is used for logging. The “logdebug” method in pex_logging will be replaced by “debug”, and most will be format changes.

For CmdLineTasks, the namespace logging level control from the command line {{-- loglevel}} is preserved. The command line option {{-- logdest}} will be removed; the file destination is instead set through log4cxx configuration. As I understand, many people use this option for its default formatter with dataId attached with each log record. After switching to {{lsst::log}}, dataId can still be included in the logs. I’m thinking two possibilities: an command line option to include the dataId in log records, or an command line option to more easily supply user’s own log4cxx configuration file in which format can be set.

In C++, macros are used, e.g.
{code:java}
LOG_LOGGER logger = LOG_GET(“a.named.logger“);
{code}

For logging one can use either one of the two macro families, the sprintf-based interface:
{code:java}
LOGL_INFO(logger, "This is INFO");
LOGL_WARN(“a.named.logger“, "This is WARN");
LOGL_FATAL(logger, "This is FATAL %d %.4f %s", 65, 42.123, "logging");
{code}

and the iostream-based interface:
{code:java}
LOGLS_INFO(logger, "Format " << 3 << " " << 2.71828 << " foo c++");
LOGLS_WARN(“a.named.logger“, "This warning is” << “scary”);
{code}

As I understand pex.logging {{Trace}}/{{Debug}} classes have been deprecated (RFC-86), despite still used in the codebase. Can I use log levels combined with named loggers to replace {{Trace}} with integer 1-11 (TRACE1-TRACE11)? What mapping between Trace and log level may be accepted? How about <3: INFO, 4-7: DEBUG, >8: TRACE? Or do all of them map to DEBUG or TRACE?

I plan to break down works into several tickets and get them reviewed separately, but merge them altogether when pipe_tasks is done. I will consult package masters as questions come up. I think some individual logging uses not chained to CmdLineTask (e.g. daf_butlerUtils, or hard-coded and non-configurable logging) can be merged independently, as they are not configurable through command line task interface anyway. Would that be fine?
Tim Jenness made changes -
 Link This issue relates to RFC-29 [ RFC-29 ]
Hide
Russell Owen added a comment - - edited

I think that using the log package is not acceptable until it offers more named trace levels – I think 9 would do it. Until then we cannot map existing code that uses Trace to the new logging package.

Given how many numeric levels are available in the new log package this should not be hard to fix. Once that is done I think it's a good idea to switch.

See DM-3532 for a detailed explanation.

Show
Russell Owen added a comment - - edited I think that using the log package is not acceptable until it offers more named trace levels – I think 9 would do it. Until then we cannot map existing code that uses Trace to the new logging package. Given how many numeric levels are available in the new log package this should not be hard to fix. Once that is done I think it's a good idea to switch. See DM-3532 for a detailed explanation.
Russell Owen made changes -
 Link This issue relates to DM-3532 [ DM-3532 ]
Hide
John Swinbank added a comment -

Two questions:

• Can you please say more about why this change is a good idea? All I can find is the statement on RFC-29 that being based on log4cxx will be "more flexible and maintainable" (although it doesn't seem as if log4cxx is particularly actively maintained upstream). Maybe the benefits are so manifold and obvious that it's not worth writing them down, but some examples might be fun.
• Can you please clarify the transition plan? I think from the last paragraph that you are intending to do all of the work yourself, including modifying all tasks and other code throughout the codebase to use the new system, and then merge them all as a big-bang. Have I understood that correctly?
Show
John Swinbank added a comment - Two questions: Can you please say more about why this change is a good idea? All I can find is the statement on RFC-29 that being based on log4cxx will be "more flexible and maintainable" (although it doesn't seem as if log4cxx is particularly actively maintained upstream ). Maybe the benefits are so manifold and obvious that it's not worth writing them down, but some examples might be fun. Can you please clarify the transition plan? I think from the last paragraph that you are intending to do all of the work yourself, including modifying all tasks and other code throughout the codebase to use the new system, and then merge them all as a big-bang. Have I understood that correctly?
Hide
Paul Price added a comment -

I'm all for this — it's been a long time in coming, and I'm glad to see it just popping up over the horizon.

If you want grunt effort on this, it might make a good project for the hack sessions at the project meeting next month.

Show
Paul Price added a comment - I'm all for this — it's been a long time in coming, and I'm glad to see it just popping up over the horizon. If you want grunt effort on this, it might make a good project for the hack sessions at the project meeting next month.
Hide
Jim Bosch added a comment -

I'm concerned about losing support for Debug/Trace in C++; it's still not clear to me whether or not the new log package provides a drop-in replacement, and if it does, I think we need some documentation demonstrating how to make the switch.

However, I'm not sure that stuff is so closely tied to the Python-side logging used in Tasks that it should block moving to the new log package there, especially if that allows us to add important features or fix bugs in logging. I tend to use the Debug/Trace stuff to debug low-level code that's run by unit tests, and (given that the code needs to be recompiled in many cases to be able print those messages) I don't think being able to control it from the command-line when running tasks is particularly important.

Show
Jim Bosch added a comment - I'm concerned about losing support for Debug/Trace in C++; it's still not clear to me whether or not the new log package provides a drop-in replacement, and if it does, I think we need some documentation demonstrating how to make the switch. However, I'm not sure that stuff is so closely tied to the Python-side logging used in Tasks that it should block moving to the new log package there, especially if that allows us to add important features or fix bugs in logging. I tend to use the Debug/Trace stuff to debug low-level code that's run by unit tests, and (given that the code needs to be recompiled in many cases to be able print those messages) I don't think being able to control it from the command-line when running tasks is particularly important.
Hide
Hsin-Fang Chiang added a comment -

Russell Owen may you please say more about what goal is achieved by having 9 DEBUG/TRACE levels? I don't understand why it cannot be replaced with fewer levels and loggers of different names, and distinct hierarchical names can be used for finer control and debug. Also if we need to add custom levels in log, I would like to understand the use case.

I find the named hierarchy and level inheritance from log4cxx pretty sweet (or at least, better handled compared to pex_logging in my own experience). That may also answer John Swinbank's first question. I am not claiming log4cxx is the best logging pacakge out there but the comparison is with pex_logging.

Show
Hsin-Fang Chiang added a comment - Russell Owen may you please say more about what goal is achieved by having 9 DEBUG/TRACE levels? I don't understand why it cannot be replaced with fewer levels and loggers of different names, and distinct hierarchical names can be used for finer control and debug. Also if we need to add custom levels in log , I would like to understand the use case. I find the named hierarchy and level inheritance from log4cxx pretty sweet (or at least, better handled compared to pex_logging in my own experience). That may also answer John Swinbank 's first question. I am not claiming log4cxx is the best logging pacakge out there but the comparison is with pex_logging.
Hide
Hsin-Fang Chiang added a comment -

To John Swinbank's second question on the transition plan, yes in Python-side up to pipe_tasks. So far I think the transision can be 3 stages: (1) individual logging not chained to CmdLineTask (e.g. daf_butlerUtils, or hard-coded and non-configurable logging), (2) Python-side logging for tasks/CmdLineTasks plus log in C++ , and (3) Debug/Trace in C++ . As Jim Bosch pointed out the Python-side and C++-side are not that closely tied, if they don't have to be used from the CmdLineTask command-line interface. I would prefer doing it in stages other than one big-bang (any objections?). And I will sure need help and will ask many questions, so warnings I will make noises and bug science pipeline developers, and may pass untypical cases to more knowledgeable developers. Time in the hack sessions at the project meeting next month would be great.

Show
Hsin-Fang Chiang added a comment - To John Swinbank 's second question on the transition plan, yes in Python-side up to pipe_tasks . So far I think the transision can be 3 stages: (1) individual logging not chained to CmdLineTask (e.g. daf_butlerUtils, or hard-coded and non-configurable logging), (2) Python-side logging for tasks/CmdLineTasks plus log in C++ , and (3) Debug/Trace in C++ . As Jim Bosch pointed out the Python-side and C++-side are not that closely tied, if they don't have to be used from the CmdLineTask command-line interface. I would prefer doing it in stages other than one big-bang (any objections?). And I will sure need help and will ask many questions, so warnings I will make noises and bug science pipeline developers, and may pass untypical cases to more knowledgeable developers. Time in the hack sessions at the project meeting next month would be great.
Hide
Russell Owen added a comment - - edited

Regarding named trace levels, I went into this in detail in DM-3532. This quote is relevant:
One issue with C++ code is how to handle trace levels used by pex_logging Trace, TTrace and Debug. In pex_logging these were integers 1-9 that mapped were negated to produce log levels. Thus smaller trace levels were more important, ranging from almost as important as INFO to not quite as unimportant as DEBUG. We could simply replace them with huge integers, e.g. 1->19000 to 9->11000, but that's clearly ugly. On a HipChat discussion Tim Jenness proposed adding constants TRACE1-TRACE9 to C++ and Python (and for use on the command line). That sounds best to me since it allows us to exactly replicate the old behavior in a simple way, while avoiding use of the huge integer values for logging levels. I also propose to eliminate the TRACE constant (existing code can use TRACE5 instead).

That ticket contains more discussion that is relevant, including a further comment by me near the end.

The existing named levels are simply not sufficient to map to the levels our current code is using. We need more named levels. To me this seems like a really minor change with a major benefit for use of trace and for converting old code.

Show
Russell Owen added a comment - - edited Regarding named trace levels, I went into this in detail in DM-3532 . This quote is relevant: One issue with C++ code is how to handle trace levels used by pex_logging Trace, TTrace and Debug. In pex_logging these were integers 1-9 that mapped were negated to produce log levels. Thus smaller trace levels were more important, ranging from almost as important as INFO to not quite as unimportant as DEBUG. We could simply replace them with huge integers, e.g. 1->19000 to 9->11000, but that's clearly ugly. On a HipChat discussion Tim Jenness proposed adding constants TRACE1-TRACE9 to C++ and Python (and for use on the command line). That sounds best to me since it allows us to exactly replicate the old behavior in a simple way, while avoiding use of the huge integer values for logging levels. I also propose to eliminate the TRACE constant (existing code can use TRACE5 instead). That ticket contains more discussion that is relevant, including a further comment by me near the end. The existing named levels are simply not sufficient to map to the levels our current code is using. We need more named levels. To me this seems like a really minor change with a major benefit for use of trace and for converting old code.
Hide
Hsin-Fang Chiang added a comment -

Okay, asked in a different way, why 9 levels, not 8, or 7, or 11? And instead of logging everything into the same logger, why not use multiple loggers of different names? I understand the proposed idea is not a one-to-one exact mapping to the old code, but I wonder whether they can be refactored using the new framework?

Show
Hsin-Fang Chiang added a comment - Okay, asked in a different way, why 9 levels, not 8, or 7, or 11? And instead of logging everything into the same logger, why not use multiple loggers of different names? I understand the proposed idea is not a one-to-one exact mapping to the old code, but I wonder whether they can be refactored using the new framework?
Hide
Hsin-Fang Chiang added a comment -

Maybe it would be helpful if someone may say more what these 9 levels mean. How does a developer use the 9 levels?

Show
Hsin-Fang Chiang added a comment - Maybe it would be helpful if someone may say more what these 9 levels mean. How does a developer use the 9 levels?
Hide
Russell Owen added a comment -

Here are some reasons:

• 9 levels is a very logical number because it naturally maps to names, e.g. TRACE1-TRACE9. If we have fewer levels we need to leave gaps or choose other kinds of names (which is fine with me if somebody wants to try to come up with names).
• We have existing code that uses those levels.
• In the trace system, log names are intended to be orthogonal to trace levels. That allows one to enable different trace levels for "foo.bar", "foo", and "something.else".

As to documentation, here is some: https://dev.lsstcorp.org/trac/wiki/TraceLevels

Show
Russell Owen added a comment - Here are some reasons: 9 levels is a very logical number because it naturally maps to names, e.g. TRACE1-TRACE9. If we have fewer levels we need to leave gaps or choose other kinds of names (which is fine with me if somebody wants to try to come up with names). We have existing code that uses those levels. In the trace system, log names are intended to be orthogonal to trace levels. That allows one to enable different trace levels for "foo.bar", "foo", and "something.else". As to documentation, here is some: https://dev.lsstcorp.org/trac/wiki/TraceLevels
Hide
Kian-Tat Lim added a comment -

In many ways, logger "foo.bar.trace1" level "TRACE" is isomorphic to logger "foo.bar" level "TRACE1". The only cogent argument I have heard for the latter is that often developers want "a little bit more information" without knowing exactly what that information is/will be (and not wanting to consult documentation or the code to determine what it is). Having incremental levels may be useful in that case. This is similar to "-v", "-vv", "-vvv" or similar structures.

I have not been convinced that just because we have done things this way in the past we should continue to do them that way. But we have been stuck on this for a long time. I think there are two alternatives: "give in" and create a bunch of new levels or rewrite a piece of code that uses TRACEn levels to demonstrate that it would work acceptably well without them.

Show
Kian-Tat Lim added a comment - In many ways, logger "foo.bar.trace1" level "TRACE" is isomorphic to logger "foo.bar" level "TRACE1". The only cogent argument I have heard for the latter is that often developers want "a little bit more information" without knowing exactly what that information is/will be (and not wanting to consult documentation or the code to determine what it is). Having incremental levels may be useful in that case. This is similar to "-v", "-vv", "-vvv" or similar structures. I have not been convinced that just because we have done things this way in the past we should continue to do them that way. But we have been stuck on this for a long time. I think there are two alternatives: "give in" and create a bunch of new levels or rewrite a piece of code that uses TRACEn levels to demonstrate that it would work acceptably well without them.
Hide
Russell Owen added a comment - - edited

Kian-Tat Lim I admit I don't understand your proposal for "foo.bar.trace1". That seems a strange conflation of "what to log" and "at what level". What if we have "foo.bar.baz" and "foo.bar" and want to set separate levels for each of them?

I don't really know to what extent tracing is proving useful in the stack. I admit I don't use it very often. But when I tried to convert existing code it really felt to me as if we either had to have a "reasonable number" of trace levels (3-10) or risk losing the ability to usefully display trace messages from existing code.

Perhaps coming up with specific names for trace levels would be better than using TRACE1-9. It would certainly be more self-documenting and eliminate the confusion of "smaller is more important". I am not against the idea, I just worry it may be more work than it is worth.

Show
Russell Owen added a comment - - edited Kian-Tat Lim I admit I don't understand your proposal for "foo.bar.trace1". That seems a strange conflation of "what to log" and "at what level". What if we have "foo.bar.baz" and "foo.bar" and want to set separate levels for each of them? I don't really know to what extent tracing is proving useful in the stack. I admit I don't use it very often. But when I tried to convert existing code it really felt to me as if we either had to have a "reasonable number" of trace levels (3-10) or risk losing the ability to usefully display trace messages from existing code. Perhaps coming up with specific names for trace levels would be better than using TRACE1-9. It would certainly be more self-documenting and eliminate the confusion of "smaller is more important". I am not against the idea, I just worry it may be more work than it is worth.
Hide
Hsin-Fang Chiang added a comment -

For using loggers of different names, originally I was thinking using foo.bar.componentA, foo.bar.componentB, foo.bar.componentB.baz, etc. when possible. (Btw "foo.bar.baz" and "foo.bar" can have different levels.) But yes refactoring existing codes to that scheme does require some knowledge of the code and understanding the intention of each log record, and it's not clear whether that's acceptable. So, how about the following:

For existing codes that use multiple Trace levels (TRACE1-TRACE11), finer levels can be implicitly defined, TRACEn levels and the behavior of Trace verbosity are preserved; see the u/hfc/RFC-203 branch in meas_algorithms for an example based on the lsst::log Doxygen document Fine-level Debugging Example. The test tests/cr.py can be run with any verbosity and the wanted Trace feature is supported.

This example also demonstrates that converting existing pex.logging TRACEn to lsst.log can be done step by step. In transition plan stage (3), each test can be handled separately.

Questions: how about codes that use only one or two Trace levels? Are the exact TRACEn level important in those cases? If only one level is used in a piece of code, can I replace them with TRACE directly? If there are two levels, can I replace them with DEBUG and TRACE?

Also I updated the descriptions based on comments so far.

Show
Hsin-Fang Chiang added a comment - For using loggers of different names, originally I was thinking using foo.bar.componentA, foo.bar.componentB, foo.bar.componentB.baz, etc. when possible. (Btw "foo.bar.baz" and "foo.bar" can have different levels.) But yes refactoring existing codes to that scheme does require some knowledge of the code and understanding the intention of each log record, and it's not clear whether that's acceptable. So, how about the following: For existing codes that use multiple Trace levels (TRACE1-TRACE11), finer levels can be implicitly defined, TRACEn levels and the behavior of Trace verbosity are preserved; see the u/hfc/ RFC-203 branch in meas_algorithms for an example based on the lsst::log Doxygen document Fine-level Debugging Example . The test tests/cr.py can be run with any verbosity and the wanted Trace feature is supported. This example also demonstrates that converting existing pex.logging TRACEn to lsst.log can be done step by step. In transition plan stage (3), each test can be handled separately. Questions: how about codes that use only one or two Trace levels? Are the exact TRACEn level important in those cases? If only one level is used in a piece of code, can I replace them with TRACE directly? If there are two levels, can I replace them with DEBUG and TRACE? Also I updated the descriptions based on comments so far.
Hsin-Fang Chiang made changes -
 Description This is a continuation of RFC-29, and narrower in scope. I would like to switch from using {{pex.logging}} to {{lsst.log}} in {{pipe_base}}. After that, all pipeline tasks ({{task}} or {{cmdLineTask}}) can switch to use {{lsst.log}}. With lsst::log (via log4cxx implementation) there are 6 logging levels (trace, debug, info, warn, error, fatal). Arbitrary integers will no longer be accepted as levels. In this framework it’s encouraged to use loggers with distinct, hierarchical names instead of custom levels. In Python, I plan to use the recently added lsst.log Python interface, not through Python logging and LogHandler. For example: {code:java}   logger = lsst.log.Log.getLogger(“a.named.logger”)   logger.setLevel(Log.INFO)   logger.trace("This is TRACE")   logger.debug("This is DEBUG")   logger.info("This is INFO")   logger.warn("This is WARN")   logger.error("This is ERROR")   logger.fatal("Format %d %g %s", 3, 2.71828, "foo") {code} For most tasks, little modifications are needed in Python, as long as the task’s own {{log}} attribute is used for logging. The “logdebug” method in pex_logging will be replaced by “debug”, and most will be format changes. For CmdLineTasks, the namespace logging level control from the command line {{-- loglevel}} is preserved. The command line option {{-- logdest}} will be removed; the file destination is instead set through log4cxx configuration. As I understand, many people use this option for its default formatter with dataId attached with each log record. After switching to {{lsst::log}}, dataId can still be included in the logs. I’m thinking two possibilities: an command line option to include the dataId in log records, or an command line option to more easily supply user’s own log4cxx configuration file in which format can be set. In C++, macros are used, e.g. {code:java}     LOG_LOGGER logger = LOG_GET(“a.named.logger“); {code} For logging one can use either one of the two macro families, the sprintf-based interface: {code:java}     LOGL_INFO(logger, "This is INFO");     LOGL_WARN(“a.named.logger“, "This is WARN");     LOGL_FATAL(logger, "This is FATAL %d %.4f %s", 65, 42.123, "logging"); {code} and the iostream-based interface: {code:java}     LOGLS_INFO(logger, "Format " << 3 << " " << 2.71828 << " foo c++");     LOGLS_WARN(“a.named.logger“, "This warning is” << “scary”); {code} As I understand pex.logging {{Trace}}/{{Debug}} classes have been deprecated (RFC-86), despite still used in the codebase. Can I use log levels combined with named loggers to replace {{Trace}} with integer 1-11 (TRACE1-TRACE11)? What mapping between Trace and log level may be accepted? How about <3: INFO, 4-7: DEBUG, >8: TRACE? Or do all of them map to DEBUG or TRACE? I plan to break down works into several tickets and get them reviewed separately, but merge them altogether when pipe_tasks is done. I will consult package masters as questions come up. I think some individual logging uses not chained to CmdLineTask (e.g. daf_butlerUtils, or hard-coded and non-configurable logging) can be merged independently, as they are not configurable through command line task interface anyway. Would that be fine? This is a continuation of RFC-29, and narrower in scope. I would like to switch from using {{pex.logging}} to {{lsst.log}} in {{pipe_base}}. After that, all pipeline tasks ({{task}} or {{cmdLineTask}}) can switch to use {{lsst.log}}. With lsst::log (via log4cxx implementation) there are 6 logging levels (trace, debug, info, warn, error, fatal). Arbitrary integers will no longer be accepted as levels. In this framework it’s encouraged to use loggers with distinct, hierarchical names instead of custom levels. In Python, I plan to use the recently added lsst.log Python interface, not through Python logging and LogHandler. For example: {code:java}   logger = lsst.log.Log.getLogger(“a.named.logger”)   logger.setLevel(Log.INFO)   logger.trace("This is TRACE")   logger.debug("This is DEBUG")   logger.info("This is INFO")   logger.warn("This is WARN")   logger.error("This is ERROR")   logger.fatal("Format %d %g %s", 3, 2.71828, "foo") {code} For most tasks, little modifications are needed in Python, as long as the task’s own {{log}} attribute is used for logging. The “logdebug” method in pex_logging will be replaced by “debug”, and most will be format changes. For CmdLineTasks, the namespace logging level control from the command line {{-- loglevel}} is preserved. The command line option {{-- logdest}} will be removed; the file destination is instead set through log4cxx configuration. As I understand, many people use this option for its default formatter with dataId attached with each log record. After switching to {{lsst::log}}, dataId can still be included in the logs. I’m thinking two possibilities: an command line option to include the dataId in log records, or an command line option to more easily supply user’s own log4cxx configuration file in which format can be set. In C++, macros are used, e.g. {code:java}     LOG_LOGGER logger = LOG_GET(“a.named.logger“); {code} For logging one can use either one of the two macro families, the sprintf-based interface: {code:java}     LOGL_INFO(logger, "This is INFO");     LOGL_WARN(“a.named.logger“, "This is WARN");     LOGL_FATAL(logger, "This is FATAL %d %.4f %s", 65, 42.123, "logging"); {code} and the iostream-based interface: {code:java}     LOGLS_INFO(logger, "Format " << 3 << " " << 2.71828 << " foo c++");     LOGLS_WARN(“a.named.logger“, "This warning is” << “scary”); {code} -As I understand pex.logging {{Trace}}/{{Debug}} classes have been deprecated (RFC-86), despite still used in the codebase. Can I use log levels combined with named loggers to replace {{Trace}} with integer 1-11 (TRACE1-TRACE11)? What mapping between Trace and log level may be accepted? How about <3: INFO, 4-7: DEBUG, >8: TRACE? Or do all of them map to DEBUG or TRACE?- The fine-level verbosity feature of pex.logging {{Trace}}/{{Debug}} TRACEn is supported. The transition plan is to do this in stages: (1) individual logging not chained to CmdLineTask (e.g. daf_butlerUtils, or hard-coded and non-configurable logging), as they are not configurable through the command line task interface anyway. (2) Python-side logging for tasks/CmdLineTasks plus log in C ++. (3) Debug/Trace in C ++. They are typically run with unit tests and each package can be handled separately.
Hide
Hsin-Fang Chiang added a comment -

As there have been no more comments or objections, I take it as we are all happy with the updated resolution. I'm marking this Adopted and will start the transition.

I doubt the transition can finish before the AHM next month. We may finish this together in the AHM hack sessions (thanks Paul!)

Show
Hsin-Fang Chiang added a comment - As there have been no more comments or objections, I take it as we are all happy with the updated resolution. I'm marking this Adopted and will start the transition. I doubt the transition can finish before the AHM next month. We may finish this together in the AHM hack sessions (thanks Paul!)
Hsin-Fang Chiang made changes -
 Resolution Done [ 10000 ] Status Proposed [ 10805 ] Adopted [ 10806 ]
Hsin-Fang Chiang made changes -
 Link This issue is triggering DM-6999 [ DM-6999 ]
Tim Jenness made changes -
 Link This issue is triggered by DM-6520 [ DM-6520 ]
Hide
Russell Owen added a comment -

I didn't see the response until now. I am very puzzled why all this machinery has to be added to compensate for the lack of named trace levels. Looking at the "fine debugging" example it feels like adding far more complexity and hard to understand code that would be better handled by having a few named trace levels. Given that the log package supports thousands of levels I remain totally utterly mystified as to why we can't just have our trace levels back.

@rhl I've been surprised not to hear more from you on this, as Trace is your invention. I stopped using it much when it got incorporated into pex_logging and maybe it's time to stop worrying about it, but if you think we should be using it, do you also feel we need more trace levels than TRACE (and possibly DEBUG)?

Show
Russell Owen added a comment - I didn't see the response until now. I am very puzzled why all this machinery has to be added to compensate for the lack of named trace levels. Looking at the "fine debugging" example it feels like adding far more complexity and hard to understand code that would be better handled by having a few named trace levels. Given that the log package supports thousands of levels I remain totally utterly mystified as to why we can't just have our trace levels back. @rhl I've been surprised not to hear more from you on this, as Trace is your invention. I stopped using it much when it got incorporated into pex_logging and maybe it's time to stop worrying about it, but if you think we should be using it, do you also feel we need more trace levels than TRACE (and possibly DEBUG)?
Hide
Russell Owen added a comment -

I want to add that I apologize for making such a fuss about this. However, I would like to better understand the objection to multiple trace levels. Is it that they are too confusing (since they are just numbers, and coming up with good names for those levels would be difficult, which is true), or that there are too many levels (which is true), or that they seem to solve the same problem as the other logging levels (I am less convinced of that, at least above DEBUG level), or is it that adding named trace levels to the log package is difficult for some reason (surely not that)?

I do think we have more trace levels than we actually need. I'm just not sure how far you can cut it down and still get reasonable behavior. Some trace levels (especially in C++ code) were for tight loops and might better be eliminated than changed to a more verbose level. Especially if that code is not compiled out by default.

Show
Russell Owen added a comment - I want to add that I apologize for making such a fuss about this. However, I would like to better understand the objection to multiple trace levels. Is it that they are too confusing (since they are just numbers, and coming up with good names for those levels would be difficult, which is true), or that there are too many levels (which is true), or that they seem to solve the same problem as the other logging levels (I am less convinced of that, at least above DEBUG level), or is it that adding named trace levels to the log package is difficult for some reason (surely not that)? I do think we have more trace levels than we actually need. I'm just not sure how far you can cut it down and still get reasonable behavior. Some trace levels (especially in C++ code) were for tight loops and might better be eliminated than changed to a more verbose level. Especially if that code is not compiled out by default.
Hide
Hsin-Fang Chiang added a comment - - edited

Robert Lupton and I talked about this this morning; my conclusion is that the proposed resolution about Debug/Trace logging is not acceptable, and defining fine-grained debugging levels implicitly behind logger names and getting verbosity control as in my example code is not sufficient. Formal, explicit fine-grained debug/trace levels need to be added. My previous understanding about the debugging/tracing need was incomplete. So I would like to remove any proposed changes about pex.logging Debug/Trace from this RFC and defer decisions on Debug/Trace logging to the future. Codes that use pex.logging Trace/Debug stay put with this RFC.

A question to all still: does the migration of the Python-side logging used in Tasks (mostly DM-6999) need to wait for the C++ Debug/Trace resolution? If the answer is no and developers can live without the Trace level control from the command line task interface (It should stay usable from unit tests), I would like to narrow down this RFC to do only Stage (1) and (2) I previously described. Do I understand correctly that Jim Bosch is fine with that, but Kian-Tat Lim doesn't like it. What do you all think?

I didn't know Trace wasn't in pex_logging in the first place. Is it possible to separate it out from pex_logging? (I should emphasize that I'm not proposing to do so in this RFC. I'm just curious and would like to learn more history.)

Show
Hsin-Fang Chiang added a comment - - edited Robert Lupton and I talked about this this morning; my conclusion is that the proposed resolution about Debug/Trace logging is not acceptable, and defining fine-grained debugging levels implicitly behind logger names and getting verbosity control as in my example code is not sufficient. Formal, explicit fine-grained debug/trace levels need to be added. My previous understanding about the debugging/tracing need was incomplete. So I would like to remove any proposed changes about pex.logging Debug/Trace from this RFC and defer decisions on Debug/Trace logging to the future. Codes that use pex.logging Trace/Debug stay put with this RFC. A question to all still: does the migration of the Python-side logging used in Tasks (mostly DM-6999 ) need to wait for the C++ Debug/Trace resolution? If the answer is no and developers can live without the Trace level control from the command line task interface (It should stay usable from unit tests), I would like to narrow down this RFC to do only Stage (1) and (2) I previously described. Do I understand correctly that Jim Bosch is fine with that, but Kian-Tat Lim doesn't like it. What do you all think? I didn't know Trace wasn't in pex_logging in the first place. Is it possible to separate it out from pex_logging? (I should emphasize that I'm not proposing to do so in this RFC. I'm just curious and would like to learn more history.)
Hsin-Fang Chiang made changes -
 Summary Use lsst::log in pipeline tasks and deprecate pex_logging Use lsst::log in pipeline tasks
Hide
Robert Lupton added a comment -

I discussed this with Jim Bosch, so it's not just me! I had thought that we'd accepted Russell's comments on this RFC; another consideration is that the indentation is/should be controlled by the trace level, so adding ".traceN" to the component isn't going to work.

Show
Robert Lupton added a comment - I discussed this with Jim Bosch, so it's not just me! I had thought that we'd accepted Russell's comments on this RFC; another consideration is that the indentation is/should be controlled by the trace level, so adding ".traceN" to the component isn't going to work.
Hide
Jim Bosch added a comment -

To clarify my position, I do agree with Robert Lupton and Russell Owen that we need more trace/debug levels than were in the original proposal (but less than we have now).

I also don't have a problem with switching the Task code to use the new log package now and migrating Trace/Debug a bit later, since I don't think we tend to enable Trace/Debug logging through the Task interface. But just because I'm not aware of us doing that doesn't mean that no one does.

Show
Jim Bosch added a comment - To clarify my position, I do agree with Robert Lupton and Russell Owen that we need more trace/debug levels than were in the original proposal (but less than we have now). I also don't have a problem with switching the Task code to use the new log package now and migrating Trace/Debug a bit later, since I don't think we tend to enable Trace/Debug logging through the Task interface. But just because I'm not aware of us doing that doesn't mean that no one does.
Hide
Russell Owen added a comment -

Hsin-Fang Chiang my understanding is that Trace is part of pex_logging.

I worry that using a mix of pex_logging and the new log package in the stack will make it harder to control verbosity, because it has to be controlled for two separate packages. I also worry that log messages from each package might easily end up in separate locations and with different formatting. If these are easily solved then perhaps it is not a problem, but my first thought is we should not attempt a mix of log packages, especially in one product. In any case, I certainly hope we can resolve how to handle trace soon, preferably on this RFC. Perhaps we can discuss it at the meeting next week and come to a conclusion? I know how frustrating it is to get bogged down in these discussions, and as one of people causing this roadblock, I'd like to do what I can to break it free.

Show
Russell Owen added a comment - Hsin-Fang Chiang my understanding is that Trace is part of pex_logging. I worry that using a mix of pex_logging and the new log package in the stack will make it harder to control verbosity, because it has to be controlled for two separate packages. I also worry that log messages from each package might easily end up in separate locations and with different formatting. If these are easily solved then perhaps it is not a problem, but my first thought is we should not attempt a mix of log packages, especially in one product. In any case, I certainly hope we can resolve how to handle trace soon, preferably on this RFC. Perhaps we can discuss it at the meeting next week and come to a conclusion? I know how frustrating it is to get bogged down in these discussions, and as one of people causing this roadblock, I'd like to do what I can to break it free.
Hide
Robert Lupton added a comment -

Once we decide on a new-style API I'm happy to see it implemented in the Tasks before we do this in C++. If this is a two-stage operation (first log, then trace) I'm OK with that – but I hope it'll go fast. The great log/trace confusion has led to people not using logging/tracing as much as they should.

Show
Robert Lupton added a comment - Once we decide on a new-style API I'm happy to see it implemented in the Tasks before we do this in C++. If this is a two-stage operation (first log, then trace) I'm OK with that – but I hope it'll go fast. The great log/trace confusion has led to people not using logging/tracing as much as they should.
Hide
Russell Owen added a comment -

Jim Bosch said: "I also don't have a problem with switching the Task code to use the new log package now and migrating Trace/Debug a bit later, since I don't think we tend to enable Trace/Debug logging through the Task interface. But just because I'm not aware of us doing that doesn't mean that no one does."

Just to expand on this for the sake of anyone who doesn't know: trace is presently handled as normal logging but with specific numeric levels, so enabling trace from the command line is just like enabling any other logging. But whether users are actually bothering to use those trace levels...that I don't know.

Also, trace and logging are both used in C++ and Python. I think at one time Trace in C++ could be compiled out below a level that one specified at build time, but (at risk of further derailing this RFC) I don't know if the current system supports that, nor if it is a requirement for the new system.

So...all that said, I'm happy to let others decide whether to convert the log messages first and the trace messages later or wait and do it all at once. Either way, I hope we can quickly decide what to do about trace.

Show
Russell Owen added a comment - Jim Bosch said: "I also don't have a problem with switching the Task code to use the new log package now and migrating Trace/Debug a bit later, since I don't think we tend to enable Trace/Debug logging through the Task interface. But just because I'm not aware of us doing that doesn't mean that no one does." Just to expand on this for the sake of anyone who doesn't know: trace is presently handled as normal logging but with specific numeric levels, so enabling trace from the command line is just like enabling any other logging. But whether users are actually bothering to use those trace levels...that I don't know. Also, trace and logging are both used in C++ and Python. I think at one time Trace in C++ could be compiled out below a level that one specified at build time, but (at risk of further derailing this RFC) I don't know if the current system supports that, nor if it is a requirement for the new system. So...all that said, I'm happy to let others decide whether to convert the log messages first and the trace messages later or wait and do it all at once. Either way, I hope we can quickly decide what to do about trace.
Hide
Hsin-Fang Chiang added a comment -

I made another example and would like to test the water here. In https://github.com/lsst/afw/compare/u/hfc/RFC-203
some afw convolution codes are converted to use lsst.log. In master, a logger named lsst.afw.convolve is used in four source files, and Trace levels 3,4,5,6 are used. If we can name loggers differently for different components, two levels are enough to distinguish them and control them separately in tests/convolve.py.

Russell Owen would you mind taking a look and see if something like that would be acceptable? In this case, is it important to have the same logger name for all convolution codes?

(ps. just to show this branch can build even though some codes use pex.logging and some use log: https://ci.lsst.codes/job/stack-os-matrix/compiler=gcc,label=centos-7,python=py2/14304//console )

Show
Hsin-Fang Chiang added a comment - I made another example and would like to test the water here. In https://github.com/lsst/afw/compare/u/hfc/RFC-203 some afw convolution codes are converted to use lsst.log . In master, a logger named lsst.afw.convolve is used in four source files, and Trace levels 3,4,5,6 are used. If we can name loggers differently for different components, two levels are enough to distinguish them and control them separately in tests/convolve.py . Russell Owen would you mind taking a look and see if something like that would be acceptable? In this case, is it important to have the same logger name for all convolution codes? (ps. just to show this branch can build even though some codes use pex.logging and some use log: https://ci.lsst.codes/job/stack-os-matrix/compiler=gcc,label=centos-7,python=py2/14304//console )
Hide
Hsin-Fang Chiang added a comment -

Also, trace and logging are both used in C++ and Python.

Yes you are absolutely right about that and thanks for clarifying. When I first wrote this RFC I didn't notice pex.logging Trace/Debug is used in the Python side too. They are rather rare so I didn't change the descriptions. For my transition plan stage (2), those pex.logging Trace/Debug in Python will stay. It is indeed more accurate to say my proposal is to convert the log messages first and the trace messages later.

Show
Hsin-Fang Chiang added a comment - Also, trace and logging are both used in C++ and Python. Yes you are absolutely right about that and thanks for clarifying. When I first wrote this RFC I didn't notice pex.logging Trace/Debug is used in the Python side too. They are rather rare so I didn't change the descriptions. For my transition plan stage (2), those pex.logging Trace/Debug in Python will stay. It is indeed more accurate to say my proposal is to convert the log messages first and the trace messages later.
Hide
Hsin-Fang Chiang added a comment -

If it can help, we can also have the log level control of both pex_logging and lsst.log through the command line task interface during the transition time. For example one command line argument for log messages based on lsst.log, and another one for trace messages based on pex.logging. (Was it similar before RFC-86?) Alternatively we can use a configuration file to control lsst.log log messages, and leave the --loglevel for pex.logging.Debug/Trace messages.

Show
Hsin-Fang Chiang added a comment - If it can help, we can also have the log level control of both pex_logging and lsst.log through the command line task interface during the transition time. For example one command line argument for log messages based on lsst.log, and another one for trace messages based on pex.logging. (Was it similar before RFC-86 ?) Alternatively we can use a configuration file to control lsst.log log messages, and leave the --loglevel for pex.logging.Debug/Trace messages.
Hide
Hsin-Fang Chiang added a comment -

Once we decide on a new-style API I'm happy to see it implemented in the Tasks before we do this in C++.

Robert Lupton may you please say more about what you meant by the new-style API? Not SuperTask, right...?

Show
Hsin-Fang Chiang added a comment - Once we decide on a new-style API I'm happy to see it implemented in the Tasks before we do this in C++. Robert Lupton may you please say more about what you meant by the new-style API? Not SuperTask, right...?
Hide
Robert Lupton added a comment -

No, not SuperTask, the interface to the logging/trace. Within a task something like:

self.log.warn("warning")
self.trace(1, "foo.bar", "cccc") # or self.log.trace1("foo.bar", "cccc") if you prefer

The warn message (or info, or fatal, or ...) gets the component as part of the output; the trace should prepend that to the component (so "pipe.tasks.isr.foo.bar" for that example).

I'm not sure if debug is useful in addition to trace; I might make it a synonym for trace(0, ...)

In C++, the trace calls can be compiled out at levels >= N (leaving no cost behind, so they can go in inner loops — rather like asserts and NDEBUG)

Show
Robert Lupton added a comment - No, not SuperTask, the interface to the logging/trace. Within a task something like: self.log.warn("warning") self.trace(1, "foo.bar", "cccc") # or self.log.trace1("foo.bar", "cccc") if you prefer The warn message (or info, or fatal, or ...) gets the component as part of the output; the trace should prepend that to the component (so "pipe.tasks.isr.foo.bar" for that example). I'm not sure if debug is useful in addition to trace; I might make it a synonym for trace(0, ...) In C++, the trace calls can be compiled out at levels >= N (leaving no cost behind, so they can go in inner loops — rather like asserts and NDEBUG)
Hide
Robert Lupton added a comment -

I looked at the example https://github.com/lsst/afw/compare/u/hfc/RFC-203 and don't see how you kept the functionality of the multiple trace levels.

Show
Robert Lupton added a comment - I looked at the example https://github.com/lsst/afw/compare/u/hfc/RFC-203 and don't see how you kept the functionality of the multiple trace levels.
Hide
Hsin-Fang Chiang added a comment -

Thank you Robert Lupton, that's a good guideline for Task's logging/tracing interface.

In the afw example I am wondering if we may use multiple loggers to reduce the number of trace levels needed. Take KernelImagesForRegion.cc where TRACE6 is used for logger named lsst.afw.math.convolve. To see those messages in KernelImagesForRegion.cc the verbosity needs to be >= 6 for the logger lsst.afw.math.convolve. If instead we use a more specific logger named lsst.afw.math.convolve.KernelImagesForRegion, we turn those messages on and off by setting that specific logger directly.

So it's not adding trace levels but having multiple loggers and allowing separate controls over them. If changing (appending) logger names like this is not acceptable, maybe more trace levels would be needed than otherwise?

Show
Hsin-Fang Chiang added a comment - Thank you Robert Lupton , that's a good guideline for Task's logging/tracing interface. In the afw example I am wondering if we may use multiple loggers to reduce the number of trace levels needed. Take KernelImagesForRegion.cc where TRACE6 is used for logger named lsst.afw.math.convolve . To see those messages in KernelImagesForRegion.cc the verbosity needs to be >= 6 for the logger lsst.afw.math.convolve . If instead we use a more specific logger named lsst.afw.math.convolve.KernelImagesForRegion , we turn those messages on and off by setting that specific logger directly. So it's not adding trace levels but having multiple loggers and allowing separate controls over them. If changing (appending) logger names like this is not acceptable, maybe more trace levels would be needed than otherwise?
Hide
Kian-Tat Lim added a comment -

Here is a draft of what I would like added to the documentation for lsst.log. This is a final decision on what we will include and how we will use the interface at this time. If this is shown to be inadequate, we will modify the package later, but we need to make progress on this and not let the perfect be the enemy of the good.

### Use of logger names and log levels

Logger names should generally start with the (fully-qualified) name of the module/file containing the logger. If the logger is saved as a variable in a class, it is often appropriate to name the logger after the class. Logger names use "." as component separators, not "::", even in C++.

Example:

 log = lsst.log.Log.getLogger("lsst.meas.algorithms.starSelector") log.info("This is information about the star selector algorithm execution. %f", 3.14) 

Log levels should be used as follows:

• ERROR: for errors that prevent further execution of the component
• WARNING: for conditions that may indicate a problem but that allow continued execution
• INFO: for information that is of interest during normal execution including production
• DEBUG: for information that is of interest to developers but not of interest during production
• TRACE: for detailed information when debugging

For loggers used at DEBUG and TRACE levels, it is often desirable to add further components to the logger name; these would indicate which specific portion of the code or algorithm that the logged information pertains to.

 debugLogger = lsst.log.Log.getLogger("lsst.meas.algorithms.starSelector.catalogReader") debugLogger.debug("Catalog reading took %f seconds", finish - start) 

The idea here is that the author understands the intent of the log message and can simply name it, without worrying about its relative importance or priority compared with other log messages in the same component. A person debugging the code would typically be looking at it and so would be able to determine the appropriate name to enable. The hierarchy allows all components to be easily enabled or disabled together.

As an alternative for TRACE loggers where there are different messages at increasing levels of verbosity but no specific component names that would be appropriate, or where increasing verbosity spans a number of levels of the component hierarchy, logger names can be prefixed with "TRACE1", "TRACE2", "TRACE3", etc. to indicate increasing verbosity.

 traceLogger = lsst.log.getLogger("TRACE2.lsst.meas.algorithms.starSelector") traceLogger.trace("On %d-th iteration of star selection", iteration) innerTraceLogger = lsst.log.getLogger("TRACE2.lsst.meas.algorithms.starSelector.catalogReader") innerTraceLogger.trace("Reading catalog %s", catalogName) lsst.log.log("TRACE4.lsst.meas.algorithms.starSelector.catalogReader", lsst.log.TRACE, "Reading %d-th row of catalog", rowNumber) 

Notice that all loggers in the hierarchy under a given component at a given trace level can be enabled easily using, e.g., "TRACE2.lsst.meas.algorithms.starSelector".

Developers are encouraged to insert log messages whenever and wherever they might be useful, with appropriate component names and levels.

Syntactic sugar in command line argument parsers or lsst.log methods could be used to 1) enable all TRACEi levels with i less than n when TRACEn is enabled and 2) synthesize the appropriate prefixed logger name when a logger name and TRACEn level is given in an lsst.log.log() call. We will not write such syntactic sugar until there is a clearly demonstrated need.

Show
Kian-Tat Lim added a comment - Here is a draft of what I would like added to the documentation for lsst.log . This is a final decision on what we will include and how we will use the interface at this time. If this is shown to be inadequate, we will modify the package later, but we need to make progress on this and not let the perfect be the enemy of the good. Use of logger names and log levels Logger names should generally start with the (fully-qualified) name of the module/file containing the logger. If the logger is saved as a variable in a class, it is often appropriate to name the logger after the class. Logger names use "." as component separators, not "::", even in C++. Example: log = lsst.log.Log.getLogger( "lsst.meas.algorithms.starSelector" ) log.info( "This is information about the star selector algorithm execution. %f" , 3.14 ) Log levels should be used as follows: ERROR: for errors that prevent further execution of the component WARNING: for conditions that may indicate a problem but that allow continued execution INFO: for information that is of interest during normal execution including production DEBUG: for information that is of interest to developers but not of interest during production TRACE: for detailed information when debugging For loggers used at DEBUG and TRACE levels, it is often desirable to add further components to the logger name; these would indicate which specific portion of the code or algorithm that the logged information pertains to. debugLogger = lsst.log.Log.getLogger( "lsst.meas.algorithms.starSelector.catalogReader" ) debugLogger.debug( "Catalog reading took %f seconds" , finish - start) The idea here is that the author understands the intent of the log message and can simply name it, without worrying about its relative importance or priority compared with other log messages in the same component. A person debugging the code would typically be looking at it and so would be able to determine the appropriate name to enable. The hierarchy allows all components to be easily enabled or disabled together. As an alternative for TRACE loggers where there are different messages at increasing levels of verbosity but no specific component names that would be appropriate, or where increasing verbosity spans a number of levels of the component hierarchy, logger names can be prefixed with "TRACE1", "TRACE2", "TRACE3", etc. to indicate increasing verbosity. traceLogger = lsst.log.getLogger( "TRACE2.lsst.meas.algorithms.starSelector" ) traceLogger.trace( "On %d-th iteration of star selection" , iteration) innerTraceLogger = lsst.log.getLogger( "TRACE2.lsst.meas.algorithms.starSelector.catalogReader" ) innerTraceLogger.trace( "Reading catalog %s" , catalogName) lsst.log.log( "TRACE4.lsst.meas.algorithms.starSelector.catalogReader" , lsst.log.TRACE, "Reading %d-th row of catalog" , rowNumber) Notice that all loggers in the hierarchy under a given component at a given trace level can be enabled easily using, e.g., "TRACE2.lsst.meas.algorithms.starSelector". Developers are encouraged to insert log messages whenever and wherever they might be useful, with appropriate component names and levels. Syntactic sugar in command line argument parsers or lsst.log methods could be used to 1) enable all TRACEi levels with i less than n when TRACEn is enabled and 2) synthesize the appropriate prefixed logger name when a logger name and TRACEn level is given in an lsst.log.log() call. We will not write such syntactic sugar until there is a clearly demonstrated need.
Hide
Russell Owen added a comment -

Will any levels be completely compiled out for trace messages in C++ code? If so, I am guessing that would be TRACE and not DEBUG. If so, then trace messages will usually be unavailable without extra work.

Is Robert Lupton is comfortable with that? I can't say I'm thrilled by it, but perhaps we should not rely heavily on trace messages being compiled out.

Show
Russell Owen added a comment - Will any levels be completely compiled out for trace messages in C++ code? If so, I am guessing that would be TRACE and not DEBUG . If so, then trace messages will usually be unavailable without extra work. Is Robert Lupton is comfortable with that? I can't say I'm thrilled by it, but perhaps we should not rely heavily on trace messages being compiled out.
Hide
Hsin-Fang Chiang added a comment -

As I understand DEBUG/TRACE logging is optimized at compiling with log4cxx but not completely compiled out. Currently the pex_logging Trace is not compiled out either. With lsst.log, macros are used in C++, and more may be done in the future if needed.

Show
Hsin-Fang Chiang added a comment - As I understand DEBUG/TRACE logging is optimized at compiling with log4cxx but not completely compiled out. Currently the pex_logging Trace is not compiled out either. With lsst.log, macros are used in C++, and more may be done in the future if needed.
Hide
Paul Price added a comment -

If we're going to have a trace capability, then it absolutely must compile out. If it doesn't, there's no point in using it instead of printf. I don't think it should be hard to do — just add #ifndef NDEBUG into each of the trace macros. But maybe there aren't trace-specific macros?

Show
Paul Price added a comment - If we're going to have a trace capability, then it absolutely must compile out. If it doesn't, there's no point in using it instead of printf . I don't think it should be hard to do — just add #ifndef NDEBUG into each of the trace macros. But maybe there aren't trace-specific macros?
Hide
Robert Lupton added a comment -

I thought that we'd agreed to support

lsst.log.debug("TRACE4.lsst.meas.algorithms.starSelector.catalogReader", "Reading %d-th row of catalog", rowNumber)

rather than

lsst.log.log("TRACE4.lsst.meas.algorithms.starSelector.catalogReader", lsst.log.TRACE, "Reading %d-th row of catalog", rowNumber)

I really want to make this easy for developers to use, and the different interface (sometimes call .log() with an extra argument, sometimes call .trace()) is confusing. In task code, that'd be self.debug rather than lsst.log.debug (yes,
that's a namespace pollution but removing the extra level of indirection is worth it).

For trace names, I'd permit the omission of the leading "lsst."

I thought that we'd agreed to merge debug and trace (with debug == trace0), so

lsst.log.debug("lsst.meas.algorithms.starSelector.catalogReader", "Reading %d-th row of catalog", rowNumber)

and

lsst.log.debug("TRACE0.lsst.meas.algorithms.starSelector.catalogReader", "Reading %d-th row of catalog", rowNumber)

are equivalent, and log.trace() doesn't exist.

Re compiling out: the old code compiled out trace above a defined level (more fine-grained than Paul's NDEBUG comment). I think that an acceptable overhead was demonstrated for the trace implementation based on log4cxx, but we should check (and are there regression tests on the timing? Probably not yet). To answer Paul: there are lots of reasons to use debug rather than printf, especially if it's just as easy to use, which it needs to be; see above.

Show
Robert Lupton added a comment - I thought that we'd agreed to support lsst.log.debug("TRACE4.lsst.meas.algorithms.starSelector.catalogReader", "Reading %d-th row of catalog", rowNumber) rather than lsst.log.log("TRACE4.lsst.meas.algorithms.starSelector.catalogReader", lsst.log.TRACE, "Reading %d-th row of catalog", rowNumber) I really want to make this easy for developers to use, and the different interface (sometimes call .log() with an extra argument, sometimes call .trace()) is confusing. In task code, that'd be self.debug rather than lsst.log.debug (yes, that's a namespace pollution but removing the extra level of indirection is worth it). For trace names, I'd permit the omission of the leading "lsst." I thought that we'd agreed to merge debug and trace (with debug == trace0), so lsst.log.debug("lsst.meas.algorithms.starSelector.catalogReader", "Reading %d-th row of catalog", rowNumber) and lsst.log.debug("TRACE0.lsst.meas.algorithms.starSelector.catalogReader", "Reading %d-th row of catalog", rowNumber) are equivalent, and log.trace() doesn't exist. Re compiling out: the old code compiled out trace above a defined level (more fine-grained than Paul's NDEBUG comment). I think that an acceptable overhead was demonstrated for the trace implementation based on log4cxx, but we should check (and are there regression tests on the timing? Probably not yet). To answer Paul: there are lots of reasons to use debug rather than printf, especially if it's just as easy to use, which it needs to be; see above.
Hide
Hsin-Fang Chiang added a comment -

I did some minor edits with Kian-Tat Lim's draft:

• Omit the leading "lsst." in the logger names.
• Use log.debug(), and drop log.trace(), for multi-level verbosity in tracing.

How about this?

### Use of logger names and log levels

Logger names should generally start with the (fully-qualified) name of the module/file containing the logger. If the logger is saved as a variable in a class, it is often appropriate to name the logger after the class. Logger names use "." as component separators, not "::", even in C++.

Example:

 log = lsst.log.Log.getLogger("meas.algorithms.starSelector") log.info("This is information about the star selector algorithm execution. %f", 3.14) 

Log levels should be used as follows:

• ERROR: for errors that prevent further execution of the component
• WARNING: for conditions that may indicate a problem but that allow continued execution
• INFO: for information that is of interest during normal execution including production
• DEBUG: for information that is of interest to developers but not of interest during production
• TRACE: for detailed information when debugging

For loggers used at DEBUG and TRACE levels, it is often desirable to add further components to the logger name; these would indicate which specific portion of the code or algorithm that the logged information pertains to.

 debugLogger = lsst.log.Log.getLogger("meas.algorithms.starSelector.catalogReader") debugLogger.debug("Catalog reading took %f seconds", finish - start) 

The idea here is that the author understands the intent of the log message and can simply name it, without worrying about its relative importance or priority compared with other log messages in the same component. A person debugging the code would typically be looking at it and so would be able to determine the appropriate name to enable. The hierarchy allows all components to be easily enabled or disabled together.

As an alternative for TRACE loggers where there are different messages at increasing levels of verbosity but no specific component names that would be appropriate, or where increasing verbosity spans a number of levels of the component hierarchy, logger names can be prefixed with "TRACE1", "TRACE2", "TRACE3", etc. to indicate increasing verbosity.

 traceLogger = lsst.log.Log.getLogger("TRACE2.meas.algorithms.starSelector") traceLogger.debug("On %d-th iteration of star selection", iteration) innerTraceLogger = lsst.log.getLogger("TRACE2.meas.algorithms.starSelector.catalogReader") innerTraceLogger.debug("Reading catalog %s", catalogName) 

In C++

 LOG_LOGGER traceLogger = LOG_GET(“TRACE2.meas.algorithms.starSelector“); LOGL_DEBUG(traceLogger, "On %d-th iteration of star selection", iteration); LOG_LOGGER innerTraceLogger = LOG_GET(“TRACE2.meas.algorithms.starSelector.catalogReader"); LOGL_DEBUG(innerTraceLogger, "Reading catalog %s", catalogName); 

Notice that all loggers in the hierarchy under a given component at a given trace level can be enabled easily using, e.g., "TRACE2.lsst.meas.algorithms.starSelector".

Developers are encouraged to insert log messages whenever and wherever they might be useful, with appropriate component names and levels.

About the interface confusion, from reading DM-3532 and RFC-29 I thought getting a logger object and logging to that is preferred over logging using a string as the logger name. The latter can be used, maybe in rarer occasions (not sure if it's used at all in Python side?). Examples:

 lsst.log.log("meas.algorithms.starSelector.psfCandidate", lsst.log.WARN, "Failed to make a psfCandidate") lsst.log.log("TRACE4.meas.algorithms.starSelector.psfCandidate", lsst.log.DEBUG, "Making a psfCandidate from star %d=", starId) 

 LOGL_WARN("meas.algorithms.starSelector.psfCandidate", "Failed to make a psfCandidate") LOGL_DEBUG("TRACE4.meas.algorithms.starSelector.psfCandidate", "Making a psfCandidate from star %d=", starId) 

In the Task code, self.log is the logger, so it looks like this:

 self.log.info("Not applying color terms because %s", applyCTReason) 

Show
Hsin-Fang Chiang added a comment - I did some minor edits with Kian-Tat Lim 's draft: Omit the leading "lsst." in the logger names. Use log.debug(), and drop log.trace(), for multi-level verbosity in tracing. How about this? Use of logger names and log levels Logger names should generally start with the (fully-qualified) name of the module/file containing the logger. If the logger is saved as a variable in a class, it is often appropriate to name the logger after the class. Logger names use "." as component separators, not "::", even in C++. Example: log = lsst.log.Log.getLogger( "meas.algorithms.starSelector" ) log.info( "This is information about the star selector algorithm execution. %f" , 3.14 ) Log levels should be used as follows: ERROR: for errors that prevent further execution of the component WARNING: for conditions that may indicate a problem but that allow continued execution INFO: for information that is of interest during normal execution including production DEBUG: for information that is of interest to developers but not of interest during production TRACE: for detailed information when debugging For loggers used at DEBUG and TRACE levels, it is often desirable to add further components to the logger name; these would indicate which specific portion of the code or algorithm that the logged information pertains to. debugLogger = lsst.log.Log.getLogger( "meas.algorithms.starSelector.catalogReader" ) debugLogger.debug( "Catalog reading took %f seconds" , finish - start) The idea here is that the author understands the intent of the log message and can simply name it, without worrying about its relative importance or priority compared with other log messages in the same component. A person debugging the code would typically be looking at it and so would be able to determine the appropriate name to enable. The hierarchy allows all components to be easily enabled or disabled together. As an alternative for TRACE loggers where there are different messages at increasing levels of verbosity but no specific component names that would be appropriate, or where increasing verbosity spans a number of levels of the component hierarchy, logger names can be prefixed with "TRACE1", "TRACE2", "TRACE3", etc. to indicate increasing verbosity. traceLogger = lsst.log.Log.getLogger( "TRACE2.meas.algorithms.starSelector" ) traceLogger.debug( "On %d-th iteration of star selection" , iteration) innerTraceLogger = lsst.log.getLogger( "TRACE2.meas.algorithms.starSelector.catalogReader" ) innerTraceLogger.debug( "Reading catalog %s" , catalogName) In C++ LOG_LOGGER traceLogger = LOG_GET(“TRACE2.meas.algorithms.starSelector“); LOGL_DEBUG(traceLogger, "On %d-th iteration of star selection" , iteration); LOG_LOGGER innerTraceLogger = LOG_GET(“TRACE2.meas.algorithms.starSelector.catalogReader"); LOGL_DEBUG(innerTraceLogger, "Reading catalog %s" , catalogName); Notice that all loggers in the hierarchy under a given component at a given trace level can be enabled easily using, e.g., "TRACE2.lsst.meas.algorithms.starSelector". Developers are encouraged to insert log messages whenever and wherever they might be useful, with appropriate component names and levels. About the interface confusion, from reading DM-3532 and RFC-29 I thought getting a logger object and logging to that is preferred over logging using a string as the logger name. The latter can be used, maybe in rarer occasions (not sure if it's used at all in Python side?). Examples: lsst.log.log( "meas.algorithms.starSelector.psfCandidate" , lsst.log.WARN, "Failed to make a psfCandidate" ) lsst.log.log( "TRACE4.meas.algorithms.starSelector.psfCandidate" , lsst.log.DEBUG, "Making a psfCandidate from star %d=" , starId) LOGL_WARN( "meas.algorithms.starSelector.psfCandidate" , "Failed to make a psfCandidate" ) LOGL_DEBUG( "TRACE4.meas.algorithms.starSelector.psfCandidate" , "Making a psfCandidate from star %d=" , starId) In the Task code, self.log is the logger, so it looks like this: self .log.info( "Not applying color terms because %s" , applyCTReason)
Hide
Russell Owen added a comment -

This seems clear, and in agreement with @ktl's ruling.

Hsin-Fang Chiang could you please clarify this answer "As I understand DEBUG/TRACE logging is optimized at compiling with log4cxx but not completely compiled out. Currently the pex_logging Trace is not compiled out either. With lsst.log, macros are used in C+, and more may be done in the future if needed."? What do you mean by "is optimized at compiling"? If DEBUG and TRACE levels are optimized treated specially then are they available in normal use, or do we have to compile specially in order to see C+ DEBUG and TRACE level messages?

If we need do need to use special compilation, I suggest it would be better to only reserve that for TRACE, not DEBUG.

Show
Russell Owen added a comment - This seems clear, and in agreement with @ktl's ruling. Hsin-Fang Chiang could you please clarify this answer "As I understand DEBUG/TRACE logging is optimized at compiling with log4cxx but not completely compiled out. Currently the pex_logging Trace is not compiled out either. With lsst.log, macros are used in C+ , and more may be done in the future if needed."? What do you mean by "is optimized at compiling"? If DEBUG and TRACE levels are optimized treated specially then are they available in normal use, or do we have to compile specially in order to see C + DEBUG and TRACE level messages? If we need do need to use special compilation, I suggest it would be better to only reserve that for TRACE, not DEBUG.
Hide
Hsin-Fang Chiang added a comment -

Both DEBUG and TRACE c++ macros in lsst.log use log4cxx's LOG4CXX_UNLIKELY that provides optimization hint to the compiler to optimize for the expression being false.. So I don't think stack-users need to compile differently to see either DEBUG or TRACE messages.

Show
Hsin-Fang Chiang added a comment - Both DEBUG and TRACE c++ macros in lsst.log use log4cxx's LOG4CXX_UNLIKELY that provides optimization hint to the compiler to optimize for the expression being false. . So I don't think stack-users need to compile differently to see either DEBUG or TRACE messages.
Hide
Russell Owen added a comment -

That sounds very useful. Thanks Hsin-Fang Chiang

Show
Russell Owen added a comment - That sounds very useful. Thanks Hsin-Fang Chiang
Hide
Hsin-Fang Chiang added a comment -

The first batch of changes is almost ready in DM-6999 and DM-6986, and I'm aiming to merge them this week, pending CI and some more manual checks. If there are any concerns about this, please shout! (p.s. The first batch does not remove pex.logging tracing use. It changes pipe_base and some python codes directly related to that.)

Show
Hsin-Fang Chiang added a comment - The first batch of changes is almost ready in DM-6999 and DM-6986 , and I'm aiming to merge them this week, pending CI and some more manual checks. If there are any concerns about this, please shout! (p.s. The first batch does not remove pex.logging tracing use. It changes pipe_base and some python codes directly related to that.)
Hide
Hsin-Fang Chiang added a comment -

I am moving the to-be-added documentations to Descriptions of this RFC, so it will be clearer and more easily tractable.

Show
Hsin-Fang Chiang added a comment - I am moving the to-be-added documentations to Descriptions of this RFC, so it will be clearer and more easily tractable.
Hsin-Fang Chiang made changes -
 Description This is a continuation of RFC-29, and narrower in scope. I would like to switch from using {{pex.logging}} to {{lsst.log}} in {{pipe_base}}. After that, all pipeline tasks ({{task}} or {{cmdLineTask}}) can switch to use {{lsst.log}}. With lsst::log (via log4cxx implementation) there are 6 logging levels (trace, debug, info, warn, error, fatal). Arbitrary integers will no longer be accepted as levels. In this framework it’s encouraged to use loggers with distinct, hierarchical names instead of custom levels. In Python, I plan to use the recently added lsst.log Python interface, not through Python logging and LogHandler. For example: {code:java}   logger = lsst.log.Log.getLogger(“a.named.logger”)   logger.setLevel(Log.INFO)   logger.trace("This is TRACE")   logger.debug("This is DEBUG")   logger.info("This is INFO")   logger.warn("This is WARN")   logger.error("This is ERROR")   logger.fatal("Format %d %g %s", 3, 2.71828, "foo") {code} For most tasks, little modifications are needed in Python, as long as the task’s own {{log}} attribute is used for logging. The “logdebug” method in pex_logging will be replaced by “debug”, and most will be format changes. For CmdLineTasks, the namespace logging level control from the command line {{-- loglevel}} is preserved. The command line option {{-- logdest}} will be removed; the file destination is instead set through log4cxx configuration. As I understand, many people use this option for its default formatter with dataId attached with each log record. After switching to {{lsst::log}}, dataId can still be included in the logs. I’m thinking two possibilities: an command line option to include the dataId in log records, or an command line option to more easily supply user’s own log4cxx configuration file in which format can be set. In C++, macros are used, e.g. {code:java}     LOG_LOGGER logger = LOG_GET(“a.named.logger“); {code} For logging one can use either one of the two macro families, the sprintf-based interface: {code:java}     LOGL_INFO(logger, "This is INFO");     LOGL_WARN(“a.named.logger“, "This is WARN");     LOGL_FATAL(logger, "This is FATAL %d %.4f %s", 65, 42.123, "logging"); {code} and the iostream-based interface: {code:java}     LOGLS_INFO(logger, "Format " << 3 << " " << 2.71828 << " foo c++");     LOGLS_WARN(“a.named.logger“, "This warning is” << “scary”); {code} -As I understand pex.logging {{Trace}}/{{Debug}} classes have been deprecated (RFC-86), despite still used in the codebase. Can I use log levels combined with named loggers to replace {{Trace}} with integer 1-11 (TRACE1-TRACE11)? What mapping between Trace and log level may be accepted? How about <3: INFO, 4-7: DEBUG, >8: TRACE? Or do all of them map to DEBUG or TRACE?- The fine-level verbosity feature of pex.logging {{Trace}}/{{Debug}} TRACEn is supported. The transition plan is to do this in stages: (1) individual logging not chained to CmdLineTask (e.g. daf_butlerUtils, or hard-coded and non-configurable logging), as they are not configurable through the command line task interface anyway. (2) Python-side logging for tasks/CmdLineTasks plus log in C ++. (3) Debug/Trace in C ++. They are typically run with unit tests and each package can be handled separately. This is a continuation of RFC-29, and narrower in scope. I would like to switch from using {{pex.logging}} to {{lsst.log}} in {{pipe_base}}. After that, all pipeline tasks ({{task}} or {{cmdLineTask}}) can switch to use {{lsst.log}}. With lsst::log (via log4cxx implementation) there are 6 logging levels (trace, debug, info, warn, error, fatal). Arbitrary integers will no longer be accepted as levels. In this framework it’s encouraged to use loggers with distinct, hierarchical names instead of custom levels. In Python, I plan to use the recently added lsst.log Python interface, not through Python logging and LogHandler. For example: {code:java}   logger = lsst.log.Log.getLogger(“a.named.logger”)   logger.setLevel(Log.INFO)   logger.trace("This is TRACE")   logger.debug("This is DEBUG")   logger.info("This is INFO")   logger.warn("This is WARN")   logger.error("This is ERROR")   logger.fatal("Format %d %g %s", 3, 2.71828, "foo") {code} For most tasks, little modifications are needed in Python, as long as the task’s own {{log}} attribute is used for logging. The “logdebug” method in pex_logging will be replaced by “debug”, and most will be format changes. For CmdLineTasks, the namespace logging level control from the command line {{-- loglevel}} is preserved. The command line option {{-- logdest}} will be removed; the file destination is instead set through log4cxx configuration. As I understand, many people use this option for its default formatter with dataId attached with each log record. After switching to {{lsst::log}}, dataId can still be included in the logs. I’m thinking two possibilities: an command line option to include the dataId in log records, or an command line option to more easily supply user’s own log4cxx configuration file in which format can be set. In C++, macros are used, e.g. {code:java}     LOG_LOGGER logger = LOG_GET(“a.named.logger“); {code} For logging one can use either one of the two macro families, the sprintf-based interface: {code:java}     LOGL_INFO(logger, "This is INFO");     LOGL_WARN(“a.named.logger“, "This is WARN");     LOGL_FATAL(logger, "This is FATAL %d %.4f %s", 65, 42.123, "logging"); {code} and the iostream-based interface: {code:java}     LOGLS_INFO(logger, "Format " << 3 << " " << 2.71828 << " foo c++");     LOGLS_WARN(“a.named.logger“, "This warning is” << “scary”); {code} The transition plan is to do this in stages: (1) individual logging not chained to CmdLineTask (e.g. daf_butlerUtils, or hard-coded and non-configurable logging), as they are not configurable through the command line task interface anyway. (2) Python-side logging for tasks/CmdLineTasks plus log in C ++. (3) Debug/Trace in C ++. They are typically run with unit tests and each package can be handled separately. ---- ---- (Proposed documentations) h3. Use of logger names and log levels Logger names should generally start with the (fully-qualified) name of the module/file containing the logger. If the logger is saved as a variable in a class, it is often appropriate to name the logger after the class. Logger names use "." as component separators, not "::", even in C\+\+. Example: {code:py} log = lsst.log.Log.getLogger("meas.algorithms.starSelector") log.info("This is information about the star selector algorithm execution. %f", 3.14) {code} In Python, comma-separated arguments are preferred over the string formatting so it is done only if the log record is to be printed. Log levels should be used as follows: * FATAL: for severe errors that may prevent further execution of the component * ERROR: for errors that may still allow the execution to continue * WARNING: for conditions that may indicate a problem but that allow continued execution * INFO: for information that is of interest during normal execution including production * DEBUG: for information that is of interest to developers but not of interest during production * TRACE: for detailed information when debugging For loggers used at DEBUG and TRACE levels, it is often desirable to add further components to the logger name; these would indicate which specific portion of the code or algorithm that the logged information pertains to. {code:py} debugLogger = lsst.log.Log.getLogger("meas.algorithms.starSelector.catalogReader") debugLogger.debug("Catalog reading took %f seconds", finish - start) {code} The idea here is that the author understands the intent of the log message and can simply name it, without worrying about its relative importance or priority compared with other log messages in the same component. A person debugging the code would typically be looking at it and so would be able to determine the appropriate name to enable. The hierarchy allows all components to be easily enabled or disabled together. As an alternative for TRACE loggers where there are different messages at increasing levels of verbosity but no specific component names that would be appropriate, or where increasing verbosity spans a number of levels of the component hierarchy, logger names can be _prefixed_ with "TRACE1", "TRACE2", "TRACE3", etc. to indicate increasing verbosity. {code:py} traceLogger = lsst.log.Log.getLogger("TRACE2.meas.algorithms.starSelector") traceLogger.debug("On %d-th iteration of star selection", iteration) innerTraceLogger = lsst.log.getLogger("TRACE2.meas.algorithms.starSelector.catalogReader") innerTraceLogger.debug("Reading catalog %s", catalogName) {code} In C++ {code:py} LOG_LOGGER traceLogger = LOG_GET(“TRACE2.meas.algorithms.starSelector“); LOGL_DEBUG(traceLogger, "On %d-th iteration of star selection", iteration); LOG_LOGGER innerTraceLogger = LOG_GET(“TRACE2.meas.algorithms.starSelector.catalogReader"); LOGL_DEBUG(innerTraceLogger, "Reading catalog %s", catalogName); {code} Notice that all loggers in the hierarchy under a given component at a given trace level can be enabled easily using, e.g., "TRACE2.lsst.meas.algorithms.starSelector". Getting a logger object and logging to that is preferred over logging using a string as the logger name. The latter can be used, for examples: {code:py} lsst.log.log("meas.algorithms.starSelector.psfCandidate", lsst.log.WARN, "Failed to make a psfCandidate") lsst.log.log("TRACE4.meas.algorithms.starSelector.psfCandidate", lsst.log.DEBUG, "Making a psfCandidate from star %d=", starId) {code} {code:py} LOGL_WARN("meas.algorithms.starSelector.psfCandidate", "Failed to make a psfCandidate") LOGL_DEBUG("TRACE4.meas.algorithms.starSelector.psfCandidate", "Making a psfCandidate from star %d=", starId) {code} Developers are encouraged to insert log messages whenever and wherever they might be useful, with appropriate component names and levels. -----
Hsin-Fang Chiang made changes -
 Link This issue is triggering DM-7530 [ DM-7530 ]
Hide
Hsin-Fang Chiang added a comment -

Robert Lupton commented on a Github PR

I'd missed

The command line option – logdest will be removed; the file destination is instead set through log4cxx configuration

Can you attach a pointer to the documentation (and examples) of how to so this to the RFC, or some other place that it's easily findable by users who uses to use --logdest.

My understanding is that some people (at least Merlin Fisher-Levine, Dominique Boutigny) used --logdest for its verbose format including date/time/dataId in each log message. The --longlog option has been added to do exactly that. In this case users may use command redirection to collect their logs. The shell script in validate_drp is one example.

If needed, a file destination can be specified through the log4cxx configuration. Right now the configuration is hard-coded inside pipe_base, but DM-7459 will add a command line task argument to let users supply their own log4cxx configuration file more easily. Instead of using the default ConsoleAppender, the following example will log to a file:

 log4j.appender.A=org.apache.log4j.FileAppender log4j.appender.A.File=exampleLogFileName.log 

Here is a link to the log4cxx documentation, in particular the "Configuration" session for this topic: http://logging.apache.org/log4cxx/latest_stable/usage.html

I'm planning to add an example configuration file in DM-7459. DM-7530 should help the documentation too.

Show
Hsin-Fang Chiang added a comment - Robert Lupton commented on a Github PR I'd missed The command line option – logdest will be removed; the file destination is instead set through log4cxx configuration Can you attach a pointer to the documentation (and examples) of how to so this to the RFC, or some other place that it's easily findable by users who uses to use --logdest. My understanding is that some people (at least Merlin Fisher-Levine , Dominique Boutigny ) used --logdest for its verbose format including date/time/dataId in each log message. The --longlog option has been added to do exactly that. In this case users may use command redirection to collect their logs. The shell script in validate_drp is one example. If needed, a file destination can be specified through the log4cxx configuration. Right now the configuration is hard-coded inside pipe_base, but DM-7459 will add a command line task argument to let users supply their own log4cxx configuration file more easily. Instead of using the default ConsoleAppender, the following example will log to a file: log4j.appender.A=org.apache.log4j.FileAppender log4j.appender.A.File=exampleLogFileName.log Here is a link to the log4cxx documentation, in particular the "Configuration" session for this topic: http://logging.apache.org/log4cxx/latest_stable/usage.html I'm planning to add an example configuration file in DM-7459 . DM-7530 should help the documentation too.
Hide
Paul Price added a comment -

I think dumping the log to a file is such an important and common operation that it needs to be supported as a first-class command-line option. Having to write two lines of obscure configuration in a separate file for every command-line operation is too burdensome.

Show
Paul Price added a comment - I think dumping the log to a file is such an important and common operation that it needs to be supported as a first-class command-line option. Having to write two lines of obscure configuration in a separate file for every command-line operation is too burdensome.
Hide
Merlin Fisher-Levine added a comment -

+1 to what Paul Price said - this is a very important piece of functionality!

And also just confirming that I have indeed used --logdest before, and would have been completely lost without it as it was the only way to unscramble the interleaved logfiles created by multicore jobs.

Show
Merlin Fisher-Levine added a comment - +1 to what Paul Price said - this is a very important piece of functionality! And also just confirming that I have indeed used --logdest before, and would have been completely lost without it as it was the only way to unscramble the interleaved logfiles created by multicore jobs.
Hsin-Fang Chiang made changes -
 Link This issue relates to DM-7997 [ DM-7997 ]
Hide
Hsin-Fang Chiang added a comment -

Created DM-7997 on adding back a command line option to specify the log file destination.

As of today, meas_astrom and upstream no longer uses pex_logging.

Show
Hsin-Fang Chiang added a comment - Created DM-7997 on adding back a command line option to specify the log file destination. As of today, meas_astrom and upstream no longer uses pex_logging.
Hide
Hsin-Fang Chiang added a comment -

I would like to hear what you think about log configuration in our stack, as some relevant conversations have come up a few times on HipChat/Slack recently. This is beyond the scope of this RFC.

log based on log4cxx can be configured at runtime using a configuration file. This file can be set through an environment variable, or through configuring methods in the code. One can also configure log with a string in the code. If no configuration is done, the default basic configuration is used (Improvement on this default is wanted in DM-7955)

Currently I configure log in 3 places explicitly and separately:

• Python utils unit test (utils)
• CmdLineTask interface (pipe_base)
• BatchCmdLineTask in its jobLog (ctrl_pool)

But I don't think the current way is the best way to move forward. I am curious about some general desired design/requests on log configuration. Should all Science Pipelines code in the stack share one configuration file, no matter how it's used? If yes, is it okay to use environment variables to achieve that? If no, is there any important stack entry beyond the three listed above? How should we set the configuration?

Show
Hsin-Fang Chiang added a comment - I would like to hear what you think about log configuration in our stack, as some relevant conversations have come up a few times on HipChat/Slack recently. This is beyond the scope of this RFC. log based on log4cxx can be configured at runtime using a configuration file. This file can be set through an environment variable, or through configuring methods in the code. One can also configure log with a string in the code. If no configuration is done, the default basic configuration is used (Improvement on this default is wanted in DM-7955 ) Currently I configure log in 3 places explicitly and separately: Python utils unit test (utils) CmdLineTask interface (pipe_base) BatchCmdLineTask in its jobLog (ctrl_pool) But I don't think the current way is the best way to move forward. I am curious about some general desired design/requests on log configuration. Should all Science Pipelines code in the stack share one configuration file, no matter how it's used? If yes, is it okay to use environment variables to achieve that? If no, is there any important stack entry beyond the three listed above? How should we set the configuration?
Hide
Paul Price added a comment -

I find the configuration file burdensome (it's something alien I would have to learn; it looks something like our python configuration files, but it isn't), though I think we should keep it around for low-level configuration, maybe as a dot-file backed by a envvar. I think the defaults should be set up so that logging behaves much as it used to (format, levels), a python configuration interface should be added, and ArgumentParser hooks added for the most important configuration options (file, levels, format).

Show
Paul Price added a comment - I find the configuration file burdensome (it's something alien I would have to learn; it looks something like our python configuration files, but it isn't), though I think we should keep it around for low-level configuration, maybe as a dot-file backed by a envvar. I think the defaults should be set up so that logging behaves much as it used to (format, levels), a python configuration interface should be added, and ArgumentParser hooks added for the most important configuration options (file, levels, format).
Hide
Robert Lupton added a comment -

Another place that we need logging configuration is in ad-hoc python (e.g. jupyter notebooks that use LSST objects or tasks).

I think it'd be OK to ask people to use a simple logging configuration call as boiler plate, but it'd be much better if the default configuration was almost always useful (e.g. import lsst.afw.image as afwImage; im = afwImage.ImageF("foo.fits")) didn't generate any logging output to the screen). A default of WARN might be enough to achieve this.

Show
Robert Lupton added a comment - Another place that we need logging configuration is in ad-hoc python (e.g. jupyter notebooks that use LSST objects or tasks). I think it'd be OK to ask people to use a simple logging configuration call as boiler plate, but it'd be much better if the default configuration was almost always useful (e.g. import lsst.afw.image as afwImage; im = afwImage.ImageF("foo.fits") ) didn't generate any logging output to the screen). A default of WARN might be enough to achieve this.
Hide
Russell Owen added a comment -

I agree with Robert Lupton that the default verbosity is too high. I vote for INFO as the default level, since I usually want to see that level of message.

Here is an example of reading a calexp:

 lsst.afw.image.ExposureF("calexp-849375-14.fits") 5482 [0x7fff79b6b000] DEBUG afw.image.Mask null - Number of mask planes: 16 5681 [0x7fff79b6b000] WARN afw.image.ExposureInfo null - Could not read PSF; setting to null: PersistableFactory with name 'PcaPsf' not found, and import of module 'lsst.meas.algorithms' failed (possibly because Python calls were not available from C++). {0}; loading object with id=6, name='PcaPsf' {1}  > *' at 0x10184b360> > 

the warning is definitely useful (I should setup meas_extensions_psfex); the debug message is not.

Recently when running some code in a python interpreter (without any attempt to log anything) I managed to trigger warnings about missing log configuration. I realize that is not a helpful bug report, and If I figure out how to reproduce it I'll report in detail, but as a general statement I hope users will not see normally see such warnings.

Show
Russell Owen added a comment - I agree with Robert Lupton that the default verbosity is too high. I vote for INFO as the default level, since I usually want to see that level of message. Here is an example of reading a calexp: lsst.afw.image.ExposureF("calexp-849375-14.fits") 5482 [0x7fff79b6b000] DEBUG afw.image.Mask null - Number of mask planes: 16 5681 [0x7fff79b6b000] WARN afw.image.ExposureInfo null - Could not read PSF; setting to null: PersistableFactory with name 'PcaPsf' not found, and import of module 'lsst.meas.algorithms' failed (possibly because Python calls were not available from C++). {0}; loading object with id=6, name='PcaPsf' {1} <lsst.afw.image.imageLib.ExposureF; proxy of <Swig Object of type 'std::shared_ptr< lsst::afw::image::Exposure< float,lsst::afw::image::MaskPixel,lsst::afw::image::VariancePixel > > *' at 0x10184b360> > the warning is definitely useful (I should setup meas_extensions_psfex ); the debug message is not. Recently when running some code in a python interpreter (without any attempt to log anything) I managed to trigger warnings about missing log configuration. I realize that is not a helpful bug report, and If I figure out how to reproduce it I'll report in detail, but as a general statement I hope users will not see normally see such warnings.
Hide
Robert Lupton added a comment -

If we make INFO the default it'll require us to change verbosity levels in the stack. E.g.

 reduceArc.py $PFS_DATA_DIR --rerun$whoami/talk --id visit=4 --clobber-versions root INFO: Config override file does not exist: '/Users/rhl/LSST/obs/pfs/config/reduceArcTask.py' root INFO: Config override file does not exist: '/Users/rhl/LSST/obs/pfs/config/pfs/reduceArcTask.py' root INFO: input=/Users/rhl/PFS/Data root INFO: calib=None root INFO: output=/Users/rhl/PFS/Data/rerun/rhl/talk CameraMapper INFO: Loading registry registry from /Users/rhl/PFS/Data/rerun/rhl/talk/_parent/registry.sqlite3 CameraMapper INFO: Loading calibRegistry registry from /Users/rhl/PFS/Data/CALIB/calibRegistry.sqlite3 

shouldn't come out at default level. It was one reason I wanted more levels...

Show
Robert Lupton added a comment - If we make INFO the default it'll require us to change verbosity levels in the stack. E.g. reduceArc.py $PFS_DATA_DIR --rerun$whoami/talk --id visit=4 --clobber-versions root INFO: Config override file does not exist: '/Users/rhl/LSST/obs/pfs/config/reduceArcTask.py' root INFO: Config override file does not exist: '/Users/rhl/LSST/obs/pfs/config/pfs/reduceArcTask.py' root INFO: input=/Users/rhl/PFS/Data root INFO: calib=None root INFO: output=/Users/rhl/PFS/Data/rerun/rhl/talk CameraMapper INFO: Loading registry registry from /Users/rhl/PFS/Data/rerun/rhl/talk/_parent/registry.sqlite3 CameraMapper INFO: Loading calibRegistry registry from /Users/rhl/PFS/Data/CALIB/calibRegistry.sqlite3 shouldn't come out at default level. It was one reason I wanted more levels...
Hide
Jim Bosch added a comment -

I think I prefer WARN as the default level, but I also think we really should change most of the lines in Robert Lupton's posted snippet to DEBUG.

Show
Jim Bosch added a comment - I think I prefer WARN as the default level, but I also think we really should change most of the lines in Robert Lupton 's posted snippet to DEBUG.
Hide
Hsin-Fang Chiang added a comment -

Russell Owen, the warnings you saw about missing log configuration may have been the bug fixed in DM-7754.

So far it sounds to me that the most important issue is to improve the basic default configuration (covered by DM-7955). Some more configuration options are wanted in the task interface (e.g. DM-7997 for file destination). Doing additional configuration in multiple places seems less of a concern?

Also adding Andy Salnikov and Fritz Mueller to join the discussions.

Show
Hsin-Fang Chiang added a comment - Russell Owen , the warnings you saw about missing log configuration may have been the bug fixed in DM-7754 . So far it sounds to me that the most important issue is to improve the basic default configuration (covered by DM-7955 ). Some more configuration options are wanted in the task interface (e.g. DM-7997 for file destination). Doing additional configuration in multiple places seems less of a concern? Also adding Andy Salnikov and Fritz Mueller to join the discussions.
Hsin-Fang Chiang made changes -
 Link This issue relates to DM-8009 [ DM-8009 ]
Hide
Paul Price added a comment -

Doing additional configuration in multiple places seems less of a concern?

I think that so long as there are good defaults and a consistent python-based configuration interface (e.g., log.setDestination, log.setLevel, log.setFormat) for the major options that's fine. It's impossible to avoid having configuration in multiple places because we have code with different entry points. It would be great to have a simple way to get the hooks into an argparse.ArgumentParser (not just supporting the lsst.pipe.base.ArgumentParser), which would help consolidate things.

Show
Paul Price added a comment - Doing additional configuration in multiple places seems less of a concern? I think that so long as there are good defaults and a consistent python-based configuration interface (e.g., log.setDestination, log.setLevel, log.setFormat ) for the major options that's fine. It's impossible to avoid having configuration in multiple places because we have code with different entry points. It would be great to have a simple way to get the hooks into an argparse.ArgumentParser (not just supporting the lsst.pipe.base.ArgumentParser ), which would help consolidate things.
Hide
Hsin-Fang Chiang added a comment -

DM-8009 is added to reduce verbosity in starting tasks and changing those lines in Robert Lupton's snippet to DEBUG. If you have opinions about that please leave comments in that ticket.

Show
Hsin-Fang Chiang added a comment - DM-8009 is added to reduce verbosity in starting tasks and changing those lines in Robert Lupton 's snippet to DEBUG. If you have opinions about that please leave comments in that ticket.
Andy Salnikov made changes -
 Link This issue is triggering DM-7955 [ DM-7955 ]
Hide
Tim Jenness added a comment -

All triggering tickets are complete. Hsin-Fang Chiang are we waiting on DM-7997 before marking this RFC implemented? If we are please, change the linkage, else mark as implemented.

Show
Tim Jenness added a comment - All triggering tickets are complete. Hsin-Fang Chiang are we waiting on DM-7997 before marking this RFC implemented? If we are please, change the linkage, else mark as implemented.
Hide
Hsin-Fang Chiang added a comment -

Adding DM-7000 as the last ticket triggered by this RFC.

I'm not sure what the recent replan/WBS changes mean to the logging framework transition work; if Fritz Mueller Joel Plutchak John Swinbank Simon Krughoff do not object, I will plan to do DM-7000 later this cycle. Please advise if that should be assigned differently.

Show
Hsin-Fang Chiang added a comment - Adding DM-7000 as the last ticket triggered by this RFC. I'm not sure what the recent replan/WBS changes mean to the logging framework transition work; if Fritz Mueller Joel Plutchak John Swinbank Simon Krughoff do not object, I will plan to do DM-7000 later this cycle. Please advise if that should be assigned differently.
Hsin-Fang Chiang made changes -
 Link This issue is triggering DM-7000 [ DM-7000 ]
Hide
John Swinbank added a comment -

Formally, none of the replan changes take effect until after a baseline change request has been approved, which won't be until next year.

Practically, the plan for the rest of this cycle (which is only the next three and a bit weeks) is already locked in.

The upshot is: I certainly don't object to Hsin-Fang Chiang's plan.

Show
John Swinbank added a comment - Formally, none of the replan changes take effect until after a baseline change request has been approved, which won't be until next year. Practically, the plan for the rest of this cycle (which is only the next three and a bit weeks) is already locked in. The upshot is: I certainly don't object to Hsin-Fang Chiang 's plan.
Hsin-Fang Chiang made changes -
 Remote Link This issue links to "Page (Confluence)" [ 14367 ]
Hsin-Fang Chiang made changes -
 Link This issue relates to DM-8365 [ DM-8365 ]
Hsin-Fang Chiang made changes -
 Status Adopted [ 10806 ] Implemented [ 11105 ]
Hide
Hsin-Fang Chiang added a comment -

With the merge of DM-7000, pipe_tasks no longer depends on pex_logging.

Although beyond the scope of this RFC, I filed a few tickets for some downstream packages that still use pex_logging: ci_hsc DM-8356, meas_modelfit DM-8357, meas_mosaic DM-8358, ctrl_pool DM-8359 for the TCAMs to decide where they should go.

This page has been used to track the migration progress: https://confluence.lsstcorp.org/display/DM/Logging+Migration

Show
Hsin-Fang Chiang added a comment - With the merge of DM-7000 , pipe_tasks no longer depends on pex_logging . Although beyond the scope of this RFC, I filed a few tickets for some downstream packages that still use pex_logging : ci_hsc DM-8356 , meas_modelfit DM-8357 , meas_mosaic DM-8358 , ctrl_pool DM-8359 for the TCAMs to decide where they should go. This page has been used to track the migration progress: https://confluence.lsstcorp.org/display/DM/Logging+Migration

#### People

Assignee:
Hsin-Fang Chiang
Reporter:
Hsin-Fang Chiang
Watchers:
Gregory Dubois-Felsmann, Hsin-Fang Chiang, Jim Bosch, John Swinbank, Kian-Tat Lim, Merlin Fisher-Levine, Paul Price, Robert Lupton, Russell Owen, Tim Jenness
Votes:
0 Vote for this issue
Watchers:
10 Start watching this issue

#### Dates

Created:
Updated:
Resolved:
Planned End:

#### Jenkins

No builds found.