Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-14141

Automated helper for doxygen->numpydoc conversion

    XMLWordPrintable

    Details

    • Story Points:
      2
    • Team:
      SQuaRE

      Description

      It would greatly help the doxygen to numpydoc conversion if we could automate at least a little of the process. A day or so of effort by a regex or parsing expert would probably cut everyone else's manual labor by half.

      For example, automatically converting the parameters and returns sections from:

      @param[in] foo stuff
      @param[in] bar more stuff
      @return something else
      

      into something like:

      Parameters
      ----------
      foo : `Unknown`
          stuff
      bar : `Unknown`
          more stuff
       
      Returns
      -------
      Unknown : `Unknown`
          something else
      

        Attachments

          Issue Links

            Activity

            Hide
            ktl Kian-Tat Lim added a comment -

            Something like this could be a start, although it doesn't deal with some subtleties like @param[out] being used for return values.

            import re
            import sys
             
            param_re = re.compile(r'^(\s+)[@\\]param([(in|out|in,out)])?\s+([*\w]):?\s(.*)$')
            return_re = re.compile(r'^(\s+)[@\\]return\s+(.*)$')
            blank_re = re.compile(r'^(\s+""")?$')
            in_params = False
            with open(sys.argv[1]) as f:
                 for line in f:
                     line = line.rstrip('\n')
                     p = param_re.match(line)
                     if p:
                         indent = p.group(1)
                         if not in_params:
                             in_params = True
                             print(indent + "Parameters")
                             print(indent + "----------")
                         print(indent + p.group(4) + " : `Unknown`")
                         print(indent + "    " + p.group(5))
                     else:
                         r = return_re.match(line)
                         if r:
                             in_params = False
                             indent = r.group(1)
                             print(indent + "Returns")
                             print(indent + "-------")
                             print(indent + "Unknown: `Unknown`")
                             print(indent + "    " + r.group(2))
                         elif in_params:
                             if blank_re.match(line):
                                 in_params = False
                                 print(line)
                             else:
                                 print(indent + "    " + line.lstrip())
                         else:
                             print(line)
            

             

            Show
            ktl Kian-Tat Lim added a comment - Something like this could be a start, although it doesn't deal with some subtleties like @param [out] being used for return values. import re import sys   param_re = re. compile (r '^(\s+)[@\\]param([(in|out|in,out)])?\s+([*\w]):?\s(.*)$' ) return_re = re. compile (r '^(\s+)[@\\]return\s+(.*)$' ) blank_re = re. compile (r '^(\s+""")?$' ) in_params = False with open (sys.argv[ 1 ]) as f:     for line in f:         line = line.rstrip( '\n' )         p = param_re.match(line)         if p:             indent = p.group( 1 )             if not in_params:                 in_params = True                 print (indent + "Parameters" )                 print (indent + "----------" )             print (indent + p.group( 4 ) + " : `Unknown`" )             print (indent + "    " + p.group( 5 ))         else :             r = return_re.match(line)             if r:                 in_params = False                 indent = r.group( 1 )                 print (indent + "Returns" )                 print (indent + "-------" )                 print (indent + "Unknown: `Unknown`" )                 print (indent + "    " + r.group( 2 ))             elif in_params:                 if blank_re.match(line):                     in_params = False                     print (line)                 else :                     print (indent + "    " + line.lstrip())             else :                 print (line)  
            Hide
            krzys Krzysztof Findeisen added a comment -

            Not volunteering for this, but a few possible gotchas:

            • Old documentation comments often list sections in the wrong order compared to the style guide recommendation. Since the exact order can be hard to remember, enforcing it in-script is probably safest.
            • Doxygen has aliases for many commands, to support different documentation styles. Naturally, we use them all.
            Show
            krzys Krzysztof Findeisen added a comment - Not volunteering for this, but a few possible gotchas: Old documentation comments often list sections in the wrong order compared to the style guide recommendation. Since the exact order can be hard to remember, enforcing it in-script is probably safest. Doxygen has aliases for many commands, to support different documentation styles. Naturally, we use them all.
            Hide
            Parejkoj John Parejko added a comment -

            Thanks Kian-Tat Lim! That's a great start.

            As to the aliases, we use @param about 30 times more often than \param (~3500 vs. ~120), so we can get the majority of cases with just that.

            Show
            Parejkoj John Parejko added a comment - Thanks Kian-Tat Lim ! That's a great start. As to the aliases, we use @param about 30 times more often than \param (~3500 vs. ~120), so we can get the majority of cases with just that.
            Hide
            ktl Kian-Tat Lim added a comment - - edited

            The above code deals with both @param and \param versions of the keywords.  But I only did return; C++ code sometimes has returns.  It's left as an exercise for the reader to support both (should be a two-character addition).

            Show
            ktl Kian-Tat Lim added a comment - - edited The above code deals with both @param and \param versions of the keywords.  But I only did return ; C++ code sometimes has returns .  It's left as an exercise for the reader to support both (should be a two-character addition).
            Hide
            krzys Krzysztof Findeisen added a comment - - edited

            From a recent review: despite not having its own section, "@warning" can be translated to ".. warning::". I did not know that.

            Show
            krzys Krzysztof Findeisen added a comment - - edited From a recent review: despite not having its own section, " @warning " can be translated to " .. warning:: ". I did not know that.
            Hide
            krzys Krzysztof Findeisen added a comment -

            I just noticed DM-9015, which asks for the same thing (though it doesn't have all the details we've discussed here). Perhaps John Parejko or Jonathan Sick could close either this issue or that one?

            Show
            krzys Krzysztof Findeisen added a comment - I just noticed DM-9015 , which asks for the same thing (though it doesn't have all the details we've discussed here). Perhaps John Parejko or Jonathan Sick could close either this issue or that one?
            Hide
            jsick Jonathan Sick added a comment -

            I'm personally hesitant to take on this work because I'm not convinced of the cost/benefit. But the T/CAMs can totally discuss this and assign it if they are convinced.

            Show
            jsick Jonathan Sick added a comment - I'm personally hesitant to take on this work because I'm not convinced of the cost/benefit. But the T/CAMs can totally discuss this and assign it if they are convinced.
            Hide
            Parejkoj John Parejko added a comment -

            I've attached a python script built on KT's regex magic above (which I had to touch up) that can be used on multiple files and write them in-place with -o. I'm using it to do DM-16855 as follows:

            python ~/lsst/temp/deDoxygen.py -o python/lsst/afw/cameraGeom/*.py
            

            and it seems to do a decent job.

            One could potentially add @throws->Raises to it.

            Show
            Parejkoj John Parejko added a comment - I've attached a python script built on KT's regex magic above (which I had to touch up) that can be used on multiple files and write them in-place with -o . I'm using it to do DM-16855 as follows: python ~/lsst/temp/deDoxygen.py -o python/lsst/afw/cameraGeom/*.py and it seems to do a decent job. One could potentially add @throws -> Raises to it.

              People

              Assignee:
              Unassigned Unassigned
              Reporter:
              Parejkoj John Parejko
              Watchers:
              Chris Morrison [X] (Inactive), Eric Bellm, Ian Sullivan, Jim Bosch, John Parejko, John Swinbank, Jonathan Sick, Kian-Tat Lim, Krzysztof Findeisen, Meredith Rawls
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

                Dates

                Created:
                Updated:

                  Jenkins

                  No builds found.