Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-14141

Automated helper for doxygen->numpydoc conversion

    XMLWordPrintable

Details

    • 2
    • SQuaRE

    Description

      It would greatly help the doxygen to numpydoc conversion if we could automate at least a little of the process. A day or so of effort by a regex or parsing expert would probably cut everyone else's manual labor by half.

      For example, automatically converting the parameters and returns sections from:

      @param[in] foo stuff
      @param[in] bar more stuff
      @return something else
      

      into something like:

      Parameters
      ----------
      foo : `Unknown`
          stuff
      bar : `Unknown`
          more stuff
       
      Returns
      -------
      Unknown : `Unknown`
          something else
      

      Attachments

        Issue Links

          Activity

            ktl Kian-Tat Lim added a comment -

            Something like this could be a start, although it doesn't deal with some subtleties like @param[out] being used for return values.

            import re
            import sys
             
            param_re = re.compile(r'^(\s+)[@\\]param([(in|out|in,out)])?\s+([*\w]):?\s(.*)$')
            return_re = re.compile(r'^(\s+)[@\\]return\s+(.*)$')
            blank_re = re.compile(r'^(\s+""")?$')
            in_params = False
            with open(sys.argv[1]) as f:
                 for line in f:
                     line = line.rstrip('\n')
                     p = param_re.match(line)
                     if p:
                         indent = p.group(1)
                         if not in_params:
                             in_params = True
                             print(indent + "Parameters")
                             print(indent + "----------")
                         print(indent + p.group(4) + " : `Unknown`")
                         print(indent + "    " + p.group(5))
                     else:
                         r = return_re.match(line)
                         if r:
                             in_params = False
                             indent = r.group(1)
                             print(indent + "Returns")
                             print(indent + "-------")
                             print(indent + "Unknown: `Unknown`")
                             print(indent + "    " + r.group(2))
                         elif in_params:
                             if blank_re.match(line):
                                 in_params = False
                                 print(line)
                             else:
                                 print(indent + "    " + line.lstrip())
                         else:
                             print(line)
            

             

            ktl Kian-Tat Lim added a comment - Something like this could be a start, although it doesn't deal with some subtleties like @param [out] being used for return values. import re import sys   param_re = re. compile (r '^(\s+)[@\\]param([(in|out|in,out)])?\s+([*\w]):?\s(.*)$' ) return_re = re. compile (r '^(\s+)[@\\]return\s+(.*)$' ) blank_re = re. compile (r '^(\s+""")?$' ) in_params = False with open (sys.argv[ 1 ]) as f:     for line in f:         line = line.rstrip( '\n' )         p = param_re.match(line)         if p:             indent = p.group( 1 )             if not in_params:                 in_params = True                 print (indent + "Parameters" )                 print (indent + "----------" )             print (indent + p.group( 4 ) + " : `Unknown`" )             print (indent + "    " + p.group( 5 ))         else :             r = return_re.match(line)             if r:                 in_params = False                 indent = r.group( 1 )                 print (indent + "Returns" )                 print (indent + "-------" )                 print (indent + "Unknown: `Unknown`" )                 print (indent + "    " + r.group( 2 ))             elif in_params:                 if blank_re.match(line):                     in_params = False                     print (line)                 else :                     print (indent + "    " + line.lstrip())             else :                 print (line)  

            Not volunteering for this, but a few possible gotchas:

            • Old documentation comments often list sections in the wrong order compared to the style guide recommendation. Since the exact order can be hard to remember, enforcing it in-script is probably safest.
            • Doxygen has aliases for many commands, to support different documentation styles. Naturally, we use them all.
            krzys Krzysztof Findeisen added a comment - Not volunteering for this, but a few possible gotchas: Old documentation comments often list sections in the wrong order compared to the style guide recommendation. Since the exact order can be hard to remember, enforcing it in-script is probably safest. Doxygen has aliases for many commands, to support different documentation styles. Naturally, we use them all.
            Parejkoj John Parejko added a comment -

            Thanks ktl! That's a great start.

            As to the aliases, we use @param about 30 times more often than \param (~3500 vs. ~120), so we can get the majority of cases with just that.

            Parejkoj John Parejko added a comment - Thanks ktl ! That's a great start. As to the aliases, we use @param about 30 times more often than \param (~3500 vs. ~120), so we can get the majority of cases with just that.
            ktl Kian-Tat Lim added a comment - - edited

            The above code deals with both @param and \param versions of the keywords.  But I only did return; C++ code sometimes has returns.  It's left as an exercise for the reader to support both (should be a two-character addition).

            ktl Kian-Tat Lim added a comment - - edited The above code deals with both @param and \param versions of the keywords.  But I only did return ; C++ code sometimes has returns .  It's left as an exercise for the reader to support both (should be a two-character addition).
            krzys Krzysztof Findeisen added a comment - - edited

            From a recent review: despite not having its own section, "@warning" can be translated to ".. warning::". I did not know that.

            krzys Krzysztof Findeisen added a comment - - edited From a recent review: despite not having its own section, " @warning " can be translated to " .. warning:: ". I did not know that.

            I just noticed DM-9015, which asks for the same thing (though it doesn't have all the details we've discussed here). Perhaps Parejkoj or jsick could close either this issue or that one?

            krzys Krzysztof Findeisen added a comment - I just noticed DM-9015 , which asks for the same thing (though it doesn't have all the details we've discussed here). Perhaps Parejkoj or jsick could close either this issue or that one?

            I'm personally hesitant to take on this work because I'm not convinced of the cost/benefit. But the T/CAMs can totally discuss this and assign it if they are convinced.

            jsick Jonathan Sick added a comment - I'm personally hesitant to take on this work because I'm not convinced of the cost/benefit. But the T/CAMs can totally discuss this and assign it if they are convinced.
            Parejkoj John Parejko added a comment -

            I've attached a python script built on KT's regex magic above (which I had to touch up) that can be used on multiple files and write them in-place with -o. I'm using it to do DM-16855 as follows:

            python ~/lsst/temp/deDoxygen.py -o python/lsst/afw/cameraGeom/*.py
            

            and it seems to do a decent job.

            One could potentially add @throws->Raises to it.

            Parejkoj John Parejko added a comment - I've attached a python script built on KT's regex magic above (which I had to touch up) that can be used on multiple files and write them in-place with -o . I'm using it to do DM-16855 as follows: python ~/lsst/temp/deDoxygen.py -o python/lsst/afw/cameraGeom/*.py and it seems to do a decent job. One could potentially add @throws -> Raises to it.

            People

              Unassigned Unassigned
              Parejkoj John Parejko
              Chris Morrison [X] (Inactive), Eric Bellm, Ian Sullivan, Jim Bosch, John Parejko, John Swinbank, Jonathan Sick, Kian-Tat Lim, Krzysztof Findeisen, Meredith Rawls
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:

                Jenkins

                  No builds found.