# Automated helper for doxygen->numpydoc conversion

XMLWordPrintable

#### Details

• Type: Improvement
• Status: To Do
• Resolution: Unresolved
• Fix Version/s: None
• Component/s:
• Labels:
• Story Points:
2
• Team:
SQuaRE

#### Description

It would greatly help the doxygen to numpydoc conversion if we could automate at least a little of the process. A day or so of effort by a regex or parsing expert would probably cut everyone else's manual labor by half.

For example, automatically converting the parameters and returns sections from:

 @param[in] foo stuff @param[in] bar more stuff @return something else 

into something like:

 Parameters ---------- foo : Unknown  stuff bar : Unknown  more stuff   Returns ------- Unknown : Unknown  something else 

#### Attachments

1. deDoxygen.py
3 kB

#### Activity

Hide
Kian-Tat Lim added a comment -

Something like this could be a start, although it doesn't deal with some subtleties like @param[out] being used for return values.

 import re import sys   param_re = re.compile(r'^(\s+)[@\\]param([(in|out|in,out)])?\s+([*\w]):?\s(.*)$') return_re = re.compile(r'^(\s+)[@\\]return\s+(.*)$') blank_re = re.compile(r'^(\s+""")?$') in_params = False with open(sys.argv[1]) as f:  for line in f:  line = line.rstrip('\n')  p = param_re.match(line)  if p:  indent = p.group(1)  if not in_params:  in_params = True  print(indent + "Parameters")  print(indent + "----------")  print(indent + p.group(4) + " : Unknown")  print(indent + " " + p.group(5))  else:  r = return_re.match(line)  if r:  in_params = False  indent = r.group(1)  print(indent + "Returns")  print(indent + "-------")  print(indent + "Unknown: Unknown")  print(indent + " " + r.group(2))  elif in_params:  if blank_re.match(line):  in_params = False  print(line)  else:  print(indent + " " + line.lstrip())  else:  print(line)  Show Kian-Tat Lim added a comment - Something like this could be a start, although it doesn't deal with some subtleties like @param [out] being used for return values. import re import sys param_re = re. compile (r '^(\s+)[@\\]param([(in|out|in,out)])?\s+([*\w]):?\s(.*)$' ) return_re = re. compile (r '^(\s+)[@\\]return\s+(.*)$' ) blank_re = re. compile (r '^(\s+""")?$' ) in_params = False with open (sys.argv[ 1 ]) as f:     for line in f:         line = line.rstrip( '\n' )         p = param_re.match(line)         if p:             indent = p.group( 1 )             if not in_params:                 in_params = True                 print (indent + "Parameters" )                 print (indent + "----------" )             print (indent + p.group( 4 ) + " : Unknown" )             print (indent + "    " + p.group( 5 ))         else :             r = return_re.match(line)             if r:                 in_params = False                 indent = r.group( 1 )                 print (indent + "Returns" )                 print (indent + "-------" )                 print (indent + "Unknown: Unknown" )                 print (indent + "    " + r.group( 2 ))             elif in_params:                 if blank_re.match(line):                     in_params = False                     print (line)                 else :                     print (indent + "    " + line.lstrip())             else :                 print (line)
Hide
Krzysztof Findeisen added a comment -

Not volunteering for this, but a few possible gotchas:

• Old documentation comments often list sections in the wrong order compared to the style guide recommendation. Since the exact order can be hard to remember, enforcing it in-script is probably safest.
• Doxygen has aliases for many commands, to support different documentation styles. Naturally, we use them all.
Show
Krzysztof Findeisen added a comment - Not volunteering for this, but a few possible gotchas: Old documentation comments often list sections in the wrong order compared to the style guide recommendation. Since the exact order can be hard to remember, enforcing it in-script is probably safest. Doxygen has aliases for many commands, to support different documentation styles. Naturally, we use them all.
Hide
John Parejko added a comment -

Thanks Kian-Tat Lim! That's a great start.

As to the aliases, we use @param about 30 times more often than \param (~3500 vs. ~120), so we can get the majority of cases with just that.

Show
John Parejko added a comment - Thanks Kian-Tat Lim ! That's a great start. As to the aliases, we use @param about 30 times more often than \param (~3500 vs. ~120), so we can get the majority of cases with just that.
Hide
Kian-Tat Lim added a comment - - edited

The above code deals with both @param and \param versions of the keywords.  But I only did return; C++ code sometimes has returns.  It's left as an exercise for the reader to support both (should be a two-character addition).

Show
Kian-Tat Lim added a comment - - edited The above code deals with both @param and \param versions of the keywords.  But I only did return ; C++ code sometimes has returns .  It's left as an exercise for the reader to support both (should be a two-character addition).
Hide
Krzysztof Findeisen added a comment - - edited

From a recent review: despite not having its own section, "@warning" can be translated to ".. warning::". I did not know that.

Show
Krzysztof Findeisen added a comment - - edited From a recent review: despite not having its own section, " @warning " can be translated to " .. warning:: ". I did not know that.
Hide
Krzysztof Findeisen added a comment -

I just noticed DM-9015, which asks for the same thing (though it doesn't have all the details we've discussed here). Perhaps John Parejko or Jonathan Sick could close either this issue or that one?

Show
Krzysztof Findeisen added a comment - I just noticed DM-9015 , which asks for the same thing (though it doesn't have all the details we've discussed here). Perhaps John Parejko or Jonathan Sick could close either this issue or that one?
Hide
Jonathan Sick added a comment -

I'm personally hesitant to take on this work because I'm not convinced of the cost/benefit. But the T/CAMs can totally discuss this and assign it if they are convinced.

Show
Jonathan Sick added a comment - I'm personally hesitant to take on this work because I'm not convinced of the cost/benefit. But the T/CAMs can totally discuss this and assign it if they are convinced.
Hide
John Parejko added a comment -

I've attached a python script built on KT's regex magic above (which I had to touch up) that can be used on multiple files and write them in-place with -o. I'm using it to do DM-16855 as follows:

 python ~/lsst/temp/deDoxygen.py -o python/lsst/afw/cameraGeom/*.py 

and it seems to do a decent job.

One could potentially add @throws->Raises to it.

Show
John Parejko added a comment - I've attached a python script built on KT's regex magic above (which I had to touch up) that can be used on multiple files and write them in-place with -o . I'm using it to do DM-16855 as follows: python ~/lsst/temp/deDoxygen.py -o python/lsst/afw/cameraGeom/*.py and it seems to do a decent job. One could potentially add @throws -> Raises to it.

#### People

Assignee:
Unassigned
Reporter:
John Parejko
Watchers:
Chris Morrison [X] (Inactive), Eric Bellm, Ian Sullivan, Jim Bosch, John Parejko, John Swinbank, Jonathan Sick, Kian-Tat Lim, Krzysztof Findeisen, Meredith Rawls