Message ID | cover.1622898327.git.mchehab+huawei@kernel.org |
---|---|
Headers | show |
Series | docs: avoid using ReST :doc:`foo` tag | expand |
Em Sun, 6 Jun 2021 19:52:25 -0300 Nícolas F. R. A. Prado <n@nfraprado.net> escreveu: > On Sat, Jun 05, 2021 at 09:08:36PM +0200, Mauro Carvalho Chehab wrote: > > Em Sat, 5 Jun 2021 12:11:09 -0300 > > Nícolas F. R. A. Prado <n@nfraprado.net> escreveu: > > > > > Hi Mauro, > > > > > > On Sat, Jun 05, 2021 at 03:17:59PM +0200, Mauro Carvalho Chehab wrote: > > > > As discussed at: > > > > https://lore.kernel.org/linux-doc/871r9k6rmy.fsf@meer.lwn.net/ > > > > > > > > It is better to avoid using :doc:`foo` to refer to Documentation/foo.rst, as the > > > > automarkup.py extension should handle it automatically, on most cases. > > > > > > > > There are a couple of exceptions to this rule: > > > > > > > > 1. when :doc: tag is used to point to a kernel-doc DOC: markup; > > > > 2. when it is used with a named tag, e. g. :doc:`some name <foo>`; > > > > > > > > It should also be noticed that automarkup.py has currently an issue: > > > > if one use a markup like: > > > > > > > > Documentation/dev-tools/kunit/api/test.rst > > > > - documents all of the standard testing API excluding mocking > > > > or mocking related features. > > > > > > > > or, even: > > > > > > > > Documentation/dev-tools/kunit/api/test.rst > > > > documents all of the standard testing API excluding mocking > > > > or mocking related features. > > > > > > > > The automarkup.py will simply ignore it. Not sure why. This patch series > > > > avoid the above patterns (which is present only on 4 files), but it would be > > > > nice to have a followup patch fixing the issue at automarkup.py. > > > > > > What I think is happening here is that we're using rST's syntax for definition > > > lists [1]. automarkup.py ignores literal nodes, and perhaps a definition is > > > considered a literal by Sphinx. Adding a blank line after the Documentation/... > > > or removing the additional indentation makes it work, like you did in your > > > 2nd and 3rd patch, since then it's not a definition anymore, although then the > > > visual output is different as well. > > > > A literal has a different output. I think that this is not the case, but I > > didn't check the python code from docutils/Sphinx. > > Okay, I went in deeper to understand the issue and indeed it wasn't what I > thought. The reason definitions are ignored by automarkup.py is because the main > loop iterates only over nodes that are of type paragraph: > > for para in doctree.traverse(nodes.paragraph): > for node in para.traverse(nodes.Text): > if not isinstance(node.parent, nodes.literal): > node.parent.replace(node, markup_refs(name, app, node)) > > And inspecting the HTML output from your example, the definition name is inside > a <dt> tag, and it doesn't have a <p> inside. So in summary, automarkup.py will > only work on elements which are inside a <p> in the output. Yeah, that's what I was suspecting, based on the comments. Maybe something similar to the above could be done also for some non-paragraph data. By looking at: https://docutils.sourceforge.io/docs/ref/doctree.html It says that the body elements are: admonition, attention, block_quote, bullet_list, caution, citation, comment, compound, container, danger, definition_list, doctest_block, enumerated_list, error, field_list, figure, footnote, hint, image, important, line_block, literal_block, note, option_list, paragraph, pending, raw, rubric, substitution_definition, system_message, table, target, tip, warning So, perhaps a similar loop for definition_list would do the trick, but maybe automarkup should also look at other types, like enum lists, notes (and their variants, like error/warning) and footnotes. No idea how this would affect the docs build time, though. > Only applying the automarkup inside paragraphs seems like a good decision (which > covers text in lists and tables as well), so unless there are other types of > elements without paragraphs where automarkup should work, I think we should just > avoid using definition lists pointing to documents like that. Checking the code or doing some tests are needed for us to be sure about what of the above types docutils don't consider a paragraph. Thanks, Mauro
Hi Mauro, On Mon, Jun 07, 2021 at 09:34:22AM +0200, Mauro Carvalho Chehab wrote: > Em Sun, 6 Jun 2021 19:52:25 -0300 > Nícolas F. R. A. Prado <n@nfraprado.net> escreveu: > > > On Sat, Jun 05, 2021 at 09:08:36PM +0200, Mauro Carvalho Chehab wrote: > > > Em Sat, 5 Jun 2021 12:11:09 -0300 > > > Nícolas F. R. A. Prado <n@nfraprado.net> escreveu: > > > > > > > Hi Mauro, > > > > > > > > On Sat, Jun 05, 2021 at 03:17:59PM +0200, Mauro Carvalho Chehab wrote: > > > > > As discussed at: > > > > > https://lore.kernel.org/linux-doc/871r9k6rmy.fsf@meer.lwn.net/ > > > > > > > > > > It is better to avoid using :doc:`foo` to refer to Documentation/foo.rst, as the > > > > > automarkup.py extension should handle it automatically, on most cases. > > > > > > > > > > There are a couple of exceptions to this rule: > > > > > > > > > > 1. when :doc: tag is used to point to a kernel-doc DOC: markup; > > > > > 2. when it is used with a named tag, e. g. :doc:`some name <foo>`; > > > > > > > > > > It should also be noticed that automarkup.py has currently an issue: > > > > > if one use a markup like: > > > > > > > > > > Documentation/dev-tools/kunit/api/test.rst > > > > > - documents all of the standard testing API excluding mocking > > > > > or mocking related features. > > > > > > > > > > or, even: > > > > > > > > > > Documentation/dev-tools/kunit/api/test.rst > > > > > documents all of the standard testing API excluding mocking > > > > > or mocking related features. > > > > > > > > > > The automarkup.py will simply ignore it. Not sure why. This patch series > > > > > avoid the above patterns (which is present only on 4 files), but it would be > > > > > nice to have a followup patch fixing the issue at automarkup.py. > > > > > > > > What I think is happening here is that we're using rST's syntax for definition > > > > lists [1]. automarkup.py ignores literal nodes, and perhaps a definition is > > > > considered a literal by Sphinx. Adding a blank line after the Documentation/... > > > > or removing the additional indentation makes it work, like you did in your > > > > 2nd and 3rd patch, since then it's not a definition anymore, although then the > > > > visual output is different as well. > > > > > > A literal has a different output. I think that this is not the case, but I > > > didn't check the python code from docutils/Sphinx. > > > > Okay, I went in deeper to understand the issue and indeed it wasn't what I > > thought. The reason definitions are ignored by automarkup.py is because the main > > loop iterates only over nodes that are of type paragraph: > > > > for para in doctree.traverse(nodes.paragraph): > > for node in para.traverse(nodes.Text): > > if not isinstance(node.parent, nodes.literal): > > node.parent.replace(node, markup_refs(name, app, node)) > > > > And inspecting the HTML output from your example, the definition name is inside > > a <dt> tag, and it doesn't have a <p> inside. So in summary, automarkup.py will > > only work on elements which are inside a <p> in the output. > > > Yeah, that's what I was suspecting, based on the comments. > > Maybe something similar to the above could be done also for some > non-paragraph data. By looking at: > > https://docutils.sourceforge.io/docs/ref/doctree.html > > It says that the body elements are: > > admonition, attention, block_quote, bullet_list, caution, citation, > comment, compound, container, danger, definition_list, doctest_block, > enumerated_list, error, field_list, figure, footnote, hint, image, > important, line_block, literal_block, note, option_list, paragraph, > pending, raw, rubric, substitution_definition, system_message, > table, target, tip, warning Ok, I went through each one by searching the term on [1] and inspecting the element to see if it contained a <p> or not. The vast majority did. These are the ones I didn't find there or didn't make sense: comment container image pending raw substitution_definition system_message target We can safely ignore them. And these are the ones that matter and don't have paragraphs: 1. literal_block 2. doctest_block 3. definition_list 4. field_list 5. option_list 6. line_block 1 and 2 are literals, so we don't care about them. 3 is the one you noticed the issue with. It's worth mentioning that the definition term doesn't have a paragraph, but its definition does (as can be checked by inspecting [2]). 4 is basically the same as 3, the rst syntax is different but the output is the same. That said, I believe we only use those to set options at the top of the file, like in translations, and I can't see automarkup being useful in there. 5 is similar to 3 and 4, but the term is formatted using <kbd>, so it's like a literal and therefore not relevant. 6 is useful just to preserve indentation, and I'm pretty sure we don't use it in the docs. So in the end, I think the only contenders to be added to automarkup are definition lists, and even then I still think we should just substitute those definition lists with alternatives like you did in your patches. Personally I don't see much gain in using definitions instead of a simple paragraph. But if you really think it's an improvement in some way, it could probably be added to automarkup in the way you described. Thanks, Nícolas [1] https://sphinx-rtd-theme.readthedocs.io/en/stable/index.html [2] https://sphinx-rtd-theme.readthedocs.io/en/stable/demo/lists_tables.html?highlight=definition%20list#definition-lists > > So, perhaps a similar loop for definition_list would do the trick, > but maybe automarkup should also look at other types, like enum lists, > notes (and their variants, like error/warning) and footnotes. > > No idea how this would affect the docs build time, though. > > > Only applying the automarkup inside paragraphs seems like a good decision (which > > covers text in lists and tables as well), so unless there are other types of > > elements without paragraphs where automarkup should work, I think we should just > > avoid using definition lists pointing to documents like that. > > Checking the code or doing some tests are needed for us to be sure about what > of the above types docutils don't consider a paragraph. > > Thanks, > Mauro
Em Mon, 7 Jun 2021 21:34:58 -0300 Nícolas F. R. A. Prado <n@nfraprado.net> escreveu: > Hi Mauro, > > On Mon, Jun 07, 2021 at 09:34:22AM +0200, Mauro Carvalho Chehab wrote: > > Em Sun, 6 Jun 2021 19:52:25 -0300 > > Nícolas F. R. A. Prado <n@nfraprado.net> escreveu: > > > > > On Sat, Jun 05, 2021 at 09:08:36PM +0200, Mauro Carvalho Chehab wrote: > > > > Em Sat, 5 Jun 2021 12:11:09 -0300 > > > > Nícolas F. R. A. Prado <n@nfraprado.net> escreveu: > > > > > > > > > Hi Mauro, > > > > > > > > > > On Sat, Jun 05, 2021 at 03:17:59PM +0200, Mauro Carvalho Chehab wrote: > > > > > > As discussed at: > > > > > > https://lore.kernel.org/linux-doc/871r9k6rmy.fsf@meer.lwn.net/ > > > > > > > > > > > > It is better to avoid using :doc:`foo` to refer to Documentation/foo.rst, as the > > > > > > automarkup.py extension should handle it automatically, on most cases. > > > > > > > > > > > > There are a couple of exceptions to this rule: > > > > > > > > > > > > 1. when :doc: tag is used to point to a kernel-doc DOC: markup; > > > > > > 2. when it is used with a named tag, e. g. :doc:`some name <foo>`; > > > > > > > > > > > > It should also be noticed that automarkup.py has currently an issue: > > > > > > if one use a markup like: > > > > > > > > > > > > Documentation/dev-tools/kunit/api/test.rst > > > > > > - documents all of the standard testing API excluding mocking > > > > > > or mocking related features. > > > > > > > > > > > > or, even: > > > > > > > > > > > > Documentation/dev-tools/kunit/api/test.rst > > > > > > documents all of the standard testing API excluding mocking > > > > > > or mocking related features. > > > > > > > > > > > > The automarkup.py will simply ignore it. Not sure why. This patch series > > > > > > avoid the above patterns (which is present only on 4 files), but it would be > > > > > > nice to have a followup patch fixing the issue at automarkup.py. > > > > > > > > > > What I think is happening here is that we're using rST's syntax for definition > > > > > lists [1]. automarkup.py ignores literal nodes, and perhaps a definition is > > > > > considered a literal by Sphinx. Adding a blank line after the Documentation/... > > > > > or removing the additional indentation makes it work, like you did in your > > > > > 2nd and 3rd patch, since then it's not a definition anymore, although then the > > > > > visual output is different as well. > > > > > > > > A literal has a different output. I think that this is not the case, but I > > > > didn't check the python code from docutils/Sphinx. > > > > > > Okay, I went in deeper to understand the issue and indeed it wasn't what I > > > thought. The reason definitions are ignored by automarkup.py is because the main > > > loop iterates only over nodes that are of type paragraph: > > > > > > for para in doctree.traverse(nodes.paragraph): > > > for node in para.traverse(nodes.Text): > > > if not isinstance(node.parent, nodes.literal): > > > node.parent.replace(node, markup_refs(name, app, node)) > > > > > > And inspecting the HTML output from your example, the definition name is inside > > > a <dt> tag, and it doesn't have a <p> inside. So in summary, automarkup.py will > > > only work on elements which are inside a <p> in the output. > > > > > > Yeah, that's what I was suspecting, based on the comments. > > > > Maybe something similar to the above could be done also for some > > non-paragraph data. By looking at: > > > > https://docutils.sourceforge.io/docs/ref/doctree.html > > > > It says that the body elements are: > > > > admonition, attention, block_quote, bullet_list, caution, citation, > > comment, compound, container, danger, definition_list, doctest_block, > > enumerated_list, error, field_list, figure, footnote, hint, image, > > important, line_block, literal_block, note, option_list, paragraph, > > pending, raw, rubric, substitution_definition, system_message, > > table, target, tip, warning > > Ok, I went through each one by searching the term on [1] and inspecting the > element to see if it contained a <p> or not. The vast majority did. These are > the ones I didn't find there or didn't make sense: > > comment > container > image > pending > raw > substitution_definition > system_message > target > > We can safely ignore them. And these are the ones that matter and don't have > paragraphs: > > 1. literal_block > 2. doctest_block > 3. definition_list > 4. field_list > 5. option_list > 6. line_block > > 1 and 2 are literals, so we don't care about them. > > 3 is the one you noticed the issue with. It's worth mentioning that the > definition term doesn't have a paragraph, but its definition does (as can be > checked by inspecting [2]). > > 4 is basically the same as 3, the rst syntax is different but the output is the > same. That said, I believe we only use those to set options at the top of the > file, like in translations, and I can't see automarkup being useful in there. > > 5 is similar to 3 and 4, but the term is formatted using <kbd>, so it's like a > literal and therefore not relevant. > > 6 is useful just to preserve indentation, and I'm pretty sure we don't use it in > the docs. > > So in the end, I think the only contenders to be added to automarkup are > definition lists, and even then I still think we should just substitute those > definition lists with alternatives like you did in your patches. Personally I > don't see much gain in using definitions instead of a simple paragraph. But if > you really think it's an improvement in some way, it could probably be added to > automarkup in the way you described. Thank you for checking this! Kernel docs use a lot definition lists. At the initial versions, it was equivalent to: **Something to be written with emphasis** Some description Sphinx later changed the look-and-feel for the term, on html output, but the thing is that: Something to be written with emphasis Some description looks a lot better when read as a text file. Also, on some cases, the first notation doesn't work. The definition-list was the only way I know that would allow to apply an emphasis to a literal block. We can avoid using Documentation/foo on description lists: the current 4 cases where doc:`foo` are already addressed in this series, and the output is acceptable. Yet, I have a couple of concerns: 1. It might have some unknown places where a description list is used for Documentation/foo; 2. It is not trivial to identify if someone add Documentation/foo in the future; 3. I suspect that there are several places where functions and structs appear at the definition lists. (1) can probably be checked with a multi-line grep. So, not a big problem; (2) is something that would require someone to verify from time to time; but (3) are harder to check and seems to be a valid use-case. Due to (3), I think we should let automarkup to parse non-literal terms on description lists. At very least it should emit a warning when it won't be doing auto-conversions for known patterns at definition lists (if doing that would generate false-positives). Thanks, Mauro