Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix qiskit.circuit method header and broken cross-reference #12394

Merged
merged 4 commits into from May 17, 2024

Conversation

arnaucasau
Copy link
Contributor

Part of Qiskit/documentation#1275

The __array__ method has a wrong header in the API docs (as we can see in the following screenshot) because Sphinx is not able to set the correct id in the generated HTML for methods starting with double underscore. Instead, Sphinx truncates the name to array__.

Captura desde 2024-05-13 14-31-08

This PR changes the method to be instead builtins.__array__, and it also fixes a broken reference for Instruction.params. The cross-reference was broken due to the .. currentmodule:: None directive.

This is the result of Sphinx after this PR:

Captura desde 2024-05-13 14-27-06

The API docs will have the correct header because we only use the part after the last dot in the id.

The PR will need backport to stable/1.1

@arnaucasau arnaucasau requested a review from a team as a code owner May 13, 2024 12:42
@qiskit-bot qiskit-bot added the Community PR PRs from contributors that are not 'members' of the Qiskit repo label May 13, 2024
@qiskit-bot
Copy link
Collaborator

Thank you for opening a new pull request.

Before your PR can be merged it will first need to pass continuous integration tests and be reviewed. Sometimes the review process can be slow, so please be patient.

While you're waiting, please feel free to review other open PRs. While only a subset of people are authorized to approve pull requests for merging, everyone is encouraged to review open pull requests. Doing reviews helps reduce the burden on the core team and helps make the project's code better for everyone.

One or more of the the following people are requested to review this:

  • @Qiskit/terra-core

Copy link
Member

@jakelishman jakelishman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the catching the Instruction.params thing.

The other problem here is not Sphinx, it's the scripts that are scraping the HTML output. I know it's not super popular, but these sorts of problems will keep cropping up and needing work for as long as the API generation scripts keep scraping the HTML rather than using Sphinx's data directly, such as by writing a Sphinx Builder that directly generates the target MDX.

.. currentmodule:: None
.. currentmodule:: builtins
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this changed? __array__ is not a Python builtin and isn't something that's part of the builtins module - it doesn't have a module, which is spelled currentmodule:: None in Sphinx.

Sphinx might be generating a modified id to avoid HTML problems, but if that's causing problems, it's on the scripts that are parsing the output of Sphinx to fix that, not to change the semantics of the documentation. The Sphinx-generated HTML entries do correctly name this method __array__, and it appears correctly in objects.inv.

If it's really impossible to fix the conversion scripts to use Sphinx data (such as using objects.inv as a source of truth) rather than trying to scrape it from escaped HTML, then potentially the cleaner workaround is to document this as being a potential method on object (which is in builtins) or somewhat lying and documenting it as part of Instruction.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been testing the use of objects.inv to determine the headers instead of reading the HTML and it would be more tricky than it seems. For example, for some classes, we have duplicated IDs that reference a rust 'class' and a python class:

py:class
    qiskit._accelerate.qasm3.CustomGate      apidoc/qasm3.html#qiskit.qasm3.CustomGate
    qiskit.qasm3.CustomGate                  apidoc/qasm3.html#qiskit.qasm3.CustomGate

I'm not sure if we can always differentiate between them, so maybe it's better to go with the workaround as __array__ is the only instance of this problem that we have found. In 15f9842, I changed builtins to object as you suggested, let me know if you think it's okay 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fwiw, the objects.inv format also includes a "priority" field, which might be what distinguishes the canonical reference? I'm not super sure. For example (the printf is just appending the magic numbers for zlib compression, and the tail cuts out the Sphinx comment header):

{ printf "\x1f\x8b\x08\x00\x00\x00\x00\x00" ; tail -n+5 docs/_build/html/objects.inv ; } | gzip -d | grep 'CustomGate'
qiskit._accelerate.qasm3.CustomGate py:class -1 apidoc/qasm3.html#qiskit.qasm3.CustomGate -
qiskit.qasm3.CustomGate py:class 1 apidoc/qasm3.html#$ -

it's the -1 / 1 that I'm talking about. But that might be something different too.

Anyway - thanks for the change. What I meant was ever so slightly different: leaving .. currentmodule:: None, but documenting the method as object.__array__. I've updated the PR to be like that, and I checked that it appears like that in the HTML's id tag.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The generated HTML is this, for reference:

<dl class="py method">
<dt class="sig sig-object py" id="object.__array__">
<span class="sig-prename descclassname"><span class="pre">object.</span></span><span class="sig-name descname"><span class="pre">__array__</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">dtype</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">copy</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">None</span></span></em><span class="sig-paren">)</span><a class="headerlink" href="#object.__array__" title="Permalink to this definition"></a></dt>
<dd><p>Return a Numpy array representing the gate. This can use the gate’s
<a class="reference internal" href="../stubs/qiskit.circuit.Instruction.html#qiskit.circuit.Instruction.params" title="qiskit.circuit.Instruction.params"><code class="xref py py-attr docutils literal notranslate"><span class="pre">params</span></code></a> field, and may assume that these are numeric
values (assuming the subclass expects that) and not
<a class="reference internal" href="#circuit-compile-time-parameters"><span class="std std-ref">compile-time parameters</span></a>.</p>
<p>For greatest efficiency, the returned array should default to a dtype of <a class="reference external" href="https://docs.python.org/3/library/functions.html#complex" title="(in Python v3.12)"><code class="xref py py-class docutils literal notranslate"><span class="pre">complex</span></code></a>.</p>
</dd></dl>

@coveralls
Copy link

coveralls commented May 13, 2024

Pull Request Test Coverage Report for Build 9120720858

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 122 unchanged lines in 11 files lost coverage.
  • Overall coverage decreased (-0.02%) to 89.608%

Files with Coverage Reduction New Missed Lines %
qiskit/passmanager/passmanager.py 1 92.63%
qiskit/transpiler/passes/synthesis/unitary_synthesis.py 2 88.2%
crates/qasm2/src/expr.rs 2 94.26%
crates/qasm2/src/lex.rs 4 92.11%
crates/accelerate/src/euler_one_qubit_decomposer.rs 4 89.68%
crates/qasm2/src/parse.rs 6 97.61%
qiskit/dagcircuit/dagnode.py 8 88.12%
qiskit/utils/parallel.py 10 82.76%
crates/circuit/src/circuit_instruction.rs 13 87.59%
crates/accelerate/src/sabre/neighbor_table.rs 13 73.13%
Totals Coverage Status
Change from base Build 9037095581: -0.02%
Covered Lines: 62297
Relevant Lines: 69522

💛 - Coveralls

@Eric-Arellano
Copy link
Collaborator

then potentially the cleaner workaround is to document this as being a potential method on object (which is in builtins)

Ah, good suggestion. @arnaucasau, pardon me leading you astray with builtins rather than object.

The other problem here is not Sphinx, it's the scripts that are scraping the HTML output.

Imo the problem is Sphinx. Sphinx is generating the HTML id as =array__ rather than __array__. Sphinx is specifically doing this when the currentmodule is set to None. Compare it to https://docs.quantum.ibm.com/api/qiskit/0.19/qiskit.transpiler.Layout#methods, where the HTML ID is correctly set by Sphinx as qiskit.transpiler.Layout.__getitem__. If we set currentmodule here to another value like object, then the ID gets set to object.__array__ correctly.

I know it's not super popular, but these sorts of problems will keep cropping up and needing work for as long as the API generation scripts keep scraping the HTML rather than using Sphinx's data directly, such as by writing a Sphinx Builder that directly generates the target MDX.

A non-trivial portion of the rendering bugs have been due to RST issues, rather than the conversion script: Qiskit/documentation#1200, Qiskit/documentation#1199, Qiskit/documentation#1275, #11744, #12162. Those would still be a problem.

The script-caused issues have been due to not knowing about generic edge cases—like aliases (Sampler vs SamplerV1). Now that those edge cases are fixed and tested, we haven't been seeing them regress.

It's always worth keeping that rewrite to a Sphinx builder in our back-pocket, but I don't think that's a good investment at the time.

@mtreinish mtreinish added Intern PR PR submitted by IBM Quantum interns and removed Community PR PRs from contributors that are not 'members' of the Qiskit repo labels May 14, 2024
@jakelishman
Copy link
Member

Sphinx is generating the HTML id as =array__ rather than __array__

Sure, but the id HTML attribute is meant to be purely opaque and not carry semantics - it's not part of Sphinx's data output other than a programmatic anchor point that's a little easier for humans to read than a random string. It's typically recommended that an id tag start with an ASCII letter, which is probably why Sphinx is dropping the underscores here - it's a type of escaping.

A non-trivial portion of the rendering bugs have been due to RST issues

Oh 100% - I'm not questioning that at all. I didn't mean to suggest that every problem on the docs is because of conversion script. Actually, mostly I'm not as concerned about the final rendering in the long term; rendering is the "easy" bit, in the sense that you can do almost anything with CSS if your structure is good, which you guys have been showing by making big improvements to the looks of the pages. It's the semantics of the generated page I'm concerned about, and that's where I think scraping the HTML is a scary avenue for errors and fragile to updates in Sphinx. At some point, I think a HTML scraping program is already doing as much work as a Sphinx builder on output (since you've got to convert your in-memory representation to MDX), it's just got an awful lot harder job on input, because it doesn't get given easily structured data, and has to guess what the original inputs were from that.


Anyway, it's not a practical change right now, and I'm sorry for being overly grumpy about the proposed PR.

@jakelishman jakelishman added documentation Something is not clear or an error documentation stable backport potential The bug might be minimal and/or import enough to be port to stable Changelog: None Do not include in changelog labels May 17, 2024
@jakelishman jakelishman added this pull request to the merge queue May 17, 2024
Merged via the queue into Qiskit:main with commit 8b56995 May 17, 2024
15 checks passed
mergify bot pushed a commit that referenced this pull request May 17, 2024
* Fix qiskit.circuit method header

* use object

* fix lint

* Correct method to be defined on `object`

---------

Co-authored-by: Jake Lishman <jake.lishman@ibm.com>
(cherry picked from commit 8b56995)
github-merge-queue bot pushed a commit that referenced this pull request May 17, 2024
#12424)

* Fix qiskit.circuit method header

* use object

* fix lint

* Correct method to be defined on `object`

---------

Co-authored-by: Jake Lishman <jake.lishman@ibm.com>
(cherry picked from commit 8b56995)

Co-authored-by: Arnau Casau <47946624+arnaucasau@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Changelog: None Do not include in changelog documentation Something is not clear or an error documentation Intern PR PR submitted by IBM Quantum interns stable backport potential The bug might be minimal and/or import enough to be port to stable
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

6 participants