Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhanced the efficiency of power calculations #26606

Merged
merged 3 commits into from
May 20, 2024

Conversation

haru-44
Copy link
Member

@haru-44 haru-44 commented May 18, 2024

References to other Issues or PRs

Brief description of what is fixed or changed

Other comments

Release Notes

NO ENTRY

@sympy-bot
Copy link

Hi, I am the SymPy bot. I'm here to help you write a release notes entry. Please read the guide on how to write release notes.

  • No release notes entry will be added for this pull request.
Click here to see the pull request description that was parsed.
<!-- Your title above should be a short description of what
was changed. Do not include the issue number in the title. -->

#### References to other Issues or PRs
<!-- If this pull request fixes an issue, write "Fixes #NNNN" in that exact
format, e.g. "Fixes #1234" (see
https://tinyurl.com/auto-closing for more information). Also, please
write a comment on that issue linking back to this pull request once it is
open. -->


#### Brief description of what is fixed or changed


#### Other comments


#### Release Notes

<!-- Write the release notes for this release below between the BEGIN and END
statements. The basic format is a bulleted list with the name of the subpackage
and the release note for this PR. For example:

* solvers
  * Added a new solver for logarithmic equations.

* functions
  * Fixed a bug with log of integers. Formerly, `log(-x)` incorrectly gave `-log(x)`.

* physics.units
  * Corrected a semantical error in the conversion between volt and statvolt which
    reported the volt as being larger than the statvolt.

or if no release note(s) should be included use:

NO ENTRY

See https://github.com/sympy/sympy/wiki/Writing-Release-Notes for more
information on how to write release notes. The bot will check your release
notes automatically to see if they are formatted correctly. -->

<!-- BEGIN RELEASE NOTES -->
NO ENTRY
<!-- END RELEASE NOTES -->

Copy link

github-actions bot commented May 18, 2024

Benchmark results from GitHub Actions

Lower numbers are good, higher numbers are bad. A ratio less than 1
means a speed up and greater than 1 means a slowdown. Green lines
beginning with + are slowdowns (the PR is slower then master or
master is slower than the previous release). Red lines beginning
with - are speedups.

Significantly changed benchmark results (PR vs master)

Significantly changed benchmark results (master vs previous release)

| Change   | Before [2487dbb5]    | After [8519e0f7]    |   Ratio | Benchmark (Parameter)                                                |
|----------|----------------------|---------------------|---------|----------------------------------------------------------------------|
| -        | 70.8±0.6ms           | 44.2±0.2ms          |    0.62 | integrate.TimeIntegrationRisch02.time_doit(10)                       |
| -        | 68.1±0.7ms           | 43.4±0.4ms          |    0.64 | integrate.TimeIntegrationRisch02.time_doit_risch(10)                 |
| +        | 18.7±0.6μs           | 29.8±0.6μs          |    1.59 | integrate.TimeIntegrationRisch03.time_doit(1)                        |
| -        | 5.41±0.02ms          | 2.87±0.01ms         |    0.53 | logic.LogicSuite.time_load_file                                      |
| -        | 73.0±0.3ms           | 28.5±0.1ms          |    0.39 | polys.TimeGCD_GaussInt.time_op(1, 'dense')                           |
| -        | 73.0±0.3ms           | 28.9±0.2ms          |    0.4  | polys.TimeGCD_GaussInt.time_op(1, 'sparse')                          |
| -        | 252±1ms              | 124±0.5ms           |    0.49 | polys.TimeGCD_GaussInt.time_op(2, 'dense')                           |
| -        | 251±1ms              | 125±0.4ms           |    0.5  | polys.TimeGCD_GaussInt.time_op(2, 'sparse')                          |
| -        | 643±2ms              | 374±3ms             |    0.58 | polys.TimeGCD_GaussInt.time_op(3, 'dense')                           |
| -        | 653±2ms              | 372±1ms             |    0.57 | polys.TimeGCD_GaussInt.time_op(3, 'sparse')                          |
| -        | 496±4μs              | 283±3μs             |    0.57 | polys.TimeGCD_LinearDenseQuadraticGCD.time_op(1, 'dense')            |
| -        | 1.74±0.01ms          | 1.04±0.01ms         |    0.6  | polys.TimeGCD_LinearDenseQuadraticGCD.time_op(2, 'dense')            |
| -        | 5.77±0.07ms          | 3.07±0.03ms         |    0.53 | polys.TimeGCD_LinearDenseQuadraticGCD.time_op(3, 'dense')            |
| -        | 450±10μs             | 228±2μs             |    0.51 | polys.TimeGCD_QuadraticNonMonicGCD.time_op(1, 'dense')               |
| -        | 1.48±0.01ms          | 680±4μs             |    0.46 | polys.TimeGCD_QuadraticNonMonicGCD.time_op(2, 'dense')               |
| -        | 4.85±0.02ms          | 1.65±0.01ms         |    0.34 | polys.TimeGCD_QuadraticNonMonicGCD.time_op(3, 'dense')               |
| -        | 369±1μs              | 207±1μs             |    0.56 | polys.TimeGCD_SparseGCDHighDegree.time_op(1, 'dense')                |
| -        | 2.40±0.02ms          | 1.24±0ms            |    0.52 | polys.TimeGCD_SparseGCDHighDegree.time_op(3, 'dense')                |
| -        | 9.97±0.05ms          | 4.41±0.02ms         |    0.44 | polys.TimeGCD_SparseGCDHighDegree.time_op(5, 'dense')                |
| -        | 356±3μs              | 172±1μs             |    0.48 | polys.TimeGCD_SparseNonMonicQuadratic.time_op(1, 'dense')            |
| -        | 2.49±0.01ms          | 892±3μs             |    0.36 | polys.TimeGCD_SparseNonMonicQuadratic.time_op(3, 'dense')            |
| -        | 9.59±0.1ms           | 2.66±0.02ms         |    0.28 | polys.TimeGCD_SparseNonMonicQuadratic.time_op(5, 'dense')            |
| -        | 1.01±0.01ms          | 429±4μs             |    0.42 | polys.TimePREM_LinearDenseQuadraticGCD.time_op(3, 'dense')           |
| -        | 1.70±0.01ms          | 520±3μs             |    0.31 | polys.TimePREM_LinearDenseQuadraticGCD.time_op(3, 'sparse')          |
| -        | 5.92±0.04ms          | 1.78±0.01ms         |    0.3  | polys.TimePREM_LinearDenseQuadraticGCD.time_op(5, 'dense')           |
| -        | 8.36±0.02ms          | 1.52±0ms            |    0.18 | polys.TimePREM_LinearDenseQuadraticGCD.time_op(5, 'sparse')          |
| -        | 290±0.5μs            | 64.7±0.5μs          |    0.22 | polys.TimePREM_QuadraticNonMonicGCD.time_op(1, 'sparse')             |
| -        | 3.48±0.03ms          | 394±6μs             |    0.11 | polys.TimePREM_QuadraticNonMonicGCD.time_op(3, 'dense')              |
| -        | 3.95±0.01ms          | 283±1μs             |    0.07 | polys.TimePREM_QuadraticNonMonicGCD.time_op(3, 'sparse')             |
| -        | 7.06±0.07ms          | 1.25±0ms            |    0.18 | polys.TimePREM_QuadraticNonMonicGCD.time_op(5, 'dense')              |
| -        | 8.63±0.06ms          | 850±9μs             |    0.1  | polys.TimePREM_QuadraticNonMonicGCD.time_op(5, 'sparse')             |
| -        | 5.00±0.02ms          | 3.03±0.01ms         |    0.61 | polys.TimeSUBRESULTANTS_LinearDenseQuadraticGCD.time_op(2, 'sparse') |
| -        | 12.0±0.09ms          | 6.63±0.02ms         |    0.55 | polys.TimeSUBRESULTANTS_LinearDenseQuadraticGCD.time_op(3, 'dense')  |
| -        | 22.3±0.2ms           | 9.16±0.1ms          |    0.41 | polys.TimeSUBRESULTANTS_LinearDenseQuadraticGCD.time_op(3, 'sparse') |
| -        | 5.22±0.05ms          | 885±9μs             |    0.17 | polys.TimeSUBRESULTANTS_QuadraticNonMonicGCD.time_op(1, 'sparse')    |
| -        | 12.5±0.04ms          | 7.17±0.06ms         |    0.57 | polys.TimeSUBRESULTANTS_QuadraticNonMonicGCD.time_op(2, 'sparse')    |
| -        | 100±0.8ms            | 26.2±0.06ms         |    0.26 | polys.TimeSUBRESULTANTS_QuadraticNonMonicGCD.time_op(3, 'dense')     |
| -        | 164±0.7ms            | 55.0±0.1ms          |    0.34 | polys.TimeSUBRESULTANTS_QuadraticNonMonicGCD.time_op(3, 'sparse')    |
| -        | 174±1μs              | 113±2μs             |    0.65 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(1, 'dense')      |
| -        | 360±1μs              | 214±0.7μs           |    0.6  | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(1, 'sparse')     |
| -        | 4.20±0.03ms          | 844±3μs             |    0.2  | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(3, 'dense')      |
| -        | 5.26±0.01ms          | 385±2μs             |    0.07 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(3, 'sparse')     |
| -        | 19.8±0.09ms          | 2.80±0.01ms         |    0.14 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(5, 'dense')      |
| -        | 22.7±0.2ms           | 633±2μs             |    0.03 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(5, 'sparse')     |
| -        | 483±4μs              | 137±0.8μs           |    0.28 | polys.TimeSUBRESULTANTS_SparseNonMonicQuadratic.time_op(1, 'sparse') |
| -        | 4.82±0.03ms          | 612±2μs             |    0.13 | polys.TimeSUBRESULTANTS_SparseNonMonicQuadratic.time_op(3, 'dense')  |
| -        | 5.24±0.03ms          | 141±0.7μs           |    0.03 | polys.TimeSUBRESULTANTS_SparseNonMonicQuadratic.time_op(3, 'sparse') |
| -        | 13.1±0.08ms          | 1.33±0ms            |    0.1  | polys.TimeSUBRESULTANTS_SparseNonMonicQuadratic.time_op(5, 'dense')  |
| -        | 13.6±0.06ms          | 144±0.8μs           |    0.01 | polys.TimeSUBRESULTANTS_SparseNonMonicQuadratic.time_op(5, 'sparse') |
| -        | 132±0.7μs            | 75.3±1μs            |    0.57 | solve.TimeMatrixOperations.time_rref(3, 0)                           |
| -        | 251±1μs              | 89.0±0.4μs          |    0.36 | solve.TimeMatrixOperations.time_rref(4, 0)                           |
| -        | 24.3±0.2ms           | 10.3±0.08ms         |    0.42 | solve.TimeSolveLinSys189x49.time_solve_lin_sys                       |
| -        | 28.9±0.3ms           | 15.3±0.1ms          |    0.53 | solve.TimeSparseSystem.time_linsolve_Aaug(20)                        |
| -        | 56.0±0.2ms           | 24.9±0.1ms          |    0.44 | solve.TimeSparseSystem.time_linsolve_Aaug(30)                        |
| -        | 28.7±0.2ms           | 15.2±0.2ms          |    0.53 | solve.TimeSparseSystem.time_linsolve_Ab(20)                          |
| -        | 55.9±0.3ms           | 24.7±0.2ms          |    0.44 | solve.TimeSparseSystem.time_linsolve_Ab(30)                          |

Full benchmark results can be found as artifacts in GitHub Actions
(click on checks at the top of the PR).

result = result*self
# this method can be improved instead of just returning the
# multiplication of elements
x = self.inverse()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to just use pow(x, n)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will a RecursionError occur?

@asmeurer
Copy link
Member

This looks reasonable to me, but we should check that all these methods are actually tested with nontrivial powers.

Comment on lines 598 to 604
result = [0]*len(self)
if not n:
return self.rebuild(result)
x = self.exponents
while True:
if n % 2:
result = monomial_mul(result, x)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this just use monomial_pow?

@oscarbenjamin
Copy link
Contributor

In some cases iterated multiplication is more efficient than exponentiation by squaring if multiplication is not constant time (e.g. for polynomials).

@haru-44
Copy link
Member Author

haru-44 commented May 20, 2024

In some cases iterated multiplication is more efficient than exponentiation by squaring if multiplication is not constant time (e.g. for polynomials).

Is that the case for small n?
For sufficiently large n, multiplication can still be efficient even if it is not constant time. If the computational cost of multiplication is $m(n)$, the cost of iterated multiplication is $O(m(n)n)$, while the cost of exponentiation is $O(m(n) \log n)$.

@oscarbenjamin
Copy link
Contributor

The computational cost of multiplication depends on the size of both operands like M(n1,n2) and for e.g. polynomials with n1 and n2 terms is quadratic like M(n1,n2) ~ n1*n2 at least if the basic polynomial multiplication algorithm is used. In iterated multiplication one operand stays small but in exponentiation by squaring both operands grow.

Consider something like (x + y + z + t)^8. The number of terms is:

In [37]: [len((Poly(x+y+z+t)**n).coeffs()) for n in range(1, 8+1)]
Out[37]: [4, 10, 20, 35, 56, 84, 120, 165]

In exponentiation by squaring the number of multiplies in the ground domain is:

In [38]: 4*4 + 10*10 + 35*35
Out[38]: 1341

For iterated multiplication it is:

In [45]: 4*4 + 4*10 + 4*20 + 4*35 + 4*56 + 4*84 + 4*120
Out[45]: 1316

These numbers are close but raising to the power 16 the difference is greater:

In [51]: 4*sum([len((Poly(x+y+z+t)**n).coeffs()) for n in range(1, 16)])
Out[51]: 15500

In [52]: 4*4 + 10*10 + 35*35 +165*165
Out[52]: 28566

The more symbols you have and the higher the exponent then the more iterated multiplication is better.

If you have a polynomial $p$ of degree $d$ in $r$ symbols and you want to compute $p^n$ then the iterative method is $O(d^{2r}n^{r+1})$ and exponentiation by squaring is $O(d^{2r}n^{2r})$ assuming constant time coefficient multiplication. The iterative method is asymptotically faster by a factor of $n^{r-1}$.

Counting only the number of coefficient multiplies assumes that multiplication in the coefficient domain is constant which is only true for RR, CC and GF(p). In other domains like ZZ the coefficients grow in size as well as the number of terms compounding the effect that multiplying two large operands is slower than multiplying a small operand by a larger one.

A more complete analysis is in e.g.

[HEI72] L. E. HEINDEL, "Computation of Powers of Multivariate Polynomials over the
Integers," J. Comput. System Sci. 1 (1972), pp. 1-8.

https://www.sciencedirect.com/science/article/pii/S0022000072800370

There are also faster algorithms that are more complicated but if we want to stick to simple algorithms then we should remember that iterated multiplication is not necessarily slower than exponentiation by squaring.

@haru-44
Copy link
Member Author

haru-44 commented May 20, 2024

Thanks for the explanation. I lacked a thorough understanding of multivariate polynomials.

@oscarbenjamin
Copy link
Contributor

Looks good. Thanks

@oscarbenjamin oscarbenjamin merged commit 3cc42c4 into sympy:master May 20, 2024
48 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants