GSoC ideas

Cera, Tim <tim <at> cerazone.net>
2015-03-03 15:30:54 GMT

These ideas are things that I wanted to tackle, but they might instead get better play as part of the GSoC program.

1. In the scipy.ndimage​ package there are many functions that have a 'mode' option to define the padding process to use to minimize edge issues. I think would be better to use the pad function in numpy. With this approach, as improvements are made in the numpy pad function, the scipy.ndimage package benefits.

2. Something that could be useful to me is to update ODRPACK to ODRPACK95. ODRPACK95 can be found at http://www.netlib.org/toms/869.zip. At the same time suggest to implement the new odr in scipy.optimize. License is not defined in the ODRPACK95 software but I found this in the netlib FAQ:

Most netlib software packages have no restrictions on their use but we recommend you check with the authors to be sure. Checking with the authors is a nice courtesy anyway since many authors like to know how their codes are being used.

Facilitate navigating to latest version of the docs

Robert McGibbon <rmcgibbo <at> gmail.com>
2015-03-03 10:38:52 GMT

Hey,

I've noticed that often, when googling for the documentation of a scipy function, I often get the docs for that function from a mix of different versions of scipy. Furthermore, on the mailing list, it's somewhat common for people to ask questions about a function that are based on the docstrings for an older version scipy (this might be because they're using an older version of scipy, but I think in many cases it's what came up in their search).

In the web documentation for scikit-learn, the version that you're browsing is displayed somewhat prominently. Also for particularly old versions of the docs, a red bar at the top of the screen lets you know that you're browsing an outdated version, and offers a link to the latest stable version. See this page, for example.

results in taking a single step that (typically) goes beyond the output time requested in the solver. When doing, for example, monte carlo algorithms, this leads to a big performance hit because one must take a step back, reset the solver and then use the normal mode to go to the requested stop time. Instead, these solvers support a mode (5) that will never step beyond the end time. The modified step function is in that case:

Currently in order to implement this, one needs to create their own ODE integrator subclass of VODE or ZVODE, overload the step function, then create an ode instance and then finally add the custom integrator using ode._integrator. I think supporting both options natively would be a nice thing to have in SciPy.

In addition, often it is not necessary to do a full reset of the ode solver using ode.reset(). Often times one just needs to change the RHS vector (and possibly the time) and set the flag for the solver to start anew (ode._integrator.call_args[3] = 1). This to results in a large performance benefit for things like monte carlo solving. Right now I need to call

ode._y = new_vec

ode._integrator.call_args[3] = 1

when I want to accomplish this. Adding support for a “fast reset” might also be a good thing to have in SciPy.

Split signal.lti class into subclasses

Felix Berkenkamp <befelix <at> ethz.ch>
2015-03-01 15:47:30 GMT

Hi everyone,
I started looking into improving the signal.lti class following the
issue discussed at
https://github.com/scipy/scipy/issues/2912
The pull request can be found here:
https://github.com/scipy/scipy/pull/4576
The main idea is to split the lti class into ss, tf, and zpk subclasses.
Calling the lti class itself returns instances of these three subclasses.
Advantages
* No redundant information (lti class currently holds the information of
all 3 classes)
* Reduce overhead (creating 3 system representations)
* Switching between the different subclasses is more explicit: obj.ss(),
obj.tf(), obj.zpk()
* Avoids one huge class for everything
* Is fully backwards compatible (as far as I can tell)
* Similar to what Octave / Matlab does (easier to convert code from
there to scipy)
Disadvantages:
* Accessing properties that are not part of the subclass is more
expensive (e.g.: sys = ss(1,1,1,1); sys.num --> this now returns
sys.tf().num).
Any suggestions / comments / things I've broken?
Best,
Felix

Again partially answered in PR: "It's stalled: the algorithmic part is OK, the new interfaces proposed controversial.", "However, this could perhaps be extended to Levenberg-Marquardt supporting sparse Jacobians"

3) Based on 2: how GSOC student should proceed with interface issue? I mean there weren't any strong opinions and it was on the list for so long. I have no idea how to come up with a good solution all of a sudden.

4) Do you believe that code written during GSOC should be based on PR mentioned?

Spherical Harmonics and Condon-Shortley phase

I'm working with auralization and Ambisonics, and the directivity patterns that are used with Ambisonics are spherical harmonics. Scipy has an implementation, scipy.special.sph_harm. Several definitions exist however for spherical harmonics, and the documentation does not specify which is implemented.

A common definition that is used in quantum-mechanics includes the Condon-Shortley phase, which is a (-1)**m factor.

Re: Slow moment function in scipy.stats

stefan <stefan.peterson <at> rubico.com>
2015-02-24 11:42:36 GMT

Hi Ralf and Julian
I have created a pull request (I hope, very limited experience with git and github). After doing so, I saw Julian's post. Indeed, pow is very accurate. I would argue that for all realistic uses of the moment function, the inaccuracies will be insignificant, but this is certainly up for discussion.
BR,
Stefan
On 02/24/2015 08:17 AM, Ralf Gommers wrote:
> Hi Stefan,
>
>
> On Thu, Feb 19, 2015 at 12:55 PM, stefan <stefan.peterson <at> rubico.com
> <mailto:stefan.peterson <at> rubico.com>> wrote:
>
> Hello list,
>
> First time poster here. Anyway, some time ago I noticed that the
> scipy skewness function was a major bottleneck in an algorithm of
> mine. Back then, I typed up my own replacement and thought no more
> about it. Today, for some unknown reason, I decided to dig a little
> deeper in this and found the major culprit to be the way moments are
> computed, specifically the use of np.power.
>
>
> np.power is indeed slow, see for explanations:
> http://stackoverflow.com/questions/25254541/why-is-numpy-power-60x-slower-than-in-lining
> http://stackoverflow.com/questions/26770996/why-is-numpy-power-slower-for-integer-exponents
>
>
> I'd say that replacing one call to np.power with 6 lines of code to
> achieve a ~10x speedup is a good tradeoff. Pull request is welcome:)
>
its faster but also less accurate, the reason pow is so slow is that it
has an accuracy of 0.5 ulp regardless of input, which a multiplication
does not.
But the argument can certainly be made that this is irrelevant
especially as the moment computation performed here is numerically not
that stable in the first place (uncorrected loss of significance in the
subtraction, mean has an error in the order of the array size ...)
maybe numpy should have a special integer case in power for such situations?

scipy.signal function naming

Ralf Gommers <ralf.gommers <at> gmail.com>
2015-02-24 05:47:19 GMT

Hi,

Historically many names in scipy.signal have followed Matlab, which typically chooses short but very nondescriptive names. I would prefer to not keep doing that but instead choose readable names that fit with the rest of Scipy and Python. Recent examples from PRs, with in brackets a proposed alternative: