I'll get back to this tomorrow evening. I promised to finish something
today. With respect to computing the sum of two random variables: this
turns out to be quite a challenge. To resolve this I decided to
formulate it as a possible assignment for students... Thus, this takes
somewhat longer than I expected. On the other hand, hopefully it leads
to some more scipy converts...
On 23 April 2012 22:04, <josef.pktd@gmail.com> wrote:
> On Mon, Apr 23, 2012 at 2:58 PM, nicky van foreest <vanforeest@gmail.com> wrote:
>>>> for xa, xb it doesn't matter whether they are larger or smaller than
>>>>> zero, so I don't think we need a special check
>>>> I think it does, for suppose that in the algo left = xa = 0.5 (because
>> the user has been fiddling with xa) and cdf(xa) > q. Then setting
>> left = 2*left will only worsen the problem. Or do I miss something?
>> True, however I don't think we have any predefined xa and xb that both
> are strictly positive or negative values.
> pareto is the only distribution bounded away from zero that I know and
> it has xa = -10
>>>>>>>> it looks good in a few more example cases.
>>>> I found another small bug, please see the included code.
>> later today
>>>>>>>>>>>>> The difficult cases will be where cdf also doesn't exist and we need
>>>>> to get it through integrate.quad, but I don't remember which
>>>>> distribution is a good case.
>>>> This case is harder indeed. (I assume you mean by 'not exist' that
>> there is no closed form expression for the cdf, like the normal
>> distribution). Computing the ppf would involve calling quad a lot of
>> times. This is wasteful especially since the computation of cdf(b)
>> includes the computation of cdf(a) for a < b, supposing that quad runs
>> from -np.inf to b. We could repair this by computing cdf(b) = cdf(a) +
>> quad(f, a, b), assuming that cdf(a) has been computed already.
>> (perhaps I am not clear enough here. If so, let me know.)
>> not exists = not defined as _cdf method could also be scipy.special
> if there are no closed form expressions
>> quad should run from dist.a to x, I guess, dist.a might be -inf
>>>>>>> I just think that we are not able to reach the q=0, q=1 boundaries,
>>>> since for some distributions we will run into other numerical
>>>> problems. And I'm curious how far we can get with this.
>>>> I completely missed to include a test on the obvious cases q >= 1. -
>> np.finfo(float).eps and q <= np.finfo(float).eps. It is now in the
>> attached file.
>>>>> findppf(stats.expon, 1e-30)
> -6.3593574850511882e-13
>> lower bound q can be small and won't run into problems with being 0,
> until 1e-300?
>> The right answer should be dist.b for q=numerically 1, lower support
> point is dist.a but I don't see when we would need it.
>>>>>>> Similarly, I don't know whether the default xa and xb are good. I
>>>> changed them for a few distributions, but only where I saw obvious
>>>> improvements.
>>>> I also have no clue what would be good values in general. The choices
>> seems reasonable from a practical point of view...
>>>>>>> Note: I removed the scale in your example, because internal _ppf works
>>>>> on the standard distribution, loc=0, scale=1. loc and scale are added
>>>>> generically in .ppf
>>>> Thanks. I included also **kwds so that I can pass scale = 10 or
>> something like this. Once all works as it should, I'll try to convert
>> the code such that it fits nicely in distributions.py.
>> with self instead of dist, it should already have the signature about
> right, no **kwds I assume
>>>>> The simultaneous updating of left and right in the previous algo is
>> wrong. Suppose for instance that cdf(left) < cdf(right) < q. Then both
>> left and right would `move to the left'. This is clearly wrong. The
>> included code should be better.
>> would move to the *right* ?
>> I thought the original was a nice trick, we can shift both left and
> right since we know it has to be in that direction, the cut of range
> cannot contain the answer.
>> Or do I miss the point?
>>>>> With regard to the values of xb and xa. Can a `ordinary' user change
>> these? If so, then the ppf finder should include some protection in my
>> opinion. If not, the user will get an error that brentq has not the
>> right limits, but this error might be somewhat unexpected. (What has
>> brentq to do with finding the ppf?) Of course, looking at the code
>> this is clear, but I expect most users will no do so.
>> I don't think `ordinary' users should touch xa, xb, but they could.
> Except for getting around the limitation in this ticket there is no
> reason to change xa, xb, so we could make them private _xa, _xb
> instead.
>>>>>> The code contains two choices about how to handle xa and xb. Do you
>> have any preference?
>> I don't really like choice 1, because it removes the use of the
> predefined xa, xb. On the other hand, with this extension, xa and xb
> wouldn't be really necessary anymore.
>> another possibility would be to try except brentq with xa, xb first
> and get most cases, and switch to your version if needed. I'm not sure
> xa, xb are defined well enough that it's worth to go this route,
> though.
>> Thanks for working on this,
>> Josef
>>>>> Thanks for your feedback. Very helpful.
>>>> _______________________________________________
>> SciPy-Dev mailing list
>>SciPy-Dev@scipy.org>>http://mail.scipy.org/mailman/listinfo/scipy-dev>>> _______________________________________________
> SciPy-Dev mailing list
>SciPy-Dev@scipy.org>http://mail.scipy.org/mailman/listinfo/scipy-dev