Josh Berkus wrote:
> Tom,
>
>
>>As for the dependency issue, one man's bug is another man's feature.
>>I think the fact that we don't track the internal dependencies of
>>functions is not all bad. We've certainly seen plenty of complaints
>>about how you can't easily change tables that a view is depending on
>>because the view dependencies block it...
>
>
> I'd agree with this. I write about 150,000 lines of function code a year,
> and if I had to rebuild all of the cascading functions every time I change
> a table they way I have to with views, it would probably add 20% to my
> overall application development time.
>
> BTW, the other thing that we're still TODOing on SRFs (as far as I know) is
> finding ways to change the row estimate for an SRF. It's still a flat
> 1000 in the code, which can cause a lot of bad query plans. I proposed a
> year ago that, as a first step, we allow the function owner to assign a
> static estimate variable to the function (i.e. "average rows returned =
> 5'). This doesn't solve the whole problem of SRF estimates but it would
> be a significant step forwards in being able to use them in queries.
>
This would only seem to work for trivial functions. Most functions that I write
are themselves dependent on underlying tables, and without any idea how many
rows are in the tables, and without any idea of the statistical distribution of
those rows, I can't really say anything like "average rows returned = 5".
What I have wanted for some time is a function pairing system. For each set
returning function F() I create, I would have the option of creating a
statistics function S() which returns a single integer which represents the
guess of how many rows will be returned. S() would be called by the planner,
and the return value of S() would be used to decide the plan. S() would need
access to the table statistics information. I imagine that the system would
want to prevent S() from running queries, and only allow it to call certain
defined table statistics functions and some internal math functions, thereby
avoiding any infinite recursion in the planner. (If S() ran any queries, those
queries would go yet again to the planner, and on down the infinite recursion
you might go.)
Of course, some (possibly most) people could chose not to write an S() for their
F(), and the default of 1000 rows would continue to be used. As such, this new
extension to the system would be backwards compatible to functions which don't
have an S() defined.
Thoughts?