From: Alexander Barkov
Date: December 29 2010 1:54pm
Subject: Re: bzr commit into mysql-5.1-bugteam branch (martin.hansson:3534)
Bug#58165
List-Archive: http://lists.mysql.com/commits/127668
Message-Id: <4D1B3D87.2050304@oracle.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Hi Martin,
Martin Hansson wrote:
> Hi Alexander and Sergey!
>
> Alexander Barkov skrev 2010-12-29 11.14:
>> Hi Sergey, Martin,
>>
>> Sergey Glukhov wrote:
>>> Hi Martin,
>>>
>>> On 12/28/2010 11:33 AM, Martin Hansson wrote:
>>>> Hi Sergey,
>>>>
>>>> Sergey Glukhov skrev 2010-12-23 12.44:
>>>>> Hi Martin, see comments below
>>>>>
>>>>> On 12/22/2010 03:30 PM, Martin Hansson wrote:
>>>>>> #At file:///data0/martin/bzrroot/bug58165/5.1bt-lazy_copy/ based
>>>>>> on revid:sergey.glukhov@stripped
>>>>>>
>>>>>>
>>>>>> modified:
>>>>>> mysql-test/r/func_str.result
>>>>>> mysql-test/t/func_str.test
>>>>>> sql/item_strfunc.cc
>>>>>> sql/sql_string.cc
>>>>> ...
>>>>>> === modified file 'sql/item_strfunc.cc'
>>>>>> --- a/sql/item_strfunc.cc 2010-11-11 10:25:23 +0000
>>>>>> +++ b/sql/item_strfunc.cc 2010-12-22 12:30:13 +0000
>>>>>> @@ -1305,8 +1305,15 @@ String *Item_func_substr_index::val_str(
>>>>>> null_value=0;
>>>>>> uint delimiter_length= delimiter->length();
>>>>>> if (!res->length() || !delimiter_length || !count)
>>>>>> - return&my_empty_string; // Wrong parameters
>>>>>> -
>>>>>> + {
>>>>>> + if (str->copy("", 0, default_charset_info)) // Wrong
>>>>>> parameters
>>>>>> + {
>>>>>> + /* Out of memory */
>>>>>> + null_value= true;
>>>>>> + return NULL;
>>>>>> + }
>>>>>> + return str;
>>>>>> + }
>>>>>> res->set_charset(collation.collation);
>>>>>>
>>>>>> #ifdef USE_MB
>>>>>>
>>>>>
>>>>> Using my_empty_string is not safe, the bug shows it.
>>>>> So we should remove my_empty_string from all places
>>>>> where it is used.
>>>> Removing my_empty_string altogether is way too large to be done as
>>>> an MRU bug fix (Believe me, I tried.) I suggest filing a separate
>>>> report about it. Let's make this customer happy first and then we'll
>>>> deal with the other problems we discovered while working on it.
>>> I was too optimistic with the request about removing my_empty_string
>>> from all places. As far as I can see problematic places are
>>> set_var.cc, sql_class.cc. However we could fix all places in
>>> item_strfunc.cc at least. it should be not a problem.
>>> It seems we should introduce new method, say
>>>
>>> Item::empty_string()
>>> {
>>> str_value.set("", 0, collation.collation);
>>> return &str_value;
>>> }
>>
>> Sergey, I like your idea about introducing this new method.
>> Please check if it can be moved to Item_str_func:
>>
>> String *Item_str_func::empty_string()
>> {
>> str_value.set("", 0, collation.collation);
>> return &str_value;
>> }
> Nice idea, will save us from cluttering up the code.
>>
>>>
>>> and replace &my_empty_string with empty_string().
>>>
>>
>> It should be safe to replace all &my_empty_string instances
>> to the new method calls in item_str_func.cc,
>> even under terms of a MRU bug. Looks like a fairly small change:
>> "grep my_empty_string item_strfunc.cc" reports about 22 lines.
> done. I've added a comment that it should be removed.
>>
>>
>> Note, my_empty_string is a reminder from 4.0 times,
>> when we had a single global character set for the entire server.
>> my_empty_string should be eventually removed everywhere from the
>> sources. Fixing item_strfunc.cc looks like a good start.
> Thank you for the background. It's really strange that the bug has
> stayed under the radar for that long.
I think this is because it shows up under very rare conditions,
e.g. wrong arguments, or end-of-memory.
>
> I also have a question for you: Why do we even return a *pointer* to a
> String object from the Item_*::val_str() methods? I see no good reason
> for this. The String class is really small, only 5 small members and we
> could almost always exploit the return value optimization anyways. We
> lose all of the safety benefits of the shared string buffers by passing
> pointers around. For instance this bug would not exists if we did that.
> To a C++ programmer this is completely insane.
>
> Some historical reason?
This is done for performance reasons.
If we returned objects instead of pointers, then for every row
we'd have to do "new String(...)", then allocate Ptr,
then send result to the client side in protocol.cc, and then
free the memory allocated for the String and its Ptr.
Now String is created only one time per Item, but what's more important,
Ptr() is allocated one time! So val_str() reuses the same buffer
for every row.
>
> Thank you for the review!
Will you commit the new version soon?
Thanks.
>
> Best Regards
>
> Martin