Øystein Grøvlen wrote:
> Jay,
>
> Thanks for taking time to share your ideas. I agree that we could gain
> a lot if we could pass value objects around instead of going through all
> the transformations you describe. However, this will be a pretty large
> task, but I think what I have done so far is a step in the right
> direction, and hopefully this work can evolve further.
Agreed :)
> Thanks,
>
> --
> Øystein
>
> Jay Pipes wrote:
>> Hi! I've looked through your code and have a few comments inline.
>> Please understand these are just suggestions; I'm actually very
>> supportive of the concept of Value objects in the runtime, and I'll
>> try to explain some alternate strategies to ponder...
>>
>> I will say ahead of time that I was not surprised to see this thread
>> immediately devolve into a discussion on the memory and performance
>> ramifications of the code change. It seems that the performance of
>> the bits most often takes precedence over everything else, especially
>> at the cost of maintainability and code readability. A Value Object
>> system is designed to increase code maintainability and ease of use.
>> It is *not* designed around performance. It may well be true that a
>> Value object system may add additional memcpy's. It may also be true
>> that a Value object system may reduce the *total* number of
>> conversions/copies that need to be done in the runtime.
>>
>> Hopefully the comments below will shed some light on this, but I will
>> state now that my fondness for the Value object system stems not from
>> a desire for increased performance, but from a need for greater system
>> maintainability.
>>
>> :)
>>
>> Øystein Grøvlen wrote:
>>> Hi,
>>>
>>> I am working on extracting much duplicated type conversion code from
>>> Item classes and Field classes into a Value class. (See
>>> http://forge.mysql.com/worklog/task.php?id=4760). Note that the
>>> architectural review for this worklog is still pending, but a
>>> preliminary patch for the Field classes part can be found at
>>> http://lists.mysql.com/commits/74564.
>>> Comments are very welcomed.
>>>
>>> I am currently looking at the interaction between Field and Protocol,
>>> and would appreciate some input.
>>>
>>> Each Field subclass has a send_binary() method (a bit misleading name
>>> since it depends on the protocol whether what it sends is binary or
>>> textual.) Currently, these methods calls a corresponding
>>> Protocol::store_xxx() method, and in my preliminary patch this looks
>>> something like this:
>>>
>>> bool Field_short::send_binary(Protocol *protocol)
>>> {
>>> return protocol->store_short(value().get_int());
>>> }
>>>
>>> I think it would be a good idea to add a Protocol::store(Value)
>>> method that will be called from Field::send_binary(Protocol*). Then
>>> all the overriding send_binary methods in the Field subclasses could
>>> be dropped.
>>> Protocol_text::store(Value) could just call Value::to_string() to get
>>> the textual representation of the Value. The question is what to do
>>> for Protocol_binary and other protocols.
>>
>> May I present an alternate strategy?
>>
>> Currently, you have one large mysql::Value class (kudos on using
>> namespacing BTW!) which "understands" how to convert its value into a
>> number of different native "types" (custom String class, integers,
>> my_decimal, etc... using the to_xxx() methods).
>>
>> Up until now, this is basically what the Item class and its derived
>> classes already do via the val_str(), val_int(), etc methods.
>>
>> So, you've actually just added an additional layer of abstraction on
>> top of the already complex Item tree. The benefit of a Value object
>> system has yet to be realized since you have, to now, only really
>> duplicated the current system of type conversion within the server.
>>
>> The true advantage of a Value object system is in two respects:
>>
>> * the immutability of Value objects, once constructed
>> * the ability of Value objects to transform themselves into another
>> Value object via construction
>>
>> By this last bullet point I mean something like the following:
>>
>> Number a= Number(20080911123059);
>> DateTime b= DateTime(a);
>>
>> This may look like a trivial piece of code...but in there is some
>> beauty which can remove or encapsulate a large chunk of type
>> conversion code in the server and allow code to be written in a more
>> natural (IMHO) manner.
>>
>> Assume that Number and DateTime are "Value objects". Perhaps they
>> inherit from a base mysql::Value class, perhaps they don't. By saying
>> they are "Value objects", I mean that:
>>
>> a) Number and DateTime objects, once constructed, are immutable
>> b) Number and DateTime can be constructed from instances of each other
>>
>> Ignore, for now, the implementation of the above Value objects Number
>> and DateTime (operator and constructor overloads). If we take the
>> above interface (of construction via another Value object instance),
>> then we can begin to refactor two distinct parts of the parser and
>> runtime which revolve around:
>>
>> * Construction of constant things
>> * Transformation and reduction of constant things into other constant
>> things
>>
>> An example of the former would be in the Item tree constructed via a
>> simple statement such as:
>>
>> CREATE TABLE t1 (
>> b DATETIME NOT NULL
>> );
>>
>> SELECT b FROM t1 WHERE b BETWEEN '20080911123059' AND '20090911123059';
>>
>> In the current server, the SELECT statement above is parsed into a
>> Select_lex structure which contains a series of Item, Table and Field
>> objects. The Item objects represent the constant strings
>> '20080911123059' and '20090911123059' as Item_string objects. The
>> Field object represents the "b" field as a Field_datetime.
>>
>> Because Field_datetime is evaluated by the MyISAM and HEAP storage
>> engines and runtime as 64-bit integers, each Item_string's val_int()
>> method is called to return a signed 64-bit integer number that is
>> passed to the runtime during its evaluation of the COND representing
>> the between condition on the "b" field. (Transformation #1)
>>
>> These 64-bit signed integer are then passed to the
>> Field_datetime::store() method in the runtime and the runtime asks the
>> storage engine to give it an appropriate key to use in reading the
>> data from the t1 table.
>>
>> Before it can pass the data to the storage engine, however,
>> Field_datetime::store() must first verify that the passed integer is
>> indeed a valid datetime. So, it "parses" the signed 64-bit integer
>> into a timestamp-like number (see sql/time.cc:number_to_datetime())
>> represented by the MYSQL_TIME struct. This struct contains temporal
>> information like year, month, day, etc. (Transformation #2)
>>
>> Field_datetime::store() then places the pieces of this MYSQL_TIME
>> struct into a series of raw uchar* bytes, pointed to by the
>> Field::ptr member. (Transformation #3)
>>
>> The storage engine can then use these raw uchar* bytes to find and
>> retrieve the records needed by the call from the runtime to the record
>> pointer needed during send of the record back to the client over the
>> Protocol.
>>
>> However, the Protocol itself will see (in Item::send()) that the
>> returned Field data type is of type MYSQL_TYPE_DATETIME, and will
>> construct a MYSQL_TIME instance by calling
>> Field_datetime::get_date(*MYSQL_TIME).
>>
>> The get_date() method then takes the value of the retrieved uchar*
>> records, converts the uchar* into a signed 64-bit integer (via macros
>> in sql/korr.h) (Transformation #4). These 64-bit integers are then
>> passed to the number_to_datetime() method to construct the MYSQL_TIME
>> structs. (Transformation #5) and these are passed back to the
>> Item::send() method.
>>
>> Item::send() then calls Protocol::store(*MYSQL_TIME), passing in the
>> MYSQL_TIME structs. This method then transforms each MYSQL_TIME
>> struct into a series of char* via the datetime_to_str() methods, which
>> calls sprintf() to change the data into a textual format of
>> "YYYY-MM-DD HH:MM:SS" (Transformation #6) which is then sent along the
>> wire in text format...
>>
>> As you can see, there are quite a few transformations which occur for
>> both the constant strings as well as the datetime data sent/retrieved
>> from the storage engine.
>>
>> Once functions such as DATE_FORMAT(), CAST(), STR_TO_DATE() and others
>> get involved, the whole process can be downright overwhelming! :)
>>
>> Currently, when you follow the code in the server, you weave in and
>> out of the various val_xxx(), store(xxx), get_xxx() methods, and in
>> the case of decimal and datetime conversions, all their associated
>> routines. It can be very difficult to follow at times.
>>
>> A Value object implementation can simplify much of this code spaghetti
>> and even get rid of many of the Item classes entirely, namely all of
>> the Item_xxx_typecast classes and the Cached_item classes.
>>
>> So, how to go about implementing a Value object system in the runtime?
>>
>> 1) Create a hierarchy of Value classes subclassing from a Value class.
>> You'll need classes for temporal objects like Date and Timestamp as
>> well as classes for a Number (don't have to break it into Decimal and
>> Natural, I'd encapsulate all that in one class) and a String class of
>> course...but not like the current String class; you'll want an
>> immutable one.
>>
>> 2) Instead of the parser creating Item_string or Item_num objects, it
>> would create immutable Value objects of type String and Number.
>>
>> 3) For each column in a SELECT's result and each WHERE condition you
>> might create a vector<> of Value object pointers. Instead of calling
>> val_int(), val_str() and likewise, simply use the Session's vector<>
>> of Value objects as needed. Push and pop objects off the vector as
>> needed, and construct new Value objects from other ones.
>>
>> For example, Let's take the example above. The parser would construct
>> a new String value object from the string "20080911123059".
>>
>> After parsing, some analysis of the parsed nodes is done, including a
>> name resolution step. During this step, the "b" column would be
>> determined to be of type DATETIME. A DateTime value object for each
>> side of the condition would then be pushed onto the Session's object
>> vector<> like so:
>>
>> String *left_side= <call to get the left String value of the condition>
>> session.values.push_back((Value *) new DateTime(left_side));
>>
>> Within the optimizer, instead of using Field_datetime::store() to both
>> validate and transform the datetime-string data, the optimizer would
>> simply access the condition's constants like so:
>>
>> DateTime *left_side= (DateTime *) session.values[0];
>> DateTime *right_side= (DateTime *) session.values[1];
>>
>> If temporal functions were used in a statement (say, DATE_FORMAT()),
>> it could work with DateTimes as DateTimes, with calendrical
>> calculations native to a DateTime object, instead of constantly having
>> to convert args[0] to a MYSQL_TIME or an integer or a string (just see
>> the amount of code in the temporal built-in functions for doing
>> conversion...)
>>
>> For implementing DATE_ADD(), for instance, there's a ton of code which
>> could be encapsulated in a Value object system that understands how to
>> add and subtract dates and times properly...
>>
>> Anyway, these are all just thoughts to get the ideas flowing, and
>> nothing more. In Drizzle-land, we're constantly thinking about this
>> very problem and how to tackle it. It's not an easy one to address
>> with the current architecture, but hopefully we can share our
>> successes and failures and collaborate together on this :)
>>
>> Cheers!
>>
>> Jay
>>

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.