From: Rick James
Date: August 21 2012 12:19am
Subject: RE: Best design for a table using variant data
List-Archive: http://lists.mysql.com/mysql/228026
Message-Id: <2E7DD7ADE53B044C8C8BCD9C5829E1EB148A04797A@SP2-EX07VS01.ds.corp.yahoo.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Even if you get past those issues, you will get to other nasty problems... =
Ugly JOINs, terrible performance, huge disk footprint, etc.
Split the attributes into two groups:
* The ones you _really_ need to search on.
* The rest.
Build columns for the first group; use the appropriate datatypes.
Throw the rest into a JSON string, put that into another column. (I like t=
o compress the JSON and put it into a MEDIUMBLOB.)
For searching,...
Use SQL (WHERE ...) to filter on any attributes in the first group.
Then use your programming language (PHP, etc) to finish the filtering, afte=
r fetching (and uncompressing) the JSON.
> -----Original Message-----
> From: Gaston Gloesener [mailto:gaston.gloesener@stripped]
> Sent: Sunday, August 12, 2012 10:58 PM
> To: mysql@stripped
> Subject: RE: Best design for a table using variant data
>=20
> > You don't specify how many different types (including min/max values)
> > you
> expect to be using. If you expect to end up with a few hundred, then
> you should perhaps consider using an ENUM or SET column directly in the
> data table.
>=20
> I do not quite understand where the use of enums/sets could be help
> here.
>=20
> Regarding the number of basic types there are only expected to be a few
> similar to a programming language: At the beginning there will be 4
> types (integer, 64-bit integer, floating point (probably double) and
> strings with their associated constraints. Each single attribute (i.e.
> table row) would only use one of these types.
>=20
> Possibly my explanations where not clear enough, so let me take a
> virtual example which would match my needs: A database which represents
> a second hand shop. This shop would contain a number of things of
> different types.
> Let's take "cars" and "clothes" as example. Those two would have a
> number off attributes which depend on the type. I.e. the car would have
> Brand (String), Model (String), "manufacturing year" (Inetger, max
> 2012), horsepower (integer), ... while some of these are meaningless
> for the clothes which will have "size", ...
>=20
> These attributes are not known at development type and are to be
> defined by the users of the database and may be extended over time. As
> said above this is a virtual example which does not match the actual
> much more complex database use.
>=20
> / Gaston
>=20
> -----Original Message-----
> From: Carsten Pedersen [mailto:carsten@stripped]
> Sent: Friday, August 10, 2012 15:33
> To: Gaston Gloesener
> Cc: mysql@stripped
> Subject: Re: Best design for a table using variant data
>=20
> You don't specify how many different types (including min/max values)
> you expect to be using. If you expect to end up with a few hundred,
> then you should perhaps consider using an ENUM or SET column directly
> in the data table.
>=20
> / Carsten
>=20
> On 10.08.2012 10:51, Gaston Gloesener wrote:
> > Hello,
> >
> >
> >
> > I am currently facing a design where a table (virtually) needs to
> > store attributes of a topic (related table). The attributes can be
> > user defined, i.e. not known at development type and depend on other
> > factors. Each attributes value can be one of different types (int,
> > int64, double, string) and may have constraints like min, max or
> > length
> (string).
> >
> >
> >
> > Thus the data type would be modeled as variant in some programming
> > languages, but this is not an option in SQL (beside the MS SQLserver
> > sql_variant extension).
> >
> >
> >
> > So, how to simulate this in SQL.
> >
> >
> >
> > Basically there would be one table describing the attributes type
> > (Type identifier, min/max,.) and one table for the values itself.
> >
> >
> >
> > The design I am currently thinking of would be to make exactly these
> > two tables, with the attributes having a Dataype column and iMin,iMax
> > for integer, i64Min, i64Max, fMin,fMax for double , sMinLen, sMaxLen
> > for strings. The same applies to the value table which will have
> > iValue, i64Value, fValue, string columns to hold the actual data.
> >
> >
> >
> > Now the columns will be filled according to the data type, columns
> not
> > matching the type will be NULL.
> >
> >
> >
> > This means that each row in the table will have virtual space for any
> > data type which violates database normalization. However it seems to
> > me to be the best deal for performance and data space as NULL takes
> > virtually no room
> > (4/8 bytes in total for a number of fields in some circumstances) and
> > requires no complex queries.
> >
> >
> >
> > One could also imagine to have the constraints moved to a separate
> > table and interpreted according to the data type. Also a table for
> > each type could be imagined but this will make the queries very
> > complicated working against performance.
> >
> >
> >
> > Note: The model has to work for huge databases
> >
> >
> >
> > Anybody has a better alternative ?
> >
> >
> >
> >
> >
> >
>=20
> --
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe: http://lists.mysql.com/mysql
>=20
>=20
>=20
>=20
> --
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe: http://lists.mysql.com/mysql