From: Jan Steinman
Date: September 20 2012 3:59am
Subject: Re: Making Myself Crazy
List-Archive: http://lists.mysql.com/mysql/228206
Message-Id: <377BFE80-9550-4A50-9FC9-A12CFF7ACFA4@bytesmiths.com>
MIME-Version: 1.0 (Apple Message framework v1084)
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: quoted-printable
Thanks for your help, Rick!
Interspersed are some questions and rationales for you to shoot down... =
:-)
> From: Rick James
>=20
> s_product_sales_log has no PRIMARY KEY. All InnoDB tables 'should' =
have an explicit PK.
This table really has no identifying information. There could be two =
identical, valid rows, if the same person sold the same amount of the =
same product to the same other person on the same day.
All the foreign keys were indexed. Is there something I don't understand =
about something a PK field does? If an individual record cannot be =
uniquely identified by its information, is there really any need for a =
primary key?
None the less, I added field "ID" as an unsigned autoincrement INT and =
made it PK.
> INT(5) is not what you think. INT is always a 32-bit, 4-byte =
quantity, regardless of the number.
>=20
> Use TINYINT UNSIGNED, SMALLINT UNSIGNED, etc. wherever reasonable.
Understood. I make all my keys UINT even when they could be smaller, =
because I've gotten into some gnarly consistency problems. with foreign =
keys.
>> KEY `is_value_added` (`is_value_added`),
> A single-column INDEX on a flag, ENUM, and other low-cardinality =
field, is almost always useless.
Why is that? Surely, even a flag separates the record space into two?
> Performance issues...
>> WHERE YEAR(sales.`Date`) =3D '{{{1}}}'
> won't use
>> KEY `Date` (`Date`),
> because the column (Date) is hidden in a function. A workaround:
> WHERE `Date` >=3D '{{{1}}}-01-01'
> AND `Date` < '{{{1}}}-01-01' + INTERVAL 1 YEAR
Thanks! Good catch.
> JOINing two subqueries -- There is no way to index either of them, so =
the JOIN will have to do a table scan of one temp table for every row of =
the other temp table.
> (The alternative is to CREATE TEMPORARY TABLE... with an index, for =
one of the subqueries.)
But I made sure the subqueries were the smallest possible sets -- =
essentially, the domain of s_profit_centre, which only has 12 records.
I had the entire thing coded up into one massive JOIN of everything, and =
it took 30 minutes to run! By LEFT JOINing down to a dozen or fewer =
records, it seems to run in reasonable time, even though it's two =
subqueries that are not indexed.
> It would probably be better to move the mt.tot!=3D0 test inside:
>=20
> GROUP BY `Profit Centre`
> ) mt ON mt.pcid =3D tt.pcid
> WHERE mt.tot !=3D 0
> ) xx
> -->
> GROUP BY `Profit Centre`
> HAVING tot !=3D 0 -- added
> ) mt ON mt.pcid =3D tt.pcid
> -- removed: WHERE mt.tot !=3D 0
> ) xx
>=20
> That would make mt have fewer rows, hence that unindexed JOIN could =
run faster.
My first attempt to do that produced an error. And again, both the =
subqueries will have 12 or fewer records, so I'm wondering if this =
really helps anything.
Thanks for your help!
[clip]
----------------
:::: People see what they have been conditioned to see; they refuse to =
see what they don't expect to see. -- Merle P. Martin
:::: Jan Steinman, EcoReality Co-op ::::