From: Zhangzhigang
Date: May 10 2012 4:59am
Subject: =?utf-8?B?5Zue5aSN77yaIOWbnuWkje+8miBXaHkgaXMgY3JlYXRpbmcgaW5kZXhlcyBm?=
=?utf-8?B?YXN0ZXIgYWZ0ZXIgaW5zZXJ0aW5nIG1hc3NpdmUgZGF0YSByb3dzPw==?=
List-Archive: http://lists.mysql.com/mysql/227376
Message-Id: <1336625980.71244.YahooMailNeo@web15205.mail.cnb.yahoo.com>
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="-381987588-103176792-1336625980=:71244"
---381987588-103176792-1336625980=:71244
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
>The =E2=80=9Coutput=E2=80=9D from the sortmerge is fed into code that buil=
ds the BTree for =0Athe table.=C2=A0 This building of the BTree is sequenti=
al =E2=80=93 fill the first =0Ablock, move on to the next block, and never =
have to go back.=0A=0A=C2=A0James...=0A=0A=0AThanks for your answer, so cle=
arly.=0A=0AFirstly:=0A=0AI thought that the "block split" for building of t=
he BTree has to been done to do random I/O before accepting this answer.=0A=
=0ANow, i have known that the mysql do the optimization to keep from "block=
split" by "sort merge" for building BTree, so it does not do more "random"=
I/O.=0A=0A=0ASecondly:=0A=0AIt bypass BTree traversals, When the index are=
too big to be cached which involves disk hit(s)=C2=A0 fro each row inserte=
d.=0A=0A=0AThank you very much.=0A=0A=0ASincerely yours=0AZhigang Zhang=0A=
=0A=0A________________________________=0A =E5=8F=91=E4=BB=B6=E4=BA=BA=EF=BC=
=9A Rick James =0A=E6=94=B6=E4=BB=B6=E4=BA=BA=EF=BC=
=9A Zhangzhigang =0A=E6=8A=84=E9=80=81=EF=BC=9A =
"mysql@stripped" =0A=E5=8F=91=E9=80=81=E6=97=
=A5=E6=9C=9F=EF=BC=9A 2012=E5=B9=B45=E6=9C=889=E6=97=A5, =E6=98=9F=E6=9C=9F=
=E4=B8=89, =E4=B8=8B=E5=8D=88 11:21=0A=E4=B8=BB=E9=A2=98: RE: =E5=9B=9E=E5=
=A4=8D=EF=BC=9A Why is creating indexes faster after inserting massive data=
rows?=0A =0A=0AA BTree that is small enough to be cached in RAM can be qui=
ckly maintained.=C2=A0 Even the =E2=80=9Cblock splits=E2=80=9D are not too =
costly without the I/O.=0A=C2=A0=0AA big file that needs sorting =E2=80=93 =
bigger than can be cached in RAM =E2=80=93 is more efficiently done with a =
dedicated =E2=80=9Csort merge=E2=80=9D program.=C2=A0 A =E2=80=9Cbig=E2=80=
=9D INDEX on a table may be big enough to fall into this category.=0A=C2=A0=
=0AI/O is the most costly part of any of these operations.=C2=A0 My rule of=
thumb for MySQL SQL statements is:=C2=A0 If everything is cached, the quer=
y will run ten times as fast as it would if things have to be fetched from =
disk.=0A=C2=A0=0ASortmerge works this way:=0A1.=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0 Sort as much of the file as you can in RAM.=C2=A0 Write that sort=
ed piece to disk.=0A2.=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Repeat for the n=
ext chunk of the file.=C2=A0 Repeat until the input file is broken into sor=
ted chunks.=0A3.=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Now, =E2=80=9Cmerge=E2=
=80=9D those chunks together =E2=80=93 take the first row from=C2=A0 each, =
decide which is the =E2=80=9Csmallest=E2=80=9D, send it to the output=0A4.=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Repeat until finished with all the pie=
ces.=0AFor a really big task, there may have to be more than on =E2=80=9Cme=
rge=E2=80=9D pass.=0ANote how sort merge reads the input sequentially once,=
writes the output sequentially once, and has sequential I/O for each merge=
chunk.=0A=E2=80=9CSequential=E2=80=9D I/O is faster than =E2=80=9Crandom=
=E2=80=9D I/O =E2=80=93 no arm motion on traditional disks.=C2=A0 (SSDs are=
a different matter; I won=E2=80=99t go into that.)=0A=C2=A0=0AThe =E2=80=
=9Coutput=E2=80=9D from the sortmerge is fed into code that builds the BTre=
e for the table.=C2=A0 This building of the BTree is sequential =E2=80=93 f=
ill the first block, move on to the next block, and never have to go back.=
=0A=C2=A0=0ABTrees (when built randomly), if they need to spill to disk, wi=
ll involve random I/O.=C2=A0 (And we are talking about an INDEX that is so =
big that it needs to spill to disk.)=0A=C2=A0=0AWhen a block =E2=80=9Csplit=
s=E2=80=9D, one full block becomes two half-full blocks.=C2=A0 Randomly fil=
ling a BTree leads to, on average, the index being 69% full.=C2=A0 This is =
not a big factor in the overall issue, but perhaps worth noting.=0A=C2=A0=
=0AHow bad can it get?=C2=A0 Here=E2=80=99s an example.=0A=C2=B7=C2=A0=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 You have an INDEX on some random va=
lue, such as a GUID or MD5.=0A=C2=B7=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0 The INDEX will be 5 times as big as you can fit in RAM.=0A=C2=B7=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 MySQL is adding to the BTr=
ee one row at a time (the non-sortmerge way)=0AWhen it is nearly finished, =
only 1 of 5 updates to the BTree can be done immediately in RAM; 4 out of 5=
updates to the BTree will have to hit disk.=C2=A0 If you are using normal =
disks, that is on the order of 125 rows per second that you can insert =E2=
=80=93 Terrible!=C2=A0 Sortmerge is likely to average over 10,000.=0A=C2=A0=
=0A=C2=A0=0A=C2=A0=0AFrom:Zhangzhigang [mailto:zzgang_2008@stripped] =
=0ASent: Tuesday, May 08, 2012 9:13 PM=0ATo: Rick James=0ACc: mysql@stripped=
ysql.com=0ASubject: =E5=9B=9E=E5=A4=8D=EF=BC=9AWhy is creating indexes fast=
er after inserting massive data rows?=0A=C2=A0=0AJames...=0A>* By doing all=
the indexes after building the table (or at least all the non-UNIQUE index=
es), "sort merge" can be used.=C2=A0 This technique had been highly optimiz=
ed over the past half-century, and is more efficient.=0A=C2=A0=0AI have a q=
uestion about "sort merge":=0A=C2=A0=0AWhy does it do the all "sort merge"?=
=0A=C2=A0=0AIn my opinion, it just maintains the B tree and inserts one ke=
y into a B tree node which has fewer sorted keys, so it is good performance=
.=0A=C2=A0=0AIf it only does the "sort merge", the B tree data structure ha=
ve to been created separately. it wastes some performance.=0A=C2=A0=0ADoes =
it?=0A=C2=A0=0A=C2=A0=0A=0A________________________________=0A=0A=E5=8F=91=
=E4=BB=B6=E4=BA=BA=EF=BC=9ARick James =0A=E6=94=B6=E4=
=BB=B6=E4=BA=BA=EF=BC=9AJohan De Meersman ; Zhangzhigan=
g =0A=E6=8A=84=E9=80=81=EF=BC=9A"mysql@stripped=
ql.com" =0A=E5=8F=91=E9=80=81=E6=97=A5=E6=9C=9F=EF=
=BC=9A2012=E5=B9=B45=E6=9C=888=E6=97=A5, =E6=98=9F=E6=9C=9F=E4=BA=8C, =E4=
=B8=8A=E5=8D=8812:35=0A=E4=B8=BB=E9=A2=98:RE: Why is creating indexes faste=
r after inserting massive data rows?=0A=0A* Batch INSERTs run faster than o=
ne-row-at-a-time, but this is unrelated to INDEX updating speed.=0A* The ca=
che size is quite important to dealing with indexing during INSERT; see htt=
p://mysql.rjweb.org/doc.php/memory =0A* Note that mysqldump sets up for an =
efficient creation of indexes after loading the data.=C2=A0 This is not pra=
ctical (or necessarily efficient) when incremental INSERTing into a table.=
=0A=0AAs for the original question...=0A* Updating the index(es) for one ro=
w often involves random BTree traversals.=C2=A0 When the index(es) are too =
big to be cached, this can involve disk hit(s) for each row inserted.=0A* B=
y doing all the indexes after building the table (or at least all the non-U=
NIQUE indexes), "sort merge" can be used.=C2=A0 This technique had been hig=
hly optimized over the past half-century, and is more efficient.=0A=0A=0A> =
-----Original Message-----=0A> From: Johan De Meersman [mailto:vegivamp@tux=
era.be]=0A> Sent: Monday, May 07, 2012 1:29 AM=0A> To: Zhangzhigang=0A> Cc:=
mysql@stripped=0A> Subject: Re: Why is creating indexes faster afte=
r inserting massive=0A> data rows?=0A> =0A> ----- Original Message -----=0A=
> > From: "Zhangzhigang" =0A> >=0A> > Creating in=
dexes after inserting massive data rows is faster than=0A> > before inserti=
ng data rows.=0A> > Please tell me why.=0A> =0A> Plain and simple: the indi=
ces get updated after every insert statement,=0A> whereas if you only creat=
e the index *after* the inserts, the index=0A> gets created in a single ope=
ration, which is a lot more efficient.=0A> =0A> I seem to recall that insid=
e of a transaction (thus, InnoDB or so) the=0A> difference is markedly less=
; I might be wrong, though.=0A> =0A> =0A> --=0A> Bier met grenadyn=0A> Is a=
ls mosterd by den wyn=0A> Sy die't drinkt, is eene kwezel=0A> Hy die't drin=
kt, is ras een ezel=0A> =0A> --=0A> MySQL General Mailing List=0A> For list=
archives: http://lists.mysql.com/mysql =0A> To unsubscribe:=C2=A0 =C2=A0 h=
ttp://lists.mysql.com/mysql
---381987588-103176792-1336625980=:71244--