jdbm-developer

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0
Transitional//EN">
<HTML><HEAD>
<STYLE type=text/css> P, UL, OL, DL, DIR,
MENU, PRE { margin: 0 auto;}</STYLE>
<META content="MSHTML 6.00.2900.2802" name=GENERATOR></HEAD>
<BODY leftMargin=1 topMargin=1 rightMargin=1><FONT
face=Tahoma>
<DIV><FONT face=Arial size=2>Alex-</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial size=2>Maybe you can
help - can you explain to me how MVCC could
have anything to do serializability of transactions?&nbsp;
Am I completely off-base in saying that adding
read locking to MVCC would defeat the purpose?</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial size=2>I'm obviously
having a massive disconnect.</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial size=2>Thanks,</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial size=2>- K</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial size=2></FONT></DIV>
<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial size=2>&nbsp;</FONT>
<TABLE>
<TBODY>
<TR>
<TD width=1 bgColor=blue><FONT face=Arial
size=2></FONT></TD>
<TD><FONT face=Arial size=2><FONT color=blue>&gt;
Kevin,<BR><BR>MVCC does provide serializability.
&nbsp;The tradeoffs among correct<BR>concurrency
control mechanisms involve performance not
correctness.<BR>The problems with performance
for MVCC are that, by itself, it tends<BR>to
produce more transaction restarts and lower
overall transaction<BR>throughput when compared
to 2PL. &nbsp;Postgres is some sort of hybrid<BR>of
2PL and MVCC. &nbsp;I tried their chat group
yesterday to get some<BR>insight into exactly
what kind of a hybrid, but no one could say
"Oh,<BR>we implemented XYZ".<BR><BR>I do
not think that MVCC (or any hybrid of MVCC)
is a silver bullet<BR>for performance. &nbsp;Neither
postgres nor Oracle is a clear performance<BR>winner
-- in fact I have personal experience with
the one of these<BR>platforms which places
it an order of magnitude slower than jdbm
for<BR>our application (an semantic web store)
using a solution developer by<BR>the database
people themselves.<BR><BR>I recommend a closer
reading of this paper. &nbsp;I have found
later articles<BR>which use analytic techniques
to examine the performance of a variety of<BR>concurrency
control mechanisms, but none which benchmarks
real databases<BR>explicitly in terms of
this issue. &nbsp;Perhaps we could look at
some of the<BR>RDBMS benchmark literature
for some insight there?<BR><BR>Enjoy your
vacation,<BR><BR>-bryan<BR><BR>-----Original
Message-----<BR>From: <A href="mailto:jdbm-developer-admin@..."><FONT
color=#0000ff>jdbm-developer-admin@...</FONT></A><BR>To:
JDBM Developer listserv<BR>Sent: 2/16/2006
4:48 PM<BR>Subject: re[2]: [Jdbm-developer]
MC-DataSafe <BR><BR>Bryan-<BR><BR>Can you
please provide a description of how MVCC
(with non-serialized<BR>processing of transactions)
would cause either of the two lost data/bad<BR>data
examples described in the paper?<BR><BR>The
way I read your email, it sounds like you
are implying that MVCC<BR>can't provide transaction
isolation or atomic updates, when it<BR>absolutely
does. &nbsp;MVCC just operates optimistically
in these situations.<BR>Under MVCC in the
first example, an abort would occur. &nbsp;In
the second<BR>example under MVCC, the report
created in T2 would be correct, and T1<BR>would
commit without problems. &nbsp;In either
case, nothing gets left on the<BR>floor...<BR><BR>The
massive advantage of MVCC comes in the second
example: &nbsp;If the<BR>report required
a significant amount of time to complete
(it was reading<BR>a huge number of rows),
then serializing the transactions as described<BR>in
the paper will cause massive slow downs for
the poor sucker who is<BR>trying to execute
T1 (and kill concurrency). &nbsp;In MVCC,
the update would<BR>happen immediately and
return, even if T2 takes a long time to complete.<BR>This
absolutely maximizes concurrency, and is
perfectly safe.<BR><BR><BR>I'm going to need
some serious convincing with concrete examples
before<BR>I'm willing to favor serializable
transactions over MVCC's approach. &nbsp;My<BR>concern
at this stage is that much of the literature
on these topics<BR>either doesn't account
for the radical departure from the norm that
MVCC<BR>is, or it is incorrect in it's description
of how MVCC actually operates<BR>in practice.<BR><BR>I
actually find it humorous that the concurrency
control paper we are<BR>talking about here
says that "the problem for nondistributed
DMBSs is<BR>well understood... &nbsp;and
one approach called two-phase locking has
been<BR>accepted as a standard solution."
&nbsp;That may have been true in 1981 when<BR>the
paper was published, but the good folks at
PostgreSQL and others<BR>have pretty much
blown that out of the water.<BR><BR>This
means that any broad claims made in these
older papers have to be<BR>taken with a considerable
grain of salt, and evaluated in the context
of<BR>the new development of MVCC. &nbsp;Bernstein
and Goodman are certainly going<BR>to say
that 2 phase locking is the only way to go
- they weren't even<BR>aware that a better
approach was possible.<BR><BR>Anyway - I'm
going to be gone until the 28th - hopefully
you and Alex<BR>can continue the discussion
with vigour while I'm gone!<BR><BR>Cheers,<BR><BR>-
K<BR><BR><A href="http://www-static.cc.gatech.edu/classes/AY2003/cs8803i_fall/ConcurrencyC"><FONT
color=#0000ff>http://www-static.cc.gatech.edu/classes/AY2003/cs8803i_fall/ConcurrencyC</FONT></A><BR>ontrol.pdf<BR><A
href="http://www-static.cc.gatech.edu/classes/AY2003/cs8803i_fall/Concurrency"><FONT
color=#0000ff>&lt;http://www-static.cc.gatech.edu/classes/AY2003/cs8803i_fall/Concurrency</FONT></A><BR>Control.pdf&gt;
<BR>&nbsp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&gt;
I have to say that I favor serializable transactions
-- at<BR>least in the<BR>abstract. &nbsp;Non-serializable
transactions can leave things on the floor.<BR>If
you do not have a concrete application you
don't know what that might<BR>be. &nbsp;Two
classic examples of problems with not enforcing<BR>serializablility<BR>are
given at the start of the paper, e.g., a
lost deposit into a bank<BR>account.<BR><BR>-bryan<BR><BR>-----Original
Message-----<BR>From: Alex Boisvert &nbsp;<A
href="mailto:boisvert@..."><FONT
color=#0000ff>&lt;mailto:boisvert@...&gt;</FONT></A><BR><A
href="mailto:boisvert@..."><FONT
color=#0000ff>[mailto:boisvert@...]</FONT></A><BR>Sent:
Thu 2/16/2006 11:55 AM<BR>To: Thompson, Bryan
B.<BR>Cc: 'Kevin Day '; &nbsp;<A href="mailto:jdbm-developer-admin@..."><FONT
color=#0000ff>&lt;mailto:jdbm-developer-admin@...&gt;</FONT></A><BR><A
href="mailto:jdbm-developer-admin@..."><FONT
color=#0000ff>'jdbm-developer-admin@...</FONT></A>
'; '''JDBM<BR>Developer listserv ' ' '<BR>Subject:
Re: [Jdbm-developer] MC-DataSafe (in Persistent
Store Inte<BR>rface).<BR><BR><BR>I agree
with figure 11 if you want to guarantee a
completely<BR>serializable history. &nbsp;However,
you can relax the isolation with MVCC to<BR>allow
such ordering to happen, which won't guarantee
complete<BR>serializability but in most cases
will result in a consistent outcome.<BR><BR>The
need to balance concurrency and consistency
is satisfied by most<BR>databases by offering
different levels of isolation (read uncomitted,<BR>read
committed, repeatable read, serializable,
...)<BR><BR>alex<BR><BR><BR>Thompson, Bryan
B. wrote:<BR><BR>&gt;I think that we probably
need to be much more concrete. &nbsp;However
MVCC<BR>&gt;does NOT permit a write to create
a version which would pre-date a<BR>&gt;version
which had already been read by another tx.
&nbsp;Intuitively such a<BR>&gt;write would
mean that the read had accessed the incorrect
version.<BR>This<BR>&gt;is the subject of
figure 11 in [1]. &nbsp;Are we disagreeing
about the<BR>meaning<BR>&gt;of figure 11?<BR>&gt;<BR>&gt;-bryan<BR>&gt;
&nbsp;<BR>&gt;<BR><BR><BR><BR>-------------------------------------------------------<BR>This
SF.net email is sponsored by: Splunk Inc.
Do you grep through log<BR>files<BR>for problems?
&nbsp;Stop! &nbsp;Download the new AJAX search
engine that makes<BR>searching your log files
as easy as surfing the &nbsp;web. &nbsp;DOWNLOAD
SPLUNK!<BR><BR><A href="http://sel.as-us.falkag.net/sel?cmd=lnk&amp;kid=103432&amp;bid=230486&amp;dat=12164"><FONT
color=#0000ff>&lt;http://sel.as-us.falkag.net/sel?cmd=lnk&amp;kid=103432&amp;bid=230486&amp;dat=12164</FONT></A><BR>2&gt;<BR><A
href="http://sel.as-us.falkag.net/sel?cmd=lnk&amp;kid=103432&amp;bid=230486&amp;dat=121642"><FONT
color=#0000ff>http://sel.as-us.falkag.net/sel?cmd=lnk&amp;kid=103432&amp;bid=230486&amp;dat=121642</FONT></A><BR>_______________________________________________<BR>Jdbm-developer
mailing list<BR><A href="mailto:Jdbm-developer@..."><FONT
color=#0000ff>&lt;mailto:Jdbm-developer@...&gt;</FONT></A><BR><A
href="mailto:Jdbm-developer@..."><FONT
color=#0000ff>Jdbm-developer@...</FONT></A><BR><A
href="https://lists.sourceforge.net/lists/listinfo/jdbm-developer"><FONT
color=#0000ff>&lt;https://lists.sourceforge.net/lists/listinfo/jdbm-developer&gt;</FONT></A><BR><A
href="https://lists.sourceforge.net/lists/listinfo/jdbm-developer"><FONT
color=#0000ff>https://lists.sourceforge.net/lists/listinfo/jdbm-developer</FONT></A><BR><BR>&lt;<BR>&nbsp;&nbsp;&nbsp;&nbsp;<BR&gt;-------------------------------------------------------
This SF.net<BR>email is sponsored by: Splunk
Inc. Do you grep through log files for<BR>problems?
Stop! Download the new AJAX search engine
that makes searching<BR>your log files as
easy as surfing the web. DOWNLOAD SPLUNK!<BR><A
href="http://sel.as-us.falkag.net/sel?cmd=lnk&amp;kid=103432&amp;bid=230486&amp;dat=121642"><FONT
color=#0000ff>http://sel.as-us.falkag.net/sel?cmd=lnk&amp;kid=103432&amp;bid=230486&amp;dat=121642</FONT></A><BR>_______________________________________________
Jdbm-developer mailing<BR>list <A href="mailto:Jdbm-developer@..."><FONT
color=#0000ff>Jdbm-developer@...</FONT></A><BR><A
href="https://lists.sourceforge.net/lists/listinfo/jdbm-developer"><FONT
color=#0000ff>https://lists.sourceforge.net/lists/listinfo/jdbm-developer</FONT></A><BR><BR>&lt;<BR></FONT></FONT></TD></TR></TBODY></TABLE></DIV></FONT></BODY></HTML&gt;

I think it would look similar to what industrial databases are doing today:
-read uncommitted
-read committed
-repeatable read
-serializable
This is more about relaxing MVCC than locks. You can stack locking on
top of these isolation levels for custom levels of conflict allowance or
prevention.
alex
Thompson, Bryan B. wrote:
>Alex,
>
>If you are going to relax conflict analysis, what is that going
>to look like under the "proposed" scheme? Release of some locks
>before all locks have been acquired?
>
>-bryan
>
>-----Original Message-----
>From: jdbm-developer-admin@...
>To: 'Kevin Day '
>Cc: 'JDBM Developer listserv '
>Sent: 2/16/2006 6:45 PM
>Subject: Re: [Jdbm-developer] MC-DataSafe
>
>
>Serializability is more about how strict you are when doing conflict
>analysis.
>
>To have total ordering (aka serializability) you must enforce that the
>read and write set of a transaction are completely disjoint from the
>write set of any transaction committed after the transaction started.
>
>alex
>
>
>'Kevin Day ' wrote:
>
>
>
>>Alex-
>>
>>Maybe you can help - can you explain to me how MVCC could have
>>anything to do serializability of transactions? Am I completely
>>off-base in saying that adding read locking to MVCC would defeat the
>>purpose?
>>
>>I'm obviously having a massive disconnect.
>>
>>Thanks,
>>
>>- K
>>
>>
>>
>>
>> > Kevin,
>>
>>MVCC does provide serializability. The tradeoffs among correct
>>concurrency control mechanisms involve performance not correctness.
>>The problems with performance for MVCC are that, by itself, it tends
>>to produce more transaction restarts and lower overall transaction
>>throughput when compared to 2PL. Postgres is some sort of hybrid
>>of 2PL and MVCC. I tried their chat group yesterday to get some
>>insight into exactly what kind of a hybrid, but no one could say "Oh,
>>we implemented XYZ".
>>
>>I do not think that MVCC (or any hybrid of MVCC) is a silver bullet
>>for performance. Neither postgres nor Oracle is a clear performance
>>winner -- in fact I have personal experience with the one of these
>>platforms which places it an order of magnitude slower than jdbm for
>>our application (an semantic web store) using a solution developer by
>>the database people themselves.
>>
>>I recommend a closer reading of this paper. I have found later
>>
>>
>articles
>
>
>>which use analytic techniques to examine the performance of a variety
>>
>>
>of
>
>
>>concurrency control mechanisms, but none which benchmarks real
>>
>>
>databases
>
>
>>explicitly in terms of this issue. Perhaps we could look at some of
>>
>>
>the
>
>
>>RDBMS benchmark literature for some insight there?
>>
>>Enjoy your vacation,
>>
>>-bryan
>>
>>-----Original Message-----
>>From: jdbm-developer-admin@...
>><mailto:jdbm-developer-admin@...>
>>To: JDBM Developer listserv
>>Sent: 2/16/2006 4:48 PM
>>Subject: re[2]: [Jdbm-developer] MC-DataSafe
>>
>>Bryan-
>>
>>Can you please provide a description of how MVCC (with non-serialized
>>processing of transactions) would cause either of the two lost
>>
>>
>data/bad
>
>
>>data examples described in the paper?
>>
>>The way I read your email, it sounds like you are implying that MVCC
>>can't provide transaction isolation or atomic updates, when it
>>absolutely does. MVCC just operates optimistically in these
>>
>>
>situations.
>
>
>>Under MVCC in the first example, an abort would occur. In the second
>>example under MVCC, the report created in T2 would be correct, and T1
>>would commit without problems. In either case, nothing gets left on
>>
>>
>the
>
>
>>floor...
>>
>>The massive advantage of MVCC comes in the second example: If the
>>report required a significant amount of time to complete (it was
>>
>>
>reading
>
>
>>a huge number of rows), then serializing the transactions as described
>>in the paper will cause massive slow downs for the poor sucker who is
>>trying to execute T1 (and kill concurrency). In MVCC, the update
>>
>>
>would
>
>
>>happen immediately and return, even if T2 takes a long time to
>>
>>
>complete.
>
>
>>This absolutely maximizes concurrency, and is perfectly safe.
>>
>>
>>I'm going to need some serious convincing with concrete examples
>>
>>
>before
>
>
>>I'm willing to favor serializable transactions over MVCC's approach.
>>
>>
>My
>
>
>>concern at this stage is that much of the literature on these topics
>>either doesn't account for the radical departure from the norm that
>>
>>
>MVCC
>
>
>>is, or it is incorrect in it's description of how MVCC actually
>>
>>
>operates
>
>
>>in practice.
>>
>>I actually find it humorous that the concurrency control paper we are
>>talking about here says that "the problem for nondistributed DMBSs is
>>well understood... and one approach called two-phase locking has been
>>accepted as a standard solution." That may have been true in 1981
>>
>>
>when
>
>
>>the paper was published, but the good folks at PostgreSQL and others
>>have pretty much blown that out of the water.
>>
>>This means that any broad claims made in these older papers have to be
>>taken with a considerable grain of salt, and evaluated in the context
>>
>>
>of
>
>
>>the new development of MVCC. Bernstein and Goodman are certainly
>>
>>
>going
>
>
>>to say that 2 phase locking is the only way to go - they weren't even
>>aware that a better approach was possible.
>>
>>Anyway - I'm going to be gone until the 28th - hopefully you and Alex
>>can continue the discussion with vigour while I'm gone!
>>
>>Cheers,
>>
>>- K
>>
>>
>>
>
>
>
>-------------------------------------------------------
>This SF.net email is sponsored by: Splunk Inc. Do you grep through log
>files
>for problems? Stop! Download the new AJAX search engine that makes
>searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
>_______________________________________________
>Jdbm-developer mailing list
>Jdbm-developer@...
>https://lists.sourceforge.net/lists/listinfo/jdbm-developer
>
>

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0
Transitional//EN">
<HTML><HEAD>
<STYLE type=text/css> P, UL, OL, DL, DIR,
MENU, PRE { margin: 0 auto;}</STYLE>
<META content="MSHTML 6.00.2900.2802" name=GENERATOR></HEAD>
<BODY leftMargin=1 topMargin=1 rightMargin=1><FONT
face=Tahoma>
<DIV><FONT face=Arial size=2>Bryan-</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial size=2></FONT></DIV>
<DIV><FONT face=Arial size=2>OK - I'll get
my act together on this over the week.</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial size=2>Cheers!</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial size=2>- K</FONT></DIV>
<DIV><FONT face=Arial size=2>&nbsp;</FONT>
<TABLE>
<TBODY>
<TR>
<TD width=1 bgColor=blue><FONT face=Arial
size=2></FONT></TD>
<TD><FONT face=Arial size=2><FONT color=blue>&gt;
That is why using 2PL with MVCC is so interesting.
&nbsp;It allows you<BR>to induce a total
ordering (by assigning timestamps) based
on the<BR>locks that are actually acquired.
&nbsp;If there is a totally ordering,<BR>then
all is good. &nbsp;If not, then one or more
transactions will have<BR>to be aborted and
retried (to have serializability).<BR><BR>-bryan<BR><BR>-----Original
Message-----<BR>From: <A href="mailto:jdbm-developer-admin@..."><FONT
color=#0000ff>jdbm-developer-admin@...</FONT></A><BR>To:
'JDBM Developer listserv '<BR>Sent: 2/16/2006
6:55 PM<BR>Subject: re[2]: [Jdbm-developer]
MC-DataSafe<BR><BR>Alex-<BR><BR>Does that
imply that enforcing serializability would
defeat the purpose<BR>of MVCC, or are these
totally separate issues? &nbsp;I guess if
you had<BR>sufficient information before
any of the transactions were started you<BR>could
come up with an optimal strategy - but that
seems like its way<BR>outside the realm of
practical implementation for a general purpose<BR>database...<BR><BR>-
K<BR>&nbsp;<BR>&nbsp;&nbsp;&nbsp;&nbsp;&gt;
<BR>Serializability is more about how strict
you are when doing conflict<BR>analysis.<BR><BR>To
have total ordering (aka serializability)
you must enforce that the<BR>read and write
set of a transaction are completely disjoint
from the<BR>write set of any transaction
committed after the transaction started.<BR><BR>alex<BR><BR><BR>'Kevin
Day ' wrote:<BR><BR>&gt; Alex-<BR>&gt; &nbsp;<BR>&gt;
Maybe you can help - can you explain to me
how MVCC could have<BR>&gt; anything to do
serializability of transactions? &nbsp;Am
I completely<BR>&gt; off-base in saying that
adding read locking to MVCC would defeat
the<BR>&gt; purpose?<BR>&gt; &nbsp;<BR>&gt;
I'm obviously having a massive disconnect.<BR>&gt;
&nbsp;<BR>&gt; Thanks,<BR>&gt; &nbsp;<BR>&gt;
- K<BR>&gt; &nbsp;<BR>&gt; &nbsp;<BR>&gt;
&nbsp;<BR>&gt;<BR>&gt; &nbsp;&nbsp;&nbsp;&nbsp;&gt;
Kevin,<BR>&gt;<BR>&gt; MVCC does provide
serializability. &nbsp;The tradeoffs among
correct<BR>&gt; concurrency control mechanisms
involve performance not correctness.<BR>&gt;
The problems with performance for MVCC are
that, by itself, it tends<BR>&gt; to produce
more transaction restarts and lower overall
transaction<BR>&gt; throughput when compared
to 2PL. &nbsp;Postgres is some sort of hybrid<BR>&gt;
of 2PL and MVCC. &nbsp;I tried their chat
group yesterday to get some<BR>&gt; insight
into exactly what kind of a hybrid, but no
one could say "Oh,<BR>&gt; we implemented
XYZ".<BR>&gt;<BR>&gt; I do not think that
MVCC (or any hybrid of MVCC) is a silver
bullet<BR>&gt; for performance. &nbsp;Neither
postgres nor Oracle is a clear performance<BR>&gt;
winner -- in fact I have personal experience
with the one of these<BR>&gt; platforms which
places it an order of magnitude slower than
jdbm for<BR>&gt; our application (an semantic
web store) using a solution developer by<BR>&gt;
the database people themselves.<BR>&gt;<BR>&gt;
I recommend a closer reading of this paper.
&nbsp;I have found later<BR>articles<BR>&gt;
which use analytic techniques to examine
the performance of a variety<BR>of<BR>&gt;
concurrency control mechanisms, but none
which benchmarks real<BR>databases<BR>&gt;
explicitly in terms of this issue. &nbsp;Perhaps
we could look at some of<BR>the<BR>&gt; RDBMS
benchmark literature for some insight there?<BR>&gt;<BR>&gt;
Enjoy your vacation,<BR>&gt;<BR>&gt; -bryan<BR>&gt;<BR>&gt;
-----Original Message-----<BR>&gt; From:
&nbsp;<A href="mailto:jdbm-developer-admin@..."><FONT
color=#0000ff>&lt;mailto:jdbm-developer-admin@...&gt;</FONT></A><BR><A
href="mailto:jdbm-developer-admin@..."><FONT
color=#0000ff>jdbm-developer-admin@...</FONT></A><BR>&gt;
&nbsp;<A href="mailto:jdbm-developer-admin@..."><FONT
color=#0000ff>&lt;mailto:jdbm-developer-admin@...&gt;</FONT></A><BR><A
href="mailto:jdbm-developer-admin@..."><FONT
color=#0000ff>&lt;mailto:jdbm-developer-admin@...&gt;</FONT></A><BR>&gt;
To: JDBM Developer listserv<BR>&gt; Sent:
2/16/2006 4:48 PM<BR>&gt; Subject: re[2]:
[Jdbm-developer] MC-DataSafe<BR>&gt;<BR>&gt;
Bryan-<BR>&gt;<BR>&gt; Can you please provide
a description of how MVCC (with non-serialized<BR>&gt;
processing of transactions) would cause either
of the two lost<BR>data/bad<BR>&gt; data
examples described in the paper?<BR>&gt;<BR>&gt;
The way I read your email, it sounds like
you are implying that MVCC<BR>&gt; can't
provide transaction isolation or atomic updates,
when it<BR>&gt; absolutely does. &nbsp;MVCC
just operates optimistically in these<BR>situations.<BR>&gt;
Under MVCC in the first example, an abort
would occur. &nbsp;In the second<BR>&gt;
example under MVCC, the report created in
T2 would be correct, and T1<BR>&gt; would
commit without problems. &nbsp;In either
case, nothing gets left on<BR>the<BR>&gt;
floor...<BR>&gt;<BR>&gt; The massive advantage
of MVCC comes in the second example: &nbsp;If
the<BR>&gt; report required a significant
amount of time to complete (it was<BR>reading<BR>&gt;
a huge number of rows), then serializing
the transactions as described<BR>&gt; in
the paper will cause massive slow downs for
the poor sucker who is<BR>&gt; trying to
execute T1 (and kill concurrency). &nbsp;In
MVCC, the update<BR>would<BR>&gt; happen
immediately and return, even if T2 takes
a long time to<BR>complete.<BR>&gt; This
absolutely maximizes concurrency, and is
perfectly safe.<BR>&gt;<BR>&gt;<BR>&gt; I'm
going to need some serious convincing with
concrete examples<BR>before<BR>&gt; I'm willing
to favor serializable transactions over MVCC's
approach.<BR>My<BR>&gt; concern at this stage
is that much of the literature on these topics<BR>&gt;
either doesn't account for the radical departure
from the norm that<BR>MVCC<BR>&gt; is, or
it is incorrect in it's description of how
MVCC actually<BR>operates<BR>&gt; in practice.<BR>&gt;<BR>&gt;
I actually find it humorous that the concurrency
control paper we are<BR>&gt; talking about
here says that "the problem for nondistributed
DMBSs is<BR>&gt; well understood... &nbsp;and
one approach called two-phase locking has
been<BR>&gt; accepted as a standard solution."
&nbsp;That may have been true in 1981<BR>when<BR>&gt;
the paper was published, but the good folks
at PostgreSQL and others<BR>&gt; have pretty
much blown that out of the water.<BR>&gt;<BR>&gt;
This means that any broad claims made in
these older papers have to be<BR>&gt; taken
with a considerable grain of salt, and evaluated
in the context<BR>of<BR>&gt; the new development
of MVCC. &nbsp;Bernstein and Goodman are
certainly<BR>going<BR>&gt; to say that 2
phase locking is the only way to go - they
weren't even<BR>&gt; aware that a better
approach was possible.<BR>&gt;<BR>&gt; Anyway
- I'm going to be gone until the 28th - hopefully
you and Alex<BR>&gt; can continue the discussion
with vigour while I'm gone!<BR>&gt;<BR>&gt;
Cheers,<BR>&gt;<BR>&gt; - K<BR>&gt;<BR><BR><BR><BR>-------------------------------------------------------<BR>This
SF.net email is sponsored by: Splunk Inc.
Do you grep through log<BR>files<BR>for problems?
&nbsp;Stop! &nbsp;Download the new AJAX search
engine that makes<BR>searching your log files
as easy as surfing the &nbsp;web. &nbsp;DOWNLOAD
SPLUNK!<BR><BR><A href="http://sel.as-us.falkag.net/sel?cmd=lnk&amp;kid=103432&amp;bid=230486&amp;dat=12164"><FONT
color=#0000ff>&lt;http://sel.as-us.falkag.net/sel?cmd=lnk&amp;kid=103432&amp;bid=230486&amp;dat=12164</FONT></A><BR>2&gt;<BR><A
href="http://sel.as-us.falkag.net/sel?cmd=lnk&amp;kid=103432&amp;bid=230486&amp;dat=121642"><FONT
color=#0000ff>http://sel.as-us.falkag.net/sel?cmd=lnk&amp;kid=103432&amp;bid=230486&amp;dat=121642</FONT></A><BR>_______________________________________________<BR>Jdbm-developer
mailing list<BR><A href="mailto:Jdbm-developer@..."><FONT
color=#0000ff>&lt;mailto:Jdbm-developer@...&gt;</FONT></A><BR><A
href="mailto:Jdbm-developer@..."><FONT
color=#0000ff>Jdbm-developer@...</FONT></A><BR><A
href="https://lists.sourceforge.net/lists/listinfo/jdbm-developer"><FONT
color=#0000ff>&lt;https://lists.sourceforge.net/lists/listinfo/jdbm-developer&gt;</FONT></A><BR><A
href="https://lists.sourceforge.net/lists/listinfo/jdbm-developer"><FONT
color=#0000ff>https://lists.sourceforge.net/lists/listinfo/jdbm-developer</FONT></A><BR><BR>&lt;<BR>&nbsp;&nbsp;&nbsp;&nbsp;<BR&gt;-------------------------------------------------------
This SF.net<BR>email is sponsored by: Splunk
Inc. Do you grep through log files for<BR>problems?
Stop! Download the new AJAX search engine
that makes searching<BR>your log files as
easy as surfing the web. DOWNLOAD SPLUNK!<BR><A
href="http://sel.as-us.falkag.net/sel?cmd=lnk&amp;kid=103432&amp;bid=230486&amp;dat=121642"><FONT
color=#0000ff>http://sel.as-us.falkag.net/sel?cmd=lnk&amp;kid=103432&amp;bid=230486&amp;dat=121642</FONT></A><BR>_______________________________________________
Jdbm-developer mailing<BR>list <A href="mailto:Jdbm-developer@..."><FONT
color=#0000ff>Jdbm-developer@...</FONT></A><BR><A
href="https://lists.sourceforge.net/lists/listinfo/jdbm-developer"><FONT
color=#0000ff>https://lists.sourceforge.net/lists/listinfo/jdbm-developer</FONT></A><BR><BR>&lt;<BR></FONT></FONT></TD></TR></TBODY></TABLE></DIV></FONT></BODY></HTML&gt;

Alex,
Ok. I am still getting my feet in this space and I need to think more
about how the different kinds of inconsistencies could be permitted in
a hybrid 2PL + MVCC strategy. 2PL seems straightforward -- and maybe
MVCC is as well.
I am thinking of the implementation in terms of 2PL hierarchical
locking [1] support for rw synchronization plus MVCC support for ww
synchronization (per [2], section 5.3). What I think I need to do
is work through the interaction of those synchronization mechanisms
in more depth, maybe do some modeling to support that, and generate
some ideas for how this might interface with jdbm and DBCache. If
you have a design in mind I am all ears.
Regardless it seems that we could incrementally improve jdbm right
now by:
1. integrating DBCache for VLR Tx support (but without concurrency).
2. integrating the b-link support into the b+tree to support index
traversal under concurrent modification.
-bryan
PS: Gray laid out those isolation levels in [1], together with
heirarchical locking.
[1] J.N. Gray, R.A. Lorie, G.R. Putzolu, I.L. Traiger.
Granularity of Locks and Degrees of Consistency in a Shared Data
Base.
http://www.seas.upenn.edu/~zives/cis650/papers/granularity-locks.pdf
[2] Bernstein, P. A. and Goodman, N. 1981. Concurrency Control in
Distributed Database Systems. ACM Comput. Surv. 13, 2 (Jun. 1981),
185-221. DOI= http://doi.acm.org/10.1145/356842.356846http://www-static.cc.gatech.edu/classes/AY2003/cs8803i_fall/ConcurrencyContr
ol.pdf
-----Original Message-----
From: Alex Boisvert
To: Thompson, Bryan B.
Cc: 'jdbm-developer-admin@... '; ''Kevin Day ' '; ''JDBM
Developer listserv ' '
Sent: 2/16/2006 7:18 PM
Subject: Re: [Jdbm-developer] MC-DataSafe
I think it would look similar to what industrial databases are doing
today:
-read uncommitted
-read committed
-repeatable read
-serializable
This is more about relaxing MVCC than locks. You can stack locking on
top of these isolation levels for custom levels of conflict allowance or
prevention.
alex
Thompson, Bryan B. wrote:
>Alex,
>
>If you are going to relax conflict analysis, what is that going
>to look like under the "proposed" scheme? Release of some locks
>before all locks have been acquired?
>
>-bryan
>
>-----Original Message-----
>From: jdbm-developer-admin@...
>To: 'Kevin Day '
>Cc: 'JDBM Developer listserv '
>Sent: 2/16/2006 6:45 PM
>Subject: Re: [Jdbm-developer] MC-DataSafe
>
>
>Serializability is more about how strict you are when doing conflict
>analysis.
>
>To have total ordering (aka serializability) you must enforce that the
>read and write set of a transaction are completely disjoint from the
>write set of any transaction committed after the transaction started.
>
>alex
>
>
>'Kevin Day ' wrote:
>
>
>
>>Alex-
>>
>>Maybe you can help - can you explain to me how MVCC could have
>>anything to do serializability of transactions? Am I completely
>>off-base in saying that adding read locking to MVCC would defeat the
>>purpose?
>>
>>I'm obviously having a massive disconnect.
>>
>>Thanks,
>>
>>- K
>>
>>
>>
>>
>> > Kevin,
>>
>>MVCC does provide serializability. The tradeoffs among correct
>>concurrency control mechanisms involve performance not correctness.
>>The problems with performance for MVCC are that, by itself, it tends
>>to produce more transaction restarts and lower overall transaction
>>throughput when compared to 2PL. Postgres is some sort of hybrid
>>of 2PL and MVCC. I tried their chat group yesterday to get some
>>insight into exactly what kind of a hybrid, but no one could say "Oh,
>>we implemented XYZ".
>>
>>I do not think that MVCC (or any hybrid of MVCC) is a silver bullet
>>for performance. Neither postgres nor Oracle is a clear performance
>>winner -- in fact I have personal experience with the one of these
>>platforms which places it an order of magnitude slower than jdbm for
>>our application (an semantic web store) using a solution developer by
>>the database people themselves.
>>
>>I recommend a closer reading of this paper. I have found later
>>
>>
>articles
>
>
>>which use analytic techniques to examine the performance of a variety
>>
>>
>of
>
>
>>concurrency control mechanisms, but none which benchmarks real
>>
>>
>databases
>
>
>>explicitly in terms of this issue. Perhaps we could look at some of
>>
>>
>the
>
>
>>RDBMS benchmark literature for some insight there?
>>
>>Enjoy your vacation,
>>
>>-bryan
>>
>>-----Original Message-----
>>From: jdbm-developer-admin@...
>><mailto:jdbm-developer-admin@...>
>>To: JDBM Developer listserv
>>Sent: 2/16/2006 4:48 PM
>>Subject: re[2]: [Jdbm-developer] MC-DataSafe
>>
>>Bryan-
>>
>>Can you please provide a description of how MVCC (with non-serialized
>>processing of transactions) would cause either of the two lost
>>
>>
>data/bad
>
>
>>data examples described in the paper?
>>
>>The way I read your email, it sounds like you are implying that MVCC
>>can't provide transaction isolation or atomic updates, when it
>>absolutely does. MVCC just operates optimistically in these
>>
>>
>situations.
>
>
>>Under MVCC in the first example, an abort would occur. In the second
>>example under MVCC, the report created in T2 would be correct, and T1
>>would commit without problems. In either case, nothing gets left on
>>
>>
>the
>
>
>>floor...
>>
>>The massive advantage of MVCC comes in the second example: If the
>>report required a significant amount of time to complete (it was
>>
>>
>reading
>
>
>>a huge number of rows), then serializing the transactions as described
>>in the paper will cause massive slow downs for the poor sucker who is
>>trying to execute T1 (and kill concurrency). In MVCC, the update
>>
>>
>would
>
>
>>happen immediately and return, even if T2 takes a long time to
>>
>>
>complete.
>
>
>>This absolutely maximizes concurrency, and is perfectly safe.
>>
>>
>>I'm going to need some serious convincing with concrete examples
>>
>>
>before
>
>
>>I'm willing to favor serializable transactions over MVCC's approach.
>>
>>
>My
>
>
>>concern at this stage is that much of the literature on these topics
>>either doesn't account for the radical departure from the norm that
>>
>>
>MVCC
>
>
>>is, or it is incorrect in it's description of how MVCC actually
>>
>>
>operates
>
>
>>in practice.
>>
>>I actually find it humorous that the concurrency control paper we are
>>talking about here says that "the problem for nondistributed DMBSs is
>>well understood... and one approach called two-phase locking has been
>>accepted as a standard solution." That may have been true in 1981
>>
>>
>when
>
>
>>the paper was published, but the good folks at PostgreSQL and others
>>have pretty much blown that out of the water.
>>
>>This means that any broad claims made in these older papers have to be
>>taken with a considerable grain of salt, and evaluated in the context
>>
>>
>of
>
>
>>the new development of MVCC. Bernstein and Goodman are certainly
>>
>>
>going
>
>
>>to say that 2 phase locking is the only way to go - they weren't even
>>aware that a better approach was possible.
>>
>>Anyway - I'm going to be gone until the 28th - hopefully you and Alex
>>can continue the discussion with vigour while I'm gone!
>>
>>Cheers,
>>
>>- K
>>
>>
>>
>
>
>
>-------------------------------------------------------
>This SF.net email is sponsored by: Splunk Inc. Do you grep through log
>files
>for problems? Stop! Download the new AJAX search engine that makes
>searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=12164
2
>_______________________________________________
>Jdbm-developer mailing list
>Jdbm-developer@...
>https://lists.sourceforge.net/lists/listinfo/jdbm-developer
>
>

I'm a little concerned about adding MVCC to DBCache after the fact.
But then again I agree that moving from the current JDBM architecture to
DBCache is more incremental. I could help with the implementation of
B-link in parallel to the DBCache work.
alex
Thompson, Bryan B. wrote:
>Regardless it seems that we could incrementally improve jdbm right
>now by:
>
>1. integrating DBCache for VLR Tx support (but without concurrency).
>
>2. integrating the b-link support into the b+tree to support index
>traversal under concurrent modification.
>
>-bryan
>
>

Alex,
I don't know how this will all play out either. I see the concurrency
control as functioning over logical records and some higher level data
structures (indices or application data structures). DBCache needs to
make sure that different transactions do not overwrite the same region
of a page since it must enforce the distinction between original page
images and working copies (of a page or a region on a page) being
modified by a tx and also distinguish between the original and the
updated page image (after the commit).
jdbm certainly delays putting serialized objects onto pages, which helps.
A log structured store would help by focusing on appending rather than
in place updated. Once a write is performed on a page image in DBCache
it seems that other writers must be blocked for the (sub)page region
which was updated. Also, once a tx commits, the original state for the
changed parts of the page image are no longer available.
I am beginning to believe that the DBCache semantics (transaction page
image semantics) are what they are and that MVCC is purely in terms of
the record level semantics. Perhaps the only way to make it work out
cleanly is to NOT update pages containing live records (i.e., using a
log structured store) and then to periodically purge old versions and
recycle the pages from which those versions were collected. I can see
how we could create more efficient record allocation strategies using
this approach as well.
-bryan
-----Original Message-----
From: Alex Boisvert
To: Thompson, Bryan B.
Cc: ''jdbm-developer-admin@... ' '; '''Kevin Day ' ' ';
'''JDBM Developer listserv ' ' '
Sent: 2/17/2006 4:36 PM
Subject: Re: [Jdbm-developer] MC-DataSafe
I'm a little concerned about adding MVCC to DBCache after the fact.
But then again I agree that moving from the current JDBM architecture to
DBCache is more incremental. I could help with the implementation of
B-link in parallel to the DBCache work.
alex
Thompson, Bryan B. wrote:
>Regardless it seems that we could incrementally improve jdbm right
>now by:
>
>1. integrating DBCache for VLR Tx support (but without concurrency).
>
>2. integrating the b-link support into the b+tree to support index
>traversal under concurrent modification.
>
>-bryan
>
>

Thompson, Bryan B. wrote:
>I am beginning to believe that the DBCache semantics (transaction page
>image semantics) are what they are and that MVCC is purely in terms of
>the record level semantics. Perhaps the only way to make it work out
>cleanly is to NOT update pages containing live records (i.e., using a
>log structured store) and then to periodically purge old versions and
>recycle the pages from which those versions were collected. I can see
>how we could create more efficient record allocation strategies using
>this approach as well.
>
>
Yes, I have a feeling that it's going to play out towards a
log-structured approach (appending and recycling) as well.
alex

Serializability is more about how strict you are when doing conflict
analysis.
To have total ordering (aka serializability) you must enforce that the
read and write set of a transaction are completely disjoint from the
write set of any transaction committed after the transaction started.
alex
'Kevin Day ' wrote:
> Alex-
>
> Maybe you can help - can you explain to me how MVCC could have
> anything to do serializability of transactions? Am I completely
> off-base in saying that adding read locking to MVCC would defeat the
> purpose?
>
> I'm obviously having a massive disconnect.
>
> Thanks,
>
> - K
>
>
>
>
> > Kevin,
>
> MVCC does provide serializability. The tradeoffs among correct
> concurrency control mechanisms involve performance not correctness.
> The problems with performance for MVCC are that, by itself, it tends
> to produce more transaction restarts and lower overall transaction
> throughput when compared to 2PL. Postgres is some sort of hybrid
> of 2PL and MVCC. I tried their chat group yesterday to get some
> insight into exactly what kind of a hybrid, but no one could say "Oh,
> we implemented XYZ".
>
> I do not think that MVCC (or any hybrid of MVCC) is a silver bullet
> for performance. Neither postgres nor Oracle is a clear performance
> winner -- in fact I have personal experience with the one of these
> platforms which places it an order of magnitude slower than jdbm for
> our application (an semantic web store) using a solution developer by
> the database people themselves.
>
> I recommend a closer reading of this paper. I have found later articles
> which use analytic techniques to examine the performance of a variety of
> concurrency control mechanisms, but none which benchmarks real databases
> explicitly in terms of this issue. Perhaps we could look at some of the
> RDBMS benchmark literature for some insight there?
>
> Enjoy your vacation,
>
> -bryan
>
> -----Original Message-----
> From: jdbm-developer-admin@...
> <mailto:jdbm-developer-admin@...>
> To: JDBM Developer listserv
> Sent: 2/16/2006 4:48 PM
> Subject: re[2]: [Jdbm-developer] MC-DataSafe
>
> Bryan-
>
> Can you please provide a description of how MVCC (with non-serialized
> processing of transactions) would cause either of the two lost data/bad
> data examples described in the paper?
>
> The way I read your email, it sounds like you are implying that MVCC
> can't provide transaction isolation or atomic updates, when it
> absolutely does. MVCC just operates optimistically in these situations.
> Under MVCC in the first example, an abort would occur. In the second
> example under MVCC, the report created in T2 would be correct, and T1
> would commit without problems. In either case, nothing gets left on the
> floor...
>
> The massive advantage of MVCC comes in the second example: If the
> report required a significant amount of time to complete (it was reading
> a huge number of rows), then serializing the transactions as described
> in the paper will cause massive slow downs for the poor sucker who is
> trying to execute T1 (and kill concurrency). In MVCC, the update would
> happen immediately and return, even if T2 takes a long time to complete.
> This absolutely maximizes concurrency, and is perfectly safe.
>
>
> I'm going to need some serious convincing with concrete examples before
> I'm willing to favor serializable transactions over MVCC's approach. My
> concern at this stage is that much of the literature on these topics
> either doesn't account for the radical departure from the norm that MVCC
> is, or it is incorrect in it's description of how MVCC actually operates
> in practice.
>
> I actually find it humorous that the concurrency control paper we are
> talking about here says that "the problem for nondistributed DMBSs is
> well understood... and one approach called two-phase locking has been
> accepted as a standard solution." That may have been true in 1981 when
> the paper was published, but the good folks at PostgreSQL and others
> have pretty much blown that out of the water.
>
> This means that any broad claims made in these older papers have to be
> taken with a considerable grain of salt, and evaluated in the context of
> the new development of MVCC. Bernstein and Goodman are certainly going
> to say that 2 phase locking is the only way to go - they weren't even
> aware that a better approach was possible.
>
> Anyway - I'm going to be gone until the 28th - hopefully you and Alex
> can continue the discussion with vigour while I'm gone!
>
> Cheers,
>
> - K
>

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0
Transitional//EN">
<HTML><HEAD>
<STYLE type=text/css> P, UL, OL, DL, DIR,
MENU, PRE { margin: 0 auto;}</STYLE>
<META content="MSHTML 6.00.2900.2802" name=GENERATOR></HEAD>
<BODY leftMargin=1 topMargin=1 rightMargin=1><FONT
face=Tahoma>
<DIV><FONT face=Arial size=2>Alex-</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial size=2></FONT></DIV>
<DIV><FONT face=Arial size=2>Does that imply
that enforcing serializability would defeat
the purpose of MVCC, or are these totally
separate issues?&nbsp; I guess if you had
sufficient information before any of the
transactions were started you could come
up with an optimal strategy - but that seems
like its way outside the realm of practical
implementation for a general purpose database...</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial size=2>- K</FONT></DIV>
<DIV><FONT face=Arial size=2>&nbsp;</FONT>
<TABLE>
<TBODY>
<TR>
<TD width=1 bgColor=blue><FONT face=Arial
size=2></FONT></TD>
<TD><FONT face=Arial size=2><FONT color=red>&gt;
<BR>Serializability is more about how strict
you are when doing conflict<BR>analysis.<BR><BR>To
have total ordering (aka serializability)
you must enforce that the<BR>read and write
set of a transaction are completely disjoint
from the<BR>write set of any transaction
committed after the transaction started.<BR><BR>alex<BR><BR><BR>'Kevin
Day ' wrote:<BR><BR>&gt; Alex-<BR>&gt; &nbsp;<BR>&gt;
Maybe you can help - can you explain to me
how MVCC could have<BR>&gt; anything to do
serializability of transactions? &nbsp;Am
I completely<BR>&gt; off-base in saying that
adding read locking to MVCC would defeat
the<BR>&gt; purpose?<BR>&gt; &nbsp;<BR>&gt;
I'm obviously having a massive disconnect.<BR>&gt;
&nbsp;<BR>&gt; Thanks,<BR>&gt; &nbsp;<BR>&gt;
- K<BR>&gt; &nbsp;<BR>&gt; &nbsp;<BR>&gt;
&nbsp;<BR>&gt;<BR>&gt; &nbsp;&nbsp;&nbsp;&nbsp;&gt;
Kevin,<BR>&gt;<BR>&gt; MVCC does provide
serializability. &nbsp;The tradeoffs among
correct<BR>&gt; concurrency control mechanisms
involve performance not correctness.<BR>&gt;
The problems with performance for MVCC are
that, by itself, it tends<BR>&gt; to produce
more transaction restarts and lower overall
transaction<BR>&gt; throughput when compared
to 2PL. &nbsp;Postgres is some sort of hybrid<BR>&gt;
of 2PL and MVCC. &nbsp;I tried their chat
group yesterday to get some<BR>&gt; insight
into exactly what kind of a hybrid, but no
one could say "Oh,<BR>&gt; we implemented
XYZ".<BR>&gt;<BR>&gt; I do not think that
MVCC (or any hybrid of MVCC) is a silver
bullet<BR>&gt; for performance. &nbsp;Neither
postgres nor Oracle is a clear performance<BR>&gt;
winner -- in fact I have personal experience
with the one of these<BR>&gt; platforms which
places it an order of magnitude slower than
jdbm for<BR>&gt; our application (an semantic
web store) using a solution developer by<BR>&gt;
the database people themselves.<BR>&gt;<BR>&gt;
I recommend a closer reading of this paper.
&nbsp;I have found later articles<BR>&gt;
which use analytic techniques to examine
the performance of a variety of<BR>&gt; concurrency
control mechanisms, but none which benchmarks
real databases<BR>&gt; explicitly in terms
of this issue. &nbsp;Perhaps we could look
at some of the<BR>&gt; RDBMS benchmark literature
for some insight there?<BR>&gt;<BR>&gt; Enjoy
your vacation,<BR>&gt;<BR>&gt; -bryan<BR>&gt;<BR>&gt;
-----Original Message-----<BR>&gt; From:
<A href="mailto:jdbm-developer-admin@..."><FONT
color=#0000ff>jdbm-developer-admin@...</FONT></A><BR>&gt;
<A href="mailto:jdbm-developer-admin@..."><FONT
color=#0000ff>&lt;mailto:jdbm-developer-admin@...&gt;</FONT></A><BR>&gt;
To: JDBM Developer listserv<BR>&gt; Sent:
2/16/2006 4:48 PM<BR>&gt; Subject: re[2]:
[Jdbm-developer] MC-DataSafe<BR>&gt;<BR>&gt;
Bryan-<BR>&gt;<BR>&gt; Can you please provide
a description of how MVCC (with non-serialized<BR>&gt;
processing of transactions) would cause either
of the two lost data/bad<BR>&gt; data examples
described in the paper?<BR>&gt;<BR>&gt; The
way I read your email, it sounds like you
are implying that MVCC<BR>&gt; can't provide
transaction isolation or atomic updates,
when it<BR>&gt; absolutely does. &nbsp;MVCC
just operates optimistically in these situations.<BR>&gt;
Under MVCC in the first example, an abort
would occur. &nbsp;In the second<BR>&gt;
example under MVCC, the report created in
T2 would be correct, and T1<BR>&gt; would
commit without problems. &nbsp;In either
case, nothing gets left on the<BR>&gt; floor...<BR>&gt;<BR>&gt;
The massive advantage of MVCC comes in the
second example: &nbsp;If the<BR>&gt; report
required a significant amount of time to
complete (it was reading<BR>&gt; a huge number
of rows), then serializing the transactions
as described<BR>&gt; in the paper will cause
massive slow downs for the poor sucker who
is<BR>&gt; trying to execute T1 (and kill
concurrency). &nbsp;In MVCC, the update would<BR>&gt;
happen immediately and return, even if T2
takes a long time to complete.<BR>&gt; This
absolutely maximizes concurrency, and is
perfectly safe.<BR>&gt;<BR>&gt;<BR>&gt; I'm
going to need some serious convincing with
concrete examples before<BR>&gt; I'm willing
to favor serializable transactions over MVCC's
approach. &nbsp;My<BR>&gt; concern at this
stage is that much of the literature on these
topics<BR>&gt; either doesn't account for
the radical departure from the norm that
MVCC<BR>&gt; is, or it is incorrect in it's
description of how MVCC actually operates<BR>&gt;
in practice.<BR>&gt;<BR>&gt; I actually find
it humorous that the concurrency control
paper we are<BR>&gt; talking about here says
that "the problem for nondistributed DMBSs
is<BR>&gt; well understood... &nbsp;and one
approach called two-phase locking has been<BR>&gt;
accepted as a standard solution." &nbsp;That
may have been true in 1981 when<BR>&gt; the
paper was published, but the good folks at
PostgreSQL and others<BR>&gt; have pretty
much blown that out of the water.<BR>&gt;<BR>&gt;
This means that any broad claims made in
these older papers have to be<BR>&gt; taken
with a considerable grain of salt, and evaluated
in the context of<BR>&gt; the new development
of MVCC. &nbsp;Bernstein and Goodman are
certainly going<BR>&gt; to say that 2 phase
locking is the only way to go - they weren't
even<BR>&gt; aware that a better approach
was possible.<BR>&gt;<BR>&gt; Anyway - I'm
going to be gone until the 28th - hopefully
you and Alex<BR>&gt; can continue the discussion
with vigour while I'm gone!<BR>&gt;<BR>&gt;
Cheers,<BR>&gt;<BR>&gt; - K<BR>&gt;<BR><BR><BR><BR>-------------------------------------------------------<BR>This
SF.net email is sponsored by: Splunk Inc.
Do you grep through log files<BR>for problems?
&nbsp;Stop! &nbsp;Download the new AJAX search
engine that makes<BR>searching your log files
as easy as surfing the &nbsp;web. &nbsp;DOWNLOAD
SPLUNK!<BR><A href="http://sel.as-us.falkag.net/sel?cmd=lnk&amp;kid=103432&amp;bid=230486&amp;dat=121642"><FONT
color=#0000ff>http://sel.as-us.falkag.net/sel?cmd=lnk&amp;kid=103432&amp;bid=230486&amp;dat=121642</FONT></A><BR>_______________________________________________<BR>Jdbm-developer
mailing list<BR><A href="mailto:Jdbm-developer@..."><FONT
color=#0000ff>Jdbm-developer@...</FONT></A><BR><A
href="https://lists.sourceforge.net/lists/listinfo/jdbm-developer"><FONT
color=#0000ff>https://lists.sourceforge.net/lists/listinfo/jdbm-developer</FONT></A><BR><BR>&lt;<BR></FONT></FONT></TD></TR></TBODY></TABLE></DIV></FONT></BODY></HTML&gt;

Total ordering doesn't defeat the purpose of MVCC, you still get the
benefit that a lot of transaction patterns can run concurrently without
conflicting. In fact, optimistic locking provides the widest
permissible range of non-conflicting transactions to execute concurrently.
However, when transaction are potentially conflicting and there's
contention for resources, pessimistic locking has proven to yield higher
performance.
If you know the read and write set of transactions in advance, it
becomes trivial to order them for near-optimal performance. But in
reality, you don't know in advance. That's why pessimistic locking
comes into play. It's an easy way to obtain the read+write sets as you
go so you can reduce the number of unnecessary rollbacks and maximize
usage of non-contended resources.
alex
'Kevin Day ' wrote:
> Alex-
>
> Does that imply that enforcing serializability would defeat the
> purpose of MVCC, or are these totally separate issues? I guess if you
> had sufficient information before any of the transactions were started
> you could come up with an optimal strategy - but that seems like its
> way outside the realm of practical implementation for a general
> purpose database...
>
> - K
>
>
> >
> Serializability is more about how strict you are when doing conflict
> analysis.
>
> To have total ordering (aka serializability) you must enforce that the
> read and write set of a transaction are completely disjoint from the
> write set of any transaction committed after the transaction started.
>
> alex
>