August 27, 2014

In-memory Consistency

A comment on one of my early blogs about the 12c in-memory database option asked how Oracle would deal with read-consistency. I came up with a couple of comments outlining the sort of thing I would look for in a solution, and this note is an outline on how I started to tackle the question – with a couple of the subsequent observations. The data is (nearly) the same as the data I generated for my previous article on the in-memory database (and I’m running 12.1.0.2, of course):

In this case I’ve made the inmemory priority high and I haven’t set any column to “no inmemory” although I have made one column different from the rest (v$_im_column_level doesn’t get populated unless there is some variation across columns). I have to say I couldn’t get very consistent behaviour in terms of when the data finally got into memory with this table creation – possibly something to do with using “alter table” rather than “create table” – but a second “alter table t1 inmemory;” seemed to do the trick if Oracle was playing hard to get.

Once I’d checked that the table was in memory I collected performance figures from v$mystat and v$session_event for the following query:

select
/* Test Run */
last_ddl_time
from
t1
where t1.created > trunc(sysdate)
and t1.object_type = 'TABLE'
and t1.subobject_name is not null
;

Once I was satisfied that the in-memory option was working correctly, I went through the following steps:

Session 1: set transaction read only;

Session 1: run the query and collect performance figures

Session 2: do several small, committed, updates, modifying a total of 30 or 40 random rows

Session 2: Flush the buffer cache – so that we can see future block acquisition

The effect of the “set transaction read only;” was to force the session to do some extra work in the second execution of the query to make the data read-consistent back to the start of the “transaction”. The results were as follows (don’t forget that some of the numbers will relate to the action of collecting the performance figures):

There’s quite a lot of stats which probably aren’t interesting – and there’s one detail that is important but doesn’t appear (at least not clearly) and that’s the fact that the table in question had about 6,800 blocks below its highwater mark.

So, what do the stats tell us? The most obvious change, of course, is that we had to do some physical reads to get a result set: 24 multiblock reads and 5 single block reads (the latter from the undotablespace). This is echoed in the session stats as 853 “physical reads cache” from 29 “physical read IO requests”. We can then see the specific read-consistency work (in two ways – with a third close approximation):

We applied 48 undo change vectors to fix up 41 blocks to the correct point in time and used them to read 48 rows – the last pair of figures won’t necessarily match the first two pairs, but they do give us a measure of how much data we had to acquire from the cache when trying to do an in-memory scan.

The number 41 actually appears a couple more times: it’s “table scan blocks gotten” (which might seem a little odd since we got far more than 41 blocks by multiblock reads – but we only really wanted 41), and it’s also the change (downwards) in “session logical reads – IM”. Even when Oracle does a pure in-memory query it calculates the number of blocks it would have been reading and reports that number as “session logical reads” and “session logical reads – IM” – so there’s another way to get confused about buffer visits and another statistic to cross-check when you’re trying to work out how to calculate “the buffer hit ratio” ;)

After the first read the scattered reads all seemed to be 32 blocks of “intermittent” tablescan – perhaps this is a measure of the number of blocks that are compressed into a single in-memory chunk (for query low), but perhaps it’s a side effect of the “physical reads prefetch warmup” that Oracle may do when the cache has a lot of empty space. I’ll leave it as an exercise to the reader to refine the test (or think of a different test) to determine whether it’s the former or latter; it’s quite important to find this out because if Oracle is tracking change at the “in-memory chunk” rather than at the block level then a small amount of high-precision change to an in-memory table could result in a relatively large amount of “redundant” I/O as a long-running query tried to stay read-consistent.

Related

Yeah … noticed the same thing here on a test 12.1.0.2 instance here…”alter table” commands seems a little “ambivalent” about bring the data into the inmemory store. Independent of priority, sometimes have had to the query the table

I think it’s highly likely that the scattered reads are happening because Oracle is taking advantage of the required reads to do some cache warming – but I didn’t see anything in your blog note that supports the hypothesis. What I’d be looking for is an example where the object was flushed from the cache, and the changed blocks had to be re-read, but the only blocks read from the object were just the ones needed and no others.

Thanks for your quick response.In fact i haven’t put all the stats in the post because there was no big increase in consistent reads (caused by the read of undo blocks and affected blocks from the cache i should have put them to show the fact and i will add them later) i only focused on ‘CR blocks created’ = ‘IM scan blocks cache’ and the three rows changed (IM scan rows excluded) was located on two different blocks.So in these case oracle is not tracking change at in-memory chunk level (IMCU) but there is thresholds that can cause the invalidation of an IMCU (_inmemory_pct_inv_blocks_invalidate_imcu,_inmemory_pct_inv_rows_invalidate_imcu,etc) and so causing the rereading all the blocks compressed in these IMCU.

But that’s just repeating my experiment above. I pointed out how many blocks had to be made read-consistent and how many undo records had to be applied, and tallied this against the various buffer gets. BUT … that doesn’t prove that (all) the extra blocks read from disc were definitely basic pre-warming ; it is still possible that (some of) the excess reads were directly related to the way the code handles stale rows. (Note that “physical reads cache prefetch” and “physical reads prefetch warmup” are not identical – leaving scope for some I/O which is prefetching for reasons other).

Thanks for your help i really apreciate, i updated the test in my blog.I tried to fill the cache with random blocks to avoid prefetching but the problem is if i don’t receive “physical reads cache prefetch” in the second query some times it give me (when i continue to fill the cache and test) ‘IM scan invalid all blocks’ ‘IM scan CUs invalid’ ‘table scan disk IMC fallback’ .So i stoped the test with lower number of block prefetch(you can check in my blog)