27 September 2007

BULK COLLECT & FORALL vs. CURSOR & FOR-LOOP

After more and more reads about BULK COLLECT and FORALL and their performance improvements I decided to have a closer look on it by myself to see how powerful they really are. So I built a little test-case which inserts all entries from the all_object view into another table. The inserts happens on three different ways:
First way is a simple cursor over the view and a insert in a loop with FETCH into local variables. This way also shows how slow the opening of the cursor itself is.
The second way is a simple FOR – IN LOOP with the insert of the cursor variables.
And, of course, the third way is the way with bulking the rows and inserting them with FORALL so lets see.

So the other table looks like this (three columns are enough for this tests)

Look at this, for bulking all the lines into the collection took just 8 seconds (for 49 424 rows) and the inserts just 1 second! Unbelievable, together we did everything in 9 seconds where the other ways took over 20 seconds!

Well now lets try to first execute the bulk load then truncate the table again but not bouncing the database so that the buffers and caches a still filled

Ok so now we need 10 seconds for the run with the bulk but we sill need 21 seconds for the cursor! So not really a improvement with the cache and so on. Ok final test on a big system with over 268 thousand rows

COUNT(*)
———-
268056
And again, bulking took 8 seconds and the inserts just 1 second! But the run with the cursor took 33 seconds!

So this was just a short test but it definitely shows that BULK COLLECT and FORALL are much faster than cursors within the FOR loop! Only disadvantage of FORALL as you maybe already guess if you looked at the code: You can just perform one DML statement, there is no “FORALL END” clause! But anyway also bulking is a very high-performance functionality of Oracle! So if you have to run throw data collections then use BULK COLLECT!

In the cursor Oracle must refresh the binds every loop with the new data. Also the Insert is executed each time! With BULK COLLECT Oracle can put all the values into a collection at onces. No need to refresh something again and again. FORALL then executes the insert just one time in bulk. No soft parses, reading of the new values of the binds…

There is a subtle flaw in your tests that leads to incorrect conclusions.

There are two differences between CURSOR_FOR_QUERY and BULK_COLLECT_QUERY. The first difference is that the FOR loop is replaced by BULK COLLECT and the second is that the INSERT statements are replaced by FORALL. In fact, all the performance benefit is gained by the latter change and none by the former.

In 10gR2 (or 10gR1 with PLSQL_OPTIMIZE_LEVEL set to 2) a FOR loop does a BULK COLLECT under the covers.

To show that the BULK COLLECT offers no benefit you can either remove the INSERT and FORALL statements (replacing the former with NULL) or you can modify CURSOR_FOR_QUERY to build an in-memory table as follows:
CREATE OR REPLACE PROCEDURE cursor_for_query
IS
TYPE sowner IS TABLE OF VARCHAR2 (30);

Hi
I am using Bulk collect for parameterised cursors. I have 3cursors which is running one after other…If I am using Bulk collect then for first loop iteration its running fine but for second time it doesnt find any value and returns the error…ORA – 0652 PL/SQL numeric or Value… Where as if I use the same code with normal Cursor n for loop it works perfectly fine…n I get my output…

Plz can u suggest me can we use bulk collect in different loops or is there any constraints in using them

Well, one second isn’t really meaningful in a time frame of 8 but as I’m a advocate of “knowing what you’re doing” I don’t trust on optimizer parameters and stuff that may or may not happen under the hood. I get sure that they happen and so I’ll always write BULK COLLECT rather than trusting that it will happen under the hood when I want a bulk collect.

OK, the first and main point is that the FOR loop performs comparably to BULK COLLECT unless somebody lowers the optimizer level.

The FOR loop was provided – as with other languages – to simplify code and provide structure. In the vast majority of cases the BULK COLLECT clause is clumsier and, in my opinion, can generally no longer be justified now there is no longer a performance benefit.

There are some rare cases where I would still use BULK COLLECT, however. For example, if I want to process a bunch of rows multiple times I can read them into a collection first. This turns out to be simpler with BULK COLLECT (providing you do not need the limit clause) than a FOR loop but this is the exception that proves the rule.

Not sure if I get you right here, but I’ve never written something like “Use BULK COLLECT/FORALL because FOR LOOP is crap”.
Also I don’t tell developer which syntax they should use. It depends on the use cases anyway.
But what I wouldn’t do is to compare PL/SQL FOR loop with FOR loop in other languages like Java or C++.
And yes, FOR loops was provided for writting better structured code but I kwow C programs for example, where you can’t read the for loop anymore. So it depends also here on the developer how good the code is written!

Ok, so when I got you right then you say that you wouldn’t use the BULK COLLECT clause anymore.
That’s fine, I say: The developer has to/should know when to use BULK COLLECT and when to use FOR clause!

By the way, I repeated your “cold” tests on my computer and found no difference in performance. Did you bounce the database between your tests? Flushing the buffer cache would be insuffficient in this case:

Perhaps we have done this to death but I am a little confused. When would you recommend using BULK COLLECT and when would you recommend using a FOR loop?

This seems like a reasonable question given the title of this BLOG entry.

It seems like we largely, perhaps completely, agree with each other.

In case my position remains unclear, my answer to the question is “only use BULK COLLECT in the very rare cases where the code is simpler to read. In nearly all cases the FOR loop will be easier to both write and read”.

In my cold run I just executed both procedures so that buffer pool and shared pool are filled. It’s a little bit unclear here – I just wanted to warm up every thing and not to test how it behaves when it’s executed the first time.

Yeah, looks like that we agree at least to a certain degree.

I would say that it depends on the use case and the benefit for the developer – so the developer has to know when to use one of these clauses.

An example for BULK COLLECT would be when you’ve to call a sub procedure for example and this sub procedure gets collections as parameter. Then it would be better to fill the collections with bulk collect rather than write a for loop and and filling the collections by yourself.

A FOR loop I would use when I’ve to go over some data and do a lot with it. For example reading some records of a table, check some values, pass some values to a sub routine, having some if/else depending on the values of one or more field. Well, that will be one of the standard scenarios I guess!

I have read the complete discussion but still having one doubt since one querry is not cleared by anyone or not ask by anyone, i.e. in case of multiple DML statement has to perform then what should be use.Like I have a need of purging Record from around 10 table and If I go with BULK COLLECT FORALL then I wud have to write 10 forall loop whereas in case of for loop i can do the same with single loop instead. one more question is, if I use Bulk collect with for loop then wud it give any performance benefit? can someone sugges what wud be the best approach to perform this? what wud be the impact on memory since bulk collect use collection of data so it must keep all the data in cache, I guess? Please make me correct if am wrong.

I’m not sure if I get your question here: If you would have tables where you have to purge data out I would do a simple “DELETE FROM table WHERE” with the specific where clause. I don’t see any reason to select the data first and do a BULK COLLECT FORALL there! And yes, I would use 10 different delete statements or if is generic one dynamic sql delete statement.

I don’t think that a collection would be the right choice to do the insert part. “Insert into select * from” is very good optimized. But parallel queries might help here. Anyway – I would check the code first. Selecting from a table with a rownum where clause is a statement that does not have any business logic. You just get some “random” rows into the new table.

For the delete you can collect the primary key first with a BULK COLLECT and then execute a FORALL DELETE with the primary keys. But again: Check the code, rownum in a delete sounds just wrong!

Actually venzi u dint get my Question. am using the 10 delete staatment but before deletion I have to get all the eligible record into one table and from that table am reading the record on matching in all the table into Where clause and deleting them…and am reading only once all the record from the table and starting the loop. here my question arise that this I can do using either bulk collect or cursor but in case of bulk collect Ill have to write 10 forall loop since only once DML operation can perform with forall and in case of cursor Ill have to write single for loop and all the 10 delete statement inside that.

You are right! You have to write 10 FORALL statements for those 10 DELETES.
As I said in my originally post: “Only disadvantage of FORALL as you maybe already guess if you looked at the code: You can just perform one DML statement, there is no “FORALL END” clause! But anyway also bulking is a very high-performance functionality of Oracle! So if you have to run throw data collections then use BULK COLLECT!”

And also the Oracle documentation says so:
“The FORALL statement runs one DML statement multiple times, with different values in the VALUES and WHERE clauses. The different values come from existing, populated collections or host arrays. The FORALL statement is usually much faster than an equivalent FOR LOOP statement.”

But: You have to see FORALL as a DML statement and not as a LOOP!
It’s just like a DELETE statement or an INSERT statement. So rather than having 10 DELETE statements you have 10 FORALL statements there. The amount is the same but with the FORALL statements you save the whole FOR LOOP!

Actually I note that you have two unnecessary steps in there. You open a loop and exit it when c%notfound is true.
As you do bulk collect and forall you don’t need to do that; that’s the whole purpose of this functionality. So you can remove those lines and it still works fine:

Bulk collect doesn’t have a limit, it’s a PL/SQL feature. However some collections have limits like the VARRAY collection. Others, like that one used in my example, do not have virtual limits, however, they do have physical. The collection is stored within the PGA of the session process which means that the bigger the collection gets the more memory is used by the session process. So the size of the collection is depending on the amount of the physical memory available on the database machine.

hello,
but query takes the time for fetching records rather insert. so what is main difference in record by record insert record by FOR LOOP and BULK COLLECT/ FORALL?? I have still confusion. Please clear.

Yes, the query takes the time for fetching the results, that is correct. In the CURSOR/FOR example this time will be spent by the FOR loop iteration over the cursor. In the BULK COLLECT/FORALL example this time will be spent during the BULK COLLECT.

The major difference between the INSERT and the FORALL statements is that the INSERT statement gets execute for every single row from your cursor/select statement. So for every row you primary have a soft parse, a round trip to the database and a query execution.
While in the scenario of the FORALL statement the statement will only do a soft parse once, send/stream the collection to the database once (only 1 round trip) and execute the statement only once streaming the whole collection into the table. So at the end the FORALL statement does simply much less work.

About

Disclaimer

The views expressed on this blog are my own and do not necessarily reflect the views of Oracle and its affiliates! The views and opinions expressed by visitors on this blog are theirs solely and may not reflect mine!