Details

Description

Reporting for Mamta Satoor.

If we use insert into desttable, select
distinct from source, into a desttable which has autoincrement
column in it, we might see gaps in the autoincrement column if
there are duplicated rows in the source table. The reason for
this is Derby projects values into destination table columns before
building a distinct resultset from the source table. The piece
of code doing this is in org.apache.derby.impl.sql.execute.ProjectRestrictResultSet class's getNextRowCore() method where it calls doProjection.

I have been working on this trying to resolve this. Here are a few comments on this:

1) The main reason for this happeing is the auto increment value getting projected before the duplicate rows are discovered. But I was not able to get a row from the source ResultSet without actually getting the projected value.

2) From the activation I discovered that a hierarchy of ResultSets are involved here, where each ResultSet has a specific job and it least bothers where data is coming from or where it is going to send the data. So in this case the ProjectRestrictResultSet is where the projection happens. Any call to get the next row from the source is routed through this ResultSet where the actual projection happens.

3) One solution that I thought about was doing the elimination of duplicate rows before we do the Projection, since I was unable to figure out how to bypass the projection ( as stated in the above comment) , I thought of using the SYS.SYSCOLUMNS table where Derby stores the latest auto increment value and increment ident and all other details about the table and the column itself.

4) But one problem in using this table is that whenever updates need to happen to this table it gets locked and if we are in the middle of transaction and some errors take place the table might be in a in consistent state. The Derby docs also say that in the case of using an auto increment column, if a transaction that is using this is rolled back gaps might result as the SYS.SYSCOLUMNS table is not locked for rolling back the auto increment values.

Shreyas Kaushik
added a comment - 30/Nov/04 09:22 I have been working on this trying to resolve this. Here are a few comments on this:
1) The main reason for this happeing is the auto increment value getting projected before the duplicate rows are discovered. But I was not able to get a row from the source ResultSet without actually getting the projected value.
2) From the activation I discovered that a hierarchy of ResultSets are involved here, where each ResultSet has a specific job and it least bothers where data is coming from or where it is going to send the data. So in this case the ProjectRestrictResultSet is where the projection happens. Any call to get the next row from the source is routed through this ResultSet where the actual projection happens.
3) One solution that I thought about was doing the elimination of duplicate rows before we do the Projection, since I was unable to figure out how to bypass the projection ( as stated in the above comment) , I thought of using the SYS.SYSCOLUMNS table where Derby stores the latest auto increment value and increment ident and all other details about the table and the column itself.
4) But one problem in using this table is that whenever updates need to happen to this table it gets locked and if we are in the middle of transaction and some errors take place the table might be in a in consistent state. The Derby docs also say that in the case of using an auto increment column, if a transaction that is using this is rolled back gaps might result as the SYS.SYSCOLUMNS table is not locked for rolling back the auto increment values.