Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It's 100% free, no registration required.

If I make a unique constraint on a field, do I also need to make an index on that field in order to get a scalable insert time? Or is this done for me (even if the index it uses isn't publicly accessible?)

Specifically, I'm working with Apache Derby for prototyping, although I will probably be moving it to MySQL in the semi-near future. I'm also hoping there might be something in the SQL standard that says something about this.

I will never have a need to search by this field, so I would rather not make a useless index. But I'd rather have a useless index than have an O(n) insert time.

3 Answers
3

My original answer (below) is probably not useful to you at all because it does not address the question of unique constraints. As others have said, these constraints are usually implemented with an implied unique index. In special cases this might not be true (eg disable novalidate for Oracle).

The question could be: Is it possible to enforce uniqueness without an index? Generally speaking the answer is no though in some cases a Clustered Index will mean that the index and the table are the same object.

--END EDIT--

You said "I'd rather have a useless index than have an O(n) insert time.", but in general databases do not have O(n) insert time. There are two cases to consider:

A normal table with or without indexes:

New rows are dumped at the top of the heap. The RDBMS probably only looks at 1 block, so not just O(1) but very small O(1).

If the table has indexes, a pointer to the row will be added to each. This will usually be an O(log(n)) operation.

New rows are inserted into a particular block, which may cause the block to split or overflow, but whatever happens it is still O(log(n)) or better, caused by the b-tree or similar structure used to find the block.

But uniqueness without an index would be O(n) as you have to check the entire table. That's what I'm trying to avoid.
–
corsiKaMay 18 '11 at 14:36

This indeed is the best answer for this question !!! +1
–
RolandoMySQLDBAMay 18 '11 at 14:36

@Trick - yes, I misunderstood at first. The index is the price you pay for the uniqueness constraint I'm afraid. Can you use a Clustered Index in your case?
–
Jack Douglas♦May 18 '11 at 14:55

@JackPDougless I can use a standard "index" and get a O(lg n) insert time. That's not a problem. My question is would the system, knowing that you need that index to get a decent insert time, create an index for me.
–
corsiKaMay 18 '11 at 15:21

INSERT must add a "row" to each and every index (of any kind) that you have. This takes some time. (Usually not enough time to matter.) Indexes are all stored in BTree format. MyISAM BTree blocks are 1KB; InnoDB uses 16KB.

Inserting into InnoDB updates the PK and the data simultaneously.

Inserting into MyISAM usually "appends" the data to the .MYD. Separately, it adds a row to the PK (if any).

INSERT has to first verify that there is no duplicate key for any PRIMARY or UNIQUE key. This is done by using the index. And, hence, why the UNIQUE and FOREIGN KEY CONSTRAINTs really build indexes. This is O(logN), but usually CPU, not I/O, because if efficient caching.

And here's a way to test it... CREATE a table without any secondary indexes; do SHOW TABLE STATUS - Index_length will be 0. Then add a UNIQUE index; TABLE STATUS will now show something. (May have to put a non-trivial amount of data in the table.)
–
Rick JamesMay 21 '11 at 0:50

As for constraints, indexes are created for you so that constraint paradigm is setup. You should be able to remove duplicate indexes, even UNIQUE keys, as long as the constraint you made does not reference other UNIQUE keys you personally made apart from the constraint paradigm.

You may never have to search for this field, but MySQL will sure have to as its path to determine validity of keys and determine how to go about performing ON DELETE CASCADE and ON UPDATE CASCADE operations.

The UNIQUE index simply guarantees uniqueness of tuples (singletons, pairs, triplets, ..., n-tuples, etc) in every row in the table.

It is at your discretion to remove such duplicate indexes, provided you do not break the constraint paradigm you wan the table to have.

This doesn't answer my question. My question is related to the insert time. If you have a unique constraint, the system must ensure the uniqueness of the field before an insert - if there is no index on the field, it will have to search the entire table (O(n)). If there is an index, the lookup will be much faster (probably O(lg n)). That's my issue. I'm well aware of the referential integrity mechanics, I'm only concerned (for purposes of this question) about the performance.
–
corsiKaMay 18 '11 at 5:01