My analysis:
It's obvious that the row lock contention wait time is huge, and this direct me to find out SQL stmt, causing this.
all the SQL statement was SELECT ....... FOR UPDATE stmt.
I was also able to find out locked tables.

My tuning idea:
1. I'm thinking to reorganize hot tables as well as their indexes, but by instinct it seems to not give so much value to avoid the huge row lock wait time.
2. I'm also seeing if I can reduce the number of rows per block, by increasing PCTFREE and diminishing PCTUSED, so the contention will spread over many blocks instead of one heavy block.

Question
As SQL stmt related to those locked tables are select ... for update, how could I tune this kind of stmt?

Does someone have other idea to come up with this row lock contention?

So, I need first to have the data locally then, check the initrans value before change it.
furthermore as I know, the new initrans won't affect current block, but new allocated, so I have to Rebuild the table by changing the initrans value.

Question.
What is the overhead when dumping a contents of a blcok from the production database as I specified above?

As the documentation mentionned, you don't know exactly the number of transactions that took place, because the dba_tables shows only the target initrans, but not the one used. the metalink Note:151473.1 explain more in details steps given above, furthermore initrans value depends on the block size as specified in the metalink note.
That's why dumping the target block should show exactly how many transactions took place within.

ENQ: TX - ROW LOCK CONTENTION - this is a row level lock caused by a possible primary key or unique index violation if the first session commits its current transaction. A second session attempting to insert the value that will result in the primary key violation will sit in this wait event until the first session commits or rolls back.

Problems with initrans would show up as ENQ: TX - ALLOCATE ITL ENTRY waits in 10g/11g. But, such problems should be rare with tablespaces using ASSM.

I would likely start by trying to determine why the sessions were attempting to insert identical primary key values. Is the application a database platform independent application, which rather than using sequences uses the following to determine the next primary key value:
SELECT
MAX(PRIMARY_KEY) + 1 NEW_PRIMARY_KEY
FROM
MY_TABLE;

Followed by:
INSERT INTO MY_TABLE (
PRIMARY_KEY)
VALUES (... );

Followed by:
SELECT
*
FROM
MY_TABLE
WHERE
PRIMARY_KEY = ...
FOR UPDATE;

Once I understood why or how the sessions were trying to insert duplicate primary key values, I would try to determine why the average number of seconds for the wait event is almost 3 seconds (maybe a timeout).

To do this, I might find a session that frequently waits on this event, and activate a 10046 trace at level 12, and find a way of capturing the contents of the V$LOCK table every 1/4 of a second or so (considering that the average wait is nearly 3 seconds, maybe checking once a second would be sufficient). The V$LOCK table will show the session that causes the first session to wait in this wait event, so I would also enable a 10046 trace at level 12 on the other session. From there, I would manually parse the 10046 trace files to determine what is happening. For instance, if I saw that one session inserted a row into a table, and then there was a 30 second wait on SQL*Net message from client and an equivalent 30 second wait by the second session followed by a rollback in the second session, that might tell me that something odd happened at the computer that inserted the row (maybe a message asking if the document printed OK on the first session's computer).

In short, I would try to find the cause, rather than trying to fight the symptoms (the ENQ: TX - ROW LOCK CONTENTION wait event). Find waits to decrease the time required for a transaction to complete.

Thanks you so much,we're using a 3rd party tool that generate sql smt,with a low control on it. but looking at the SQL generated by the tool:
SELECT ....
FROM DEV.POS_FOLIO t1
WHERE (t1.POS_FOLIO_ID = TO_NUMBER(:1))
FOR UPDATE OF t1.POS_FOLIO_ID

The POS_FOLIO_ID is the table primary key, but the optimizer won't
use the index on this primary key due the function TO_NUMBER within the predicate.

In order to make this select .. for update more selective I have to

1. create a function base index on POS_FOLIO_ID
or
2. remove the function TO_NUMBER against the predicate

In this case the CBO will effeciently use the inndex created on POS_FOLIO_ID and make the update more selective.
what do you think? Any idea?

Before making any change, I would monitor the database to see what is causing the average 3 second wait in the wait event. Once you find a couple sessions that are involved, as I mentioned previously, turn on a 10046 trace at level 12 for those sessions.

Manually review the trace files - you might just find that the session holding the lock (possibily for much longer than 3 seconds) is waiting on the user to click an OK button (you would see a long wait on SQL*Net message from client before the commit). In such a case, adding an extra index may just make the problem worse (additional index maintenance on insert or possibly update if the primary key is updated).

If you find that the session holding the lock is waiting on multiple db file scattered reads on the DEV.POS_FOLIO table for a rough total of 3 seconds, then adding the index that you suggest might be helpful. You might find that there is a missing foreign key index, resulting in a full table lock when its corresponding primary key value if changed (even changed to the same value). It could be that the session attempting to obtain the row lock on the DEV.POS_FOLIO table is having to wait for another full table lock to finish, as might be caused by a missing foreign key index (monitoring V$LOCK will show this). It might be a combination of problems like these, the detail of which are lost in the summary of the top 5 wait events.

Attempting to guess at a solution without monitoring the problem can make the problem worse.

he can be locking only 1 row in the table, no mater how long it takes him
to find that row (using a FullTableScan or the Index) because POS_FOLIO_ID
is your Primary Key.
Therefore, if user 'B' starts wiating on an enqueue with the same SQL statement,
wouldn't it be that he is also doing a SELECT ... TO_NUMBER('12345') FOR .. ?
Again, whether he uses an Index or a FullTableScan, when he finds that row
(it might take him 0.1seconds or 10seconds to get there) is when he willstart waiting for the row. If user 'B' waits for 3 seconds , it would mean
that user 'A' is doing something else for those 3 seconds.

You'd have to find out what that "something else" is that user 'A' is doing.

Taking another look at your suggested function based index, it depends on the data type of the DEV.POS_FOLIO_ID.POS_FOLIO_ID column. If the column is defined as a number, and it is a primary key, there will already be a usable index on that column.

Yesterday, I wrote this: "Once I understood why or how the sessions were trying to insert duplicate primary key values, I would try to determine why the average number of seconds for the wait event is almost 3 seconds (maybe a timeout)."

After fixing the formatting of the top 5 wait events (total duration unknown):

I now have a test table with 1,000,000 rows. I start monitoring the changes in the wait events roughly every 60 seconds, and V$SESSION_WAIT and V$LOCK roughly 4 times per second.

Back in session 1:

UPDATE
T1
SET
C1=-C1
WHERE
C1<=100;

I have now modified the first 100 rows that were inserted into the table, time to make the pot of coffee.

In session 2, I try to insert a row with a primary key value of -10:

INSERT INTO T1 VALUES (
-10,
10);

Session 2 hangs.

If I take the third 60 second snap of the system wide wait events as the zero point, and the 11th snap as the end point. There were 149 waits on ENQ: TX - ROW LOCK CONTENTION, 148 time outs, 446.62 seconds of total time in the wait event, with an average wait time of 2.997450 seconds.

Rolling down to the session level wait events, SID 208 (my session 2) had 149 waits on ENQ: TX - ROW LOCK CONTENTION, for a total time of 446.61 seconds with an average wait time of 2.997383 seconds. All of the 149 waits and the wait time was in this one session that was locked up for the full duration of this time period because session 1 was making a pot of coffee.

Rolling down to V$SESSION_WAIT (sampled roughly 4 times per second): At the start of the third time interval, SID 208 has been in the ENQ: TX - ROW LOCK CONTENTION wait event for 39 seconds and is actively waiting trying to execute SQL with a hash value of 1001532423, the wait object is -1, wait file is 0, wait block is 0, wait row is 0, P1 is 1415053316, P2 is 196646, P3 is 4754.
At the end of the 11th time interval: , SID 208 has been in the ENQ: TX - ROW LOCK CONTENTION wait event for 483 seconds and is actively waiting trying to execute SQL with a hash value of 1001532423, the wait object is -1, wait file is 0, wait block is 0, wait row is 0, P1 is 1415053316, P2 is 196646, P3 is 4754.

Rolling down to V$LOCK (sampled roughly 4 times per second): I see that SID 214 (session 1) is blocking SID 208 (session 2). SID 214 has a TX lock in mode 6 with ID1 of 196646 and ID2 of 4754. SID 208 is requesting a TX lock in mode 4 with ID1 of 196646 and ID2 of 4754.

So, it seems that I need a faster coffee pot rather than an additional index on my table. It could be that the above process would have found that the application associated with SID 214 was abandoned or crashed and for some reason the lock was not released for a long period of time, a little less than 10.44 hours in your case.