If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

analyzing the tables corrects the problem.

I have seen a strange a thing happening with the application. The application is designed with certain database jobs that includes analyzing the tables and indexes on regular basis. I have observed that certain tables and indexes do not get analyzed daily.
When this happens, the application suffers hangs and some queries just do not work and end up with logging off the application user. When I manually analyze the tables and indexes using "alter table/ index ... compute statistics", then some queries start working fine and some others not.
The whole problem gets corrected when we restart both the database and application. Then the entire system works fine for few days before the problem starts recurring again.

I think there is more than one issue in described scenario - recycling database and application servers has nothing to do with performance statistics.

In regards to performance statistics.

I assume when you say "alter table compute statistics" you actually mean "analyze table compute statistics" - anyway that's deprecated, you are on Ora10g then use proper dbms_stats system package to gather performance stats.

Having said that, contrary to popular belief usually you do not need to keep gathering statistics all the time. Only exeption would be tables/indexes with very high volatility.

A good practice is to gather stats until you get optimal performance then stop gathering stats and stick with the ones that give you consistent good performance.

Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.

I think there is more than one issue in described scenario - recycling database and application servers has nothing to do with performance statistics.

In regards to performance statistics.

I assume when you say "alter table compute statistics" you actually mean "analyze table compute statistics" - anyway that's deprecated, you are on Ora10g then use proper dbms_stats system package to gather performance stats.

Having said that, contrary to popular belief usually you do not need to keep gathering statistics all the time. Only exeption would be tables/indexes with very high volatility.

A good practice is to gather stats until you get optimal performance then stop gathering stats and stick with the ones that give you consistent good performance.

I appreciate your suggestions. But can you comment on why application and database restart solves it for few days? I am not well aware about Solaris. I think that I need to look into some OS problem caused by Oracle.

Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.

Did you check top wait events. Is latch wait in top wait event? If so, after restating Oracle removing all fragmented memory and so query seems to work faster.

BTW use only dbms_stats.gather_stats for gathering stats and table indexes do not analyze daily, instead it's a increment job, if some percentage of records got modified/added then only that table/index will be analyzed.

I found a particular query taking so long. The query was found after analyzing the application logs that pointed to a query that takes so long to execute and eventually logs out the application user without actually executing the query.

I executed the query on standalone SQLPLUS client using autotrace on. Please see the attached file named 'problematic_query.txt'.

I have also taken the awr report for the affected time period. The same can be verified as the problematic query's SQL Module column in SQL ordered by elapsed time is 'SQL*Plus'.

I bet that by shutdown/startup affected instance - without recyling the actual host - problems "gets solved" for a few days - is that correct?

If that's the case I'll suggest to run statspack or awr and look at what is changing over time... shared_pool is always a good initial suspect for this kind of behavior.

Yes, you are correct. Restarting solves the problem. But shared pool is good enough in terms of hit ratio. SGA_TARGET is on.

Please see my previous posting and have a look at AWR report. May be, you can bring me out of this situation. One more thing is that I cannot tune the SQL statement except rebuilding the indexes as the application development was stopped three years back and we are only supporting the application.

First i would advise AWR report of max 30 minutes duration not a 3 hour.
How many CPU cores u got there?

Second, the top SQL you picked is correct, however if you see "Segments by Physical Reads" and "Segments by Logical Reads" the top contender name not at all in top SQL report. How did this hapen, i beleive that the SQL not finished by this 3 hour period and that's why that SQL not captured?

Can you find top SQL may be from v$longops or some where where table_name will be MAM_ASSET_ATTR_STRING_VALUES and MAM_ASSET_ATTR_DOMAIN_VALUES, you see they are the top contenders?

Last but not the least "Buffer Pool Advisory" says if Buffer cache can be increased from 3,840 M to 4,992 M then physical reads can be reduced from 17,217,394 to 477,260. Since your top wait event is DB file sequential reda this should benefit you.

Can you also attach CPU usage info during AWR report gathered period, i suspect there would be high IO usage during that period.

1. This 3 hour duration of AWR is non peak hour duration. The query itself takes more than 30 minutes to execute. For 45 minute duration, I will have to take remote connectivity of client again. I will check if it is possible.

2. No, I took AWR report after the query finished execution in SQLPLUS.

3. The query I have mentioned in attached file 'problematic_query.txt' already refers to tables MAM_ASSET_ATTR_STRING_VALUES and MAM_ASSET_ATTR_DOMAIN_VALUES.

4. Yes, I will see if it is possible to increase SGA_TARGET.

5. CPU cores, CPU usage and SAN related is unknow as of now. I will try to capture it. But it won't be of much help as same query was running good few days back and it has been working fine for last 5 years. I know that the data growth patterns have to be taken into account but still it works fine after restart.

6. I don't know much about latches. Did you find anything about latches from AWR that may alleviate the problem?