Let's assume that you're invited to my organisation to look into my platform, which has thousands of databases if not less, and you are requested to check various aspects of these databases to tell me where I am going wrong w.r.t administration, health checks, backups, etc.

What would be the points that you'd cover during such an excercise?
E.g., how the backups are being taken? What parameters to monitor day by day? Script(s) to identify bad queries? Etc ..

I would prepare a list of those database/db servers who found top 10 mistakes in my examination and then suggest to admin to remove them. What are Top Ten Mistakes Found in Oracle Systems see here :
http://docs.oracle.com/cd/B19306_01/server.102/b14211/technique.htm#i11221

I am not sure that thousands of db's can be checked just like that for obvious issues but a quick look at the EM would be able to reveal a lot of things especially related to Performance , policy violations etc.

Let's assume that you're invited to my organisation to look into my platform, which has thousands of databases if not less, and you are requested to check various aspects of these databases to tell me where I am going wrong w.r.t administration, health checks, backups, etc.

What would be the points that you'd cover during such an excercise?

I would ask for a better problem description and what the goals are of the exercise. What does who expect as output from the exercise for which reason to achieve what?

E.g., how the backups are being taken? What parameters to monitor day by day? Script(s) to identify bad queries? Etc ..

A bad query can be a single simplistic SQL statement that executes in 1s, and is executed a million times per day. As that SQL can be and should be executed in less than 0.5s if the correct index was used.

A good query can be a single complex SQL that spans over 3 pages of A4 printout, is executed twice a day, with an execution plan that makes your head hurt when looking at it, and takes 5 hours to execute. As it does incredible complex processing to produce invaluable critical business data with an optimally designed SQL (using the best indexes, the most appropriate join algorithms, etc).

Thinking that one can simplistically run a script, identify an issue like a "bad SQL", is dangerously naive. As is the concept that management typically has of "database health"...

Purpose of this excercise is to come up with a definitive list of actions/areas which a DBA would look for in an environment so as to decide if everything is in place as it should be. Of course there would be changes to it depending on the application running on top of it/them and how they are being ran.

As I said in my original post, I would kindly request you/all to imagine yourself in a situation wherein "you" have visited a client-site environment, and you are requested to "investigate" the platform as a DBA in whatsoever respect possible, and finally "_provide improvement suggestions_" to client.

Thanks for your note. That's one of the points I would include in my list - I.e., to install OEMGC to monitor all databases from one OEM.

Do you foresee any performance issues because of OEM agents running on such a large number of databases?

In case, and there can be many examples, wherein client is not ready to "pay" Oracle for the OEMGC licensing. In that case, we can have our own tool created whose agents/sub-scripts would run in each of the databases, after a certain interval of time .. let's say after every 10 mins, and report any/all issues in them to a central server from where we can monitor all of them at once? Note: The output of these scripts can be displayed in a web portal. [ A simple Apache installation would do in this case ... ]

I think non, because this seems me complete list and much more provided by Oracle itself. I shall wait with you to see Aman's and Billy's response, who are definitely far far knowledgeable and experienced; no doubt.

As I said in my original post, I would kindly request you/all to imagine yourself in a situation wherein "you" have visited a client-site environment, and you are requested to "investigate" the platform as a DBA in whatsoever respect possible, and finally "_provide improvement suggestions_" to client.

You do that by starting to ask questions to management. What are the real and perceived problems? What is the architecture? What are the policies ito security? Auditing? Vendor support and maintenance? What are the critical business requirements that the platform and database need to address? Etc.

I would not start out by running "scripts" pretending to be an all-knowing Oracle database guru.

I would start by applying fundamental software engineering principles. User requirements and expectations. Defining that and defining the real or perceived problems. Before doing anything else.

Another comment as to why approaching this issue with "running scripts" is flawed.

Metrics alone are useless. I/O calls per second, CPU utilisation, memory used, network bandwidth consumed, transactions per second, number of users/sessions - these all are MEANINGLESS on their own.

Unless you have a baseline for comparing those metrics against. A metric of a 100 IOPS alone does not tell you anything. It is not possible to say whether this is good/bad/wrong/performance problem/whatever if you cannot compare that against some baseline.

Running scripts will give you primarily metrics. And that alone are meaningless when you want to determine whether there are problems with the performance or architecture or database or application.