The approach that I have seen with this much code is to
start with statistical tools that estimate the number
of defects in various subsystems. Test and fix
some sections of the code to see if the estimated
defects are really there. Refine your estimates by
using the results of the fixes.

Figure out some sort of reasonable quality goal that
uses the same measures as your estimates.
Then develop an estimate of the effort needed to
reach the goal. Share your estimates with others
to see if you are on the right track. Agree on a
plan to improve the quality, and execute it.

Testing this much code requires a good build system
to automate compilation and the running of the
tests. Without this infrastructure, the other test
tactics that you have listed will be difficult to
implement in a reasonable way. A daily build and
smoke test is a great way to evaluate the system
quality.

Version control needs to be working, also. If the
code changes are not measured it will be difficult
to estimate the code quality and to focus the effort.

In a large system like this, once the build is
automated, the versions are controlled, and the
quality estimates are done, there are usually one
or two subsystems that obviously have the most problems.
These are your hot spots. Much of the time the
hot spots are already known, but sometimes there are
surprises, such as when a problematic low-level
data store causes what appear to be UI problems.

I like the various books by
Steve McConnell
which touch on this topic. There are other worthwhile
books that I'm sure others will recommend.