Get a Kinect, record a depth video of yourself talking, and shrink wrap the depth image to get a usable mesh. Generate morph targets per frame from the same wrapped mesh to get the keyframe data you need. MoCap on the cheap, but good enough for lots of apps.
–
David LivelyJun 22 '12 at 16:43

@DavidLively Wow... another great use for the Kinect. I wouldn't have thought of this, nice.
–
Dalin SeivewrightJun 26 '12 at 6:41

However, I would say all AAA-titles use motion-captured (mocap) actors for cutscenes. They are expensive to do, but cheaper than having an army of artists create facial animations by hand. Mocap also has the advantage of looking a lot more life-life, depending on the amount of control points on the actor's face.

If, however, you do not work for an AAA-studio and you want some simple facial animation for your characters, look into morph targets. Create a bunch of facial expression (one for each consonant, for instance) and blend them dynamically, based on the lines of dialog.

However, there are extents to which they use mocap. Some use it for the entire dialogue (like la noire) whereas most will just capture a few syllables and have a text interpreter for the dialog that will mimic it. But i would say not all use mocap.
–
CobaltHexJun 22 '12 at 8:44

2

In fact LA Noire was notable particularly for using facial mocap everywhere - I doubt any other game has done it. It's crazy expensive. (Quantic Dream is apparently going to be the next to do it.)
–
user744Jun 22 '12 at 14:08

Well faces are generally scanned in and then mapped to heads for 'realistic faces' and then for their facial expressions, they can either be mocapped or done by hand (which is not as pretty but cheaper). However, both take a fair amount of time and mocap costs a lot of money. Facial animations i think generally use more precise vertex based animation but you could probably rig the lips to bones if you don't care about being super accurate. I think there are some professional packages for face modeling, try googling around for your favorite 3d editor. Also, game engines sometimes have their own facial modeling software, for example, face poser in source games.

Since most of the answers so far have been focused on animation, I'd like to add as a sidenote that the shading model and materials used to render the face can also have a large impact on its realism.

In particular, realistic skin rendering is usually implemented using a subsurface scattering shader, otherwise it tends to look artificial. Reflectance is also important, as skin is slightly glossy and reflects a certain amount of light. Eyes and hair rendering are also important, and both subjects can get extremely complex when pursuing realism.

If possible, use a game engine that already supports these rendering techniques!

I'll just leave a few resources on these subjects below, for those that are interested, taken either from the Cryengine 3 documentation (amazing stuff), or from the free GPU gems books.

Skin

Skin's main look is the result of subsurface scattering (light
bouncing inside the surface, exiting in a different location than
where it entered). Without subsurface scattering skin appears hard
like concrete. It's the most important visual cue for the look of skin

Facial expressions and mouth movements usually use a facial animation middleware of some sort. FaceFX is one (Mass Effect used it), and incidentally I used to work in the very same building as some of the guys from OC3 Entertainment, who put it out. I believe in many if not most cases these middlewares do processing on the voice file to generate most of the mouth animations to sync up, perhaps with cleanup by a human afterward. The expressions are most likely scripted, and perhaps blended with the mouth animations at render time (to avoid storing a million distinct animations to disk). I'm sure you can learn more by going on the FaceFX website.

If you want to get similar results yourself, you're probably in for an uphill climb. It's typically not something you do on your own if you have a large amount of animating to do (As Bioware did with the ME games). However, if you're closer to Call of Duty or some such where you have only a few minutes of dialogue which needs very realistic movements, hand animating might be cheaper or give a better result. Not sure.

There are tools these days for producing facial animations based on a few basic emotions and phonemes for spoken words. By blending from one to the next (eg. using morph targets as knight666 has said) you can create the appearance of fairly realistic conversation. (Sadly I'm a programmer rather than an artist so I forget the name of the plugin that I saw that does this.)

I doubt most games are using motion-captured faces except perhaps for very important scenes and extreme close-ups. Mocap is expensive and time-consuming to do and capturing every line of speech is likely to be impossible. Certainly you can see that most games just attempt morph the face based on the audio. It may be the case that a studio performs motion capture to collect the initial phonemic and emotion shapes to the face, but these are common across most cultures and therefore it makes more sense just to use pre-captured ones in a standard tool.