Face Detection Concepts Overview

Face detection is the process of automatically locating human faces in visual
media (digital images or video). A face that is detected is reported at a
position with an associated size and orientation. Once a face is detected, it
can be searched for landmarks such as the eyes and nose.

Here are some of the terms that we use in discussing the face detection feature
of ML Kit:

Face tracking extends face detection to video sequences. Any face
appearing in a video for any length of time can be tracked. That is, faces
that are detected in consecutive video frames can be identified as being the
same person. Note that this is not a form of face recognition; this mechanism
just makes inferences based on the position and motion of the faces in a
video sequence.

A landmark is a point of interest within a face. The left eye, right eye,
and base of the nose are all examples of landmarks. ML Kit provides the
ability to find landmarks on a detected face.

A contour is a set of points that follow the shape of a facial feature.
ML Kit provides the ability to find the contours of a face.

Classification is determining whether a certain facial characteristic is
present. For example, a face can be classified with regards to whether its
eyes are open or closed. Another example is whether the face is smiling or
not.

Face Orientation

The following terms describe the angle a face is oriented with respect to the
camera:

Euler X: A face with a positive Euler X angle is facing upward.

Euler Y: A face with a positive Euler Y angle is turned to the camera's
right and to its left.

Euler Z: A face with a positive Euler Z angle is rotated counter-clockwise
relative to the camera.

ML Kit always reports the Euler Z angle of a detected face. The Euler Y angle
is available only when using the "accurate" mode setting of the face detector
(as opposed to the "fast" mode setting, which takes some shortcuts to make
detection faster). The Euler X angle is not supported.

Landmarks

A landmark is a point of interest within a face. The left eye, right eye, and
nose base are all examples of landmarks.

Rather than first detecting landmarks and using the landmarks as a basis of
detecting the whole face, ML Kit detects the whole face independently of
detailed landmark information. For this reason, landmark detection is an
optional step that is not enabled by default.

The following table summarizes all of the landmarks that can be detected, for an
associated face Euler Y angle:

Euler Y angle

Detectable landmarks

< -36 degrees

left eye, left mouth, left ear, nose base, left cheek

-36 degrees to -12 degrees

left mouth, nose base, bottom mouth, right eye, left eye, left cheek, left ear tip

-12 degrees to 12 degrees

right eye, left eye, nose base, left cheek, right cheek, left mouth, right mouth, bottom mouth

12 degrees to 36 degrees

right mouth, nose base, bottom mouth, left eye, right eye, right cheek, right ear tip

> 36 degrees

right eye, right mouth, right ear, nose base, right cheek

Each detected landmark includes its associated position in the image.

Contours

A contour is a set of points that represent the shape of a facial feature. The
following image illustrates how these points map to a face (click the image to
enlarge):

Each feature contour that ML Kit detects is represented by a fixed number of
points:

Face oval

36 points

Upper lip (top)

11 points

Left eyebrow (top)

5 points

Upper lip (bottom)

9 points

Left eyebrow (bottom)

5 points

Lower lip (top)

9 points

Right eyebrow (top)

5 points

Lower lip (bottom)

9 points

Right eyebrow (bottom)

5 points

Nose bridge

2 points

Left eye

16 points

Nose bottom

3 points

Right eye

16 points

Left cheek (center)

1 point

Right cheek (center)

1 points

When you get all of a face's contours at once, you get an array of 133 points,
which map to feature contours as shown below:

Indexes of feature contours

0-35

Face oval

36-40

Left eyebrow (top)

41-45

Left eyebrow (bottom)

46-50

Right eyebrow (top)

51-55

Right eyebrow (bottom)

56-71

Left eye

72-87

Right eye

88-96

Upper lip (bottom)

97-105

Lower lip (top)

106-116

Upper lip (top)

117-125

Lower lip (bottom)

126, 127

Nose bridge

128-130

Nose bottom (note that the center point is at index 128)

131

Left cheek (center)

132

Right cheek (center)

Classification

Classification determines whether a certain facial characteristic is present.
ML Kit currently supports two classifications: eyes open and smiling.

Classification is expressed as a certainty value, indicating the confidence that
the facial characteristic is present. For example, a value of 0.7 or more for
the smiling classification indicates that it is likely that a person is smiling.

Both of these classifications rely upon landmark detection.

Also note that "eyes open" and "smiling" classification only works for frontal
faces, that is, faces with a small Euler Y angle (at most about +/- 18 degrees).