Saturday, September 28, 2013

Recently, the video from the Legislative Yuan goes viral. Mr. Chao-Hao Liu on the right, as a legislator and a former judge, clearly points out several serious law issues in Mr. Shih-Ming Huang on the left. The background of the story can be found here.

It's really hard to see the reaction of Mr. Shih-Ming Huang from the video because there are no obvious facial expression changes throughout the video. Thus, I am interested in estimating his heart rate in the video using several computer vision and signal processing tools.

Here is the resultant video with estimated heart rate. We can now see how the heart rate of Mr. Shih-Ming Huang changes when Mr. Chao-Hao Liu raises various questions.

How?

Non-contact physiological signal monitoring using video processing has been an hot topic recently.
Two examples:

The Cardiocam in the Affective Computing Group in MIT Media Lab [Link]

Inspired by these awesome works, I implemented a simplified version for heart rate monitoring in video. Here we describe the method step-by-step.

Face detection and tracking

We use the off-the-shelf Viola-Jones face detection and KLT tracking to detect and track Mr. Shih-Ming Huang's face throughout the whole video. This provides reference frames for further analysis. Below we show part of the video sequence with face tracking on the left and the rectified face region on the right.

Temporal Processing

To extract the heart rate, we seek the color variations in the face region. For each frame of the rectified face, we perform averaging in each of the R, G, B channels. Here is an example of the values of averaged R, G, B channels over 300 frames (~10 seconds for a 30 frame per second video).
From the variation in the time, it's difficult to perceive the color variation due to heart beats. We therefore perform temporal filtering to focus on the signals in certain frequency ranges.

Here we show the Fourier transform of the green channel. From the magnitude spectrum of the signal (in log scale), we can see that the energy of the DC component dominates over other frequency components.

As we know a normal human heart rate is around 60 - 90 beat per minutes (i.e., 1 Hz - 1.5 Hz), we apply a temporal bandpass filter to filter out the rest unwanted frequencies. Here we show a bandpass filter with range 1-2 Hz (60 BPM - 120 BPM).

After temporal filtering, we can convert the filtered signal back to temporal. Here is the filtered R, G, B signals. Now we can see the color variations due to heart beats.

From these three filtered signals, we perform the Principal component analysis to compute the first principal component that captures the largest variation in the temporally filtered R, G, B signals. We can then detect the peak in this principal component to estimate how many times the heart beats in this observation time window. For example, we show the peak detection result (marked in red) in the first principal component. In this case, there are 16 beats in a 300 frames (~10 seconds). We then get an heart rate estimate: (16/10)*60 = 96 Beat Per Minute (BPM) for this time window.

Using the sliding window approach, we can estimate the subject's heart rate variation over time. Specifically, we use a 30-seconds time windows with 1 second increment. Here is the plot for all the heart rate measurement in the video. It's interesting to the variations of the heart rate. For example, Mr. Mr. Shih-Ming Huang is nervous (HR~110 bpm) at the very beginning. He gradually calms down for the first 30 seconds and seems to get nervous when Mr. Chao-Hao Liu starts to raise questions.