The HTML markup

Our HTML interface has two main operational sections: the stream and capture panel and the presentation panel. Each of these is presented side-by-side in its own <div> to facilitate styling and control.

The first panel on the left contains two components: a <video> element, which will receive the stream from WebRTC, and a <button> the user clicks to capture a video frame.

This is straightforward, and we'll see how it ties together when we get into the JavaScript code.

Next, we have a <canvas> element into which the captured frames are stored, potentially manipulated in some way, and then converted into an output image file. This canvas is kept hidden by styling the canvas with display:none, to avoid cluttering up the screen — the user does not need to see this intermediate stage.

We also have an <img> element into which we will draw the image — this is the final display shown to the user.

The output height of the image will be computed given the width and the aspect ratio of the stream.

transmissão

Indicates whether or not there is currently an active stream of video running.

vídeo

This will be a reference to the <video> element after the page is done loading.

canvas

This will be a reference to the <canvas> element after the page is done loading.

foto

This will be a reference to the <img> element after the page is done loading.

startbutton

This will be a reference to the <button> element that's used to trigger capture. We'll get that after the page is done loading.

The startup() function

The startup() function is run when the page has finished loading, courtesy of window.addEventListener(). This function's job is to request access to the user's webcam, initialize the output <img> to a default state, and to establish the event listeners needed to receive each frame of video from the camera and react when the button is clicked to capture an image.

Getting element references

First, we grab references to the major elements we need to be able to access.

The error callback is called if opening the stream doesn't work. This will happen for example if there's no compatible camera connected, or the user denied access.

Listen for the video to start playing

After calling HTMLMediaElement.play() on the <video>, there's a (hopefully brief) period of time that elapses before the stream of video begins to flow. To avoid blocking until that happens, we add an event listener to video, canplay, which is delivered when the video playback actually begins. At that point, all the properties in the video object have been configured based on the stream's format.

This callback does nothing unless it's the first time it's been called; this is tested by looking at the value of our streaming variable, which is false the first time this method is run.

If this is indeed the first run, we set the video's height based on the size difference between the video's actual size, video.videoWidth, and the width at which we're going to render it, width.

Finally, the width and height of both the video and the canvas are set to match each other by calling Element.setAttribute() on each of the two properties on each element, and setting widths and heights as appropriate. Finally, we set the streaming variable to true to prevent us from inadvertently running this setup code again.

Handle clicks on the button

To capture a still photo each time the user clicks the startbutton, we need to add an event listener to the button, to be called when the click event is issued:

We start by getting a reference to the hidden <canvas> element that we use for offscreen rendering. Next we set the fillStyle to #AAA (a fairly light grey), and fill the entire canvas with that color by calling fillRect().

Last in this function, we convert the canvas into a PNG image and call photo.setAttribute() to make our captured still box display the image.

Capturing a frame from the stream

There's one last function to define, and it's the point to the entire exercise: the takepicture() function, whose job it is to capture the currently displayed video frame, convert it into a PNG file, and display it in the captured frame box. The code looks like this:

As is the case any time we need to work with the contents of a canvas, we start by getting the 2D drawing context for the hidden canvas.

Then, if the width and height are both non-zero (meaning that there's at least potentially valid image data), we set the width and height of the canvas to match that of the captured frame, then call drawImage() to draw the current frame of the video into the context, filling the entire canvas with the frame image.

Note: This takes advantage of the fact that the HTMLVideoElement interface looks like a HTMLImageElement to any API that accepts an HTMLImageElement as a parameter, with the video's current frame presented as the image's contents.

If there isn't a valid image available (that is, the width and height are both 0), we clear the contents of the captured frame box by calling clearphoto().

Fun with filters

Since we're capturing images from the user's webcam by grabbing frames from a <video> element, we can very easily apply filters and fun effects to the video. As it turns out, any CSS filters you apply to the element using the filter property affect the captured photo. These filters can range from the simple (making the image black and white) to the extreme (gaussian blurs and hue rotation).

You can play with this effect using, for example, the Firefox developer tools' style editor; see Edit CSS filters for details on how to do so.