Vision Framework for Face Landmarks detection using Xamarin.iOS

Vision Framework for Face Landmarks detection using Xamarin.iOS

Mobile devices are getting better and better at solving sophisticated tasks. Not only because of better hardware, but also due to modern trends towards AI – such tasks as face detection, barcode recognition, rectangle detection, text recognition, etc. are now supported on the operating system level making it really simple to solve them in your app. Here I am going to show how to detect face landmarks in real time using the Vision framework. The demo app that we’re going to build here is also available on GitHub.

AVCaptureSession

The first thing to do is to configure an instance of AVCaptureSession to capture the video stream from the front camera. We’re going to direct the stream to

AVCaptureVideoPreviewLayer to preview it on the screen

AVCaptureVideoDataOutput to perform the face landmarks detection

Let’s start with a small helper property to get the front camera AVCaptureDevice. We’re using the AVCaptureDeviceDiscoverySession specifying that we’re interested in the front camera.

Here we’re initiating the capture session by adding instances of the AVCaptureDeviceInput and AVCaptureVideoDataOutput classes. We’re setting AlwaysDiscardsLateVideoFrames to true to save some memory (well, it’s true by default, but let’s make it explicit). And also what’s important here is the OutputRecorder – our implementation of IAVCaptureVideoDataOutputSampleBufferDelegate which will do the face landmarks detection.

VNSequenceRequestHandler and VNDetectFaceLandmarksRequest

At this point, we have the configured AVCaptureSession and we’re ready to process the output to detect face landmarks. To do this let’s override the DidOutputSampleBuffer method.

The method is called every time there are new frames captured. We’re creating a CIImage and passing it to the DetectFaceLandmarks method which will use the Vision framework to detect face landmarks and draw on the overlay layer. Note that we need to properly dispose all objects, otherwise the app becomes unresponsive very quickly.

The method is quite simple. First, we initiate a new VNDetectFaceLandmarksRequest by specifying a handler which will iterate through all results and draw them (note that we’re doing the drawing on the UI thread). And second, we’re using the VNDetectFaceLandmarksRequest to perform the detection on the CIImage from the previous step.

Since the Vision framework returns normalized points of landmarks we’re transforming them to the screen coordinates before drawing. The rest code is just about adding a new CAShapeLayer with the drawn line.

Conclusion

Here I showed you how simple it is to perform such a complex task as the detection of facial landmarks. If you’re creating your own app that uses this feature, don’t forget to add an NSCameraUsageDescription to your info.plist. Also, keep in mind that the Vision framework is available on iOS 11+. Happy coding!