Recognizing Images in an AR Experience

Overview

Many AR experiences can be enhanced by using known features of the user’s environment to trigger the appearance of virtual content. For example, a museum app might show a virtual curator when the user points their device at a painting, or a board game might place virtual pieces when the player points their device at a game board. In iOS 11.3 and later, you can add such features to your AR experience by enabling image recognition in ARKit: Your app provides known 2D images, and ARKit tells you when and where those images are detected during an AR session.

This example app looks for any of the several reference images included in the app’s asset catalog. When ARKit detects one of those images, the app shows a message identifying the detected image and a brief animation showing its position in the scene.

Important

The images included with this sample are designed to fit the screen sizes of various Apple devices. To try the app using these images, choose an image that fits any spare device you have, and display the image full screen on that device. Then run the sample code project on a different device, and point its camera at the device displaying the image. Alternatively, you can add your own images; see the steps in Provide Your Own Reference Images, below.

To use the detected image as a trigger for AR content, you’ll need to know its position and orientation, its size, and which reference image it is. The anchor’s inherited transform property provides position and orientation, and its referenceImage property tells you which ARReferenceImage object was detected. If your AR content depends on the extent of the image in the scene, you can then use the reference image’s physicalSize to set up your content, as shown in the code below.

Provide Your Own Reference Images

To use your own images for detection (in this sample or in your own project), you’ll need to add them to your asset catalog in Xcode.

Open your project’s asset catalog, then use the Add button (+) to add a new AR resource group.

Drag image files from the Finder into the newly created resource group.

For each image, use the inspector to describe the physical size of the image as you’d expect to find it in the user’s real-world environment, and optionally include a descriptive name for your own use.

Note

Put all the images you want to look for in the same session into a resource group. Use separate resource groups to hold sets of images for use in separate sessions. For example, an art museum app might use separate sessions (and thus separate resource groups) for recognizing paintings in different wings of the museum.

Enter the physical size of the image in Xcode as accurately as possible. ARKit relies on this information to determine the distance of the image from the camera. Entering an incorrect physical size will result in an ARImageAnchor that’s the wrong distance from the camera.

When you add reference images to your asset catalog in Xcode, pay attention to the quality estimation warnings Xcode provides. Images with high contrast work best for image detection.

Use only images on flat surfaces for detection. If an image to be detected is on a nonplanar surface, like a label on a wine bottle, ARKit might not recognize it at all, or might create an image anchor at the wrong location.

Consider how your image appears under different lighting conditions. If an image is printed on glossy paper or displayed on a device screen, reflections on those surfaces can interfere with detection.

Apply Best Practices

This example app simply visualizes where ARKit detects each reference image in the user’s environment, but your app can do much more. Follow the tips below to design AR experiences that use image detection well.

Use detected images to set a frame of reference for the AR scene. Instead of requiring the user to choose a place for virtual content, or arbitrarily placing content in the user’s environment, use detected images to anchor the virtual scene. You can even use multiple detected images. For example, an app for a retail store could make a virtual character appear to emerge from a store’s front door by recognizing posters placed on either side of the door and then calculating a position for the character directly between the posters.

Design your AR experience to use detected images as a starting point for virtual content. ARKit doesn’t track changes to the position or orientation of each detected image. If you try to place virtual content that stays attached to a detected image, that content may not appear to stay in place correctly. Instead, use detected images as a frame of reference for starting a dynamic scene. For example, your app might recognize theater posters for a sci-fi film and then have virtual spaceships appear to emerge from the posters and fly around the environment.

Consider when to allow detection of each image to trigger (or repeat) AR interactions. ARKit adds an image anchor to a session exactly once for each reference image in the session configuration’s detectionImages array. If your AR experience adds virtual content to the scene when an image is detected, that action will by default happen only once. To allow the user to experience that content again without restarting your app, call the session’s remove(anchor:) method to remove the corresponding ARImageAnchor. After the anchor is removed, ARKit will add a new anchor the next time it detects the image.

For example, in the case described above, where spaceships appear to fly out of a movie poster, you might not want an extra copy of that animation to appear while the first one is still playing. Wait until the animation ends to remove the anchor, so that the user can trigger it again by pointing their device at the image.