Tuesday, November 3, 2009

Many people may know about the pose at GamesAlfresco (Original post is here), which says they made iPhone camera access possible on the firmware 3.x .Last weekend, I received the source code from Ori Inbar and did some tests with the camera access module.

It works well by itself. I could retrieve the raw data coming from the camera. The module uses a trick similar to what I tried on iPhone OS 2.x, creating CoreSurface buffer of preview from the camera.

However, the problem comes when I overlay something.

When I overlay some contents (like an image view or OpenGL drawings) on the preview view of the camera, the overlaid contents appear in the capture video frame. That is, the captured frame is like a screenshot, not a raw video.

I experienced a similar situation with the 'takePhoto' method, which gives a screenshot of the iPhone's screen and the module received from Ori also have the same issue.

Here are two captured images from incoming raw data of the camera module.They are not screenshots. As we can see, when I overlay an OpenGL view or a UIImageView on top of the 'previewView' of PLCameraController, the overlaid contents are also captured.

Tuesday, October 6, 2009

Few days ago, I downloaded the Wikitude application, which is one of augmented reality application providing information based on the user's location.

Here's a screen capture of Wikitude. It looks like a well-organized application. The information is retrieved from the Internet and displayed on the video background.

However, the widgets floating on the video background is unstable since the widgets' positions change depending on the iPhone's built-in compass and accelerometer. Even small movement can shake the widgets pretty much. Maybe, more filtering is required for stable widget display.

In addition, the widget is too small, to read characters on it and to select one of them. I'd like have a control that changes the widget's size in the settings menu. Moreover, the widgets are overlapped each other, and it makes it hard to read the titles of widgets.

Actually, the video background does nothing in the application. The widgets are not precisely registered on the video background. This also makes the widgets look floating, not being registered to the background video.

So, I think what Wikitude shows is AR does not guarantee good user-experience in every application. Sometimes, it is more effective to display information in a table-view instead of overlaying it on the video.

Thursday, September 17, 2009

Recently, iPhone SDK 3.1 was out. In the update, Apple now allows developers to overlay their own views on the preview view of the UIImagePickerController. Before the SDK 3.1, PLCameraController, a private API, is required or we have to do some hack in the view hierarchy of the UIImagePickerController.

On iPhone SDK 3.1, UIImagePickerController has a property, named as cameraOverlayView defined as :

The custom view to display on top of the default image picker interface.

@property(nonatomic,retain) UIView *cameraOverlayView

When I first met this property, I misunderstood what the property means. That is, I thought that the property is the camera preview view itself. But I was wrong. cameraOveralyView doesn't give us camera preview view. What we can do with cameraOverlayView is just adding my own view to the UIImagePickerController's view and nothing more.

This is very limited functionality for augmented reality because still, it is not possible to get the raw video data from the camera. However, if you just need the background video, this is quite an easy solution for you.

The cameraOverlayView works as follows (I implemented this in an IBAction of a view controller):

Somehow, it is a little bit weird that modal dialog is required. But it is the way we use UIImagePickerController as Apple allows only this way. If all procedures are OK, you will see something like this. A logo is overlaid on the camera's preview view.

If you want to make the video fill the entire screen, change the transformation property of the UIImagePickerController through the property 'cameraViewTransform', which is also a new feature in 3.1.

camera.cameraViewTransform=

CGAffineTransformScale(self.cameraViewTransform, 1.0, 1.13);

Thus, now it is very easy to make some augmented reality applications using 'cameraOverlayView' feature. I made a simple application that displays annotations in the scene by using the iPhone 3GS's compass.

Wednesday, September 2, 2009

When I upgraded the Oolong Engine to the version supporting OpenGL ES 2.0, my trick for video background did not work. The GL view's background is cleared (0,0,0), but the video preview view was occluded by the GL view. Clearly, glClear(0,0,0,0) didn't work at that stage.

I tried to find why this happens and I found a solution after all. The reason why glClear do not work was that the color format of the GL view was set to RGB565, which seems not to provide alpha in the scene So I changed the color format from RGB565 to RGBA8 like this and I can see the background video and rendered scene now.

Actually, you can see the initWithFrame method initializes the color format either RGB565 or RGBA8. Please look the implementation of initWithFrame method in the file EAGLView2.m. You may see this code.

Friday, July 31, 2009

Many people want to use iPhone's camera directly, instead of its ImagePickerController. So, people have used PLCameraController class, which is one of private framework classes. After the iPhone OS is updated to 3.0, however, PLCameraController class is modified and the old method to get preview does not work anymore.

There is a nice thread that discusses how to use PLCameraController on iPhone OS 3.0. See it here.

I succeeded to display the camera preview on my iPhone by following the thread. The code works well on both iPhone 3G and iPhone 3GS. On 3GS, the focus rectangle is automatically displayed as shown in the capture image.

However, the raw data, which all the AR developers may be more interested in rather than the preview, is not accessible until now.

I executed Camera application to see how iPhone's camera module is improved.

Well, I felt no change in image capture mode except auto focusing. The preview is still in 15 fps and there is no great improvement in its quality.

In video recording mode, the frame rate of preview increases to 30 fps (it looks like 30 fps). The video quality is not bad and I think preview quality is better in video recording mode.

The video resolution is 640x480 and frame rate is 30 fps.

When I captured a video with the iPhone in vertical, the video's specification in Quicktime player was 480x640 and it was played in vertical. But, when I played the same file in MPlayer or VLC player, the video is played in horizontal and the resolution was 640x480 ( See the screen capture images below. Click them to enlarge).

The Quicktime player --->

MPlayer -->

Apple seems to insert the information about orientation, which is available from the built-in accelerometer, in the video file. So, the recorded video is originally 640x480 resolution and Quicktime may rotate it based on the orientation information. Apple is already doing this in still image capture.

In my opinion, auto-focusing is not welcomed in computer vision jobs since it changes the focal length of the camera and this breaks the assumption of the fixed focal length, which is a very common in computer vision papers. I hope there is a way to turn off it for augmented reality applications.

Hey, Apple. Why don't you open the interface to video camera control ? It may allow developers to make much more interesting applications....

Tuesday, July 28, 2009

SIO2 engine is updated to v1.4. As I've used as a static library (refer this post), I also tried to build v1.4 in the same way. The library was built without any problem but I met link errors with sio2Init and sio2Shutdown function, which have never occurred with the previous versions, when I built my application.

I asked the forum about this problem and the answer is we need to link the pre-built static library 'libsio2_dev.a' or 'libsio2_sim.a' depending on the target platform. Those libraries contain the implementation of 'sio2Init' and 'sio2Shutdown'. Linking one of those files solved my problem.

Monday, July 27, 2009

When we make a mobile augmented reality application, video background is required. As all we know, video background requires tedious jobs. On desktop PCs, we can do it just calling glDrawPixels. But on mobile phones we need to do it through OpenGL ES texture and uploading texture data is quite slow on most mobile phones since they do not have GPUs that are fast enough to update textures in real-time.In AR applications, video background is just for displaying what we see through the camera on the screen. While trying to make video background faster, I found a simple way to do it without OpenGL ES texturing. The idea is using two views, one for video background and another for OpenGL ES rendering.

The Cocoa API allows us to add multiple child views to a window object. What I did was adding two views to our window. One is the 'previewView' of the PLCameraController and another is the OpenGL ES View. The 'previewView' of PLCameraController is a subclass of UIView and it displays video preview coming from the camera.

In the function applicationDidFinishLaunching of the application's AppDelegates.m (whatever..), add the code something like this. What the code does is just adding two views to the window.

PLCameraController *cam = [PLCameraController sharedInstance];

UIView *cam_view = [cam previewView] ;

[cam startPreview];

cam_view.frame = CGRectMake(0,0, 320, 480);

glView = [[EAGLView alloc]initWithFrame:CGRectMake(0,0,320,480)] ;

glView.opaque = NO ;

glView.alpha = 1.0 ;

[window addSubview:cam_view];

[window addSubview:glView] ;

Then, we need to do one more thing. When you render OpenGL scene, we need to clear the scene with a color that have alpha value of 0.0. By making the OpenGL ES view transparent at the beginning of the rendering process, we will see the live video preview in the background.So, just change the alpha value of clearing color like this.

glClearColor(0,0,0,0) ;

Run your application, then you will see a live video behind your OpenGL scene like the video below.

The pros and cons of this method are:

Pros : You don't need to manage textures or OpenGL ES things for video background. Even you don't need to care background update. Useful for developers who just need video background with OpenGL ES.

Cons : The background video seems not to be synchronized with the OpenGL ES. I do not dig on this, but if you do image processing with video (like object tracking), the background video may not be synchronized with your rendering result.

The important thing is performance of this method for video background for augmented reality applications.I tested my method with Oolong Engine on the iPhone 3G (OS 2.2.1). The simple skeleton example of Oolong engine runs at 60 fps as it is (before I add PLCameraController's previewView view).

The rendering speed decreased to about 28 fps after I added video background. It means the rendering became slower and I expect it is due to rendering with full-screen transparency. Note that the fps is not the video stream's framerate, which is just 15 fps. It is OpenGL ES rendering speed. It is much slower than when the video background is not applied, but I think the performance is not bad for AR applications.

I tested this method on the iPhone 3G with 2.2.1, but it may work on iPhone 3GS with 3.0 or higher.

Update (2009. 08. 11)

I tested the same code on iPhone 3GS (OS Ver. is 3.0.1). The rendering speed is improved up to 41 fps, which is +13 fps compared to the old iPhone 3G. I think it is because of the better CPU and Graphics chip. The preview is still 15 fps, which is for still image capturing. If we can get the preview for video capturing we will be able to use 30 fps preview video on 3GS, but the interface is not known until now.

Saturday, May 16, 2009

In augmented reality (AR) applications, it is required to render incoming video frames as background. In desktop PCs, it is quite easy because we just need to call the glDrawPixels function. However, on mobile phones, glDrawPixels are no more supported in OpenGL ES specification. Thus, we have to use texture instead of sending pixels directly to the framebuffer.

So, what we have to do for video texturing are as follows :

Create a texture which has 2^n width and height.

Copy pixel data from the current frame image

Update the texture data partially (depending on the video resolution)

Render the texture on the screen

In step 1, the texture should have 2^n width and height since OpenGL ES does not support other sizes. The texture can be rectangular, but the width and height should be 2^n.

If you use 320x240 video, the texture resolution becomes 512x256.

The function glTexSubImage2D is used for step 3.

The problem is updating texture takes much time and the application becomes too slow. It is critical in AR applications since real-time video rendering is required. Thus, performance should be measured.

Rendering a texture can be achieved in two ways. One is rendering a full-screen quad and applying the texture on it, and the other is using glDrawTexiOES function, which is an OpenGL ES extension. After I tested several times on a few mobile phones, both methods showed almost same performances.

Friday, May 15, 2009

The PLCameraController class is for controlling iPhone's built-in camera and one of Apple's private frameworks. If you want to get the preview of incoming video stream from the camera, it is quite easy to do it. Here are steps to do it.

1. Get the header of PLCameraController. This can be obtained by dumping iPhone OS frame works by using class-dump-X, or you can download it from somewhere.

2. Add the PhotoLibrary framework from the SDK directory. With the SDK 2.2.1, the framework is in /Developer/Platforms/iPhoneOS.platform/ Developer/SDKs/iPhoneOS2.0.sdk/System/Library/

PrivateFrameworks/PhotoLibrary.framework

3. Import the PLCameraController header to your application delegate.

4. Add the following code to 'applicationDidFinishLaunching'

- (void)applicationDidFinishLaunching:(UIApplication *)application {

// Hide the status bar

[[UIApplicationsharedApplication] setStatusBarHidden:YESanimated:NO];

// Get the view for preview and start preview

PLCameraController *cam = [PLCameraControllersharedInstance];

UIView *view = [cam previewView];

[cam startPreview];

// Add the preview view to window

[windowaddSubview:view];

// Override point for customization after app launch

[windowmakeKeyAndVisible];

}

Now, run the application then you will see something like the screen show below.

Basically, the resolution of the video coming from the camera is 304x400, which is a little weird. The preview video stream is resized in the preview view. If we do not set a new frame size, it is resized with a fixed aspect ratio and thus, there are white area at the bottom of the screen.

For full screen preview, set the frame size of the preview view like this :

view.frame = CGRectMake(0,0, 320, 480) ;

Update : I tried several tests and it turns out that the preview is actually 304x400.

Monday, March 16, 2009

When I open my project written on VS2005, one of my *.vcproj file cannot be opened with this error message.

The message says there are duplicated attributes in the *.vcproj file. To solve the problem, open the file *.vcproj file in a text editor, such as Notepad or Wordpad, and find the specified lines. There may be something like this :

Remove one of them and save *.vcproj file. I'm not sure why this happens, but it may be a bug in VS2005. Maybe, it does not occur on VS 2008.

Sunday, March 15, 2009

Sekai camera showed one of future augmented reality services for information retrieval from an environment. However, there were no mentions how it works. Here is another sekai camera demo. I think that the information is not attached to a specific object. Instead, the tags flows around the user's position and the user should select one of them to see what the others left there. As such, no recognition method is required. The position of the user is an important cue.

Tuesday, March 10, 2009

When rendering the incoming video frames as a background texture, try to turn off linear interpolation property.On Direct3D : D3DMTSS_MAGFILTER and D3DMTSS_MINFILTER

It will increase texture rendering speed much. In my case, using Direct3D, the performance is almost improved by the factor 1.5 after disabling linear interpolation. What we sacrifice for this performance improvement is the rendered texture of a video frame. Since there is no interpolation, the rendered background image will have aliasing effect but it may not a big problem.

Well, full screen video rendering is necessary on mobile phone-based AR. So, I tried full screen texture mapping with an image file. It is quite simple and works well on emulator as I expected.

However, it shows weird results on my device as shown below. The texture does not occupy the entire screen. It seems that the texture coordinates are wrong. But I used the same code without any changes.

The same code shows different results on the emulator and the device. What is the problem ??I digged this few hours and finally I found what I did is a little bit wrong when I create texture object. When I create my texture for background, I wrote the code as below:

The parameters MY_IMG_WIDTH and MY_IMG_HEIGHT are 320 and 240, respectively. The function D3DMXCreateTexture automatically change the texture size when the width and height paramters are not 2^n. Thus, my texture's dimensions become (512,256) internally.When I update the texture data in the rectangle (0,0)~(320,240) area, the remaining region in (512,256) of the texture still have no data (black). Thus, when I draw a quad using texture coordinates from 0 to 1, the entire texture (512,256) is mapped on the quad. Consequently, I see the squeezed image on my device's screen. By changing the texture coordinates of the vertices of the quad, my problem is solved now.

I'm not sure why it works well on the emulator. Maybe the emulator supports rectangular texture dimensions, not 2^n ?

Monday, March 9, 2009

For augmented reality applications, it is required to render real-time video sequences on the background. Usually, we use texture mapping for this on both OpenGL(ES) or Direct3D (mobile).

In Direct3D Mobile (D3DM), there are two choises usually used by users.

1. Lock the texture, copy pixel data from input video, and unlock the texture.2. Create an image surface where the video is copied. Lock the surface, copy pixel data from input video, unlock the surface, and update the texture with CopyRect method.

When I implement both methods(to choose a faster one), both methods runs well on the emulaltor. However, I encountered several problems on my smart phone (Samsung M480, Windows Mobile 6.0).

At first, lockable texture is not supported by M480. When I checked the capability using this code,

if(caps.SurfaceCaps & D3DMSURFCAPS_LOCKTEXTURE)

the result was false and calling D3DMXCreateTexture function with D3DMUSAGE_LOCKABLEoption failed, of course.

Then, I tried the 2nd method, using image surface but the code did not work again. This time, the error occurred when creating surface. I tried to find documents about this problem on the web, but get nothing. After I tried many different parameters, I finally figured out it.

The CreateImageSurface function always fails when I created the image surface with the format D3DMFMT_R8G8B8 on my device, M480. After I changed the image format to D3DMFMT_R5G6B5, which may be natively supported the hardware on M480, the code works well.

We need to check the capability of the hardware we are working on when we want to work with lockable things (such as textures, vertex buffers, or iamge surfaces). Even though the code runs well on the emulator, it does not gaurantee the code works well on the device. Another thing we have to consider is that try another format when your device cannot create an image surface.

I'm not sure how the textures and image surfaces work on the other devices. Does Direct3D Mobile provide the same functionalities on all the devices using WM6, or does the functionalities depend on hardware driver implementations of vendors ??