This blog attempts to be a collection of how-to examples in the Microsoft software stack - things that may take forever to find out, especially for the beginner. I see it as my way to return something to the Microsoft community in exchange for what I learned from it.

25 January 2017

Intro

This tutorial will show you how to manipulate (move, rotate and scale) Holograms by using gestures. It uses speech commands to change between moving, rotating and scaling. It will build upon my previous blog post, re-using the collision detection. The app we are going to create will work like this (turn on sound to hear the voice commands):

Setting the stage

Once again, we start off with a rather standard setup…
only now we have a cube and a sphere. Neither of their settings are particularly spectacular, and this will show the sphere initially a bit to the left, and a rectangular box a bit to the right

If we actually want this Holograms to do something we need to code some stuff:

A behaviour to select and activate Holograms – TapToSelect

A behaviour that acts on speech commands, so the next behaviour knows what to do – this is SpeechCommandExecuter

A behaviour that responds to tap-hold-and-move gesture and does the actual moving, rotating and scaling – this is SpatialManipulator

Something to wire the whole thing together, a kind of app state manager – AppStateManager and its base class, very originally called BaseAppStateManager.

Selecting a hologram

The fun thing with the new HoloToolkit is that we don’t need do much anymore to listen or track hand movements and speech. It’s all done by the InputManager. We only need to implement the right interfaces to be called with whatever we want to know. So when we want an object to receive a tap, we only need to add a behaviour that implements IInputClickHandler:

This basically just gives the object to the App State Manager – or more it’s base class, later more about it, and optionally plays a sound – if the omitted GetAudioSource method can find and AudioSource in either the object or its parent. In the app it does, it plays a by now very recognizable ‘ping’ confirmation sound.

Speech command the new way – listen to me, just hear me out*

Using speech commands has very much changed with the new HoloToolkit. There are actually two ways to go about it:

Using a SpeechInputSource and implementing an ISpeechHandler ‘somewhere’. This is rather straightforward and is very much and analogy of how the IInputClickHandler works. Disadvantage is that you have to define your keywords twice – both in the SpeechInputSource and in the ISpeechHandler implementation

Using a KeywordManager to define your keywords and map them to some object’s method(s).

This sample uses the latter method. It’s a bit of an odd workflow to get it working, but once that’s clear, it’s rather elegant. It’s also more testable, as the interpretation of keywords is separated from the execution. We are implementing that execution in the SpeechCommandExecuter. Its public methods are Move, Rotate, Scale, Done, Faster and Slower which pretty much maps to the available speech commands. And if you look in the code, you will see what it does internally is just call private methods, which in turn try to find the selected objects’ SpatialManipulator and call methods there.

So why this odd arrangement? That’s because KeywordManager needs an object with parameterless methods to call on keyword recognition. So, we add this SpeechCommandExecuter and a KeywordManager (from the HoloToolkit) to the Managers object, and then we are going to make this work. The easiest way to get going is

Expand “Keywords and Responses”,

Change “size” initially in one.

Type “move object” into keyword,

Click + under “Response”

Drag the “Managers” object in the now visible “None” field. This is best explained by an image:

And then you have to select to what method of which object you want to map this speech command. To do this, click the dropdown menu next to “Runtime Only”, that will initially say “no function”. From the drop down first select the object you want (SpeechCommandExecuter) and then the method you want (Move).
Unfortunately, all objects in the game object are displayed, as are all public methods and properties from every object you select – plus those of its parent classes. You sometimes really have to hunt them down. It’s a bit confusing at first, but once you have done it a couple of time you will get the hang of it. It might feel as an odd way of programming if you are used to the formal declarative approach of things in XAML, but that’s the way it is.
Then change the size to 6 and add all the other keyword/method combinations. By using this method you only have to drag the Managers object once, as the Unity editor will copy all values of the first entry to the new ones.

Spatial Manipulation

This is the behaviour that does most of the work. And it’s surprisingly simple. The most important (new) part is like this:

The manipulation mode can be either Move, Rotate, Scale – or None, in which case this behaviour does nothing at all. So, when ‘something’ supplies a Vector3 to the Manipulate method, it will either move, rotate or scale the object.

In move mode, when you move your hand, the object will follow the direction. So, if you pull towards you, it will come toward you. Move up, it will move up. Elementary.

Scale is even more simple. Pull toward you, the object will grow, push from you, it will shrink.

Rotate is a bit tricky. Push from you, the object will rotate around the horizontal axis. That is, an axis running through your view from left to right. Effectively, the top of the object will be moving from you and the bottom to you. Move your hand from left to right, or right to left, and the object will rotate around and axis that is running from top to bottom of your view. Last and most tricky – and least intuitive: move your hand from top to bottom and the object will rotate clockwise over the z axis – that is, the axis ‘coming out of your eyes’

There are two more methods – Faster and Slower, which are called via the SpeechManager as you have seen, and their function is not very spectacular: they either multiply the speed value of the currently active manipulation mode by two, or divide it by two. So by saying “go faster” you will make the actual speed at which your Hologram moves, rotates or scales go twice as fast, depending on what you are doing. “Go slower” does the exact opposite.

App state – the missing piece

The only thing missing now is how it’s all stitched together. That’s actually done using two classes – a BaseStateManager and a descendant, AppStateManager
The BaseStateManager doesn’t do much special. Its main feats are having a property for a selected object and notifying the rest of the world of getting one. And that’s not even used in this sample app but I consider it useful for other purposes, so I left it in. It also calls a virtual method if the selected object is changed.

There is also a class GameObjectEventArgs but that’s too trivial to show here. Note, by the way, I stick to C# 4.0 concepts as this is what Unity currently is limited to.
The actual AppStateManager glues the whole thing together:

It implements the IManipulationHandler, which means our almighty IManipulationHandler will call it’s OnManipulationUpdated whenever it detects hand with a tap-and-hold gesture (thumb and index finger pressed together) while moving. And it will give that data to the SpatialManipulator in the select object, that is – if the selected object has one. It also makes sure the currently active object gets deactivated once you select a new one. Note, IManipulationHandler requires you to implement three more methods, omitted here, as they are not used in this app.
There is an important line in the Start method, that will define this object as a global input handler. Its OnManipulationUpdated always gets called. Normally, this get only called when it is selected – that is, if your gaze strikes the object. That makes it very hard to move it, as your gaze most likely will most likely fall off the object as you move it. This approach has the advantage you can even manipulate objects even if you are not exactly looking at them.

Wiring it all together in Unity

This is actually really simple now we have all the components. Just go to the Cube in your hierarchy and add these three components:

And don’t forget to drag the InputManager and the Cube itself on the Stabilizer and the Collision Detector fields as I explained in my previous blog post. Repeat for the Sphere object. Build your app, and you should get the result I show in the video.

Concluding remarks

It’s fairly easy to wire together something to move stuff around using gestures. There are a few limitations as to the intuitiveness of the rotation gesture, and you might also notice that while moving the object around uses collision detection, rotating and scaling do not. I leave those as ‘exercise to the reader’ ;). But I do hope this takes you forward as a HoloLens developer.
Full code can be found here

Hi @Lee, can you be more specific on 'not working properly?' Does the app work, but it does not hear or understand you? Did you turn on speech capabilities? Does the app hear you and give a confirmation sound (pringggg) but doesn't it respond? Do you have trouble getting voice commands to work in my sample only, or do other apps have problems understanding you too? I have found out you pretty much need to speak English with what I perceive as an American accent to get the device to understand you.

Hi Joost, great stuff here. Love your website. I used your tutorial for guidance on an app I'm developing but I'm running into a small issue.

The hand gestures for rotation seem to get reversed when I walk to the opposite side of a hologram. Basically, when I look at the hologram initially, go into rotate mode, and move my hand right, it spins right. But then if I walk to the other side and move my hand right, it spins left. Any idea how I might go about fixing that? If it helps, I modified your code a little bit to make it only spin on one axis. Here's what I have:

I came up with a quick fix by taking the angle between the raycast and the z-axis of the game object, rounding to the nearest 180 degrees, and adding that to the manipulationData.x angle. Basically as soon as you cross the x-axis of the object, it will switch positive values to negative. It's not pretty, but it should work. You might have a better idea though.

Feedback, comments and tokens of appreciation

If you spot things that are incorrect, or if you don't understand what I mean, please drop a comment on the offending article and I will help you ASAP. You can e-mail me at joostvanschaik at outlook dot com or contact me via twitter.

If you find the information on this blog useful (and apparently some 600+ people per day do on average) please let me know as well, that encourages me to keep doing this. Or do tell others - that made me an MVP; who knows what more it might bring ;-P

Disclaimer and legal stuff

Although I take great care in providing quality samples, all postings, articles and/or files on this site are provided "AS IS" with no warranties, and confer no rights. The views expressed on this blog are strictly my own and do not necessarily reflect the views of my employer, or anyone else on the planet for that matter.

I usually make original content, sometimes building upon other people's work. Sometimes I explain things that can be found elsewhere because I felt what I read was not clear enough for my limited mind so I explain it the way it finally clicked with me. In all cases I take great pains to be sure to link to people or articles who deserve the credit. If you think I have shortchanged you on the credits please let me know.

Please note, I do not work for Microsoft and while I proudly wear the title of "Microsoft Most Valueable Professional", my opinions, files offered, etc. do not represent, are approved of, endorsed by or paid for by Microsoft. The only power behind it is me and my sometimes runaway passion for parts of Microsoft's technology.