Could SIRI be used in Desktops?

Renowned Colorist Alex Hurkman today tweeted that he would have SIRI in on his desktop rather than his Iphone 4S.

Makes me think it could be start of something very interesting. How about Voice Controlled Editing? Just think of the iinfinite possibilities & the speed with which you would be able to work.

You say -5 frames and boom there it is. WOW!

Sohrab

FCS 3, AJA Kona Lhi & Adobe PPro

"The creative person wants to be a know-it-all. He wants to know about all kinds of things: ancient history, nineteenth-century mathematics, current manufacturing techniques, flower arranging, and hog futures. Because he never knows when these ideas might come together to form a new idea. It may happen six minutes later or six months, or six years down the road. But he has faith that it will happen." -- Carl Ally

"The creative person wants to be a know-it-all. He wants to know about all kinds of things: ancient history, nineteenth-century mathematics, current manufacturing techniques, flower arranging, and hog futures. Because he never knows when these ideas might come together to form a new idea. It may happen six minutes later or six months, or six years down the road. But he has faith that it will happen." -- Carl Ally

[Jacob Kerns]"you how stupid it looks and sounds talking to a phone instead of just typing in things."

Eh? It might look stupid right now but wait for a year then tell me when most of the smartphones will have this voice enabled technology.

[Jacob Kerns]"Um the people working around you!"

Ever heard of speech recognition system? It might not have great results today, but can definitely be improved as time goes by.

Sohrab

FCS 3, AJA Kona Lhi & Adobe PPro

"The creative person wants to be a know-it-all. He wants to know about all kinds of things: ancient history, nineteenth-century mathematics, current manufacturing techniques, flower arranging, and hog futures. Because he never knows when these ideas might come together to form a new idea. It may happen six minutes later or six months, or six years down the road. But he has faith that it will happen." -- Carl Ally

[Sohrab Sandhu]"Eh? It might look stupid right now but wait for a year then tell me when most of the smartphones will have this voice enabled technology."

Most phones already had this! Apple just now marketing like its a new technology!Granted Apple probably made it better. BlackBerry and Androids that had it I used it lot but after awhile it the fun wears off except when I'm driving.

I even used Dragon Naturally Speaking after an accident and could use my hands and it worked well. I still think its a pointless technology until it can work with noise in the background and can understand accents.

Well, i am no apple agent. I thought it was something interesting and worth a discussion here. If you don't like the idea, you can move to the next thread!

Nothing personal :)

Sohrab

FCS 3, AJA Kona Lhi & Adobe PPro

"The creative person wants to be a know-it-all. He wants to know about all kinds of things: ancient history, nineteenth-century mathematics, current manufacturing techniques, flower arranging, and hog futures. Because he never knows when these ideas might come together to form a new idea. It may happen six minutes later or six months, or six years down the road. But he has faith that it will happen." -- Carl Ally

Apple has had voice commands on its mobiles, too, for quite a while. But this is of a different order. I think Siri will be huge and the other phone manufacturers are going to have to start playing catch-up. Apple describes Siri as beta which is unusual for them and there are clearly limitations and some glitches with it at present. But most people using it seem to be very impressed. Apple and Microsoft are both very interested in this type of AI and are putting serious money into it. You may not like Siri but you won't be abel to ignore it or similar developments whichever platform you use.

A couple of people in my office bought the iPhone yesterday. you how stupid it looks and sounds talking to a phone instead of just typing in things. For people driving and disabled yes great Idea."

Not to mention how annoying this would be.
We already have plenty of rude idiots with USI, who already don't know when it's proper to curb their cell phone usage, lets give them something else they can use to annoy the rest of the world.

Working in silence is preferable, do you really want to be talking all the time? Think of how long a typical project takes to edit? Do you want to be talking that ENTIRE time? And what about precision work? Do you always know how long you want to shorten something? You'd end up saying a variation on the same command over and over again while trying things.

It takes me 4 seconds to say all that. I can do it with my mouse in 2 seconds max.

For complex operations where menus and popup windows are required I can see voice activation being a great improvement. But for simple editing tasks, the mouse (and pen tablet) rains supreme."

Agreed, for some tasks it would still be quicker to do it physically.

But consider for instance tweaking values in filters tab or. It cud well be done much quicker with a voice command.

And for the record, I am not suggesting only Apple is capable of doing this. Infact companies like Adobe or AVID might take the lead and bring it sooner than later!

Sohrab

FCS 3, AJA Kona Lhi & Adobe PPro

"The creative person wants to be a know-it-all. He wants to know about all kinds of things: ancient history, nineteenth-century mathematics, current manufacturing techniques, flower arranging, and hog futures. Because he never knows when these ideas might come together to form a new idea. It may happen six minutes later or six months, or six years down the road. But he has faith that it will happen." -- Carl Ally

[John-Michael Seng-Wheeler]I would love to be able to say "render the entire sequence a 20Mbs H.264 file in the project folder"... that would be much faster then any interface could be. (Unless it was ridiculously simplified)

I am of firm belief that this to work we would need more of mental makeover than technology.

Sohrab

FCS 3, AJA Kona Lhi & Adobe PPro

"The creative person wants to be a know-it-all. He wants to know about all kinds of things: ancient history, nineteenth-century mathematics, current manufacturing techniques, flower arranging, and hog futures. Because he never knows when these ideas might come together to form a new idea. It may happen six minutes later or six months, or six years down the road. But he has faith that it will happen." -- Carl Ally

At the risk of using another crappy metaphor that will start arguments over the value of the metaphor rather than the actual topic -- voice navigation in your car is just about the right level of voice control in your NLE. "Call Dave" and "Play Metallica" is one thing. "Turn right at the next light" is another. How far in the future do you think we'll be before it's a good idea for a car to execute that command based on your voice?

See, it's not just the voice part -- it's how much the MACHINE will understand, and actually perform, based on the voice command.

I mean, to make this work for editing, you'd need an NLE that was pretty stripped down, and that made a lot of assumptions about what you mean. It couldn't really rely on YOUR idea of things, or the whole thing would fall apart. It would require such a small subset of traditionally understood professional editing features that you'd wind up retraining yourself to try and think like the NLE, rather than try to edit the way you know works.

It is disconcerting the amount of free thinking on a bit of the unknown that gets shot down immediately around here. It started in earnest on June 21st.

I thought this was the creative communities of the world.

There is a certain amount of parallels to the ways of thinking that is dovetailing with much of the frustration that isn't covered by the main stream news outlets today, but is instead covered by individuals, that are now gathering all over this country, on social networks. The world is more self aware than ever.

Not all ideas are great ones, but great ones come from all ideas. I know all of you have built things from scratch, it is part of the editing process. You try things, some don't work out. You chip away at it, try at new failures and sucesses, then share your ideas with others who might bring new perspectives and therefore through hard work, make something better

Instead, people expect everything to be perfect right out of the box on Day 01, forgetting that all great stories started with a rough cut.

Your edits aren't perfect on the first pass. You need time to step away, and think about new ways to approach the same subject with the tools and footage at hand, tomorrow. You find the thread. Admittedly, disagreement is part of that process.

Perhaps voice activated editing isn't a perfect way to edit, but what's wrong with talking about it in a way that you don't get accused of being a member of a suicidal cult? It is these new ideas that prevent intellectual suicide. Why is everyone so terrified?

[Jeremy Garchow]Perhaps voice activated editing isn't a perfect way to edit, but what's wrong with talking about it in a way that you don't get accused of being a member of a suicidal cult?

Because if you spend more than a few seconds thinking about what your idea could be like, it's not hard to realize how it wouldn't be all that useful. There are plenty ideas that are almost immediately apparent that they don't work in their current form.

It is these new ideas that prevent intellectual suicide. Why is everyone so terrified?

Why is it every time someone doesn't jump in with both feet on every little whim here that they are "terrified" of something? Why use that word? Can't the people who don't lap up every little new thing be given better respect than just "you're terrified!"

[Gary Huff]"Because if you spend more than a few seconds thinking about what your idea could be like, it's not hard to realize how it wouldn't be all that useful. There are plenty ideas that are almost immediately apparent that they don't work in their current form.

It is these new ideas that prevent intellectual suicide. Why is everyone so terrified?

Why is it every time someone doesn't jump in with both feet on every little whim here that they are "terrified" of something? Why use that word? Can't the people who don't lap up every little new thing be given better respect than just "you're terrified!""

Judging from your last few responses to me, it's quite obvious you have some sort of grudge against me for whatever reason, it's fine.

"Voice activated editing" might not be for you, but it doesn't mean it's not for the editing community at large, or even a select few. I'm not jumping in with any feet, I tend to wade then submerge. Forgive me for having thoughts that might not make sense to you. I'm not scared of what I don't understand, even mistakenly.

What if Siri could be used to make a transcript? What if I asked Siri to call me when a transcode is done so I can go for a walk? What if I asked Siri to make all my clips dual mono instead of stereo? What if Siri...?

And didn't Apple say Siri's in beta? Shit, let's give it at least a moment to sink in before calling it useless.

[Jeremy Garchow]Judging from your last few responses to me, it's quite obvious you have some sort of grudge against me for whatever reason, it's fine.

Considering how you reference others who don't share your point-of-view on things, I feel like it's more likely you're just reflecting off me instead of there being actual hostility on my end.

Forgive me for having thoughts that might not make sense to you. I'm not scared of what I don't understand, even mistakenly.

Sorry, guess I'm just too "terrified" of the future...or something...

What if Siri could be used to make a transcript? What if I asked Siri to call me when a transcode is done so I can go for a walk? What if I asked Siri to make all my clips dual mono instead of stereo? What if Siri...?

The topic was about editing using voice command, not about doing all the side housekeeping stuff. If you want to bring that up, then please do so, and not try make a case against something I haven't actually poo-pooed.

And didn't Apple say Siri's in beta? Shit, let's give it at least a moment to sink in before calling it useless.

Again, such as in this comment. I got a 4S and I use Siri all the time. Where did we take a turn from editing via voice commands to the simple timesavers that Siri brings?

[Gary Huff]"Because if you spend more than a few seconds thinking about what your idea could be like, it's not hard to realize how it wouldn't be all that useful. There are plenty ideas that are almost immediately apparent that they don't work in their current form."

Actually, after thinking about it for awhile, I think voice activation could have a pivotal role in editing. Not for making cuts and trims and all that; I don't think it'd be good for that at all. But for searches and media organization it could be quite a time-saver and really powerful.

"Show me all my two-shots"

"Hide all my 2k footage"

"Show me clips labeled as 'exteriors'"

"Hide any clips that have been cut into the sequence"

For editing, I believe the future is in two-handed multi-touch desktop pads. I envision a world where you can edit with gestures because editing, for me anyway, is a very visual endeavor. I don't think I could ever edit effectively with voice commands alone. It would be like playing chess without seeing the board. You can do it, but what the hell does "pawn to Queen's bishop 3" look like?

I'm sure that voice control will be implemented some time.
I see that very useful for managing the media because key-word based system (logging,tagging and retrieve media..) and some other functions (alternate edits, playing, repeat, go to..,export to..).
Voice will replace any typing. I think that's great (I've been complaining of the so much typing to get things organized in FCPX).

However I don't see that so useful for the very editing, trimming, or adjusting values. That would be very tiring for the editor and for anyone beside.

[Craig Seeman]"Somewhere there is an editor now suffering from crippling arthritis or repetitive stress syndrome or was paralyzed in an accident who could continue his creativity with this feature."
Craig you are a well intentioned and very positive guy.
There are thousands of editors all around the world suffering from FCP EOL and Apple doesn't give a damn about them. Apple would move a finger for the people in the painful circumstances you are describing but for marketing purposes.
And not, I'm not cynic here.

[Jeremy Garchow]"...but what's wrong with talking about it in a way that you don't get accused of being a member of a suicidal cult? It is these new ideas that prevent intellectual suicide. Why is everyone so terrified?"

Jeez Jeremy sounds like your about one notch shy of Godwins Law.
What I find funny is that in your desire to talk about this, if the talk doesn't go in a direction you like, or support some new whim of Apple, you resort to hyperbole and rhetoric. Is it that big of a deal that some don't think this is all that handy, or can see flaws in the concept? In your desire to talk about things, try to remember not everyone has to adopt/buy every new thing that comes along just to prove their worthiness, and that pragmatism has it's merits.

Scott, I agree that that post was way out of line, but I'd like to expand it beyond "disagreement with me = terrified."

Add "disagreement = old man with his head up his ass, lack of vision/imagination, standing in the way of progress, etc.," to "like it = brainwashed fan boy" to the list of inappropriate directions for conversations, even in this full-contact forum.

Just because it doesn't have somebody's name in it doesn't mean it's NOT a personal attack. This is worse in my mind. Instead of attacking one guy, it's attacking an entire class of people, and it's not okay.

So please, back up off of that stuff in either direction. NFL football is full-contact, but there are still rules. Here' there's only one rule, which I'll remind everyone of again: talk about topics, not posters.

[Scott Sheriff]"Jeez Jeremy sounds like your about one notch shy of Godwins Law.
What I find funny is that in your desire to talk about this, if the talk doesn't go in a direction you like, or support some new whim of Apple, you resort to hyperbole and rhetoric. Is it that big of a deal that some don't think this is all that handy, or can see flaws in the concept? In your desire to talk about things, try to remember not everyone has to adopt/buy every new thing that comes along just to prove their worthiness, and that pragmatism has it's merits."

If its not handy to you, why does the OP needs to be shunned to the corner?

This isn't just about Apple, it's about the bigger scope, just as Andrew Richards pointed out.

I don't care if you disagree, I welcome disagreement. I'm just really tired of trying to talk about things intelligently and then someone says, I'm stupid or I follow a cult. I happen to like thinking about crazy ideas, it just do happens that Apple has a few on the table, and sometimes the logical implications of those ideas aren't immediately clear. If Google built a new NLE that I happened to like, I'd talk about that too.

There have been plenty of conversations I have been involved in on this forum where disagreement abounds. We are all still friends and respect each other.

I'm sorry if you think I am approaching Godwin and his law, I don't really know what it has to do with this.

Several companies walk into a bar and start talking about out how they could make a phone you could use just about anywhere.

Sitting at a table near them are some members of an elite group called the Creative Chickens. They overhear the companies talking about their idea and decide to give their opinion.

I don't think you thought this idea through very well.
Do you know how stupid you'd look walking down the street talking on a phone?
Stop making a phone out to be more than it is.
Working in silence is preferable.
Bad idea!
I can currently walk down the street and talk to myself, but it's pointless!
Yeah, quite the time saver.
Using a phone at home is just about the right level of phone usage.
To make a phone work for talking just about anywhere, you'd need a pretty stripped down phone.
You'd look like a big green Kool-Aid guy.
Who the h*ll wants to talk all day?
If you spent more than a few seconds thinking about what your idea could be like, it's not hard to realize how it wouldn't be all that useful.
Sorry, guess I'm just too "terrified" of the future.

Well, companies will be companies. So, they go out and create the phone anyway. They didn't listen to any of the Creative Chickens.

Several years later, the Creative Chicken members are sitting in the same bar. Each one of them is talking to someone else on a phone they can use just about anywhere.

Another group of companies walks into the bar. They have an idea where you could use a phone just about anywhere, without having to talk into the phone. You can just put a little speaker and mic in your ear.

Once again, the Creative Chickens step in and decide to give their opinion.

I don't think you thought this through very well.
Do you know how stupid you'd look walking down the street talking without a phone?
…

[Sohrab Sandhu]"Makes me think it could be start of something very interesting. How about Voice Controlled Editing? Just think of the iinfinite possibilities & the speed with which you would be able to work."

If we couple Siri with our iSight webcams, then we can edit like producers:

Apple reportedly had a lot of negotiating to do with Nuance in order to actually use Siri in iOS as the voice-controlled assistant it was as a third party app. Nuance's voice recognition is what turns speech into text, Siri's own tech is what interprets that text and "understands" what you mean when you use plain language to ask it questions.

Apple would need to license much more from Nuance to bring its speech recognition to more than Siri (and maybe it has), but the cool thing that Apple owns in Siri is semantic search. Semantic search holds powerful potential for any application that has a lot of metadata to search. What you might see more likely than voice controlled editing, is semantic search in FCPX and/or other Apple products. Maybe it even becomes part of the CoreData API.

Maybe voice controls will take on a more significant role for accessibility for OS X in general and FCPX in particular, but I'm much more intrigued by the semantic search angle, where searching your content won't have to happen with syntactically correct search terms, but natural language.

This is an interesting discussion. These capabilities have been a reality since Windows Vista on the PC. I've done a couple of courses at our local library for people who have limited mobility as far as their hands and arms go.

I tested this "theory" out on both After Effects and Premiere, and, given the full featured capability of the voice commands, it works pretty well. There's a feature called Mousegrid, which allows you to use the mouse just the way you would with your hands, by drilling down on a sector grid which appears on the screen when you say "mousegrid". It can be as simple as saying "Mousegrid, 1, 10, 5, double-click", and you're running your software from the desktop, or say "Premiere Pro, Run", and it's there.

While I wouldn't switch to editing with it, I can see some very real, and very helpful capabilities to the editing public. I know several editors who've ended up with carpal tunnel; this would allow them to continue with their work, albeit a bit slower, in some cases.

The thing about AVFoundation is that it is just one part of handling media in OS X. QuickTime used to be the one-stop shop for media handling, and now Apple has divvied the work up among several frameworks. So a player like QuickTime Player will call upon AVFoundation to deal with playing and encoding assets and maybe QTKit to deal with MOV wrapper data.

[Jeremy Garchow]"So why is everything still wrapped in QT in FCPX?"

I'm not sure. I can't find any definitive statements in the dev docs about wrapper support, MOV or otherwise. Perhaps the private Camera I/O SDK that Apple has shared with the camera folks will answer that question.

[Jeremy Garchow]"If someone wanted to build what used to be a QT component to send non QT wrapped media to be available to the OS, how would one do that in AV Foundation?"

QT Components still seem to have a place in the new order. Again, I can't locate definitive statements in the dev docs, but Apple still distributes them as the ProApps codecs and Telestream's Flip4Mac components work for playback in QTP10.1. Those both use .component files stored in /Library/QuickTime. The common theme here is that the APIs for old QuickTime are deprecated, but the wrapper and the plugins live on.

[Andrew Richards]"I'm not sure. I can't find any definitive statements in the dev docs about wrapper support, MOV or otherwise. Perhaps the private Camera I/O SDK that Apple has shared with the camera folks will answer that question.
"

This is my number one question with FCPX/Lion right now. I hope this makes a bunch of sense one day. I hope we can get access without rewrap to all these digital media formats. :)

By the way, I think CMX proposed a voice activated linear editor somewhere between the 3600 and the Omni. I don't think it ever got to actual hardware at NAB, but I used to have some marketing material for it in my "collection".
I think before voice operation we will see a well developed ipad version of FCPX, and I hope something similar for Android from Adobe. I think they would be useful even for the "Professional".

My list of suggested voice commands may not be the best example. But, I think there are probably commands that could benifit from using your voice.

It may not be useful for some users. Such as those (many I would imagine) who have become so proficient utilizing a keyboard and all it's modifiers to control the vast amount of what they need to get done. I can see where having a voice version of the command would probably even slow them down.

A change like this reminds me of Apple's Mighty Magically Mystical Trackpad. (I couldn't think of 4 T words) I tried to like it. I spent an entire 3 or 4 minutes of extensive user testing. It didn't work for me. I've been using a mouse for too long. I don't care how much better and faster a trackpad is over a mouse. I don't care how much I could benefit from all kinds of exciting new gestures. I'm too set in my User Interface ways. Sure I can drive on the other side of the road if I have to. But I run into a lot more stuff doing that. So I'm not moving to Singapore.

I've been using Siri on a 4S for messaging now for several days and I've found it very useful. It's not perfect, but it works good enough for the vast majority of my needs. (I haven't asked it to open the pod bay doors yet, but I want to) And because I never won any texting championships, I actually find it a better solution than Appls's touch screen keyboard. I can do it with one hand and I don't have to look at the phone while I'm doing it. It's the only way I'm texting now. (but then, I don't mind talking alone) Unless Siri gets it so wrong I have to correct it. Still, overall, faster for me.

While there are some changes that I may never be able to make, I can see how voice commands could be useful. Especially for repetitive tasks that don't have a keyboard shortcut (because I was too lazy to create one).

But, it would have to work really, really well and it would have to be really, really fast. And I've never used a computer that's fast enough. Hey, maybe the next Mac Pro! (oops)

Anyway, I think it's more than an interesting idea to debate. (and boy, is this a great place to debate stuff)

[Kevin Patrick]"My list of suggested voice commands may not be the best example. But, I think there are probably commands that could benifit from using your voice. "

This is it. SIRI is a digital assistant, not an editor. There are bunches of commands that don't belong as keyboard shortcuts (my rule: if it's more than 2 modifiers, it's not a shortcut) that would be just fine as voice commands.

The same is true of the desktop in general. It's inconceivable to me that SIRI isn't in the next OS update. It HAS to be, and it's overdue.

THAT SAID, I think that there's a basic misunderstanding of what SIRI is. For the best possible insight, check out "Sh^t SIRI Says," which is far, far more inappropriate for family viewing than the name implies, but the funniest thing I've read in a long time. It's screencaptures of actual queries and responses that will blow your mind. I promise that you'll gain a whole new appreciation for the possibilities of putting THIS in your edit suite.

Plenty of drugs, sex, and bad language that works fine for MY family, but may not for yours. COWveat lector.

[Tim Wilson]"THAT SAID, I think that there's a basic misunderstanding of what SIRI is. For the best possible insight, check out "Sh^t SIRI Says," which is far, far more inappropriate for family viewing than the name implies, but the funniest thing I've read in a long time."

Is it me or is everyone missing the point here. Siri is voice recognition software. What editor wouldn't want that? NOT so you can talk to the software to tell it what to do BUT so you can have all your audio analysed and presented in text format with clickable links in that text to timeline position. Hell, you could even break down edits by text sentence for sub clips/tags. Would save a TON of time when it comes to interviews, etc. Yes, I know Adobe have played around with this but from what I understand not that successfully.