Slashdot videos: Now with more Slashdot!

View

Discuss

Share

We've improved Slashdot's video section; now you can view our video interviews, product close-ups and site visits with all the usual Slashdot options to comment, share, etc. No more walled garden! It's a work in progress -- we hope you'll check it out (Learn more about the recent updates).

First time accepted submitter jisom writes with something that will probably not be working come morning. Quoting the source: "Today, we managed to crack open Siri's protocol. As a result, we are able to use Siri's recognition engine from any device. Yes, that means anyone could now write an Android app that uses the real Siri! Or use Siri on an iPad! And we're going to share this know-how with you."
Basically, Siri sends the data to the processing server using non-standard HTTP extensions. Of note is that the audio is encoded using Ogg Speex.

While you could write an Android app or anything else, the protocol sends an unique ID with the request. That ID is unique to every iPhone 4S. End result being, you can probably use your own for your personal use, but if you try to sell an App for Android and include your ID with it, Apple will just blacklist it. So you will still need your own iPhone 4S.

While that may be true, would having the keys of all existing iPhone devices be a sample large enough? Or maybe you could link to research that can successfully predict the keys OpenSSL generates. No, Debian OpenSSL doesn't count..

No doubt. Those users are the worst thing about having an Android phone.

I like my Android phone. It does what I need, it does it fairly smoothly. It's not as slick as my iOS devices, but I'm used to the downsides of Android and for the moment I'd rather deal with them than deal with the downsides of iOS. But the fanbois are just awful.

There is nothing available on Android that's anywhere near as functional as Siri (seems to be in the ads). Voice recognition is OK (but largely dependent on the quality of your device - if the manufacturer [HTC, cough] used cheap mics, no chance), but unless you want to call someone or search Google, you're going to need to do it the old fashioned way.

And yes, I'm one of the rabid Android fanboys you seem to be encountering so often;)

If it is correctly implemented, that's easier said than done. It is not necessarily a key-value pair that are cryptographically verified (i.e. there exists a purely arithmetic function f(x,y) that returns true iff (x, y) is a valid pair, and client is allowed access if it supplies correct (x,y) ) This kind of system would be crackable; just find another arithmetic function f' that returns y for some x (one usually exists).

However, if Apple knew what they were doing (and they usually do), it's a GUID [wikipedia.org] database stored on Apple's server. Say, they generate a 128-bit random access code for each manufactured iPhone, and the only way you can use Siri is to supply a valid GUID. Such system is virtually uncrackable, because even for a 128-bit GUID and 200 million iPhone 4S manufactured, it would take a staggering 17 million trillion trillion guesses (i.e. HTTP requests to Apple servers) to guess right ONE correct GUID. If one request took a mere 100 bytes with its TCP/IP headers, you would have to transfer 170 million yottabytes (170 million trillion terabytes) of data to find one valid access key.

the reason why you can't do this is because Siri communicates in HTTPS, so it is not vulnerable to man-in-the-middle attacks. hence, you cannot eavesdrop on somebody else's iphone

the reason why they could listen to the traffic in the article is because they had access to the root certificate on the iphone itself. you can do this if you have physical access to the phone, but obviously you can't just do this over the air to other people's phones

This presumes that the guid assignments are done from the 128bit guid space using some garanteed form of true random.

Given the number of phones in existence, and that new phones will have to be whitelisted as time passes, (and that random guesses will run the risk of collision) it is more likely that the guid assignment is performed in some sophisticated pseudo random fashion, and as such, identifiable patterns could be detected given a sufficiently large number of known whitelisted guids.

Once you have that information, and perhaps some other information that apple might use in the guid assignment algorithm (serial number, manufacturing site, date of manufacture, etc...) it should be possible to determine which guids should be valid.

This sounds like an opportunity for a naughty idevice app developer, who should already be able to get such a list by having their app phone home, and request the device uuid as part of a purchase validation mecchanism. (A popular app could quickly get several hundred active unique ids to work with, perhaps more.)

Surprisingly, when we did, we wouldnâ(TM)t gather any traffic when using Siri. So we ressorted to using tcpdump on a network gateway, and we realised Siriâ(TM)s traffic was TCP, on port 443, to a server at 17.174.4.4.

The app even validated that the cert used was signed by a trusted CA. Fortunately the iphone4S allows you to add your own trusted CA to the trust chain.

That's what they wanted people to think. 99% of all phone apps have very little to do with the actual phone and instead they're just quick reference URLs to some external site that does most of the work. Of course they tie all the apps to the phone so that you can't bypass the store.

Apple's actually pretty quick to reject apps for not offering enough functionality over a website. Simply embedding a site in a webview and calling it an app (what was implied to be happening upthread) is pretty much a 100% guaranteed way to get your app rejected.

Why would they waste the processing horsepower? It would eat the battery if it was even at all possible. They can do higher quality recognition on their servers anyway. The customer does not need to know where the processing is done as long as "it just works". To the consumer, and even some more technically inclined, it's magic -- and that is the real genius in the way Apple presents it's products. They make people feel like they're somehow in the future, that they're talking to an intelligent phone, that Saint Steve has somehow created artificial life and they get to own a piece of this future for the price of a modest chunk of change and a two year contract.

Doing the processing on the server seems very slow to me - I can find a contact much faster by pressing the first few letters than waiting for the round-trip latency to siri.

Yep. It's extremely annoying, actually, because Siri replaces the existing voice commands. So doing something like "call brother" - which used to take maybe a half second - takes a good three seconds or so of lag time. More annoyingly is things like "play playlist driving songs" - first you have to wait for the three seconds round-trip processing, then you have to wait for the iPhone to decide which playlist that matches ("Looking for playlist driving songs," Siri says), then you have to wait for her to narrate "playing playlist driving songs" before the music actually starts.

Of course, now you can say things like, "Boy, I'd love to hear some driving songs" or "Driving songs would sound good right about now." See? There's less of the "command" protocol and more like you're speaking to an actual person!

Of course, the person you're talking to is a little slow. But that's better than having to use some specific syntax, right?

So turn it off [apple.com] : "If you wish to use Voice Control while you are not connected to the Internet, turn Siri off from Settings > General > Siri. Make sure to turn Siri back on when you have Internet connectivity and you wish to use it again."

Given that Apple are touted as masters of seamless and intuitive user interface design, how come this process isn't automated? It would seem to me that it'd be pretty trivial to, at the very least, detect lack of network connectivity, and turn it off accordingly.

It's terribly obnoxiously slow. It's also a lot broader than previous voice-command efforts. I set a baking timer by saying "Siri, set an alarm for twenty minutes from now." I had no idea that "twenty minutes from now" would be something that Siri understood. It just seemed like it would make sense. And it just worked. "Text my wife that I'll be about 10 minutes late" works too.

Well, it works when the network is responding. And it works terribly slow. But it is really a step towards natural language understanding of voice. Or rather, unlike a lot of other efforts I feel like the phone is trying to understand me rather than the other way around.

Nope, and that is the scam. Basically you are calling a service. Thus they could make Siri available on every iProduct with zero effort. That they decided to hold it as an exclusive feature for the 4S to try and create the 'gotta upgrade' stampede is truly lame. Keeping it to iProducts is ok, they ain't giving away a hefty compute farm after all, who do ya think they are after all, Google? But locking access to the service to one submodel of one product line is a terrible idea.

It's still a bit scammy, but I would guess they're using early adopters as a massive beta test before rolling it out to iLife in general, so rather than depriving anyone, they're being cautious and scaling up usage slowly. Think "Apple Newton," and it's reasonable to suspect the company may still be a little gun shy with this kind of tech. Even if it is running "in the cloud" instead of on the device, there's a whole lot that could go wrong with Siri [siriousfails.com]. (Page is for entertainment purposes only. Not to be construed as actual examples. I am a non-attorney spokesperson.)

More than that, availability matters here, and they want the initial adopters to have a premium experience before they roll it out to the hoi polloi, and everything goes pear shape when they run into the usual scaling issues. You know, like the ones AT&T ran into with the first iPhones.

Crickey! Will you loo' at that. We're so very lucky! You almost never see a four digit this far from its native habitat of lurking an' she's being stalked by this five digit that's almost as rare.
It's times like this I'm gla' I don't work with lizards that might eat me!//Window seat please...

It's my understanding from reading the articles from a guy who managed to hack it onto the 3GS that the 4S actually has some pretty good voice canceling hardware onboard. Whether or not that's true, I can't say, but from the article I read, apparently things needed to be VERY quiet or the text-to-speech would fail hard.

It also means that to have Siri work you have to pay for a data account (preferably an unlimited account - this will eat a lot of data if used frequently), as otherwise it will simply not work.

This may be a non-issue for markets like the US where you can only get a phone in conjunction with a heavily overpriced contract that by default includes data, it is an issue for other markets where plans and phones are separated.

I don't have a mobile data plan with my smart phone, don't see the need for it really,

Yes, that is what the ad campaign would lead you to believe. The reality is that all of the work is server side and ANY client would work equally well. You could use a basic no frills cell phone, a landline or whatever to talk to Siri and get voice reponses. Any phone capable of hosting an app could interface with it and receive URLs or other trigger events back with a fairly simple client side application. And there are no technical limitations preventing the client from the iPhone 4S running unmodified on any of the iPhones with the same iOS revision installed. Simply, there is nothing unique to the iPhone 4S that enables Siri. But had they rolled it out as a regular iOS update or an app in the Store there wouldn't have been a 'killer feature' to hype for the new phone to drive the lemmings into the store for an upgrade. That is the scam I refer to.

The most alarming fact, for me, is that they are sending all my speech data over the Internet to some enormous Cloud database. Oh, and while they have it all, I must trust Apple now that they are not gonna mine this data and send it backdoor to advertisers and other interests.

Yet when I call a friend, only my friend received my voice, and he receives it as audio. The phone company doesn't store this (unless they've been requested to wiretap your line - not very common outside of the US luckily - and even then it's normally stored as audio only), they're not even allowed to listen in to it when it happens, they just have to transmit the audio signal from my phone to my friend's phone.

In this case the audio goes to the vendor of your phone, which then attempts to actively listen in to it and make out what you're trying to say, and as such can store this in a machine processable format. That's the big difference.

Speech recognition isn't too CPU intensive, but it's *massively* memory intensive. It's not unreasonable for speech recognition engines to eat up a gig of ram, and the 4S only has 512mb. However, push it to a server with lots of ram and it can handle lots and lots of simultaneous speech recognition queries. It's tailor made to be a server-side task. At least until phones have gigs of free memory that aren't needed.

Actually there was an article at/. the other day that talked about this fact already. For most people though it seems like it's the phone doing it and really that's all that matters for 90 percent of the users.

it's a consortium. Dolby developed AC-3, and some tools they've developed are no doubt in the AAC spec, but AAC is essentially mp3 without the filterbank (which of course changed it a ton), and some nice features like long-term prediction, noise substitution etc etc.

Isn't AAC just the MPEG4 version of what we know as mp3 (which is really just MPEG1/Audio layer 3)? There are already many open source implementations of AAC, so I don't see it as the same thing.

The real problem with AAC is the MPEG patent swamp. Even if Apple were to release an open source codec, it would still be under the same shadow that hangs over anyone that isn't lining the pockets of the MPEG licensing body.

...in the opinion of a spin-off from SRI; it might've been easier for them to go with an open source codec than to license a non-open-source codec. Remember, Apple bought the company that developed Siri; they didn't develop it themselves from Day One.

I'm not saying that the availability of the codec as open source was one of the reasons for the choice and that, if the open-source availability weren't an advantage, it would have lost to some closed-source codec; I'm just saying that one shouldn't assume th

If Apple is learning anything from Google, it's that customer info is valuable. Siri could easily become an advertising platform that rivals Google. Targeted advertising, where companies pay Apple for premium listings ( eg Asking Siri about a Pizza place returns Pizza Hut who paid the most for that key word).

If Apple is learning anything from Google, it's that customer info is valuable. Siri could easily become an advertising platform that rivals Google. Targeted advertising, where companies pay Apple for premium listings ( eg Asking Siri about a Pizza place returns Pizza Hut who paid the most for that key word).

If that's their angle, they might welcome more traffic to Siri.

<sarcasm>Yes, they are so thrilled by it. They wanted that everyone could connect to their servers, but they did not know how to make their protocols public. Being hacked has solved that problem!...</sarcasm>

What this crack means (unless has additional security measures) is that Siri will need a lot more of processing power and, what is worse, there is no way to predict how much power it will need now. Without getting to dip into related profits (selling of hardware / associated programs / etc). I bet they are doing a party right now just to celebrate!

Seriously, WTF? The crack does not give anything interesting/new away, just puts a third party in a position where it can be abused. If the people behind Siri wanted everyone to connect, they could have stated that themselves. Those are two very simple thoughts that everyone in/. could understand, yet they instead just follow the most retorted logic to justify it.

At least we are not discussing crimes here. If talking about murders, I bet some of you would posts things like "Thanks to the serial killer that murdered his wife and children, now he can chose a new wife and have more kids!"

Where do you get that Apple relies on ads? Never mind "relying on ads a LOT"?

Results from Wolfram Alpha are ad-free. In-Siri search results are ad-free. Creating events, reminders, call/texting, are all ad-free. The only time the user might see an ad is when you click on the Web Search for something Siri couldn't get an answer for right away, but that sends you to the browser where ads are fair game. Apple's own website is devoid of ads (can't say the same for Microsoft--bottom of their homepage was a banne

Umm, fact check: Apple doesn't even slightly rely on ads. At all. Apple is not an advertising company, at all.

They have the iAd product, which is little more then a hobby; Apple's profit is very, very clearly from direct hardware sales to customers -- by a/vast/ margin. Not from ads, ITMS, Apps, any of it. Its hardware sales to customers.

Its nothing like Google's business model.

Now, its possible Siri may be a future ad-related or information-related revenue stream, but only if it can be leveraged without harming the hardware sales-- because THAT is what Apple makes its dough on. It'll probably never be a huge deal, though it may be interesting.

Why is Siri cloud-powered? Perhaps because it has to be. Siri is a lot more then simply a speech recognition system-- even though the best speech recognition apps I've seen on IOS have also involved the cloud.

Just that alone seems to imply that it may take more processing power (and battery hogging) then mobile devices have to do well. But Siri does a lot more processing beyond that, juggling the possible recognition results based on context, thus changing its interpretation of the phrase and then re-evaluating again.

All three companies have VERY different business models.

Google relies on profits from its ad business.Apple relies on profits from its hardware sales.Microsoft relies on profits from published software.

Each has bits and pieces that go into others, but the/vast/ majority of their profits comes from their core business.

I admit to only being passingly familiar with Google and Microsoft's financials. But Apple's are very, very, very clearly oriented towards consumer hardware sales. Not ads, not music, not apps, not services. All of those things do nothing but maintain the ecosystem and thus make the devices more attractive. Apple's actual profit on them doesn't even compare to their actual driving businesses.

As you know, the “S” in HTTPS stands for “secure” : all traffic between a client and an https server is ciphered. So we couldn’t read it using a sniffer. In that case, the simplest solution is to fake an HTTPS server, use a fake DNS server, and see what the incoming requests are. Unfortunately, the people behind Siri did things right : they check that guzzoni’s certificate is valid, so you cannot fake it. Well they did check that it was valid, but thing is, you can add your own “root certificate”, which lets you mark any certificate you want as valid.

Some Apple software (parts of iTunes) goes further and checks that the certificate presented by the server is actually signed by Apple. If the Siri software did this then the server would be impossible to fake man-in-middle-wise without hacking the client itself. Just checking that the certificate is valid is pretty useless protection - any certificate could be valid, what you care about is whether the server is who it says it is.

It's not a "pretty useless protection". It's not just checking that the certificate is valid, it's also checking that the certificate authority has a corresponding root certificate installed on the iPhone. It stops anyone who doesn't have access to the phone from eavesdropping or manipulating the data.

I think you have missed my point. If the certificate is signed by some random authority it is "valid" but that only says that the authority (whoever that is) trusts the server. If the client did as it should (and what other Apple apps do), then it should check that the certificate is signed by a authority that it can check directly using the authority's public key built into the client.

That way it would be impossible to spoof the server and perform man-in-the-middle attack without either a) knowing the priv

I knew they were doing some heavy lifting on the server side, cause obviously it doesn't work without a network connection.

However, I figured they would at least do an initial processing pass on the phone and pass up the data points to the server instead of the raw audio. That at least would make sense, and you'd be able to pass much smaller amounts of data. It would also explain the need to have better hardware on the phone. Sending the raw audio seems insane.

I don't understand these hackers, they only promote the lock-in policies of Apple. Because having Siri for a while may lure more users to Apple. After a while, Apple will just close the hole by using the UID's of the phone, like others mentioned, or some kind of unbreakable private-key cryptosystem.

Further, all those jailbreaking tools which are available just give Apple users a reason to say "hey, I'm not locked in, I can always jailbreak my device".

While you can root your device now, it does not mean you can root it forever. Apple devs are smart enough to make the system close to unbreakable, because cryptography is not that hard, and by the way, they are baking their own ICs now.

So I think Apple is just happy with this (relatively small) jailbreaking scene, just like Microsoft was happy with their software being illegally copied for a long while.