Oraclize claims to offer a provably honest secure retrieval of a webpage by taking advantage of TLSnotary (a service that allows an auditor to verify if a specific web page was accurately retrieved)

The purpose of Oraclize seems to be to make this information available to smart contracts.

But to my understanding, the only factor keeping the TLSnotary proof secure is that the person doing the auditing generates and withholds a secret piece of data until the person being audited provides them with the hash of the retrieved web page. A contract obviously cannot generate and withhold a secret. Doesn't this mean that the contract itself is unable to verify the TLSnotary proof?

It seems like some clarification is needed of how exactly Oraclize is handling the TLSnotary secret.

Oraclize also seems to offer a web tool allowing you to play the role of the auditor. Can multiple people audit the same TLSnotary proof in parallel? And suppose I do somehow manage catch Oraclize cheating. How can I prove this to a third party?

Put in more simple terms, how much can I trust Oraclize's service?

Don't take my questions the wrong way--I'm very excited by the idea of oracles actually proving their claims. I just like to ask difficult questions!

3 Answers
3

Using the techniques described here, they are able to provide some additional guarantees regarding the software running in the AWS instance and when/whether it has been modified since being initialised. The "proofs of honesty" they provide (and allow you to verify with their web tool) are the signed attestations of this AWS instance that a proper TLSnotary proof did occur (rather than the TLSnotary proof itself, which would be impossible for a third party to verify after the fact).

It's probably difficult for the average user to understand what this means in terms of security and trust, so let me elaborate a little regarding the security implications of this technique.

Major advantages of the Oraclize approach:

By involving an AWS oracle, Oraclize has made it harder for their data retrieval service to lie about the contents of a web page. So this approach is slightly better than a purely trusted oracle service.

Finally, verifying the signature of the AWS oracle is a comparatively "cheap" computational operation, so Oraclize's "honesty proofs" could potentially be verified on chain provided that the TLSnotarised content is short enough (like an API call).

Major disadvantages of the Oraclize approach:

Amazon themselves, or anyone able to hack/subpoena Amazon's AWS platform, can gain an ability to fake the "proofs of honesty" by stealing the AWS oracle's private key. This entity would probably also have to gain control of the Oraclize API server and/or smart contract if they wanted to break applications based on Oraclize's service, however.

If you do catch someone faking Oraclize proofs, there's no way to prove it to anyone else unless you yourself can obtain the AWS oracle's private key. For further clarity: if you retrieve some data from Oraclize and it's obviously wrong, there's no way for the public to know whether the server or the Oraclize service is the source of the problem. Either can just blame the other.

Don't forget that when retrieving a web page through an oracle, you're still allowing the web server to respond with anything it wants. So for 3rd party attackers, it would probably be much easier to just compromise the server when you want to break an app that uses Oraclize (i.e. the exchange website if an app uses it for market prices). With this point (and the previous one) in mind, it's a lot less hassle to just get the information provider to be the oracle. Oraclize should be thought of as a fallback for retrieving information from sources who aren't blockchain aware. Blockchain aware sources should just sign their information directly.

In many information retrieval situations (where only two parties are involved), it would make a lot more sense for the parties to the transaction themselves to act as client and auditor in the TLSnotary proof. This gives you all the guarantees of Oraclize with none of the additional risks.

Things Oraclize can do to improve:

ensure that their AWS oracle's code is hardened against vm side-channel attacks (i.e. signing operations are all constant time and memory, etc.)

set up various simple static https pages and offer a large, on-chain bounty that pays out to any address retrieved from those pages via Oraclize's API. This would be a simultaneous bet on their behalf against Oraclize's proof mechanisms being compromised and against attackers being able to compromise an Oraclize-run server.

tl;dr:

Oraclize is better than nothing for retrieving content from an HTTPS web page. It's probably the best we can do right now for making public claims about the contents of secure web pages. But it shouldn't be considered a final or completely secure solution to the retrieval of web content. In many cases, having your apps use TLSnotary themselves is strictly superior to using Oraclize. And having an information provider sign their content directly is superior to both in all cases! Oraclize is a decent step forward, but it's not the final solution. Be careful that you use their service in a manner appropriate to the risk level of your application!

The way I was explained TLS notary (by the author of it in IRC) was that the third party has to keep the data to prevent the source from manipulating their data and making it seem as though your captured data is actually not authentic. In other words, a lying "I never said that!" from the webhost people.

It actually is secure because it is like when you browse to a HTTPS secure site, you are able to see provably that they are X company (if you inspect the certificates). TLS Notary is essentially able to record those bytes in such a way that you can play them back later so you can authenticate the bytes again! Very ingenious. (But the math of how it does this in the whitepaper goes above my head. https://tlsnotary.org/TLSNotary.pdf)

Yes this is a good explanation indeed. The TLSNotary proof is saved on IPFS and the proof ipfs multihash is the data sent back to the contract. This mean that by watching the Ethereum network you can get out all the proofs' hashes and then fetch the actual proofs via IPFS and verify them: this is exactly what our network monitor does. If you want to better understand how TLSNotary works I suggest you to watch this great explanatory video.
– Thomas BertaniJan 21 '16 at 9:23

I watched the video, but all it did was confirm that my understanding is correct. The TLSnotary proof is an interactive proof. There's no way to verify the TLSnotary proof unless you were performing the role of the auditor during the retrieval. Someone coming along later has no way of verifying that the TLSnotary secret was truly withtheld until the hash was received. Since I notice you're connected with the project, any chance you could clarify in an answer how exactly Oraclize is handling the secret during and after generation of the TLSnotary proof?
– Jeff ColemanJan 22 '16 at 17:36

@JeffColeman: according to the docs and my understaning of how it works, it is possible to check the tlsnotary proof after retrieval (tlsnotary.org/wp/?p=27, search for "this file is self-validating). It is part of the protocol that the tlsnotary does not give the auditee the secret until the auditee has committed himself that he received the tlsnotary proof is signed by the tlsnotary server.
– gellejJan 24 '16 at 11:47

The "blind notary server" described in the document is a trusted party. Just finish the quote: "this file is self-validating, assuming you trust the public key used to make the signatures (which is the notary server’s public key of course)". All that file does is suggest a way to make Amazon the trusted party instead of the TLSnotary people. In other words, it sounds like the answer to my above question is "An amazon AWS instance is in possession of the TLSnotary secret". Someone has to hold the secret.
– Jeff ColemanJan 24 '16 at 14:30

@ThomasBertani any chance we could get a direct-from-Oraclize answer to this? I'll leave it for a couple days to give you a chance before I fill in what I'm inferring from the various documentations.
– Jeff ColemanJan 25 '16 at 1:55

As @JeffColeman mentioned in a comment, TLSnotary is an interactive protocol, the auditor has to keep some part of the master key in secret for the whole solution to work. For if the auditee knows the full master key, the auditee will then be able to calculate the server HMAC key, and then he will be able to modify anything by just regenerating a new HMAC.

But for Oraclize, they are basically both auditor and auditee. So the problem is: how can I sure auditor and auditee are not colluded, i.e Oraclize is not cheating?

In general, it is extremely difficult to proof auditor and auditee has colluded. Because in order to proof they cheated, we have to proof auditee has already known the secret part of the master key during the transmission. And because the secret will eventually be published (because we need it to perform verification), how can we know at what time the auditee knew the secret?

Oraclize solves this problem by running the auditor in AWS, and publishing the code and runtime information of the auditor to the public so you can verify the integrity of the auditor environment. While it is relative easy to detect changes, but unless you really audit the source code and the configuration of the whole environment, it's still difficult to make sure the auditor is only doing what it is supposed to do and nothing more (no bugs, no matter intended or not). That's why we only said it is probably honest