Description

running the tor deamon with static openssl 1.0.1d led to masses of

[warn] 45 connections have failed:
[warn] 32 connections died in state handshaking (Tor, v3 handshake) with SSL state SSL negotiation finished successfully in OPEN
[warn] 13 connections died in state renegotiating (TLS, v2 handshake) with SSL state SSLv3 read server hello A in RENEGOTIATE

while bootstraping the node. please see attached excerpt of the debug-log.

It looks like there's an extraneous byte coming out of the first SSL_read there -- the next bytes are a perfectly good VERSIONS cell, followed by what seems to be at least the start of a good CERTS cell.

The SSL-Errors seem to disapear when building with no-asm, and bootstraping is 100% complete.

But now I got this after the first Bootstrap:

[notice] Bootstrapped 100%: Done.
[warn] Your Guard AccessNowKromyon03=6557396CF0EE5B72563A22BCAA0FF26E77FA3D08 is failing a very large amount of circuits. Most likely this means the Tor network is overloaded, but it could also mean an attack against you or potentially the guard itself. Success counts are 62/177. Use counts are 0/0. 62 circuits completed, 0 were unusable, 0 collapsed, and 0 timed out. For reference, your timeout cutoff is 131 seconds.
[warn] Your Guard jalopy=35BDC6486420EFD442C985D8D3C074988BFE544B is failing an extremely large amount of circuits. This could indicate a route manipulation attack, extreme network overload, or a bug. Success counts are 51/192. Use counts are 0/0. 51 circuits completed, 0 were unusable, 0 collapsed, and 0 timed out. For reference, your timeout cutoff is 131 seconds.
[warn] Your Guard lilith=6BE0C165B88EBE0371597F9E2230D3F253A299EF is failing an extremely large amount of circuits. This could indicate a route manipulation attack, extreme network overload, or a bug. Success counts are 48/195. Use counts are 0/0. 48 circuits completed, 0 were unusable, 0 collapsed, and 0 timed out. For reference, your timeout cutoff is 131 seconds.
[notice] Self-testing indicates your DirPort is reachable from the outside. Excellent.

I restarted the daemon two more times and did not get this warning again. Not sure if related.

The SSL-Errors seem to disapear when building with no-asm, and bootstraping is 100% complete.

But now I got this after the first Bootstrap:

[notice] Bootstrapped 100%: Done.
[warn] Your Guard AccessNowKromyon03=6557396CF0EE5B72563A22BCAA0FF26E77FA3D08 is failing a very large amount of circuits. Most likely this means the Tor network is overloaded, but it could also mean an attack against you or potentially the guard itself. Success counts are 62/177. Use counts are 0/0. 62 circuits completed, 0 were unusable, 0 collapsed, and 0 timed out. For reference, your timeout cutoff is 131 seconds.

I wouldn't worry about this one in this case: Tor is just complaining about all of the circuits it tried to launch to your guard node and failed to do so back when it was building with a busted openssl.

Does your CPU have aesni instruction? ("cat /proc/cpuinfo |grep aes" will tell you on Linuxy systems.)

Does your CPU have aesni instruction? ("cat /proc/cpuinfo |grep aes" will tell you on Linuxy systems.)

Yes, but I don't use it in my static OpenSSL/TOR Setup.

(Had some issues with the dynamic loading of the AESNI Engine at tor startup with static linked OpenSSL - since CPU is not (yet) the bottleneck on the Nodes, I didn't bother so far. This may be specific to my Setup)

Does your CPU have aesni instruction? ("cat /proc/cpuinfo |grep aes" will tell you on Linuxy systems.)

Yes, but I don't use it in my static OpenSSL/TOR Setup.

(Had some issues with the dynamic loading of the AESNI Engine at tor startup with static linked OpenSSL - since CPU is not (yet) the bottleneck on the Nodes, I didn't bother so far. This may be specific to my Setup)

I don't think aesni is a separate engine in openssl 1.0.1 -- I think you get it by default unless it's specifically disabled.

I believe that this is a bug in the AES-NI, stitched code. I'm looking at the diff now but I didn't write this code and it's complex. I'll let the OpenSSL team know about the problem tonight, whether or not I'm able to pin down the issue.

Okay, there's a possible set of workaround in my repository as branches "bug8179_022", "bug8179_023", and "bug8179_024". They don't fix the underlying problem -- they just tell OpenSSL it's not allowed to use those stitched ciphers.

It would be great to have a fix in openssl; some understanding of how this broke without getting caught; and some code Tor can call at runtime to see whether it has a broken openssl or not.

Adam Langley has investigated and encouraged the OpenSSL team to do so as well: It appears that the code for using AEAD CBC ciphers with TLS is broken in OpenSSL 1.0.1d. Right now, the stitched aesni-cbc-hmac-sha1 cipher is the only such cipher.

Since this is a pretty bad problem (and will break all commonly used AES ciphers when used with AESNI), I'd hope that a fix will come out soon. To detect this at runtime, we'll have to try doing a TLS connection with ourselves: testing the cipher implementation itself won't work.