Opened 7 years ago

Closed 7 years ago

Last modified 7 years ago

#5406 closed enhancement (implemented)

Use EVP_aes_128_ctr() on OpenSSL >= 1.0.1

Reported by: nickm Owned by:
Priority: Medium Milestone: Tor: 0.2.3.x-final
Component: Core Tor/Tor Version:
Severity: Keywords: tor-relay
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Starting with OpenSSL 1.0.1, there is an EVP value to provide counter-mode AES. We definitely want to be using this one for our counter mode implementation in aes.c, since it appears to automatically use the best available AES implementation, including tricky ones that use AESNI/bitsliced/vectorized implementations.

Bitsliced and vectorized implementations of counter mode are not only faster than the straightforward assembly versions, but also run in constant-time.

Child Tickets

Change History (10)

comment:1 Changed 7 years ago by nickm

Status: newneeds_review

See branch openssl101_aes_ctr in my public repository. It gets a 3.3x aes-ctr speedup on my desktop, which doesn't even have AESNI support.

This branch also makes aesni get used on OpenSSL 1.0.1 (when it's supported), and refactors the crypto_cipher_t setup to be less error-prone.

comment:2 Changed 7 years ago by nickm

Oh! Additionally, the patch drops support for doing hybrid encryption with no RSA padding. That would be a foolish thing to do, and it's a good thing we never used it.

comment:3 Changed 7 years ago by nickm

Linus put up some pretty sweet performance numbers for openssl+this branch on a machine with aesni support at #5440 . We should close that when we close this.

comment:4 Changed 7 years ago by Sebastian

I wonder if maybe for 3e972066a23 we should completely drop the OK_NO_PADDING define?

In af387b612a7 in crypto_cipher_decrypt_with_iv() you dropped some asserts that ought to stay, I think.

There's a fixup commit in my openssl101_aes_ctr branch for a non-issue that confused me during codereview.

Otherwise, this looks good. The speedup on my macbook is impressive:
pre-branch:

===== aes =====
1 bytes: 20.27 nsec per byte
2 bytes: 16.00 nsec per byte
4 bytes: 12.98 nsec per byte
8 bytes: 11.75 nsec per byte
16 bytes: 11.14 nsec per byte
32 bytes: 10.61 nsec per byte
64 bytes: 10.49 nsec per byte
128 bytes: 10.59 nsec per byte
256 bytes: 10.36 nsec per byte
512 bytes: 10.31 nsec per byte
1024 bytes: 10.32 nsec per byte
2048 bytes: 10.25 nsec per byte
4096 bytes: 10.27 nsec per byte
8192 bytes: 10.36 nsec per byte
===== cell_aes =====
509 bytes, misaligned by 0: 10.19 nsec per byte
509 bytes, misaligned by 1: 10.16 nsec per byte
509 bytes, misaligned by 2: 10.31 nsec per byte
509 bytes, misaligned by 3: 10.22 nsec per byte
509 bytes, misaligned by 4: 10.23 nsec per byte
509 bytes, misaligned by 5: 10.25 nsec per byte
509 bytes, misaligned by 6: 10.22 nsec per byte
509 bytes, misaligned by 7: 10.24 nsec per byte
509 bytes, misaligned by 8: 10.32 nsec per byte
509 bytes, misaligned by 9: 10.18 nsec per byte
509 bytes, misaligned by 10: 10.17 nsec per byte
509 bytes, misaligned by 11: 10.49 nsec per byte
509 bytes, misaligned by 12: 10.48 nsec per byte
509 bytes, misaligned by 13: 10.42 nsec per byte
509 bytes, misaligned by 14: 10.25 nsec per byte
509 bytes, misaligned by 15: 10.28 nsec per byte
===== cell_ops =====
 Inbound cells: 5400.01 ns per cell. (10.61 ns per byte of payload)
Outbound cells: 5294.33 ns per cell. (10.40 ns per byte of payload)

with patch:

===== aes =====
1 bytes: 22.61 nsec per byte
2 bytes: 12.67 nsec per byte
4 bytes: 7.24 nsec per byte
8 bytes: 4.59 nsec per byte
16 bytes: 2.46 nsec per byte
32 bytes: 1.39 nsec per byte
64 bytes: 0.75 nsec per byte
128 bytes: 0.58 nsec per byte
256 bytes: 0.41 nsec per byte
512 bytes: 0.36 nsec per byte
1024 bytes: 0.31 nsec per byte
2048 bytes: 0.29 nsec per byte
4096 bytes: 0.30 nsec per byte
8192 bytes: 0.27 nsec per byte
===== cell_aes =====
509 bytes, misaligned by 0: 0.44 nsec per byte
509 bytes, misaligned by 1: 0.42 nsec per byte
509 bytes, misaligned by 2: 0.43 nsec per byte
509 bytes, misaligned by 3: 0.43 nsec per byte
509 bytes, misaligned by 4: 0.42 nsec per byte
509 bytes, misaligned by 5: 0.43 nsec per byte
509 bytes, misaligned by 6: 0.43 nsec per byte
509 bytes, misaligned by 7: 0.42 nsec per byte
509 bytes, misaligned by 8: 0.43 nsec per byte
509 bytes, misaligned by 9: 0.42 nsec per byte
509 bytes, misaligned by 10: 0.43 nsec per byte
509 bytes, misaligned by 11: 0.43 nsec per byte
509 bytes, misaligned by 12: 0.43 nsec per byte
509 bytes, misaligned by 13: 0.42 nsec per byte
509 bytes, misaligned by 14: 0.43 nsec per byte
509 bytes, misaligned by 15: 0.43 nsec per byte
===== cell_ops =====
 Inbound cells: 222.23 ns per cell. (0.44 ns per byte of payload)
Outbound cells: 230.09 ns per cell. (0.45 ns per byte of payload)

comment:5 Changed 7 years ago by nickm

cherry-picked your fixup since for some reason merge was giving me trouble.

I agree about dropping PK_NO_PADDING. Added a commit for that.

Added a fixup commit for the asserts.

comment:6 Changed 7 years ago by arma

I pushed a openssl101_aes_ctr branch with a few trivial fixes.

I looked carefully at all the commits but the big one, which I looked somewhat at. It all looks plausible.

comment:7 Changed 7 years ago by nickm

Okay. Just merged arma's branch onto mine. I'll read the commits again, and assuming I still find them sane, I'll squash and merge. Thanks, all!

comment:8 Changed 7 years ago by nickm

Resolution: implemented
Status: needs_reviewclosed

There. I found one spurious newline added to the top of a file and removed it. openssl101_aes_ctr has the final vesion, which I rebased into openssl101_aes_ctr_rebased, then merged into master. Thanks, everyone!

comment:9 Changed 7 years ago by nickm

Keywords: tor-relay added

comment:10 Changed 7 years ago by nickm

Component: Tor RelayTor
Note: See TracTickets for help on using tickets.