AESNI not in use with openssl 1.0.1 on tor 0.2.3.14-alpha
The 0.2.3.14 states in the changelog(https://gitweb.torproject.org/tor.git/blob/tor-0.2.3.14-alpha:/ChangeLog) that aesni will be used. this does not seem to be the case:
uname -a
FreeBSD metaverse.dfri.se 9.0-RELEASE FreeBSD 9.0-RELEASE #0: Tue Jan 3 07:46:30 UTC 2012 root@farrell.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64
sysctl -a |egrep 'hw.machine|hw.model'
hw.machine: amd64 hw.model: Intel(R) Xeon(R) CPU E31230 @ 3.20GHz hw.machine_arch: amd64
OpenSSL build: ./config -shared --prefix=/usr/local/testbuild/
libevent-2.0.18-stable: CFLAGS=-I/usr/local/testbuild/include LDFLAGS=-L/usr/local/testbuild/lib ./configure --prefix=/usr/local/testbuild && make && make install tor-0.2.3.14alpha ./configure --with-openssl-dir=/usr/local/testbuild/lib --disable-asciidoc --enable-gcc-warnings-advisory --enable-gcc-hardening --enable-linker-hardening --with-libevent-dir=/usr/local/testbuild/lib --prefix=/usr/local/testbuild
Results from bench:
OpenSSL 0.9.8:
dmap
nbits=65536 digestmap_set: 40.31 ns per element digestmap_get: 30.15 ns per element digestset_add: 9.94 ns per element digestset_isin: 5.56 ns per element. Hits == 32866304 False positive rate on digestset: 0.23%
aes
1 bytes: 13.08 nsec per byte 2 bytes: 9.55 nsec per byte 4 bytes: 7.89 nsec per byte 8 bytes: 7.04 nsec per byte 16 bytes: 6.67 nsec per byte 32 bytes: 6.48 nsec per byte 64 bytes: 6.38 nsec per byte 128 bytes: 6.40 nsec per byte 256 bytes: 6.35 nsec per byte 512 bytes: 6.32 nsec per byte 1024 bytes: 6.31 nsec per byte 2048 bytes: 6.30 nsec per byte 4096 bytes: 6.30 nsec per byte 8192 bytes: 6.30 nsec per byte
cell_aes
509 bytes, misaligned by 0: 6.12 nsec per byte 509 bytes, misaligned by 1: 6.12 nsec per byte 509 bytes, misaligned by 2: 6.12 nsec per byte 509 bytes, misaligned by 3: 6.12 nsec per byte 509 bytes, misaligned by 4: 6.12 nsec per byte 509 bytes, misaligned by 5: 6.12 nsec per byte 509 bytes, misaligned by 6: 6.12 nsec per byte 509 bytes, misaligned by 7: 6.12 nsec per byte 509 bytes, misaligned by 8: 6.13 nsec per byte 509 bytes, misaligned by 9: 6.12 nsec per byte 509 bytes, misaligned by 10: 6.12 nsec per byte 509 bytes, misaligned by 11: 6.13 nsec per byte 509 bytes, misaligned by 12: 6.12 nsec per byte 509 bytes, misaligned by 13: 6.12 nsec per byte 509 bytes, misaligned by 14: 6.12 nsec per byte 509 bytes, misaligned by 15: 6.12 nsec per byte
cell_ops
Inbound cells: 3126.88 ns per cell. (6.14 ns per byte of payload) Outbound cells: 3131.38 ns per cell. (6.15 ns per byte of payload)
OpenSSL 1.0.1:
dmap
nbits=65536 digestmap_set: 151.35 ns per element digestmap_get: 123.08 ns per element digestset_add: 40.74 ns per element digestset_isin: 29.20 ns per element. Hits == 32825344 False positive rate on digestset: 0.21%
aes
1 bytes: 36.85 nsec per byte 2 bytes: 24.55 nsec per byte 4 bytes: 17.58 nsec per byte 8 bytes: 14.48 nsec per byte 16 bytes: 11.47 nsec per byte 32 bytes: 10.53 nsec per byte 64 bytes: 10.05 nsec per byte 128 bytes: 3.21 nsec per byte 256 bytes: 2.65 nsec per byte 512 bytes: 2.36 nsec per byte 1024 bytes: 2.23 nsec per byte 2048 bytes: 2.16 nsec per byte 4096 bytes: 2.12 nsec per byte 8192 bytes: 2.10 nsec per byte
cell_aes
509 bytes, misaligned by 0: 2.74 nsec per byte 509 bytes, misaligned by 1: 2.74 nsec per byte 509 bytes, misaligned by 2: 2.74 nsec per byte 509 bytes, misaligned by 3: 2.74 nsec per byte 509 bytes, misaligned by 4: 2.74 nsec per byte 509 bytes, misaligned by 5: 2.74 nsec per byte 509 bytes, misaligned by 6: 2.74 nsec per byte 509 bytes, misaligned by 7: 2.74 nsec per byte 509 bytes, misaligned by 8: 2.74 nsec per byte 509 bytes, misaligned by 9: 2.74 nsec per byte 509 bytes, misaligned by 10: 2.74 nsec per byte 509 bytes, misaligned by 11: 2.74 nsec per byte 509 bytes, misaligned by 12: 2.74 nsec per byte 509 bytes, misaligned by 13: 2.74 nsec per byte 509 bytes, misaligned by 14: 2.74 nsec per byte 509 bytes, misaligned by 15: 2.74 nsec per byte
cell_ops
Inbound cells: 1414.43 ns per cell. (2.78 ns per byte of payload) Outbound cells: 1518.10 ns per cell. (2.98 ns per byte of payload)
This is nowhere near the dramatic performance improvements seen in #5406 (moved)
For comparision, here are benchmarks from a machine that does not have AESNI, but tor benched against 0.9.8 and 1.0.1:
OpenSSL 0.9.8:
dmap
nbits=65536 digestmap_set: 40.31 ns per element digestmap_get: 30.15 ns per element digestset_add: 9.94 ns per element digestset_isin: 5.56 ns per element. Hits == 32866304 False positive rate on digestset: 0.23%
aes
1 bytes: 13.08 nsec per byte 2 bytes: 9.55 nsec per byte 4 bytes: 7.89 nsec per byte 8 bytes: 7.04 nsec per byte 16 bytes: 6.67 nsec per byte 32 bytes: 6.48 nsec per byte 64 bytes: 6.38 nsec per byte 128 bytes: 6.40 nsec per byte 256 bytes: 6.35 nsec per byte 512 bytes: 6.32 nsec per byte 1024 bytes: 6.31 nsec per byte 2048 bytes: 6.30 nsec per byte 4096 bytes: 6.30 nsec per byte 8192 bytes: 6.30 nsec per byte
cell_aes
509 bytes, misaligned by 0: 6.12 nsec per byte 509 bytes, misaligned by 1: 6.12 nsec per byte 509 bytes, misaligned by 2: 6.12 nsec per byte 509 bytes, misaligned by 3: 6.12 nsec per byte 509 bytes, misaligned by 4: 6.12 nsec per byte 509 bytes, misaligned by 5: 6.12 nsec per byte 509 bytes, misaligned by 6: 6.12 nsec per byte 509 bytes, misaligned by 7: 6.12 nsec per byte 509 bytes, misaligned by 8: 6.13 nsec per byte 509 bytes, misaligned by 9: 6.12 nsec per byte 509 bytes, misaligned by 10: 6.12 nsec per byte 509 bytes, misaligned by 11: 6.13 nsec per byte 509 bytes, misaligned by 12: 6.12 nsec per byte 509 bytes, misaligned by 13: 6.12 nsec per byte 509 bytes, misaligned by 14: 6.12 nsec per byte 509 bytes, misaligned by 15: 6.12 nsec per byte
cell_ops
Inbound cells: 3126.88 ns per cell. (6.14 ns per byte of payload) Outbound cells: 3131.38 ns per cell. (6.15 ns per byte of payload)
OpenSSL 1.0.1:
dmap
nbits=65536 digestmap_set: 151.35 ns per element digestmap_get: 123.08 ns per element digestset_add: 40.74 ns per element digestset_isin: 29.20 ns per element. Hits == 32825344 False positive rate on digestset: 0.21%
aes
1 bytes: 36.85 nsec per byte 2 bytes: 24.55 nsec per byte 4 bytes: 17.58 nsec per byte 8 bytes: 14.48 nsec per byte 16 bytes: 11.47 nsec per byte 32 bytes: 10.53 nsec per byte 64 bytes: 10.05 nsec per byte 128 bytes: 3.21 nsec per byte 256 bytes: 2.65 nsec per byte 512 bytes: 2.36 nsec per byte 1024 bytes: 2.23 nsec per byte 2048 bytes: 2.16 nsec per byte 4096 bytes: 2.12 nsec per byte 8192 bytes: 2.10 nsec per byte
cell_aes
509 bytes, misaligned by 0: 2.74 nsec per byte 509 bytes, misaligned by 1: 2.74 nsec per byte 509 bytes, misaligned by 2: 2.74 nsec per byte 509 bytes, misaligned by 3: 2.74 nsec per byte 509 bytes, misaligned by 4: 2.74 nsec per byte 509 bytes, misaligned by 5: 2.74 nsec per byte 509 bytes, misaligned by 6: 2.74 nsec per byte 509 bytes, misaligned by 7: 2.74 nsec per byte 509 bytes, misaligned by 8: 2.74 nsec per byte 509 bytes, misaligned by 9: 2.74 nsec per byte 509 bytes, misaligned by 10: 2.74 nsec per byte 509 bytes, misaligned by 11: 2.74 nsec per byte 509 bytes, misaligned by 12: 2.74 nsec per byte 509 bytes, misaligned by 13: 2.74 nsec per byte 509 bytes, misaligned by 14: 2.74 nsec per byte 509 bytes, misaligned by 15: 2.74 nsec per byte
cell_ops
Inbound cells: 1414.43 ns per cell. (2.78 ns per byte of payload) Outbound cells: 1518.10 ns per cell. (2.98 ns per byte of payload)