Opened 4 years ago

Closed 4 years ago

Last modified 4 years ago

#16535 closed enhancement (implemented)

Investigate building ed25519-donna with SSE2 support.

Reported by: yawning Owned by: yawning
Priority: Low Milestone: Tor: 0.2.7.x-final
Component: Core Tor/Tor Version: Tor: 0.2.7
Severity: Keywords: tor-relay, performance, ntor
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:


Followup to #9663/#16467, now that we include ed25519-donna, we should look into building it with SSE2 support where appropriate.

Adding something like this to ed25519-donna-portable.h should do the trick:

/* Tor: Build with SSE2 where it makes sense to do so. */
#if defined(__SSE2__) && !(defined(__x86_64__) || defined(__amd64__)) 
#define ED25519_SSE2

Potential pitfalls:

  • SSE2 builds benchmark worse on x86_64, at least Haswell.
  • This opens us up to potentially really obnoxious compiler bugs/pathologically bad code generation.
  • Most distribution packages probably don't build for an architecture that will define _SSE2_, so this would only get picked up by people doing custom builds.

Open questions:

  • Is this actually faster on 32 bit Intel?
  • Is doing something like "always building SSE2 and non-SSE2 and using CPUID to pick at runtime" sensible? (I personally think "No").

Child Tickets

Change History (5)

comment:1 Changed 4 years ago by yawning

Owner: set to yawning
Status: newaccepted

comment:2 Changed 4 years ago by yawning

Status: acceptedneeds_review

Using ed25519-donna-identify.h's CPU identification macros instead of the compilers since the header does all the work needed here, and it picks it up with the x86_64 check commented out on my box (and the tests pass with the SSE2 code, though it's slower).

I think this is low-risk enough to take, and the gains are worth while, though I'm not sure how much of our relay/HS server userbase is 32 bit x86 these days.

comment:3 Changed 4 years ago by nickm

/me requests a ticket for "include sizeof(void *) in your extrainfo"

comment:4 Changed 4 years ago by nickm

Resolution: implemented
Status: needs_reviewclosed

lgtm; yolo; merging.

comment:5 Changed 4 years ago by yawning may come to haunt us later, leaving this here just so there's a link to it. The solution would be that people cross compiling should explicitly specify march if they need it to run on something horrifically ancient.

Note: See TracTickets for help on using tickets.