Opened 4 years ago

Closed 3 years ago

#13696 closed enhancement (fixed)

Use syscall-based entropy reading where possible.

Reported by: nickm Owned by: yawning
Priority: Medium Milestone: Tor: 0.2.8.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: tor-relay
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

When we're setting up our CSPRNG, we should use syscalls where possible, and ensure that we're using the "block if inadequately seeded" variant whenever one exists.

See #10676 for an earlier version of this; but that isn't the sensible way to do it at all. I'll start working on this once I have a linux that supports getentropy().

Child Tickets

Change History (25)

comment:1 Changed 4 years ago by nickm

Type: defectenhancement

comment:2 Changed 4 years ago by nickm

Milestone: Tor: 0.2.6.x-finalTor: 0.2.???

Defer some items from 0.2.6. They are mostly worth doing, but not going to happen super-fast.

comment:3 Changed 3 years ago by yawning

Owner: set to yawning
Severity: Normal
Status: newassigned

I'll grab this.

comment:4 Changed 3 years ago by nickm

Okay. Feel free to yoink the cc0- concerned code from libottery-lite if you like.

comment:5 Changed 3 years ago by yawning

Status: assignedneeds_review

comment:6 Changed 3 years ago by nickm

Status: needs_reviewneeds_revision
  • Have any other bsds grabbed getentropy() ? Possibly we should check for it with autoconf rather than with #if directives.
  • A part of me says that it would be smart to 0-fill the buffer before we call these functions, and then verify that they actually filled the buffer with something other than 0s. (Assuming that the buffer is of sufficient size)
  • Maybe a comment should document what the 0 flags mean here.
  • I don't think that a warn should be necessary if getrandom() returns ENOSYS.
  • If we're doing tor_assert(ret == (int)out_len);, should we check for out_len rather than -1 in the loop?

comment:7 Changed 3 years ago by teor

OS X has CCRandomGenerateBytes in CommonCrypto/CommonRandom.h, I would be happy to submit a patch, either on top of Yawning's branch, or in another ticket.
(Which would you prefer, nickm?)

comment:8 Changed 3 years ago by nickm

On top of Yawning's branch would be fine.

Another thought: we really don't need this code to be fast. Maybe instead of trying syscall, and then device if syscall fails, we should try both and hash them together?

comment:9 in reply to:  6 Changed 3 years ago by yawning

Replying to nickm:

  • Have any other bsds grabbed getentropy() ? Possibly we should check for it with autoconf rather than with #if directives.

Done.

  • A part of me says that it would be smart to 0-fill the buffer before we call these functions, and then verify that they actually filled the buffer with something other than 0s. (Assuming that the buffer is of sufficient size)

Done. Opted to check if the buffer is still all zero for requests that are >= 128 bits, with 3 retries. If teor or somebody write a better light weight statistical test we could use that since this shouldn't be critical path.

  • Maybe a comment should document what the 0 flags mean here.

Done.

  • I don't think that a warn should be necessary if getrandom() returns ENOSYS.

It only warns once, and I think it's a useful warning to have. If people the binary with old kernel headers, the code won't get compiled in, and will never warn. If people build the binary against kernel headers that don't match the actual running kernel, that's a problem. I don't have extremely strong feelings about this though.

  • If we're doing tor_assert(ret == (int)out_len);, should we check for out_len rather than -1 in the loop?

Hm. Matter of taste? The only time the assert will trigger is if the kernel people break userland, so I think it's an invariant.

comment:10 in reply to:  8 Changed 3 years ago by yawning

Replying to nickm:

On top of Yawning's branch would be fine.

Another thought: we really don't need this code to be fast. Maybe instead of trying syscall, and then device if syscall fails, we should try both and hash them together?

I'd personally vote for "do that when we pull in the SHA-3 XOFs" so we can support larger requests, though we only currently call this requesting 32 bytes of output. I'll change it if you think it should happen now.

comment:11 in reply to:  8 ; Changed 3 years ago by teor

Replying to teor:

OS X has CCRandomGenerateBytes in CommonCrypto/CommonRandom.h, I would be happy to submit a patch...

Replying to nickm:

On top of Yawning's branch would be fine.

It's pointless, CCRandomGenerateBytes eventually uses ccrng_system.c, which starts with the comment:

/* A very simple RNG for osx/ios user mode that just get random bytes from /dev/random */

The non-file-based read_random function is only available in OS X / iOS kernel-space.

I can't see the point of calling a system call that wraps /dev/random. We'd just be reducing performance, and exposing ourselves to any security issues in the wrappers.

However, it looks like iOS might not be able to read /dev/random directly, I'll check with an iOS dev.
https://developer.apple.com/library/ios/documentation/Security/Reference/RandomizationReference/index.html#//apple_ref/c/func/SecRandomCopyBytes

comment:12 in reply to:  11 ; Changed 3 years ago by yawning

Replying to teor:

I can't see the point of calling a system call that wraps /dev/random. We'd just be reducing performance, and exposing ourselves to any security issues in the wrappers.

Disagree, at least for Linux's getrandom(), which does happen to pull it's entropy from /dev/urandom, in that behavior is a lot more clear and well defined (Eg: The syscall will block if the device isn't sufficiently seeded, requests under a certain size will not be interrupted by signals or return a short read).

OpenBSD's getentropy() is a better getrandom() in that there's less complexity and error handling required, though I'm sure the OpenBSD people will raise eyebrows at us for not using arc4random().

If the OSX call happens to provide similar advantages over reading from /dev/urandom or similar, it should be used for those reasons.

Ultimately this is just fairly arbitrary tinfoil hattery since OpenSSL will seed itself from the character device on all the platforms just mentioned, and our explicit strong random calls fall back on failure (#17687).

comment:13 in reply to:  12 ; Changed 3 years ago by teor

Replying to yawning:

Replying to teor:

I can't see the point of calling a system call that wraps /dev/random. We'd just be reducing performance, and exposing ourselves to any security issues in the wrappers.

Disagree, at least for Linux's getrandom(), which does happen to pull it's entropy from /dev/urandom, in that behavior is a lot more clear and well defined (Eg: The syscall will block if the device isn't sufficiently seeded, requests under a certain size will not be interrupted by signals or return a short read).

OpenBSD's getentropy() is a better getrandom() in that there's less complexity and error handling required, though I'm sure the OpenBSD people will raise eyebrows at us for not using arc4random().

If the OSX call happens to provide similar advantages over reading from /dev/urandom or similar, it should be used for those reasons.

Apple doesn't seem to document the semantics of these system calls in any detail, at least in any documentation I can find. But the code is open source (well, "open" for the purposes of the kind of review I'm doing right now), so I can do some spelunking.

Our options on OS X / iOS are:

  • /dev/random
  • SecRandomCopyBytes in Security/SecRandom.h
    • available MAC_10_7, IPHONE_2_0
    • documented to read from /dev/random (it does, but only to seed an internal CSPRNG)
    • actually calls CCRandomCopyBytes() in SecurityFramework 57031.40.6
  • CCRandomGenerateBytes in CommonCrypto/CommonRandom.h
    • available MAC_10_10, IPHONE_8_0
    • not documented anywhere I could find
    • code reads from /dev/random to get entropy, then produces random numbers using NIST SP 800-90 March 2007 using aes_ltc_ecb_encrypt_mode / ltc_rijndael_keysched.
      • maximum request size of 216 bytes
      • reseeds from /dev/random every 230 bytes (NIST says at least 248, Apple does 230)
    • alternately, if passed kCCRandomDevRandom it will read directly from /dev/random each time
    • in either case, there are a number of internal retries (for interrupted systems calls etc.) and some error checking that makes calling these functions worthwhile

There's no option for iOS 1, but anyone still running iOS 1 isn't likely to be able to run tor. (And probably has other security concerns.)

So I'm happy to write the code - I'll ask for random bytes from Apple's implementation of the NIST 2007 CSPRNG, unless anyone thinks that's a bad idea. (We might end up using the NIST version via SecRandomCopyBytes, if it's the only function available on that platform.)

Yawning, let me know when you're happy with your branch, and I'll add the OS X / iOS code to it.

comment:14 in reply to:  13 ; Changed 3 years ago by teor

Replying to teor:

Apparently this source is wrong, at least for modern iOS versions: iOS apps can read from /dev/random. (Which makes sense, as it's impossible to seed a CSRNG from /dev/random unless you can read from /dev/random. Unless there was some weird xpc / syscall trickery to a privileged process or the kernel.)

I'm still ok with implementing the call to SecRandomCopyBytes on OS X / iOS, because it has better semantics (lower failure rate).

comment:15 in reply to:  14 Changed 3 years ago by yawning

Status: needs_revisionneeds_review

Replying to teor:

I'm still ok with implementing the call to SecRandomCopyBytes on OS X / iOS, because it has better semantics (lower failure rate).

This sounds reasonable. The non-Dual EC DRBG constructs in SP 800-90 are secure as far as I know. You say it uses ECB, but that's just because they're implementing CTR mode right?

I'm comfortable with the branch the way it is, but it's probably best if nickm gives further feedback regarding the things from the review that I didn't want to change.

comment:16 Changed 3 years ago by nickm

Resolution: implemented
Status: needs_reviewclosed

I'm squashing+merging yawning's branch. Please feel free to open more tickets to better handle more platforms.

comment:17 Changed 3 years ago by nickm

Milestone: Tor: 0.2.???Tor: 0.2.8.x-final

comment:18 Changed 3 years ago by cypherpunks

Resolution: implemented
Status: closedreopened

Jenkins is failing because it supports none of the syscalls. This makes the out variable become unused.

comment:19 Changed 3 years ago by nickm

Resolution: implemented
Status: reopenedclosed

should be fixed; thanks!

comment:20 Changed 3 years ago by cypherpunks

Resolution: implemented
Status: closedreopened

Now Jenkins is failing with

src/common/crypto.c:2406:13: error: implicit conversion loses integer precision: 'long' to 'int' [-Werror,-Wshorten-64-to-32]
      ret = syscall(SYS_getrandom, out, out_len, 0);

comment:21 Changed 3 years ago by yawning

My gcc doesn't have "-Wshorten-64-to-32". This is completely harmless, I assume out_len just needs an explicit cast to int. If the compiler were smarter it would see that, it's impossible for out_len to be big enough to where this warning has an impact, but so it goes.

Edit: Oh, it's a clang-ism.

Last edited 3 years ago by yawning (previous) (diff)

comment:22 in reply to:  21 Changed 3 years ago by cypherpunks

Replying to yawning:

My gcc doesn't have "-Wshorten-64-to-32". This is completely harmless, I assume out_len just needs an explicit cast to int. If the compiler were smarter it would see that, it's impossible for out_len to be big enough to where this warning has an impact, but so it goes.

Edit: Oh, it's a clang-ism.

I'm thinking it's about the return value of syscall. According to the manual syscall returns a long but it's assigned to an int.

comment:23 Changed 3 years ago by nickm

Resolution: implemented
Status: reopenedclosed

attempted a fix; let's see if this one works.

comment:24 Changed 3 years ago by cypherpunks

Resolution: implemented
Status: closedreopened

Now the Windows build is failing with

src/common/crypto.c: In function 'crypto_strongest_rand_fallback':
src/common/crypto.c:2444:41: error: unused parameter 'out' [-Werror=unused-parameter]
 crypto_strongest_rand_fallback(uint8_t *out, size_t out_len)
                                         ^
src/common/crypto.c:2444:53: error: unused parameter 'out_len' [-Werror=unused-parameter]
 crypto_strongest_rand_fallback(uint8_t *out, size_t out_len)
                                                     ^
cc1: all warnings being treated as errors

comment:25 Changed 3 years ago by nickm

Resolution: fixed
Status: reopenedclosed

Should be better now.

Note: See TracTickets for help on using tickets.