Opened 17 months ago

Closed 17 months ago

Last modified 14 months ago

#18162 closed defect (fixed)

Potential heap corruption in smartlist_add(), smartlist_insert()

Reported by: asn Owned by: nickm
Priority: High Milestone: Tor: 0.2.8.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: tor-bug-bounty, security, 025-backport, 026-backport, 027-backport, 024-backport, 2016-bug-retrospective
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description (last modified by asn)

Here follows vulnerability report by Guido Vranken reported through the Tor bug bounty program.

The attack requires minimum 16GB of available memory on the victim's host, so it's quite hard to be exploited.

Walkthrough of the vulnerability

smartlist_add and smartlist_insert both invoke smartlist_ensure_capacity prior adding an element to the list in order to ensure that sufficient memory is available, to exit() if not enough memory is available and to detect requests for an invalid size:

static INLINE void
smartlist_ensure_capacity(smartlist_t *sl, int size)
{
#if SIZEOF_SIZE_T > SIZEOF_INT
#define MAX_CAPACITY (INT_MAX)
#else
#define MAX_CAPACITY (int)((SIZE_MAX / (sizeof(void*))))
#define ASSERT_CAPACITY
#endif
  if (size > sl->capacity) {
    int higher = sl->capacity;
    if (PREDICT_UNLIKELY(size > MAX_CAPACITY/2)) {
#ifdef ASSERT_CAPACITY
      /* We don't include this assertion when MAX_CAPACITY == INT_MAX,
       * since int size; (size <= INT_MAX) makes analysis tools think we're
       * doing something stupid. */
      tor_assert(size <= MAX_CAPACITY);
#endif
      higher = MAX_CAPACITY;
    } else {
      while (size > higher)
        higher *= 2;
    }
    sl->capacity = higher;
    sl->list = tor_reallocarray(sl->list, sizeof(void*),
                                ((size_t)sl->capacity));
  }
#undef ASSERT_CAPACITY
#undef MAX_CAPACITY
}

On a typical 64-bit system, SIZEOF_INT is 4 and SIZEOF_SIZE_T is 8. Consequently, MAX_CAPACITY is INT_MAX, which is 0x7FFFFFFF as can be seen in torint.h:

#ifndef INT_MAX
#if (SIZEOF_INT == 4)
#define INT_MAX 0x7fffffffL
#elif (SIZEOF_INT == 8)
#define INT_MAX 0x7fffffffffffffffL
#else
#error "Can't define INT_MAX"
#endif
#endif

So MAX_CAPACITY is 0x7FFFFFFF. Now assume that that many (0x7FFFFFFF) items have already been added to a smartlist via smartlist_add(sl, value).

smartlist_add() is:

void
smartlist_add(smartlist_t *sl, void *element)
{
  smartlist_ensure_capacity(sl, sl->num_used+1);
  sl->list[sl->num_used++] = element;
}

If sl->num_used is 0x7FFFFFFF prior to invoking smartlist_add, then the next smartlist_add is effectively:

void
smartlist_add(smartlist_t *sl, void *element)
{
  smartlist_ensure_capacity(sl, -2147483648);
  sl->list[2147483647] = element;
  sl->num_used = -2147483648
}

This is the case since we are dealing with a signed 32 bit integer, and 2147483647 + 1 equals -2147483647.

All of the code in smartlist_ensure_capacity is wrapped inside the following if block:

  if (size > sl->capacity) {
  }

The expression -2147483648 > 2147483647 equals false, thus the code inside the block is not executed.

What actually causes the segmentation fault is that a negative 32 bit integer is used to compute a the location of array index on a 64 bit memory layout, ie., the next call to smartlist_add is effectively:

void
smartlist_add(smartlist_t *sl, void *element)
{
  smartlist_ensure_capacity(sl, -2147483647); // Note that this is effective do-nothing code, as explained above
  sl->list[-2147483648] = element;
  sl->num_used = -2147483647
}

Discussion

The requirement for 16 gigabytes of memory is considerable.

Triggering the vulnerability obviously also requires some code path which will invoke smartlist_add or smartlist_insert upon the same smartlist at the attacker's behest. Moreover, such a code path may have the side effect that it requires a separate allocation for each object that is added to the list; smartlist_add takes a pointer argument after all -- usually, but not always, this pointer refers to freshly allocated memory. Exceptions to this rule are static strings and pointers to a place in a large string or buffer that was already extant.
Once a vulnerable code path has been discovered, then it ultimately boils down to how much memory a user's machine is able to allocate in order to corrupt the heap.

Despite these constraints, smartlists form a considerable portion of the infrastructure of your code (I count some 380+ occurrences of smartlist_add/smartlist_insert in the .c files using grep, that is excluding the test/ directory) and as such it's probably wise to revise the checks in smartlist_ensure_capacity.

Child Tickets

Change History (22)

comment:1 Changed 17 months ago by asn

  • Description modified (diff)

comment:2 Changed 17 months ago by asn

  • Component changed from - Select a component to Tor
  • Keywords security added
  • Milestone set to Tor: 0.2.8.x-final
  • Priority changed from Medium to High

comment:3 Changed 17 months ago by nickm

  • Keywords 025-backport 026-backport 027-backport added

comment:4 Changed 17 months ago by nickm

  • Owner set to nickm
  • Status changed from new to accepted

comment:5 follow-ups: Changed 17 months ago by nickm

  • Keywords 024-backport added
  • Status changed from accepted to needs_review

The branch bug18162_024 in my public repository has a proposed fix for this bug, along with a small family of related bugs. You can see the patch here: https://gitweb.torproject.org/nickm/tor.git/commit/?h=bug18162_024&id=bca7083e8285e8e6a4377076a7e432417eafc6d2

Review would be much appreciated.

From an abundance of caution, I've written the fix to apply to 0.2.4 and later, though I doubt that's necessary.

comment:6 in reply to: ↑ description Changed 17 months ago by teor

Replying to asn:

If sl->num_used is 0x7FFFFFFF prior to invoking smartlist_add, then the next smartlist_add is effectively:

void
smartlist_add(smartlist_t *sl, void *element)
{
  smartlist_ensure_capacity(sl, -2147483648);
  sl->list[2147483647] = element;
  sl->num_used = -2147483648
}

This is the case since we are dealing with a signed 32 bit integer, and 2147483647 + 1 equals -2147483647.

All of the code in smartlist_ensure_capacity is wrapped inside the following if block:

  if (size > sl->capacity) {
  }

The expression -2147483648 > 2147483647 equals false, thus the code inside the block is not executed.

What actually causes the segmentation fault is that a negative 32 bit integer is used to compute a the location of array index on a 64 bit memory layout, ie., the next call to smartlist_add is effectively:

void
smartlist_add(smartlist_t *sl, void *element)
{
  smartlist_ensure_capacity(sl, -2147483647); // Note that this is effective do-nothing code, as explained above
  sl->list[-2147483648] = element;
  sl->num_used = -2147483647
}

It's worth noting that signed integer overflow is undefined behaviour in C, so an optimising compiler could strictly do anything with 2147483647 + 1, including assuming that it will never occur, and optimising out the check, code after the check, or even code before the check.

http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html

Whether this undefined behaviour actually occurs in Tor depends on the setting of configure's --enable-expensive-hardening. It sets -fwrapv, which guarantees wrapping on signed-integer overflow. (Note that in #17983, we consider building with -ftrapv instead of -fwrapv. This issue would have caused a crash on integer overflow under -ftrapv. It would have been a denial of service risk, but not otherwise exploitable.)

If configure --enable-expensive-hardening is not set, the behaviour is undefined, and the outcome depends on the compiler's optimisation of undefined behaviour. I don't know what current compilers do with this code - has anyone checked?

comment:7 in reply to: ↑ 5 Changed 17 months ago by teor

Replying to nickm:

The branch bug18162_024 in my public repository has a proposed fix for this bug, along with a small family of related bugs. You can see the patch here: https://gitweb.torproject.org/nickm/tor.git/commit/?h=bug18162_024&id=bca7083e8285e8e6a4377076a7e432417eafc6d2

Review would be much appreciated.

From an abundance of caution, I've written the fix to apply to 0.2.4 and later, though I doubt that's necessary.

Code review:

These fixes look like they work and resolve this issue.

In multiple functions:

The calculation ((size_t) sl->num_used)+1 never overflows, because sl->num_used is at most MAX_CAPACITY, and MAX_CAPACITY is always less than SIZE_MAX.

In smartlist_add_all():

In fact, MAX_CAPACITY is strictly less than or equal to SIZE_MAX/2, as long as sizeof(void*) >= 2. So the addition will never overflow on any platform tor builds on, either (so the check is redundant):

  size_t new_size = (size_t)s1->num_used + (size_t)s2->num_used;
  tor_assert(new_size >= (size_t) s1->num_used); /* check for overflow. */

s1->list + s1->num_used also won't overflow, because tor_malloc() and tor_realloc() must ensure that sl->list is allocated low enough in memory to contain s1->capacity. This argument applies to s1 before and after the smartlist_ensure_capacity() call, and to s2.

comment:8 in reply to: ↑ 5 Changed 17 months ago by asn

Replying to nickm:

The branch bug18162_024 in my public repository has a proposed fix for this bug, along with a small family of related bugs. You can see the patch here: https://gitweb.torproject.org/nickm/tor.git/commit/?h=bug18162_024&id=bca7083e8285e8e6a4377076a7e432417eafc6d2

Review would be much appreciated.

Thanks for the patch!

Looks reasonable, but oh my all this casting makes my brain hurt. This casting approach was preferred over re-defining sl->capacity and sl->num_used as size_t. Is this because you would then need to change too many things on the rest of the codebase?

comment:9 Changed 17 months ago by nickm

This casting approach was preferred over re-defining sl->capacity and sl->num_used as size_t. Is this because you would then need to change too many things on the rest of the codebase?

That's right. In particular, if we cascade the change into smartlist_len(), then everything that calls smartlist_len or indexes a smartlist gets more complex. If we only cascade the change throughout container.c, it's still a bit ugly. See for example my branch bug18162_024_alternative , which I am not currently recommended.

comment:10 Changed 17 months ago by cypherpunks

The value range of size_t can be smaller than int. This is likely true for some compilers targeting segmented memory models. It would be good to have a compile time assert to check that.

Integer types are allowed to have padding bits. Thus, the check:
#if SIZEOF_SIZE_T > SIZEOF_INT
doesn't necessarily do the right thing. Even if there is no hardware that has padding, there might be compiler versions in the future where it matters, e.g. if they want to attach state to track usage of variables.

comment:11 follow-up: Changed 17 months ago by nickm

The value range of size_t can be smaller than int.
Integer types are allowed to have padding bits.

I'd be fine with compile-time fixes for both of these. How about something like

#if SIZE_MAX < INT_MAX
#error
#endif

for the first one, and something like

#if SIZE_MAX/SIZEOF_VOID_P > INT_MAX
 ...
#endif

for the second ?

(I do not expect to start supporting segmented memory models or weird padded ints.)

comment:12 follow-up: Changed 17 months ago by bugzilla

Haven't digged it deeply, but why signed int for size? Why int? Isn't size_t enough? uint64_t then?
All code with signed size must be rewritten even if it is very difficult (cause it's "bug by design").

comment:13 in reply to: ↑ 12 Changed 17 months ago by cypherpunks

+1 to bugzilla
May I ask an offtop question? Why do programmers use signed integers more often than unsigned as an array index?

comment:14 Changed 17 months ago by teor

I agree that it's much harder to program securely using int than size_t. For a new API, I'd use size_t.

But it is possible to check ints before you use them: we need to make sure the int is in range. And we have to check for underflow and overflow before they happen, as they are undefined behaviour.

The risks of converting all tor code over to size_t are that we introduce bugs while doing it, that it becomes a hell of merge conflicts, or that we don't find every instance.

It's worth noting that every smartlist() function that takes an int asserts that it's greater than 0, and less than the size of the smartlist.

And internally, smartlist_ensure_capacity() does similar overflow checks.
We could add an overflow check to smartlist_ensure_capacity(), but I don't think it can actually overflow.

Neither can num_used overflow, because it's bound by capacity. It can't underflow, because we check it's greater than 0 before decrementing it.

That said, we should build with -ftrapv (the conclusion we reached in #17983, despite the bug summary), and then if we miss any checks, it won't be a security issue, Tor will just crash.

comment:15 in reply to: ↑ 11 Changed 17 months ago by cypherpunks

Replying to nickm:

The value range of size_t can be smaller than int.
Integer types are allowed to have padding bits.

I'd be fine with compile-time fixes for both of these. How about something like

#if SIZE_MAX < INT_MAX
#error
#endif

for the first one, and something like

#if SIZE_MAX/SIZEOF_VOID_P > INT_MAX
 ...
#endif

for the second ?

That looks good and should be portable.

(I do not expect to start supporting segmented memory models or weird padded ints.)

That's reasonable.

There is a chance that the padding might happen with some debug or hardening configurations on a mainstream platform like clang/x86. The padding bits could be used for tracking uninitialized usage or concurrent access (data races) and on x86 with no penalty for unaligned access some arbitrary number of padding bits/bytes could be used (potentially different for signed/unsigned types). So the compile-time check should be there.

comment:16 Changed 17 months ago by asn

  • Keywords tor-bug-bounty added

comment:17 follow-up: Changed 17 months ago by asn

Looking at bug18162_024:

-  if (size > sl->capacity) {
-    int higher = sl->capacity;
+  tor_assert(size <= MAX_CAPACITY);

can't this assert be triggered by an attacker who can fill up smartlists? For example smartlist_add() does

-  smartlist_ensure_capacity(sl, sl->num_used+1);
+  smartlist_ensure_capacity(sl, ((size_t) sl->num_used)+1);

what if sl->num_used is MAX_CAPACITY at that point?

Or is that assert just there to stop the heap corruption, assuming we will never need so many elements in a smartlist?

comment:18 Changed 17 months ago by bugzilla

comment:19 in reply to: ↑ 17 Changed 17 months ago by teor

Replying to asn:

Looking at bug18162_024:

-  if (size > sl->capacity) {
-    int higher = sl->capacity;
+  tor_assert(size <= MAX_CAPACITY);

can't this assert be triggered by an attacker who can fill up smartlists? For example smartlist_add() does

-  smartlist_ensure_capacity(sl, sl->num_used+1);
+  smartlist_ensure_capacity(sl, ((size_t) sl->num_used)+1);

what if sl->num_used is MAX_CAPACITY at that point?

Or is that assert just there to stop the heap corruption, assuming we will never need so many elements in a smartlist?

Yes, if we're using over 2 billion elements in a smartlist, something has gone very wrong, and we should exit.

comment:20 Changed 17 months ago by nickm

  • Resolution set to fixed
  • Status changed from needs_review to closed

Okay. I've added the additional preprocessor hacks discussed above and merged to 0.2.4 and later.

Note 740421af194b890c24242a834ed03ffc5c4c16ab where I had a merge conflict merging into 0.2.6.

Note also 7788ee43e519a8f39e3917c4161e7c635cd2ecd9 where I had a merge conflict merging into master.

comment:21 Changed 14 months ago by nickm

  • Keywords 2016-bug-retrospective added

Mark more tickets for severe bug retrospective, based on Priority and date and hand-inspection.

comment:22 Changed 14 months ago by nickm

Mark more tickets for bug retrospective based on hand-review of changelogs from 0.2.5 onwards.

Note: See TracTickets for help on using tickets.