Potential heap corruption in smartlist_add(), smartlist_insert()

changed milestone to %Tor: 0.2.8.x-final

added 024-backport 025-backport 026-backport 027-backport 2016-bug-retrospective component::core tor/tor milestone::Tor: 0.2.8.x-final owner::nickm priority::high resolution::fixed security severity::normal status::closed tor-bug-bounty type::defect labels

Trac:
Description: Here follows vulnerability report by Guido Vranken. The attack requires minimum 16GB of available memory on the victim's host, so it's quite hard to be exploited.

Walkthrough of the vulnerability

smartlist_add and smartlist_insert both invoke smartlist_ensure_capacity prior adding an element to the list in order to ensure that sufficient memory is available, to exit() if not enough memory is available and to detect requests for an invalid size:

static INLINE void
smartlist_ensure_capacity(smartlist_t *sl, int size)
{
#if SIZEOF_SIZE_T > SIZEOF_INT
#define MAX_CAPACITY (INT_MAX)
#else
#define MAX_CAPACITY (int)((SIZE_MAX / (sizeof(void*))))
#define ASSERT_CAPACITY
#endif
  if (size > sl->capacity) {
    int higher = sl->capacity;
    if (PREDICT_UNLIKELY(size > MAX_CAPACITY/2)) {
#ifdef ASSERT_CAPACITY
      /* We don't include this assertion when MAX_CAPACITY == INT_MAX,
       * since int size; (size <= INT_MAX) makes analysis tools think we're
       * doing something stupid. */
      tor_assert(size <= MAX_CAPACITY);
#endif
      higher = MAX_CAPACITY;
    } else {
      while (size > higher)
        higher *= 2;
    }
    sl->capacity = higher;
    sl->list = tor_reallocarray(sl->list, sizeof(void*),
                                ((size_t)sl->capacity));
  }
#undef ASSERT_CAPACITY
#undef MAX_CAPACITY
}

On a typical 64-bit system, SIZEOF_INT is 4 and SIZEOF_SIZE_T is 8. Consequently, MAX_CAPACITY is INT_MAX, which is 0x7FFFFFFF as can be seen in torint.h:

#ifndef INT_MAX
#if (SIZEOF_INT == 4)
#define INT_MAX 0x7fffffffL
#elif (SIZEOF_INT == 8)
#define INT_MAX 0x7fffffffffffffffL
#else
#error "Can't define INT_MAX"
#endif
#endif

So MAX_CAPACITY is 0x7FFFFFFF. Now assume that that many (0x7FFFFFFF) items have already been added to a smartlist via smartlist_add(sl, value).

smartlist_add() is:

void
smartlist_add(smartlist_t *sl, void *element)
{
  smartlist_ensure_capacity(sl, sl->num_used+1);
  sl->list[sl->num_used++] = element;
}

If sl->num_used is 0x7FFFFFFF prior to invoking smartlist_add, then the next smartlist_add is effectively:

void
smartlist_add(smartlist_t *sl, void *element)
{
  smartlist_ensure_capacity(sl, -2147483648);
  sl->list[2147483647] = element;
  sl->num_used = -2147483648
}

This is the case since we are dealing with a signed 32 bit integer, and 2147483647 + 1 equals -2147483647.

All of the code in smartlist_ensure_capacity is wrapped inside the following if block:

  if (size > sl->capacity) {
  }

The expression -2147483648 > 2147483647 equals false, thus the code inside the block is not executed.

What actually causes the segmentation fault is that a negative 32 bit integer is used to compute a the location of array index on a 64 bit memory layout, ie., the next call to smartlist_add is effectively:

void
smartlist_add(smartlist_t *sl, void *element)
{
  smartlist_ensure_capacity(sl, -2147483647); // Note that this is effective do-nothing code, as explained above
  sl->list[-2147483648] = element;
  sl->num_used = -2147483647
}

Discussion

The requirement for 16 gigabytes of memory is considerable.

Triggering the vulnerability obviously also requires some code path which will invoke smartlist_add or smartlist_insert upon the same smartlist at the attacker's behest. Moreover, such a code path may have the side effect that it requires a separate allocation for each object that is added to the list; smartlist_add takes a pointer argument after all -- usually, but not always, this pointer refers to freshly allocated memory. Exceptions to this rule are static strings and pointers to a place in a large string or buffer that was already extant. Once a vulnerable code path has been discovered, then it ultimately boils down to how much memory a user's machine is able to allocate in order to corrupt the heap.

Despite these constraints, smartlists form a considerable portion of the infrastructure of your code (I count some 380+ occurrences of smartlist_add/smartlist_insert in the .c files using grep, that is excluding the test/ directory) and as such it's probably wise to revise the checks in smartlist_ensure_capacity.

to

Here follows vulnerability report by Guido Vranken reported through the Tor bug bounty program.

The attack requires minimum 16GB of available memory on the victim's host, so it's quite hard to be exploited.

Walkthrough of the vulnerability

smartlist_add and smartlist_insert both invoke smartlist_ensure_capacity prior adding an element to the list in order to ensure that sufficient memory is available, to exit() if not enough memory is available and to detect requests for an invalid size:

static INLINE void
smartlist_ensure_capacity(smartlist_t *sl, int size)
{
#if SIZEOF_SIZE_T > SIZEOF_INT
#define MAX_CAPACITY (INT_MAX)
#else
#define MAX_CAPACITY (int)((SIZE_MAX / (sizeof(void*))))
#define ASSERT_CAPACITY
#endif
  if (size > sl->capacity) {
    int higher = sl->capacity;
    if (PREDICT_UNLIKELY(size > MAX_CAPACITY/2)) {
#ifdef ASSERT_CAPACITY
      /* We don't include this assertion when MAX_CAPACITY == INT_MAX,
       * since int size; (size <= INT_MAX) makes analysis tools think we're
       * doing something stupid. */
      tor_assert(size <= MAX_CAPACITY);
#endif
      higher = MAX_CAPACITY;
    } else {
      while (size > higher)
        higher *= 2;
    }
    sl->capacity = higher;
    sl->list = tor_reallocarray(sl->list, sizeof(void*),
                                ((size_t)sl->capacity));
  }
#undef ASSERT_CAPACITY
#undef MAX_CAPACITY
}

On a typical 64-bit system, SIZEOF_INT is 4 and SIZEOF_SIZE_T is 8. Consequently, MAX_CAPACITY is INT_MAX, which is 0x7FFFFFFF as can be seen in torint.h:

#ifndef INT_MAX
#if (SIZEOF_INT == 4)
#define INT_MAX 0x7fffffffL
#elif (SIZEOF_INT == 8)
#define INT_MAX 0x7fffffffffffffffL
#else
#error "Can't define INT_MAX"
#endif
#endif

So MAX_CAPACITY is 0x7FFFFFFF. Now assume that that many (0x7FFFFFFF) items have already been added to a smartlist via smartlist_add(sl, value).

smartlist_add() is:

void
smartlist_add(smartlist_t *sl, void *element)
{
  smartlist_ensure_capacity(sl, sl->num_used+1);
  sl->list[sl->num_used++] = element;
}

If sl->num_used is 0x7FFFFFFF prior to invoking smartlist_add, then the next smartlist_add is effectively:

void
smartlist_add(smartlist_t *sl, void *element)
{
  smartlist_ensure_capacity(sl, -2147483648);
  sl->list[2147483647] = element;
  sl->num_used = -2147483648
}

This is the case since we are dealing with a signed 32 bit integer, and 2147483647 + 1 equals -2147483647.

All of the code in smartlist_ensure_capacity is wrapped inside the following if block:

  if (size > sl->capacity) {
  }

The expression -2147483648 > 2147483647 equals false, thus the code inside the block is not executed.

What actually causes the segmentation fault is that a negative 32 bit integer is used to compute a the location of array index on a 64 bit memory layout, ie., the next call to smartlist_add is effectively:

void
smartlist_add(smartlist_t *sl, void *element)
{
  smartlist_ensure_capacity(sl, -2147483647); // Note that this is effective do-nothing code, as explained above
  sl->list[-2147483648] = element;
  sl->num_used = -2147483647
}

Discussion

The requirement for 16 gigabytes of memory is considerable.

Triggering the vulnerability obviously also requires some code path which will invoke smartlist_add or smartlist_insert upon the same smartlist at the attacker's behest. Moreover, such a code path may have the side effect that it requires a separate allocation for each object that is added to the list; smartlist_add takes a pointer argument after all -- usually, but not always, this pointer refers to freshly allocated memory. Exceptions to this rule are static strings and pointers to a place in a large string or buffer that was already extant. Once a vulnerable code path has been discovered, then it ultimately boils down to how much memory a user's machine is able to allocate in order to corrupt the heap.

Despite these constraints, smartlists form a considerable portion of the infrastructure of your code (I count some 380+ occurrences of smartlist_add/smartlist_insert in the .c files using grep, that is excluding the test/ directory) and as such it's probably wise to revise the checks in smartlist_ensure_capacity.

Trac:
Priority: Medium to High
Milestone: N/A to Tor: 0.2.8.x-final
Keywords: N/A deleted, security added
Component: - Select a component to Tor

Trac:
Keywords: security deleted, security 025-backport 026-backport 027-backport added

Trac:
Status: new to accepted
Owner: N/A to nickm

The branch bug18162_024 in my public repository has a proposed fix for this bug, along with a small family of related bugs. You can see the patch here: https://gitweb.torproject.org/nickm/tor.git/commit/?h=bug18162_024&id=bca7083e8285e8e6a4377076a7e432417eafc6d2

Review would be much appreciated.

From an abundance of caution, I've written the fix to apply to 0.2.4 and later, though I doubt that's necessary.

Trac:
Keywords: security 025-backport 026-backport 027-backport deleted, security 025-backport 026-backport 027-backport 024-backport added
Status: accepted to needs_review

Replying to asn:

If sl->num_used is 0x7FFFFFFF prior to invoking smartlist_add, then the next smartlist_add is effectively:

{{{ void smartlist_add(smartlist_t *sl, void *element) { smartlist_ensure_capacity(sl, -2147483648); sl->list[2147483647] = element; sl->num_used = -2147483648 } }}}

This is the case since we are dealing with a signed 32 bit integer, and 2147483647 + 1 equals -2147483647.

All of the code in smartlist_ensure_capacity is wrapped inside the following if block:

{{{ if (size > sl->capacity) { } }}}

The expression -2147483648 > 2147483647 equals false, thus the code inside the block is not executed.

What actually causes the segmentation fault is that a negative 32 bit integer is used to compute a the location of array index on a 64 bit memory layout, ie., the next call to smartlist_add is effectively:

{{{ void smartlist_add(smartlist_t *sl, void *element) { smartlist_ensure_capacity(sl, -2147483647); // Note that this is effective do-nothing code, as explained above sl->list[-2147483648] = element; sl->num_used = -2147483647 } }}}

It's worth noting that signed integer overflow is undefined behaviour in C, so an optimising compiler could strictly do anything with 2147483647 + 1, including assuming that it will never occur, and optimising out the check, code after the check, or even code before the check.

http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html

Whether this undefined behaviour actually occurs in Tor depends on the setting of configure's --enable-expensive-hardening. It sets -fwrapv, which guarantees wrapping on signed-integer overflow. (Note that in #17983 (moved), we consider building with -ftrapv instead of -fwrapv. This issue would have caused a crash on integer overflow under -ftrapv. It would have been a denial of service risk, but not otherwise exploitable.)

If configure --enable-expensive-hardening is not set, the behaviour is undefined, and the outcome depends on the compiler's optimisation of undefined behaviour. I don't know what current compilers do with this code - has anyone checked?

Replying to nickm:

The branch bug18162_024 in my public repository has a proposed fix for this bug, along with a small family of related bugs. You can see the patch here: https://gitweb.torproject.org/nickm/tor.git/commit/?h=bug18162_024&id=bca7083e8285e8e6a4377076a7e432417eafc6d2

Review would be much appreciated.

From an abundance of caution, I've written the fix to apply to 0.2.4 and later, though I doubt that's necessary.

Code review:

These fixes look like they work and resolve this issue.

In multiple functions:

The calculation ((size_t) sl->num_used)+1 never overflows, because sl->num_used is at most MAX_CAPACITY, and MAX_CAPACITY is always less than SIZE_MAX.

In smartlist_add_all():

In fact, MAX_CAPACITY is strictly less than or equal to SIZE_MAX/2, as long as sizeof(void*) >= 2. So the addition will never overflow on any platform tor builds on, either (so the check is redundant):

  size_t new_size = (size_t)s1->num_used + (size_t)s2->num_used;
  tor_assert(new_size >= (size_t) s1->num_used); /* check for overflow. */

s1->list + s1->num_used also won't overflow, because tor_malloc() and tor_realloc() must ensure that sl->list is allocated low enough in memory to contain s1->capacity. This argument applies to s1 before and after the smartlist_ensure_capacity() call, and to s2.

Replying to nickm:

The branch bug18162_024 in my public repository has a proposed fix for this bug, along with a small family of related bugs. You can see the patch here: https://gitweb.torproject.org/nickm/tor.git/commit/?h=bug18162_024&id=bca7083e8285e8e6a4377076a7e432417eafc6d2

Review would be much appreciated.

Thanks for the patch!

Looks reasonable, but oh my all this casting makes my brain hurt. This casting approach was preferred over re-defining sl->capacity and sl->num_used as size_t. Is this because you would then need to change too many things on the rest of the codebase?

This casting approach was preferred over re-defining sl->capacity and sl->num_used as size_t. Is this because you would then need to change too many things on the rest of the codebase?

That's right. In particular, if we cascade the change into smartlist_len(), then everything that calls smartlist_len or indexes a smartlist gets more complex. If we only cascade the change throughout container.c, it's still a bit ugly. See for example my branch bug18162_024_alternative , which I am not currently recommended.

The value range of size_t can be smaller than int. This is likely true for some compilers targeting segmented memory models. It would be good to have a compile time assert to check that.

Integer types are allowed to have padding bits. Thus, the check: #if SIZEOF_SIZE_T > SIZEOF_INT doesn't necessarily do the right thing. Even if there is no hardware that has padding, there might be compiler versions in the future where it matters, e.g. if they want to attach state to track usage of variables.

The value range of size_t can be smaller than int. Integer types are allowed to have padding bits.

I'd be fine with compile-time fixes for both of these. How about something like

#if SIZE_MAX < INT_MAX
#error
#endif

for the first one, and something like

#if SIZE_MAX/SIZEOF_VOID_P > INT_MAX
 ...
#endif

for the second ?

(I do not expect to start supporting segmented memory models or weird padded ints.)

Haven't digged it deeply, but why signed int for size? Why int? Isn't size_t enough? uint64_t then? All code with signed size must be rewritten even if it is very difficult (cause it's "bug by design").

+1 to [comment:12 bugzilla] May I ask an offtop question? Why do programmers use signed integers more often than unsigned as an array index?

I agree that it's much harder to program securely using int than size_t. For a new API, I'd use size_t.

But it is possible to check ints before you use them: we need to make sure the int is in range. And we have to check for underflow and overflow before they happen, as they are undefined behaviour.

The risks of converting all tor code over to size_t are that we introduce bugs while doing it, that it becomes a hell of merge conflicts, or that we don't find every instance.

It's worth noting that every smartlist() function that takes an int asserts that it's greater than 0, and less than the size of the smartlist.

And internally, smartlist_ensure_capacity() does similar overflow checks. We could add an overflow check to smartlist_ensure_capacity(), but I don't think it can actually overflow.

Neither can num_used overflow, because it's bound by capacity. It can't underflow, because we check it's greater than 0 before decrementing it.

That said, we should build with -ftrapv (the conclusion we reached in #17983 (moved), despite the bug summary), and then if we miss any checks, it won't be a security issue, Tor will just crash.

Replying to nickm:

The value range of size_t can be smaller than int. Integer types are allowed to have padding bits.

I'd be fine with compile-time fixes for both of these. How about something like {{{ #if SIZE_MAX < INT_MAX #error #endif }}} for the first one, and something like {{{ #if SIZE_MAX/SIZEOF_VOID_P > INT_MAX ... #endif }}} for the second ?

That looks good and should be portable.

(I do not expect to start supporting segmented memory models or weird padded ints.)

That's reasonable.

There is a chance that the padding might happen with some debug or hardening configurations on a mainstream platform like clang/x86. The padding bits could be used for tracking uninitialized usage or concurrent access (data races) and on x86 with no penalty for unaligned access some arbitrary number of padding bits/bytes could be used (potentially different for signed/unsigned types). So the compile-time check should be there.

Trac:
Keywords: security 025-backport 026-backport 027-backport 024-backport deleted, tor-bug-bounty security 025-backport 026-backport 027-backport 024-backport added

Looking at bug18162_024:

-  if (size > sl->capacity) {
-    int higher = sl->capacity;
+  tor_assert(size <= MAX_CAPACITY);

can't this assert be triggered by an attacker who can fill up smartlists? For example smartlist_add() does

-  smartlist_ensure_capacity(sl, sl->num_used+1);
+  smartlist_ensure_capacity(sl, ((size_t) sl->num_used)+1);

what if sl->num_used is MAX_CAPACITY at that point?

Or is that assert just there to stop the heap corruption, assuming we will never need so many elements in a smartlist?

Interesting reading on a topic: https://www.cs.utah.edu/~regehr/papers/overflow12.pdf

Potential heap corruption in smartlist_add(), smartlist_insert()

Walkthrough of the vulnerability

Discussion

Child items 0

Activity

Walkthrough of the vulnerability

Discussion

Walkthrough of the vulnerability

Discussion