One thing we can do to improve security of TBB is to build it with an alternate semi-hardened malloc implementation that attempts to randomize the allocation pattern and performs some minimal checks to guard against heap overflows an reference count issues in Firefox (perhaps by also enabling some additional reference count debugging features already in Firefox).
Such allocator behavior may make exploitation of various use-after-free vulnerabilities more difficult, as it would be harder to predict the location of reallocated regions during exploitation in order to get a target object to overlay an incorrectly freed object.
The downside is this will likely come at the performance costs of loss of locality, increased fragmentation, and additional overhead of reference count checks, but this may be an acceptable cost for improved hardening against exploits.
The first question is: are there any existing drop-in replacement memory allocators we can use in place of Firefox's current jemalloc implementation?
The second question is will any of the Firefox refcounting checks actually help, or will they just increase runtime for no real benefit?
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items ...
Show closed items
Linked items 0
Link issues together to show that they're related.
Learn more.
Trac: Summary: Investigate usage of alternate memory allocators and meory hardening options to Investigate usage of alternate memory allocators and memory hardening options
The memory.free_dirty_pages pref may have some effect here in terms of encouraging the default jemalloc allocator to minimize the use of free lists, but according to https://bugzilla.mozilla.org/show_bug.cgi?id=805855, we may also have to emit the 'memory-pressure' observer event for it to have effect.
I spoke with cevans and he helped me pull the newest version of ctmalloc from http://src.chromium.org/blink/trunk/Source/wtf/PartitionAlloc.h, which has the usable_size function needed. I commented out a bunch of macros, successfully built, and it this may get past the sqlite3 problem. It crashed in a new spot though. Progress.
Just a note to mention that the memory tags from about:memory may be useful for initial partitioning with PartitionAlloc, but we may need to special case some of that. For instance, ArrayBuffers are a popular target for setting up UAF exploitation, and probably need a special partition, as do certain DOM/iframe objects (see http://robert.ocallahan.org/2010/10/mitigating-dangling-pointer-bugs-using_15.html).
Unfortunately, my implementation of memalign is wrong I need to hack at PartitionAlloc to support that, which will be a bit risky and tricky. Also, once jemalloc3 comes along, the advantages of using PartitionAlloc are much fewer, although some of the work (e.g. random partitioning based on callsite) could likely be ported over as defense in depth.
This field is too complex that even M$ is still trying to write MM without bugs. So, it's better to think twice before taking the responsibility for it.
An update: I tried building tor-browser.git with --enable-jemalloc flag, and it was able to build and run with no problem. This activates jemalloc4. As Tom Ritter points out in his experiments, jemalloc potentially makes it possible to partition the heap into "arenas", but it's going to require a lot of changes that ideally will be carried out by Mozilla.
One thing we can do to improve security of TBB is to build it with an alternate semi-hardened malloc implementation that attempts to randomize the allocation pattern and performs some minimal checks to guard against heap overflows an reference count issues in Firefox (perhaps by also enabling some additional reference count debugging features already in Firefox).
Such allocator behavior may make exploitation of various use-after-free vulnerabilities more difficult, as it would be harder to predict the location of reallocated regions during exploitation in order to get a target object to overlay an incorrectly freed object.
The downside is this will likely come at the performance costs of loss of locality, increased fragmentation, and additional overhead of reference count checks, but this may be an acceptable cost for improved hardening against exploits.
The first question is: are there any existing drop-in replacement memory allocators we can use in place of Firefox's current jemalloc implementation?
The second question is will any of the Firefox refcounting checks actually help, or will they just increase runtime for no real benefit?
then it seems to me the test bustage is not related to enabling jemalloc4. Moreover, (intermittent) test failures related to jemalloc 4.2.1 were reported for x86-64 win which we don't support currently. Thus, I think for the alpha series it could make sense to backport 4.2.1 and start playing with hardening it.
Do we have a patch somewhere up for review, Arthur? I suppose it could take some days for the back-and-forth of the review process and 6.5a6 is scheduled for next week.
Here is a patch (please review) that uses jemalloc4 in Tor Browser ESR45 (Linux). It also activates jemalloc redzones, and aborts if the redzones are found to have been corrupted.
(I'm still working on two alternative patches that (1) uses the DieHarder memory allocator and (2) uses jemalloc4 with randomized arenas, but I think we may want to give the redzone patch a try for now.)
Two other things I am wondering: should we have this on the hardened series first, only? (sounds like a good idea to me) Are we okay with the jemalloc4 ESR45 ships or do we think we should need to backport the patches already landed on m-c (or are about to land)? There was work done in comment:29 for that already and I am wondering about the severity of all the bugfixes that landed in jemalloc4 meanwhile...
So the free(...) call resulted in jemalloc detecting that a redzone had been overwritten. Note that 16 bytes of overwriting were detected, because allocation is chunked to 16 bytes here.
Two other things I am wondering: should we have this on the hardened series first, only? (sounds like a good idea to me)
That's OK with me! I'm not sure how it will interact with ASan, though.
Are we okay with the jemalloc4 ESR45 ships or do we think we should need to backport the patches already landed on m-c (or are about to land)? There was work done in comment:29 for that already and I am wondering about the severity of all the bugfixes that landed in jemalloc4 meanwhile...
Unfortunately, when I include just the 4 patches from comment:29 and use MOZ_JEMALLOC4=1, I get build errors. I think jemalloc didn't actually build in comment:29, because the patches that implemented --enable-jemalloc=4 weren't included. I'm working on backporting more patches to try to get the latest jemalloc to build. It's a bit complex.
Unfortunately, when I include just the 4 patches from comment:29 and use MOZ_JEMALLOC4=1, I get build errors.
Apparently I was doing something wrong here. When I wiped everything again, the build worked fine with the 4 patches from comment:29. Sorry for my confusion!
Here's a new version, using the 4 patches from comment:29 to upgrade to jemalloc 4.3.1. Then my additional patch enables jemalloc4 as the memory allocator and activates aborts on redzones.
Something Yawning pointed out is that redzones will no longer be available in jemalloc 5:
https://github.com/jemalloc/jemalloc/issues/369
But given that Firefox 52 still uses jemalloc 4.x, we should be OK for all of 2017. I see redzones as a stopgap while we continue to look for better options.
I did not look at the backported patches; do I need to? They are large; I guess if they merged cleanly they are probably okay.
Enabling the redzones and abort options might make TB slower and less stable. But I guess finding lurking bugs is a good thing as long as performance is not reduced too much.
I did not look at the backported patches; do I need to? They are large; I guess if they merged cleanly they are probably okay.
I think those are OK -- they're just pulling in the update patches from jemalloc.
Enabling the redzones and abort options might make TB slower and less stable. But I guess finding lurking bugs is a good thing as long as performance is not reduced too much.
Both these comments are true. Seems like the alphas is the right place until we observe performance and how crashy the browser is. One thing I can try in the near future is to push this patch to Mozilla's try server and look at crashes and talos (performance) results.
I'm pasting this here from an email I wrote in Sept 2016.
The original problem with PartitionAlloc was the lack of memalign
functions and the need to implement them. They don't seem to have been
added, so that problem persists.
There seem to be some security
features that are in OpenBSD's allocator that aren't/can't be in
PartitionAlloc, but not all of them.
No inline metadata - PartitionAlloc has this feature as well.
"It is guaranteed to abort for pointers that are not active malloc allocations." - not sure about this, but http://struct.github.io/partition_alloc.html has a patch for PartitionAlloc that does check the freelist for double frees. I'm not sure if this is a comprehensive equality of features though
"sets the allocator to abort on out-of-memory by default" - This is probably pretty easy to do. (Just a NULL check and an abort() no?)
"Fine-grained randomization is performed for small allocations by choosing a random pool to satisfy requests" - okay, but this is like 'choose a random partition' except not as powerful, cause you're in the same heap
"and then choosing a random free slot within a page provided by that pool" - PartitionAlloc does not have this, but it could. Here's a patch: http://struct.github.io/partition_alloc.html
"Freed small allocations are quarantined before being put back into circulation via a randomized delayed allocation pool" - okay, PArtitionAlloc doesn't have this
"CopperheadOS uses a ring buffer..." PartitionAlloc doesn't have this
"Small allocations are filled with junk data upon being released." - this is easily added
"Canaries can be placed at the end of small allocations to absorb small overflows and catch various forms of heap corruption upon free. This was a successfully upstreamed CopperheadOS extension. " - PartitionAlloc doesn't have this
I looked at OpenBSD's allocator, which AFAICT is in openbsd's
src/lib/libc/stdlib/malloc.c
It contains an implementation of the needed functions: malloc,
posix_memalign, calloc, realloc, free, and usable_size
It does not contain an implementation of the following, but they should
be simple enough to implement: memalign, alligned_alloc, valloc, good_size
So I think it would be possible to get OpenBSD's allocator in without a
ton of pain... But the main things OpenBSD's allocator seems to lack is
any sort of partitioning. So the real gains that Chrome saw and made
was moving Layout Objects and Buffer Objects to their own partitions.
So that brings us to jemalloc. As far as integration goes: Mozilla has
merged in jemalloc 4, but it seems to have a lot of bugs filed against
it. Some try results seemed to pass on 4.1.1 but failed on 4.2. It
seems the best thing to do to figure out it's stability is to sit down
with Mike Hommey and ask: if we want to enable jemalloc 3, or 4; how
stable is it?