Apparently to get full benefit we may need to annotate the Mozilla allocator, but we should be able make a test build without that annotation (it will just treat the entire malloc pool as one allocation).
SAFECode is apparently an extension to SoftBound, but it has only been rebased to LLVM 3.2 (where as SoftBound has been kept up to date to LLVM 3.4): http://safecode.cs.illinois.edu/
Trac: Description: We should see if we can get TBB to build with SoftBound+CETS, a memory-safety extension to LLVM: http://acg.cis.upenn.edu/softbound/
Apparently to get full benefit we may need to annotate the Mozilla allocator, but we should be able make a test build without that annotation (it will just treat the entire malloc pool as one allocation).
Apparently to get full benefit we may need to annotate the Mozilla allocator, but we should be able make a test build without that annotation (it will just treat the entire malloc pool as one allocation).
SAFECode is apparently an extension to SoftBound, but it has only been rebased to LLVM 3.2 (where as SoftBound has been kept up to date to LLVM 3.4): http://safecode.cs.illinois.edu/
Building Tor Browser gives now quite early a bunch of "This case not handled, requesting memory from system Softboundcets: Memory safety violation detected" errors. (See attachment 1).
GCC 4.8+ and CLang 3.1+ support this out of the box with -fsanitize=address. It may be a while before our cross-compilers pick this up, but we could build a special "TBB-Hardened" release for Linux-only, as an alpha perhaps?
Trac: Summary: Investigate building TBB with SoftBound to Investigate building TBB with SoftBound or AddressSanitizer
I found that the clang AddressSanitizer was a little more full-featured than the GCC 4.8 one: you can detect it at runtime, and you can redirect its output. I couldn't figure out how to make the GCC AddressSanitizer do that. But if the GCC 4.8 or GCC 4.9 addresssanitizer works fine for you, then it ought to be fine.
We are pretty close I guess. After resolving issues with the linker (I got
/usr/bin/ld.bfd.real: js: hidden symbol `__asan_default_options' in ../libjs_static.a(AsmJSSignalHandlers.o) is referenced by DSO/usr/bin/ld.bfd.real: final link failed: Nonrepresentable section on outputcollect2: error: ld returned 1 exit status
with the ld lucid ships and a self-compiled using binutils 2.22) by using a binutils > 2.22 everything compiles and links properly it seems. However, the packaging step breaks with
===================================================================21490== ERROR: AddressSanitizer: stack-buffer-overflow on address 0xbfb0fe5c at pc 0x44edca29 bp 0xbfb0fdf4 sp 0xbfb0fde8WRITE of size 4 at 0xbfb0fe5c thread T0 #0 0x44edca28 (/home/ubuntu/build/tor-browser/obj-i686-pc-linux-gnu/toolkit/library/libxul.so+0x38b6a28) #1 0x489325b7 (/lib/tls/i686/cmov/libc-2.11.1.so+0x2f5b7)ASAN:SIGSEGV==21490== AddressSanitizer: while reporting a bug found another one.Ignoring.Traceback (most recent call last): File "/home/ubuntu/build/tor-browser/toolkit/mozapps/installer/packager.py", line 375, in <module> main() File "/home/ubuntu/build/tor-browser/toolkit/mozapps/installer/packager.py", line 367, in main args.source, gre_path, base) File "/home/ubuntu/build/tor-browser/toolkit/mozapps/installer/packager.py", line 148, in precompile_cache errors.fatal('Error while running startup cache precompilation') File "/home/ubuntu/build/tor-browser/python/mozbuild/mozpack/errors.py", line 101, in fatal self._handle(self.FATAL, msg) File "/home/ubuntu/build/tor-browser/python/mozbuild/mozpack/errors.py", line 96, in _handle raise ErrorMessage(msg)mozpack.errors.ErrorMessage: Error: Error while running startup cache precompilation
It also describes how to get a proper gdb stack trace out of UbSan. There may be a similar way to do this with ASAN, to get a better stack trace for the packaging crash in comment 11?
gk: random idea: What if we told Firefox that the ASAN compiler was a cross compiler? Then the host gcc should build that libxul library, and use non-ASAN hardened tools in the packaging step, and this should avoid the crash during packaging?
gk: random idea: What if we told Firefox that the ASAN compiler was a cross compiler? Then the host gcc should build that libxul library, and use non-ASAN hardened tools in the packaging step, and this should avoid the crash during packaging?
That might be a smart idea. There is actually a section in the example .mozconfig that I omitted which might help us here:
# Avoid using ASan flags when building host tools like nsinstallexport HOST_CFLAGS=" "export HOST_CXXFLAGS=" "export HOST_LDFLAGS=" "
I'll test that on one of my machines. As I said the problem is not existing in Fx 29 anymore with my current setup. So, I am bisecting on my other machine meanwhile to find something useful (I guess this is less time-consuming than examining stack traces of an already solved problem although it is quite tempting to take that road).
Okay. It turned out that my analysis was not correct. The crash in comment 11 happens only for i386 builds for reasons yet to be investigated. 64 bit builds are not affected. I uploaded a bundle to https://people.torproject.org/~gk/testbuilds/asan/20140521/
Doing a
export ASAN_OPTIONS=alloc_dealloc_mismatch=0
might help while testing. Corresponding to the build is the branch hardening_asan_linux_x86-64 branch in my public tor-browser-bundle repo that I basically used to create the test bundle. Two things are needed to somewhat reproduce my work:
The standard Gitian VM is not big enough. One has to raise the value of the --rootsize flag in gitian-builder's make-base-vm script.
One needs the custom .mozconfig-asan file which is attached (It seems I cant't easily upload files starting with a ".". Thus, I renamed it to "mozconfig". But the build scripts in hardening_asan_linux_x86-64 like to have a .mozconfig-asan). Mike: could you add that one (as .mozconfig-asan) to the tor-browser repo?
Especially for testing I can highly recommend Clang. ASan isn't the only thingavailable there, you also get TSan, UBSan, LSan and some other checkers thatGCC lacks. Not all of these are usable on Firefox, since our codebase has quitea few races and undefined behavior, but smaller programs can be tested quitewell.
Happens here, too, thanks. Might be the first thing to look closer at.
FWIW: this happens with a vanilla ESR 24.5.0 as well but not with a recent ASan hardened Firefox nightly. Might be a real issue, might be a GCC + ASan issue on our side, might be...
A more general update: a) I might indeed have been right with comment:11 as I can compile a i386 ASan hardened Firefox 29 fine. Thus, I am back bisecting.
b) Then I tried to get GCC 4.9.0 to compile in order to be able to make use of UBSan and the other tools that made Clang superior but it failed on Lucid with:
/bin/bash ./libtool --tag=CC --mode=compile gcc -DHAVE_CONFIG_H -I. -I../gcc-4.9.0/libbacktrace -I ../gcc-4.9.0/libbacktrace/../include -I ../gcc-4.9.0/libbacktrace/../libgcc -I ../libgcc -funwind-tables -frandom-seed=dwarf.lo -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -Wmissing-format-attribute -Wcast-qual -g -O2 -c -o dwarf.lo ../gcc-4.9.0/libbacktrace/dwarf.clibtool: compile: gcc -DHAVE_CONFIG_H -I. -I../gcc-4.9.0/libbacktrace -I ../gcc-4.9.0/libbacktrace/../include -I ../gcc-4.9.0/libbacktrace/../libgcc -I ../libgcc -funwind-tables -frandom-seed=dwarf.lo -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -Wmissing-format-attribute -Wcast-qual -g -O2 -c ../gcc-4.9.0/libbacktrace/dwarf.c -o dwarf.o../gcc-4.9.0/libbacktrace/dwarf.c: In function 'dwarf_lookup_pc':../gcc-4.9.0/libbacktrace/dwarf.c:2678: warning: implicit declaration of function '__atomic_load_n'../gcc-4.9.0/libbacktrace/dwarf.c:2678: error: '__ATOMIC_ACQUIRE' undeclared (first use in this function)../gcc-4.9.0/libbacktrace/dwarf.c:2678: error: (Each undeclared identifier is reported only once../gcc-4.9.0/libbacktrace/dwarf.c:2678: error: for each function it appears in.)../gcc-4.9.0/libbacktrace/dwarf.c:2738: warning: implicit declaration of function '__atomic_store_n'../gcc-4.9.0/libbacktrace/dwarf.c:2738: error: '__ATOMIC_RELEASE' undeclared (first use in this function)../gcc-4.9.0/libbacktrace/dwarf.c: In function 'dwarf_fileline':../gcc-4.9.0/libbacktrace/dwarf.c:2873: error: '__ATOMIC_ACQUIRE' undeclared (first use in this function)../gcc-4.9.0/libbacktrace/dwarf.c: In function 'backtrace_dwarf_add':../gcc-4.9.0/libbacktrace/dwarf.c:3006: error: '__ATOMIC_ACQUIRE' undeclared (first use in this function)make[2]: Leaving directory `/home/ubuntu/build/gcc/libbacktrace'make[2]: *** [dwarf.lo] Error 1
Could be our Gitian setup though, that is the culprit here. Anyway, using Precise outside of gitian compiles GCC 4.9.0 fine.
gk - I have three thoughts about getting this out the door quicker in the best shape possible:
Screw lucid. Let's only support x64 and Precise+ with these builds. Build 4.9.0 and the ASAN+Ubsan+VTV firefox in Precise, and don't worry about that 4.9.0 compile error. (Though I guess this means we can't use the gitian-utils descriptors as-is to build this compiler with the rest of the tools..).
Don't strip it, so stacktraces like the cyperpunks one in comment:16 make sense immediately without the need to make a second set of detached debug symbols for this build. This way we don't hit #12103 (closed) either, and hopefully all of the other hardening options will remain in-tact too.
Install all Firefox langpack locales in one build. This way we don't have to ship 12 versions of this huge build. We can provide instructions for users on how to switch their language for now, and perhaps later add a Tor Launcher or other UI option to select locale for these builds.
Thoughts? I suppose an alternate way to achieve #1 might be to build a 4.8 gcc in lucid and then use that gcc to build 4.9. Not sure which would mean more build time/hassle on average.
gk - I have three thoughts about getting this out the door quicker in the best shape possible:
Screw lucid. Let's only support x64 and Precise+ with these builds. Build 4.9.0 and the ASAN+Ubsan+VTV firefox in Precise, and don't worry about that 4.9.0 compile error. (Though I guess this means we can't use the gitian-utils descriptors as-is to build this compiler with the rest of the tools..).
We are not only throwing lucid but debian stable and presumably other distros as well out of the boat. So I'd rather avoid that at the moment if possible. Re: re-using descriptors: I wouldn't worry about that much currently as we need a separate hardening-branch anyway (e.g. we don't build 32bit bundles as this breaks etc.).
Don't strip it, so stacktraces like the cyperpunks one in comment:16 make sense immediately without the need to make a second set of detached debug symbols for this build. This way we don't hit #12103 (closed) either, and hopefully all of the other hardening options will remain in-tact too.
Install all Firefox langpack locales in one build. This way we don't have to ship 12 versions of this huge build. We can provide instructions for users on how to switch their language for now, and perhaps later add a Tor Launcher or other UI option to select locale for these builds.
Hrm... I am not a fan of this idea for a couple of reasons:
0) We need to fix #12103 (closed) anyway for non-hardened builds.
Users have to download a huge build (e.g. the debug symbols file alone is twice as big with ASan) which might deter from testing/using it.
We need to provide additional instructions and/or a Tor Launcher patch that both need to be maintained.
(and this one is the most important to me) There might be cases where a stacktrace alone is not helpful for debugging, i.e. cases where we want things --enable-debug and --disable-optimize (and maybe others) give us.
Thoughts? I suppose an alternate way to achieve #1 might be to build a 4.8 gcc in lucid and then use that gcc to build 4.9. Not sure which would mean more build time/hassle on average.
I slightly prefer that approach to #1 if we don't find a better solution. It needs once more build time as we save the built utils (but this build time overhead can be quite a lot as we need to compile both gccs with -j1 due to autotools not liking libfaketime). Anyway, I've filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61314 and maybe the gcc folks are coming up with an easy fix/workaround for us.
Thoughts? I suppose an alternate way to achieve #1 might be to build a 4.8 gcc in lucid and then use that gcc to build 4.9. Not sure which would mean more build time/hassle on average.
I slightly prefer that approach to #1 if we don't find a better solution. It needs once more build time as we save the built utils (but this build time overhead can be quite a lot as we need to compile both gccs with -j1 due to autotools not liking libfaketime). Anyway, I've filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61314 and maybe the gcc folks are coming up with an easy fix/workaround for us.
Hmm... looking at the log again I just recognized that we are failing in the "make install" step. The failure probably happens due to libfaketime issues although I still don't see why re-compiling parts of libbacktrace in this step should lead to the error I encounter. Note that I compiled gcc 4.9.0 on precise just for testing purposes. We might actually run into the very same issue once we switch to it under the libfaketime rule. Maybe the gcc people have a good idea.
Another option would be to avoid using libfaketime for building GCC 4.9.0 (let's suppose that is the real problem here; I have to test that) as we are currently not checking whether the utils are built deterministically at all.
gk - I have three thoughts about getting this out the door quicker in the best shape possible:
Screw lucid. Let's only support x64 and Precise+ with these builds. Build 4.9.0 and the ASAN+Ubsan+VTV firefox in Precise, and don't worry about that 4.9.0 compile error. (Though I guess this means we can't use the gitian-utils descriptors as-is to build this compiler with the rest of the tools..).
We are not only throwing lucid but debian stable and presumably other distros as well out of the boat. So I'd rather avoid that at the moment if possible. Re: re-using descriptors: I wouldn't worry about that much currently as we need a separate hardening-branch anyway (e.g. we don't build 32bit bundles as this breaks etc.).
Ubuntu 12.04 was released before debian/stable, so that should be OK. We'd only be dropping debian/oldstable, 10.04 LTS, and Centos 5 users, most likely. But if we can find a way to make it work on Lucid, sure.
Don't strip it, so stacktraces like the cyperpunks one in comment:16 make sense immediately without the need to make a second set of detached debug symbols for this build. This way we don't hit #12103 (closed) either, and hopefully all of the other hardening options will remain in-tact too.
Install all Firefox langpack locales in one build. This way we don't have to ship 12 versions of this huge build. We can provide instructions for users on how to switch their language for now, and perhaps later add a Tor Launcher or other UI option to select locale for these builds.
Hrm... I am not a fan of this idea for a couple of reasons:
0) We need to fix #12103 (closed) anyway for non-hardened builds.
Hrmm. Assuming it's as easy as using a newer binutils..
Users have to download a huge build (e.g. the debug symbols file alone is twice as big with ASan) which might deter from testing/using it.
Can we easily convert the stacktrace from http://paste.debian.net/hidden/b7b2f353/ using detached symbols? Can you post your symbols for that bug so I can take a look to see if it is possible?
We need to provide additional instructions and/or a Tor Launcher patch that both need to be maintained.
For the locale thing, I don't think this is too much of a problem compared to the cost to us otherwise. The alternative is an additional 15 40M files for each locale. It gets even more unweildy if we decide to do ASAN builds for all other platforms, as our dist size would then be around 4GB. I think we definitely want to avoid shipping two sets of bundles for all platforms for all locales. The only way this would be feasible is if we decided to only provide ASAN builds.
In my experience, if the langpakcs are installed, all you have to do is switch the general.useragent.locale pref.
(and this one is the most important to me) There might be cases where a stacktrace alone is not helpful for debugging, i.e. cases where we want things --enable-debug and --disable-optimize (and maybe others) give us.
I suspect that symbols will be enough, here. Memory issues become much easier to diagnose when you catch them at the first point of illegal access (which is what ASAN gives us).
Thoughts? I suppose an alternate way to achieve #1 might be to build a 4.8 gcc in lucid and then use that gcc to build 4.9. Not sure which would mean more build time/hassle on average.
I slightly prefer that approach to #1 if we don't find a better solution. It needs once more build time as we save the built utils (but this build time overhead can be quite a lot as we need to compile both gccs with -j1 due to autotools not liking libfaketime). Anyway, I've filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61314 and maybe the gcc folks are coming up with an easy fix/workaround for us.
Ok, I was able to get symbols for that stacktrace in comment:11 by removing the full path to all of the .so files, and then piping it to 'asan_symbolize.py -d' while inside the Debug/Browser directory of the detached debug symbols. asan_symbolize.py is here: https://llvm.org/svn/llvm-project/compiler-rt/trunk/lib/asan/scripts/asan_symbolize.py. In my case, it just used addr2line, since I do not have llvm-symbolize.
It looks like an issue with a dangling image cache pointer. I think I was asking for trouble by claiming this would be easy to diagnose. The image cache is a nightmare. Who knows how that pointer got into that state. I wonder if the FF24.5.0ESR crash is the same stacktrace?
Install all Firefox langpack locales in one build. This way we don't have to ship 12 versions of this huge build. We can provide instructions for users on how to switch their language for now, and perhaps later add a Tor Launcher or other UI option to select locale for these builds.
Hrm... I am not a fan of this idea for a couple of reasons:
0) We need to fix #12103 (closed) anyway for non-hardened builds.
Hrmm. Assuming it's as easy as using a newer binutils..
Even if not we need to fix it somehow. :)
Users have to download a huge build (e.g. the debug symbols file alone is twice as big with ASan) which might deter from testing/using it.
Can we easily convert the stacktrace from http://paste.debian.net/hidden/b7b2f353/ using detached symbols? Can you post your symbols for that bug so I can take a look to see if it is possible?
We need to provide additional instructions and/or a Tor Launcher patch that both need to be maintained.
For the locale thing, I don't think this is too much of a problem compared to the cost to us otherwise. The alternative is an additional 15 40M files for each locale. It gets even more unweildy if we decide to do ASAN builds for all other platforms, as our dist size would then be around 4GB. I think we definitely want to avoid shipping two sets of bundles for all platforms for all locales.
Okay, yes. That is a good point for shipping all locales in one build. But I am still not convinced that every user has to download a huge, unstripped bundle.
gk - I have three thoughts about getting this out the door quicker in the best shape possible:
Screw lucid. Let's only support x64 and Precise+ with these builds. Build 4.9.0 and the ASAN+Ubsan+VTV firefox in Precise, and don't worry about that 4.9.0 compile error. (Though I guess this means we can't use the gitian-utils descriptors as-is to build this compiler with the rest of the tools..).
We are not only throwing lucid but debian stable and presumably other distros as well out of the boat. So I'd rather avoid that at the moment if possible. Re: re-using descriptors: I wouldn't worry about that much currently as we need a separate hardening-branch anyway (e.g. we don't build 32bit bundles as this breaks etc.).
Ubuntu 12.04 was released before debian/stable, so that should be OK.
I can't run software compiled on precise on wheezy, the current debian stable. The libc is not new enough.
Thoughts? I suppose an alternate way to achieve #1 might be to build a 4.8 gcc in lucid and then use that gcc to build 4.9. Not sure which would mean more build time/hassle on average.
I slightly prefer that approach to #1 if we don't find a better solution. It needs once more build time as we save the built utils (but this build time overhead can be quite a lot as we need to compile both gccs with -j1 due to autotools not liking libfaketime). Anyway, I've filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61314 and maybe the gcc folks are coming up with an easy fix/workaround for us.
Turns out that neither of both ideas is helping: building GCC 4.9.0 is broken on precise as well due to #11459 (moved). We need some clever way to fix that one or work around it...
Here comes some material reflecting my failures to build bundles on Lucid with 4.9.0 so far (progress with Precise is a different comment). The short story is: Tor Browser (and FWIW plain Firefox as well) is segfaulting in the packaging step with something like:
Executing /home/ubuntu/build/tor-browser/obj-x86_64-unknown-linux-gnu/dist/bin/xpcshell -g /home/ubuntu/build/tor-browser/obj-x86_64-unknown-linux-gnu/dist/bin/ -a /home/ubuntu/build/tor-browser/obj-x86_64-unknown-linux-gnu/dist/bin/ -f /home/ubuntu/build/tor-browser/toolkit/mozapps/installer/precompile_cache.js -e precompile_startupcache("re[/gre/");](/gre/");)ASAN:SIGSEGV===================================================================22869==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x000000000000 sp 0x2b0f084bf678 bp 0x2b0f084bf780 T2)AddressSanitizer can not provide additional info.SUMMARY: AddressSanitizer: SEGV ??:0 ??Thread T2 created by T0 here: #0 0x2b0ec8ea572a in __interceptor_pthread_create ../../.././libsanitizer/asan/asan_interceptors.cc:183 #1 0x2b0ef7b75269 in _PR_CreateThread /home/ubuntu/build/tor-browser/nsprpub/pr/src/pthreads/ptthread.c:444 #2 0x2b0ef7b778ae in PR_CreateThread /home/ubuntu/build/tor-browser/nsprpub/pr/src/pthreads/ptthread.c:527 #3 0x2b0ede4a9286 in nsThread::Init() /home/ubuntu/build/tor-browser/xpcom/threads/nsThread.cpp:332 #4 0x2b0ee5d7d57c (/home/ubuntu/build/tor-browser/obj-x86_64-unknown-linux-gnu/dist/bin/libxul.so+0x1bdfe57c)==22869==ABORTING
This is not an issue with our Gitian setup as it happens with plain Lucid, too. It is neither fixed by using GCC master although this gives me a different crash:
Executing /home/gk/asan/mozilla-esr24/obj-x86_64-unknown-linux-gnu/dist/bin/xpcshell -g /home/gk/asan/mozilla-esr24/obj-x86_64-unknown-linux-gnu/dist/bin/ -a /home/gk/asan/mozilla-esr24/obj-x86_64-unknown-linux-gnu/dist/bin/ -f /home/gk/asan/mozilla-esr24/toolkit/mozapps/installer/precompile_cache.js -e precompile_startupcache("re[/gre/");](/gre/");)===================================================================22303==ERROR: AddressSanitizer: unknown-crash on address 0x2ad2d31bd3c0 at pc 0x2ad2d1803362 bp 0x7fff8f6149c0 sp 0x7fff8f6149b8READ of size 16 at 0x2ad2d31bd3c0 thread T0 #0 0x2ad2d1803361 in nsIDHashKey ../../dist/include/nsHashKeys.h:375 #1 0x2ad2d1803361 in nsBaseHashtableET ../../dist/include/nsBaseHashtable.h:408 #2 0x2ad2d1803361 in nsTHashtable<nsBaseHashtableET<nsIDHashKey, nsFactoryEntry*> >::s_InitEntry(PLDHashTable*, PLDHashEntryHdr*, void const*) ../../dist/include/nsTHashtable.h:472 #3 0x2ad2d179ad39 in PL_DHashTableOperate /home/gk/asan/mozilla-esr24/obj-x86_64-unknown-linux-gnu/xpcom/build/pldhash.cpp:630 #4 0x2ad2d1805d75 in nsTHashtable<nsBaseHashtableET<nsIDHashKey, nsFactoryEntry*> >::PutEntry(nsID const&, mozilla::fallible_t const&) ../../dist/include/nsTHashtable.h:184 #5 0x2ad2d1805d75 in nsTHashtable<nsBaseHashtableET<nsIDHashKey, nsFactoryEntry*> >::PutEntry(nsID const&) ../../dist/include/nsTHashtable.h:170 #6 0x2ad2d1805d75 in nsBaseHashtable<nsIDHashKey, nsFactoryEntry*, nsFactoryEntry*>::Put(nsID const&, nsFactoryEntry* const&, mozilla::fallible_t const&) ../../dist/include/nsBaseHashtable.h:147 #7 0x2ad2d1805d75 in nsBaseHashtable<nsIDHashKey, nsFactoryEntry*, nsFactoryEntry*>::Put(nsID const&, nsFactoryEntry* const&) ../../dist/include/nsBaseHashtable.h:141 #8 0x2ad2d1806065 in nsComponentManagerImpl::RegisterCIDEntryLocked(mozilla::Module::CIDEntry const*, nsComponentManagerImpl::KnownModule*) /home/gk/asan/mozilla-esr24/xpcom/components/nsComponentManager.cpp:502 #9 0x2ad2d1809d35 in nsComponentManagerImpl::RegisterModule(mozilla::Module const*, mozilla::FileLocation*) /home/gk/asan/mozilla-esr24/xpcom/components/nsComponentManager.cpp:453 #10 0x2ad2d180aba2 in nsComponentManagerImpl::Init() /home/gk/asan/mozilla-esr24/xpcom/components/nsComponentManager.cpp:389 #11 0x2ad2d17a1fb0 in NS_InitXPCOM2 /home/gk/asan/mozilla-esr24/xpcom/build/nsXPComInit.cpp:467 #12 0x406d4b in main /home/gk/asan/mozilla-esr24/js/xpconnect/shell/xpcshell.cpp:1566 #13 0x2ad2d59b6c8c in __libc_start_main (/lib/libc.so.6+0x1ec8c) #14 0x407ea0 (/home/gk/asan/mozilla-esr24/obj-x86_64-unknown-linux-gnu/dist/bin/xpcshell+0x407ea0)0x2ad2d31bd3c0 is located 0 bytes inside of global variable 'kComponentManagerCID' from '/home/gk/asan/mozilla-esr24/xpcom/build/nsXPComInit.cpp' (0x2ad2d31bd3c0) of size 16SUMMARY: AddressSanitizer: unknown-crash ../../dist/include/nsHashKeys.h:375 nsIDHashKeyShadow bytes around the buggy address: 0x055ada62fa20: 00 00 f9 f9 f9 f9 f9 f9 00 00 f9 f9 f9 f9 f9 f9 0x055ada62fa30: 00 00 f9 f9 f9 f9 f9 f9 00 00 f9 f9 f9 f9 f9 f9 0x055ada62fa40: 00 00 f9 f9 f9 f9 f9 f9 00 00 f9 f9 f9 f9 f9 f9 0x055ada62fa50: 00 00 f9 f9 f9 f9 f9 f9 00 00 f9 f9 f9 f9 f9 f9 0x055ada62fa60: 00 00 f9 f9 f9 f9 f9 f9 00 00 f9 f9 f9 f9 f9 f9=>0x055ada62fa70: 00 00 f9 f9 f9 f9 f9 f9[00]00 f9 f9 f9 f9 f9 f9 0x055ada62fa80: 07 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 04 f9 f9 f9 0x055ada62fa90: f9 f9 f9 f9 00 02 f9 f9 f9 f9 f9 f9 00 00 00 00 0x055ada62faa0: 05 f9 f9 f9 f9 f9 f9 f9 06 f9 f9 f9 f9 f9 f9 f9 0x055ada62fab0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x055ada62fac0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Heap right redzone: fb Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack partial redzone: f4 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc ASan internal: fe==22303==ABORTING
Not sure if that is after or even before the crash described above. Issues with the packaging step are neither happening with GCC 4.8.2 for x86-64 on Lucid nor with GCC 4.9.0 on Precise.
Looking closer neither setting the HOST_* flags to '" "' nor using gold as the linker is fixing the problem. Comparing the build log of a Precise and a Lucid build does not give any clue either. After examining the code it seemed that the last rev that is probably working for us is 482026b63e8a488d6b7f0eab53fcbfe12c3309ae (although that one is broken due to https://gcc.gnu.org/bugzilla/show_bug.cgi?format=multiple&id=58868). That guess turned out to be wrong actually. After some more bisecting I found the culprit: 4fc7b5acfc1d42a0701c8fff726a3ebe7f563dd9. I've filed a GCC bug (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61408) and hope that the GCC folks can help us here one way or another. Meanwhile, I am writing a patch that fixes this problem for us.
Now the work with Precise: Tor Browser builds fine there with GCC 4.9.0 and ASan, UBSan and VTV. The other fancy options mentioned on https://developer.mozilla.org/en-US/docs/Building_SpiderMonkey_with_UBSan
are not supported in 4.9.0 yet but some of them landed already on GCC trunk. Anyway, the build crashes on start-up with the attached log. I have not had time to look into that yet. So my short term plan is getting hardened bundles built on Lucid out with ASan and UBSan while debugging the VTV issue and the one in comment:16. If someone is interested to debug the VTV issue let me know and I'll upload the bundle + debug symbols (ca. 500MiB).
After patching GCC (see: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61408 for the gory details) I was finally able to build TBBs with ASan and UBSan. They can be found on https://people.torproject.org/~gk/testbuilds/asan/20140617/ and are even fixing #12199 (closed). The branch I used for building is hardening_asan_ng_linux_x86-64 in my public repo. In order to be able to build it you need to have a tor-browser tag "asan-ng" which could be based on the branch the current nightly uses + the .mozconfig-asan attached (see 2) in comment:15 for instructions on how to handle it).