Opened 5 years ago

Closed 3 years ago

#10599 closed enhancement (fixed)

Investigate building TBB with SoftBound or AddressSanitizer

Reported by: mikeperry Owned by: gk
Priority: Very High Milestone:
Component: Applications/Tor Browser Version:
Severity: Normal Keywords: gitian, tbb-security, tbb-gitian, tbb-hardening, TorBrowserTeam201511R, GeorgKoppen201511
Cc: gk, intrigeri@…, mcs, brade, tom@…, arma, isis, nicoo, boklm Actual Points:
Parent ID: #17304 Points:
Reviewer: Sponsor: SponsorU

Description (last modified by mikeperry)

We should see if we can get TBB to build with SoftBound+CETS, a memory-safety extension to LLVM: http://acg.cis.upenn.edu/softbound/

Apparently to get full benefit we may need to annotate the Mozilla allocator, but we should be able make a test build without that annotation (it will just treat the entire malloc pool as one allocation).

SAFECode is apparently an extension to SoftBound, but it has only been rebased to LLVM 3.2 (where as SoftBound has been kept up to date to LLVM 3.4): http://safecode.cs.illinois.edu/

Other resources:

Child Tickets

TicketTypeStatusOwnerSummary
#12199defectclosederinnHardened bundles (with ASan) crash on tweakers.net
#12419defectclosederinnTBBs with ASan create alloc_dealloc_mismatch warnings

Attachments (12)

bug1 (4.1 KB) - added by gk 5 years ago.
bug1 -- mfbt
mozconfig (929 bytes) - added by gk 5 years ago.
.mozconfig-asan
asan.log (4.2 KB) - added by mikeperry 5 years ago.
Reported crash with symbols.
tweakers24 (4.3 KB) - added by gk 5 years ago.
tweakers.net on vanilla ESR 24.5.0
mozconfig-ng (1.1 KB) - added by gk 5 years ago.
.mozconfig-asan for addtional UBSan and VTV hardening
www.google.com.asan.log (1.2 KB) - added by FireballDWF 5 years ago.
asan log from simple start browser then goto https://www.google.com
precise_vtv_crash.log (14.8 KB) - added by gk 5 years ago.
VTV crash on Ubuntu Precise
mozconfig.2 (1.3 KB) - added by gk 5 years ago.
.mozconfig-asan for ASan- and UBSan-enabled TBBs
decimal_error_cpp.bz2 (408.7 KB) - added by gk 5 years ago.
bug2_1 -- Decimal.cpp
decimal_error_sh (1.0 KB) - added by gk 5 years ago.
bug2_2 -- Decimal.cpp
libfaketime-asan.patch (958 bytes) - added by gk 4 years ago.
0001-Basically-reverting-LLVM-r193602.patch (4.7 KB) - added by gk 4 years ago.
fix packaging crash by making GCC patch work with 5.2.0

Download all attachments as: .zip

Change History (79)

comment:1 Changed 5 years ago by mikeperry

Description: modified (diff)

comment:2 Changed 5 years ago by intrigeri

Cc: intrigeri@… added

comment:3 Changed 5 years ago by mcs

Cc: mcs brade added

comment:4 Changed 5 years ago by gk

To get things started and SoftBound CETS compiled following https://github.com/santoshn/softboundcets-34/ worked with one caveat: I needed to add

-lrt

to get the test program compiled.

Then in order to compile Tor Browser I added some .mozconfig lines:

export CC="clang -fsoftboundcets"
export CXX="clang -fsoftboundcets"
export LDFLAGS="-L/home/gk/softbound/softboundcets-34/softboundcets-lib -lm -lrt"

Building Tor Browser gives now quite early a bunch of "This case not handled, requesting memory from system Softboundcets: Memory safety violation detected" errors. (See attachment 1).

Changed 5 years ago by gk

Attachment: bug1 added

bug1 -- mfbt

comment:5 Changed 5 years ago by mikeperry

Summary: Investigate building TBB with SoftBoundInvestigate building TBB with SoftBound or AddressSanitizer

It seems like most of the SoftBounds+CETS functionality has actually been folded into a GCC+CLang project called 'AddressSanitizer': https://code.google.com/p/address-sanitizer/wiki/AddressSanitizer

GCC 4.8+ and CLang 3.1+ support this out of the box with -fsanitize=address. It may be a while before our cross-compilers pick this up, but we could build a special "TBB-Hardened" release for Linux-only, as an alpha perhaps?

comment:6 Changed 5 years ago by nickm

As of Tor 0.2.5.4-alpha, we're turning on AdddressSanitizer if you build Tor with --enable-expensive-hardening. The option also enables ubsan.

comment:7 Changed 5 years ago by mikeperry

Here's Mozilla's build instructions: https://developer.mozilla.org/en-US/docs/Mozilla/Testing/Firefox_and_Address_Sanitizer. They seem to recommend LLVM/Clang from SVN (possibly because that page is outdated), though I wonder if GCC 4.8 will be more straight-forward.

comment:8 Changed 5 years ago by nickm

I found that the clang AddressSanitizer was a little more full-featured than the GCC 4.8 one: you can detect it at runtime, and you can redirect its output. I couldn't figure out how to make the GCC AddressSanitizer do that. But if the GCC 4.8 or GCC 4.9 addresssanitizer works fine for you, then it ought to be fine.

comment:9 Changed 5 years ago by gk

Step one and two (building a more recent GCC and a hardened tor) are done in the hardening_tor_asan branch in my public tor-browser-bundle repo.

comment:10 Changed 5 years ago by tom

Cc: tom@… added

comment:11 Changed 5 years ago by gk

We are pretty close I guess. After resolving issues with the linker (I got

/usr/bin/ld.bfd.real: js: hidden symbol `__asan_default_options' in ../libjs_static.a(AsmJSSignalHandlers.o) is referenced by DSO
/usr/bin/ld.bfd.real: final link failed: Nonrepresentable section on output
collect2: error: ld returned 1 exit status

with the ld lucid ships and a self-compiled using binutils 2.22) by using a binutils > 2.22 everything compiles and links properly it seems. However, the packaging step breaks with

=================================================================
==21490== ERROR: AddressSanitizer: stack-buffer-overflow on address 0xbfb0fe5c at pc 0x44edca29 bp 0xbfb0fdf4 sp 0xbfb0fde8
WRITE of size 4 at 0xbfb0fe5c thread T0
    #0 0x44edca28 (/home/ubuntu/build/tor-browser/obj-i686-pc-linux-gnu/toolkit/library/libxul.so+0x38b6a28)
    #1 0x489325b7 (/lib/tls/i686/cmov/libc-2.11.1.so+0x2f5b7)
ASAN:SIGSEGV
==21490== AddressSanitizer: while reporting a bug found another one.Ignoring.
Traceback (most recent call last):
  File "/home/ubuntu/build/tor-browser/toolkit/mozapps/installer/packager.py", line 375, in <module>
    main()
  File "/home/ubuntu/build/tor-browser/toolkit/mozapps/installer/packager.py", line 367, in main
    args.source, gre_path, base)
  File "/home/ubuntu/build/tor-browser/toolkit/mozapps/installer/packager.py", line 148, in precompile_cache
    errors.fatal('Error while running startup cache precompilation')
  File "/home/ubuntu/build/tor-browser/python/mozbuild/mozpack/errors.py", line 101, in fatal
    self._handle(self.FATAL, msg)
  File "/home/ubuntu/build/tor-browser/python/mozbuild/mozpack/errors.py", line 96, in _handle
    raise ErrorMessage(msg)
mozpack.errors.ErrorMessage: Error: Error while running startup cache precompilation

comment:12 Changed 5 years ago by mikeperry

Here's a link describing building the JS engine with UBSan (-faddress=undefined): https://developer.mozilla.org/en-US/docs/Building_SpiderMonkey_with_UBSan

It also describes how to get a proper gdb stack trace out of UbSan. There may be a similar way to do this with ASAN, to get a better stack trace for the packaging crash in comment 11?

comment:13 Changed 5 years ago by mikeperry

gk: random idea: What if we told Firefox that the ASAN compiler was a cross compiler? Then the host gcc should build that libxul library, and use non-ASAN hardened tools in the packaging step, and this should avoid the crash during packaging?

comment:14 in reply to:  13 Changed 5 years ago by gk

Replying to mikeperry:

gk: random idea: What if we told Firefox that the ASAN compiler was a cross compiler? Then the host gcc should build that libxul library, and use non-ASAN hardened tools in the packaging step, and this should avoid the crash during packaging?

That might be a smart idea. There is actually a section in the example .mozconfig that I omitted which might help us here:

# Avoid using ASan flags when building host tools like nsinstall
export HOST_CFLAGS=" "
export HOST_CXXFLAGS=" "
export HOST_LDFLAGS=" "

I'll test that on one of my machines. As I said the problem is not existing in Fx 29 anymore with my current setup. So, I am bisecting on my other machine meanwhile to find something useful (I guess this is less time-consuming than examining stack traces of an already solved problem although it is quite tempting to take that road).

comment:15 Changed 5 years ago by gk

Okay. It turned out that my analysis was not correct. The crash in comment 11 happens only for i386 builds for reasons yet to be investigated. 64 bit builds are not affected. I uploaded a bundle to https://people.torproject.org/~gk/testbuilds/asan/20140521/
Doing a

export ASAN_OPTIONS=alloc_dealloc_mismatch=0

might help while testing. Corresponding to the build is the branch hardening_asan_linux_x86-64 branch in my public tor-browser-bundle repo that I basically used to create the test bundle. Two things are needed to somewhat reproduce my work:

1) The standard Gitian VM is not big enough. One has to raise the value of the --rootsize flag in gitian-builder's make-base-vm script.
2) One needs the custom .mozconfig-asan file which is attached (It seems I cant't easily upload files starting with a ".". Thus, I renamed it to "mozconfig". But the build scripts in hardening_asan_linux_x86-64 like to have a .mozconfig-asan). Mike: could you add that one (as .mozconfig-asan) to the tor-browser repo?

Changed 5 years ago by gk

Attachment: mozconfig added

.mozconfig-asan

comment:16 Changed 5 years ago by cypherpunks

By starting https://people.torproject.org/~gk/testbuilds/asan/20140521/ on Debian wheezy x86_64 I get the following trace when browsing to tweakers.net: http://paste.debian.net/hidden/b7b2f353/

comment:17 Changed 5 years ago by gk

Quoting Christian Holler from https://bugzilla.mozilla.org/show_bug.cgi?id=1013341:

Especially for testing I can highly recommend Clang. ASan isn't the only thing
available there, you also get TSan, UBSan, LSan and some other checkers that
GCC lacks. Not all of these are usable on Firefox, since our codebase has quite
a few races and undefined behavior, but smaller programs can be tested quite
well.

comment:18 in reply to:  16 ; Changed 5 years ago by gk

Replying to cypherpunks:

By starting https://people.torproject.org/~gk/testbuilds/asan/20140521/ on Debian wheezy x86_64 I get the following trace when browsing to tweakers.net: http://paste.debian.net/hidden/b7b2f353/

Happens here, too, thanks. Might be the first thing to look closer at.

comment:19 in reply to:  18 Changed 5 years ago by gk

Replying to gk:

Replying to cypherpunks:

By starting https://people.torproject.org/~gk/testbuilds/asan/20140521/ on Debian wheezy x86_64 I get the following trace when browsing to tweakers.net: http://paste.debian.net/hidden/b7b2f353/

Happens here, too, thanks. Might be the first thing to look closer at.

FWIW: this happens with a vanilla ESR 24.5.0 as well but not with a recent ASan hardened Firefox nightly. Might be a real issue, might be a GCC + ASan issue on our side, might be...

comment:20 Changed 5 years ago by gk

A more general update: a) I might indeed have been right with comment:11 as I can compile a i386 ASan hardened Firefox 29 fine. Thus, I am back bisecting.

b) Then I tried to get GCC 4.9.0 to compile in order to be able to make use of UBSan and the other tools that made Clang superior but it failed on Lucid with:

/bin/bash ./libtool --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I../gcc-4.9.0/libbacktrace  -I ../gcc-4.9.0/libbacktrace/../include -I ../gcc-4.9.0/libbacktrace/../libgcc -I ../libgcc  -funwind-tables -frandom-seed=dwarf.lo -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -Wmissing-format-attribute -Wcast-qual  -g -O2 -c -o dwarf.lo ../gcc-4.9.0/libbacktrace/dwarf.c
libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I../gcc-4.9.0/libbacktrace -I ../gcc-4.9.0/libbacktrace/../include -I ../gcc-4.9.0/libbacktrace/../libgcc -I ../libgcc -funwind-tables -frandom-seed=dwarf.lo -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -Wmissing-format-attribute -Wcast-qual -g -O2 -c ../gcc-4.9.0/libbacktrace/dwarf.c -o dwarf.o
../gcc-4.9.0/libbacktrace/dwarf.c: In function 'dwarf_lookup_pc':
../gcc-4.9.0/libbacktrace/dwarf.c:2678: warning: implicit declaration of function '__atomic_load_n'
../gcc-4.9.0/libbacktrace/dwarf.c:2678: error: '__ATOMIC_ACQUIRE' undeclared (first use in this function)
../gcc-4.9.0/libbacktrace/dwarf.c:2678: error: (Each undeclared identifier is reported only once
../gcc-4.9.0/libbacktrace/dwarf.c:2678: error: for each function it appears in.)
../gcc-4.9.0/libbacktrace/dwarf.c:2738: warning: implicit declaration of function '__atomic_store_n'
../gcc-4.9.0/libbacktrace/dwarf.c:2738: error: '__ATOMIC_RELEASE' undeclared (first use in this function)
../gcc-4.9.0/libbacktrace/dwarf.c: In function 'dwarf_fileline':
../gcc-4.9.0/libbacktrace/dwarf.c:2873: error: '__ATOMIC_ACQUIRE' undeclared (first use in this function)
../gcc-4.9.0/libbacktrace/dwarf.c: In function 'backtrace_dwarf_add':
../gcc-4.9.0/libbacktrace/dwarf.c:3006: error: '__ATOMIC_ACQUIRE' undeclared (first use in this function)
make[2]: Leaving directory `/home/ubuntu/build/gcc/libbacktrace'
make[2]: *** [dwarf.lo] Error 1

Could be our Gitian setup though, that is the culprit here. Anyway, using Precise outside of gitian compiles GCC 4.9.0 fine.

comment:21 Changed 5 years ago by mikeperry

gk - I have three thoughts about getting this out the door quicker in the best shape possible:

  1. Screw lucid. Let's only support x64 and Precise+ with these builds. Build 4.9.0 and the ASAN+Ubsan+VTV firefox in Precise, and don't worry about that 4.9.0 compile error. (Though I guess this means we can't use the gitian-utils descriptors as-is to build this compiler with the rest of the tools..).
  1. Don't strip it, so stacktraces like the cyperpunks one in comment:16 make sense immediately without the need to make a second set of detached debug symbols for this build. This way we don't hit #12103 either, and hopefully all of the other hardening options will remain in-tact too.
  1. Install all Firefox langpack locales in one build. This way we don't have to ship 12 versions of this huge build. We can provide instructions for users on how to switch their language for now, and perhaps later add a Tor Launcher or other UI option to select locale for these builds.

Thoughts? I suppose an alternate way to achieve #1 might be to build a 4.8 gcc in lucid and then use that gcc to build 4.9. Not sure which would mean more build time/hassle on average.

comment:22 in reply to:  21 ; Changed 5 years ago by gk

Replying to mikeperry:

gk - I have three thoughts about getting this out the door quicker in the best shape possible:

  1. Screw lucid. Let's only support x64 and Precise+ with these builds. Build 4.9.0 and the ASAN+Ubsan+VTV firefox in Precise, and don't worry about that 4.9.0 compile error. (Though I guess this means we can't use the gitian-utils descriptors as-is to build this compiler with the rest of the tools..).

We are not only throwing lucid but debian stable and presumably other distros as well out of the boat. So I'd rather avoid that at the moment if possible. Re: re-using descriptors: I wouldn't worry about that much currently as we need a separate hardening-branch anyway (e.g. we don't build 32bit bundles as this breaks etc.).

  1. Don't strip it, so stacktraces like the cyperpunks one in comment:16 make sense immediately without the need to make a second set of detached debug symbols for this build. This way we don't hit #12103 either, and hopefully all of the other hardening options will remain in-tact too.
  1. Install all Firefox langpack locales in one build. This way we don't have to ship 12 versions of this huge build. We can provide instructions for users on how to switch their language for now, and perhaps later add a Tor Launcher or other UI option to select locale for these builds.

Hrm... I am not a fan of this idea for a couple of reasons:
0) We need to fix #12103 anyway for non-hardened builds.
1) Users have to download a huge build (e.g. the debug symbols file alone is twice as big with ASan) which might deter from testing/using it.
2) We need to provide additional instructions and/or a Tor Launcher patch that both need to be maintained.
3) (and this one is the most important to me) There might be cases where a stacktrace alone is not helpful for debugging, i.e. cases where we want things --enable-debug and --disable-optimize (and maybe others) give us.

Thoughts? I suppose an alternate way to achieve #1 might be to build a 4.8 gcc in lucid and then use that gcc to build 4.9. Not sure which would mean more build time/hassle on average.

I slightly prefer that approach to #1 if we don't find a better solution. It needs *once* more build time as we save the built utils (but this build time overhead can be quite a lot as we need to compile both gccs with -j1 due to autotools not liking libfaketime). Anyway, I've filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61314 and maybe the gcc folks are coming up with an easy fix/workaround for us.

comment:23 in reply to:  22 Changed 5 years ago by gk

Replying to gk:

Replying to mikeperry:

Thoughts? I suppose an alternate way to achieve #1 might be to build a 4.8 gcc in lucid and then use that gcc to build 4.9. Not sure which would mean more build time/hassle on average.

I slightly prefer that approach to #1 if we don't find a better solution. It needs *once* more build time as we save the built utils (but this build time overhead can be quite a lot as we need to compile both gccs with -j1 due to autotools not liking libfaketime). Anyway, I've filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61314 and maybe the gcc folks are coming up with an easy fix/workaround for us.

Hmm... looking at the log again I just recognized that we are failing in the "make install" step. The failure probably happens due to libfaketime issues although I still don't see why re-compiling parts of libbacktrace in this step should lead to the error I encounter. Note that I compiled gcc 4.9.0 on precise just for testing purposes. We might actually run into the very same issue once we switch to it under the libfaketime rule. Maybe the gcc people have a good idea.

Another option would be to avoid using libfaketime for building GCC 4.9.0 (let's suppose that is the real problem here; I have to test that) as we are currently not checking whether the utils are built deterministically at all.

comment:24 in reply to:  22 ; Changed 5 years ago by mikeperry

Replying to gk:

Replying to mikeperry:

gk - I have three thoughts about getting this out the door quicker in the best shape possible:

  1. Screw lucid. Let's only support x64 and Precise+ with these builds. Build 4.9.0 and the ASAN+Ubsan+VTV firefox in Precise, and don't worry about that 4.9.0 compile error. (Though I guess this means we can't use the gitian-utils descriptors as-is to build this compiler with the rest of the tools..).

We are not only throwing lucid but debian stable and presumably other distros as well out of the boat. So I'd rather avoid that at the moment if possible. Re: re-using descriptors: I wouldn't worry about that much currently as we need a separate hardening-branch anyway (e.g. we don't build 32bit bundles as this breaks etc.).

Ubuntu 12.04 was released before debian/stable, so that should be OK. We'd only be dropping debian/oldstable, 10.04 LTS, and Centos 5 users, most likely. But if we can find a way to make it work on Lucid, sure.

  1. Don't strip it, so stacktraces like the cyperpunks one in comment:16 make sense immediately without the need to make a second set of detached debug symbols for this build. This way we don't hit #12103 either, and hopefully all of the other hardening options will remain in-tact too.
  1. Install all Firefox langpack locales in one build. This way we don't have to ship 12 versions of this huge build. We can provide instructions for users on how to switch their language for now, and perhaps later add a Tor Launcher or other UI option to select locale for these builds.

Hrm... I am not a fan of this idea for a couple of reasons:
0) We need to fix #12103 anyway for non-hardened builds.

Hrmm. Assuming it's as easy as using a newer binutils..

1) Users have to download a huge build (e.g. the debug symbols file alone is twice as big with ASan) which might deter from testing/using it.

Can we easily convert the stacktrace from http://paste.debian.net/hidden/b7b2f353/ using detached symbols? Can you post your symbols for that bug so I can take a look to see if it is possible?

2) We need to provide additional instructions and/or a Tor Launcher patch that both need to be maintained.

For the locale thing, I don't think this is too much of a problem compared to the cost to us otherwise. The alternative is an additional 15 40M files for each locale. It gets even more unweildy if we decide to do ASAN builds for all other platforms, as our dist size would then be around 4GB. I think we definitely want to avoid shipping two sets of bundles for all platforms for all locales. The only way this would be feasible is if we decided to only provide ASAN builds.

In my experience, if the langpakcs are installed, all you have to do is switch the general.useragent.locale pref.

3) (and this one is the most important to me) There might be cases where a stacktrace alone is not helpful for debugging, i.e. cases where we want things --enable-debug and --disable-optimize (and maybe others) give us.

I suspect that symbols will be enough, here. Memory issues become much easier to diagnose when you catch them at the first point of illegal access (which is what ASAN gives us).

Thoughts? I suppose an alternate way to achieve #1 might be to build a 4.8 gcc in lucid and then use that gcc to build 4.9. Not sure which would mean more build time/hassle on average.

I slightly prefer that approach to #1 if we don't find a better solution. It needs *once* more build time as we save the built utils (but this build time overhead can be quite a lot as we need to compile both gccs with -j1 due to autotools not liking libfaketime). Anyway, I've filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61314 and maybe the gcc folks are coming up with an easy fix/workaround for us.

Ok.

Changed 5 years ago by mikeperry

Attachment: asan.log added

Reported crash with symbols.

comment:25 Changed 5 years ago by mikeperry

Ok, I was able to get symbols for that stacktrace in comment:11 by removing the full path to all of the .so files, and then piping it to 'asan_symbolize.py -d' while inside the Debug/Browser directory of the detached debug symbols. asan_symbolize.py is here: https://llvm.org/svn/llvm-project/compiler-rt/trunk/lib/asan/scripts/asan_symbolize.py. In my case, it just used addr2line, since I do not have llvm-symbolize.

It looks like an issue with a dangling image cache pointer. I think I was asking for trouble by claiming this would be easy to diagnose. The image cache is a nightmare. Who knows how that pointer got into that state. I wonder if the FF24.5.0ESR crash is the same stacktrace?

comment:26 in reply to:  25 Changed 5 years ago by gk

Replying to mikeperry:

I wonder if the FF24.5.0ESR crash is the same stacktrace?

Basically, yes. It seems a bit more verbose but it is the same issue, I think. See my attachment.

Changed 5 years ago by gk

Attachment: tweakers24 added

tweakers.net on vanilla ESR 24.5.0

comment:27 in reply to:  24 Changed 5 years ago by gk

Replying to mikeperry:

Replying to gk:

Replying to mikeperry:

  1. Install all Firefox langpack locales in one build. This way we don't have to ship 12 versions of this huge build. We can provide instructions for users on how to switch their language for now, and perhaps later add a Tor Launcher or other UI option to select locale for these builds.

Hrm... I am not a fan of this idea for a couple of reasons:
0) We need to fix #12103 anyway for non-hardened builds.

Hrmm. Assuming it's as easy as using a newer binutils..

Even if not we need to fix it somehow. :)

1) Users have to download a huge build (e.g. the debug symbols file alone is twice as big with ASan) which might deter from testing/using it.

Can we easily convert the stacktrace from http://paste.debian.net/hidden/b7b2f353/ using detached symbols? Can you post your symbols for that bug so I can take a look to see if it is possible?

The symbols are in https://people.torproject.org/~gk/testbuilds/asan/20140521/ as well, now.

2) We need to provide additional instructions and/or a Tor Launcher patch that both need to be maintained.

For the locale thing, I don't think this is too much of a problem compared to the cost to us otherwise. The alternative is an additional 15 40M files for each locale. It gets even more unweildy if we decide to do ASAN builds for all other platforms, as our dist size would then be around 4GB. I think we definitely want to avoid shipping two sets of bundles for all platforms for all locales.

Okay, yes. That is a good point for shipping all locales in one build. But I am still not convinced that every user has to download a huge, unstripped bundle.

comment:28 in reply to:  24 Changed 5 years ago by gk

Replying to mikeperry:

Replying to gk:

Replying to mikeperry:

gk - I have three thoughts about getting this out the door quicker in the best shape possible:

  1. Screw lucid. Let's only support x64 and Precise+ with these builds. Build 4.9.0 and the ASAN+Ubsan+VTV firefox in Precise, and don't worry about that 4.9.0 compile error. (Though I guess this means we can't use the gitian-utils descriptors as-is to build this compiler with the rest of the tools..).

We are not only throwing lucid but debian stable and presumably other distros as well out of the boat. So I'd rather avoid that at the moment if possible. Re: re-using descriptors: I wouldn't worry about that much currently as we need a separate hardening-branch anyway (e.g. we don't build 32bit bundles as this breaks etc.).

Ubuntu 12.04 was released before debian/stable, so that should be OK.

I can't run software compiled on precise on wheezy, the current debian stable. The libc is not new enough.

comment:29 in reply to:  22 Changed 5 years ago by gk

Replying to gk:

Replying to mikeperry:

Thoughts? I suppose an alternate way to achieve #1 might be to build a 4.8 gcc in lucid and then use that gcc to build 4.9. Not sure which would mean more build time/hassle on average.

I slightly prefer that approach to #1 if we don't find a better solution. It needs *once* more build time as we save the built utils (but this build time overhead can be quite a lot as we need to compile both gccs with -j1 due to autotools not liking libfaketime). Anyway, I've filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61314 and maybe the gcc folks are coming up with an easy fix/workaround for us.

Turns out that neither of both ideas is helping: building GCC 4.9.0 is broken on precise as well due to #11459. We need some clever way to fix that one or work around it...

Changed 5 years ago by gk

Attachment: mozconfig-ng added

.mozconfig-asan for addtional UBSan and VTV hardening

Changed 5 years ago by FireballDWF

Attachment: www.google.com.asan.log added

asan log from simple start browser then goto https://www.google.com

comment:30 Changed 5 years ago by gk

FireballDWF: thanks for testing. See the HINT line in your ASan log. That should give you the opportunity to test the ASan builds a bit further.

comment:31 Changed 5 years ago by gk

Here comes some material reflecting my failures to build bundles on Lucid with 4.9.0 so far (progress with Precise is a different comment). The short story is: Tor Browser (and FWIW plain Firefox as well) is segfaulting in the packaging step with something like:

Executing /home/ubuntu/build/tor-browser/obj-x86_64-unknown-linux-gnu/dist/bin/xpcshell -g /home/ubuntu/build/tor-browser/obj-x86_64-unknown-linux-gnu/dist/bin/ -a /home/ubuntu/build/tor-browser/obj-x86_64-unknown-linux-gnu/dist/bin/ -f /home/ubuntu/build/tor-browser/toolkit/mozapps/installer/precompile_cache.js -e precompile_startupcache("resource://gre/");
ASAN:SIGSEGV
=================================================================
==22869==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x000000000000 sp 0x2b0f084bf678 bp 0x2b0f084bf780 T2)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV ??:0 ??
Thread T2 created by T0 here:
    #0 0x2b0ec8ea572a in __interceptor_pthread_create ../../.././libsanitizer/asan/asan_interceptors.cc:183
    #1 0x2b0ef7b75269 in _PR_CreateThread /home/ubuntu/build/tor-browser/nsprpub/pr/src/pthreads/ptthread.c:444
    #2 0x2b0ef7b778ae in PR_CreateThread /home/ubuntu/build/tor-browser/nsprpub/pr/src/pthreads/ptthread.c:527
    #3 0x2b0ede4a9286 in nsThread::Init() /home/ubuntu/build/tor-browser/xpcom/threads/nsThread.cpp:332
    #4 0x2b0ee5d7d57c (/home/ubuntu/build/tor-browser/obj-x86_64-unknown-linux-gnu/dist/bin/libxul.so+0x1bdfe57c)

==22869==ABORTING

This is not an issue with our Gitian setup as it happens with plain Lucid, too. It is neither fixed by using GCC master although this gives me a different crash:

Executing /home/gk/asan/mozilla-esr24/obj-x86_64-unknown-linux-gnu/dist/bin/xpcshell -g /home/gk/asan/mozilla-esr24/obj-x86_64-unknown-linux-gnu/dist/bin/ -a /home/gk/asan/mozilla-esr24/obj-x86_64-unknown-linux-gnu/dist/bin/ -f /home/gk/asan/mozilla-esr24/toolkit/mozapps/installer/precompile_cache.js -e precompile_startupcache("resource://gre/");
=================================================================
==22303==ERROR: AddressSanitizer: unknown-crash on address 0x2ad2d31bd3c0 at pc 0x2ad2d1803362 bp 0x7fff8f6149c0 sp 0x7fff8f6149b8
READ of size 16 at 0x2ad2d31bd3c0 thread T0
    #0 0x2ad2d1803361 in nsIDHashKey ../../dist/include/nsHashKeys.h:375
    #1 0x2ad2d1803361 in nsBaseHashtableET ../../dist/include/nsBaseHashtable.h:408
    #2 0x2ad2d1803361 in nsTHashtable<nsBaseHashtableET<nsIDHashKey, nsFactoryEntry*> >::s_InitEntry(PLDHashTable*, PLDHashEntryHdr*, void const*) ../../dist/include/nsTHashtable.h:472
    #3 0x2ad2d179ad39 in PL_DHashTableOperate /home/gk/asan/mozilla-esr24/obj-x86_64-unknown-linux-gnu/xpcom/build/pldhash.cpp:630
    #4 0x2ad2d1805d75 in nsTHashtable<nsBaseHashtableET<nsIDHashKey, nsFactoryEntry*> >::PutEntry(nsID const&, mozilla::fallible_t const&) ../../dist/include/nsTHashtable.h:184
    #5 0x2ad2d1805d75 in nsTHashtable<nsBaseHashtableET<nsIDHashKey, nsFactoryEntry*> >::PutEntry(nsID const&) ../../dist/include/nsTHashtable.h:170
    #6 0x2ad2d1805d75 in nsBaseHashtable<nsIDHashKey, nsFactoryEntry*, nsFactoryEntry*>::Put(nsID const&, nsFactoryEntry* const&, mozilla::fallible_t const&) ../../dist/include/nsBaseHashtable.h:147
    #7 0x2ad2d1805d75 in nsBaseHashtable<nsIDHashKey, nsFactoryEntry*, nsFactoryEntry*>::Put(nsID const&, nsFactoryEntry* const&) ../../dist/include/nsBaseHashtable.h:141
    #8 0x2ad2d1806065 in nsComponentManagerImpl::RegisterCIDEntryLocked(mozilla::Module::CIDEntry const*, nsComponentManagerImpl::KnownModule*) /home/gk/asan/mozilla-esr24/xpcom/components/nsComponentManager.cpp:502
    #9 0x2ad2d1809d35 in nsComponentManagerImpl::RegisterModule(mozilla::Module const*, mozilla::FileLocation*) /home/gk/asan/mozilla-esr24/xpcom/components/nsComponentManager.cpp:453
    #10 0x2ad2d180aba2 in nsComponentManagerImpl::Init() /home/gk/asan/mozilla-esr24/xpcom/components/nsComponentManager.cpp:389
    #11 0x2ad2d17a1fb0 in NS_InitXPCOM2 /home/gk/asan/mozilla-esr24/xpcom/build/nsXPComInit.cpp:467
    #12 0x406d4b in main /home/gk/asan/mozilla-esr24/js/xpconnect/shell/xpcshell.cpp:1566
    #13 0x2ad2d59b6c8c in __libc_start_main (/lib/libc.so.6+0x1ec8c)
    #14 0x407ea0 (/home/gk/asan/mozilla-esr24/obj-x86_64-unknown-linux-gnu/dist/bin/xpcshell+0x407ea0)

0x2ad2d31bd3c0 is located 0 bytes inside of global variable 'kComponentManagerCID' from '/home/gk/asan/mozilla-esr24/xpcom/build/nsXPComInit.cpp' (0x2ad2d31bd3c0) of size 16
SUMMARY: AddressSanitizer: unknown-crash ../../dist/include/nsHashKeys.h:375 nsIDHashKey
Shadow bytes around the buggy address:
  0x055ada62fa20: 00 00 f9 f9 f9 f9 f9 f9 00 00 f9 f9 f9 f9 f9 f9
  0x055ada62fa30: 00 00 f9 f9 f9 f9 f9 f9 00 00 f9 f9 f9 f9 f9 f9
  0x055ada62fa40: 00 00 f9 f9 f9 f9 f9 f9 00 00 f9 f9 f9 f9 f9 f9
  0x055ada62fa50: 00 00 f9 f9 f9 f9 f9 f9 00 00 f9 f9 f9 f9 f9 f9
  0x055ada62fa60: 00 00 f9 f9 f9 f9 f9 f9 00 00 f9 f9 f9 f9 f9 f9
=>0x055ada62fa70: 00 00 f9 f9 f9 f9 f9 f9[00]00 f9 f9 f9 f9 f9 f9
  0x055ada62fa80: 07 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 04 f9 f9 f9
  0x055ada62fa90: f9 f9 f9 f9 00 02 f9 f9 f9 f9 f9 f9 00 00 00 00
  0x055ada62faa0: 05 f9 f9 f9 f9 f9 f9 f9 06 f9 f9 f9 f9 f9 f9 f9
  0x055ada62fab0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x055ada62fac0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  ASan internal:           fe
==22303==ABORTING

Not sure if that is after or even before the crash described above. Issues with the packaging step are neither happening with GCC 4.8.2 for x86-64 on Lucid nor with GCC 4.9.0 on Precise.

Looking closer neither setting the HOST_* flags to '" "' nor using gold as the linker is fixing the problem. Comparing the build log of a Precise and a Lucid build does not give any clue either. After examining the code it seemed that the last rev that is probably working for us is 482026b63e8a488d6b7f0eab53fcbfe12c3309ae (although that one is broken due to https://gcc.gnu.org/bugzilla/show_bug.cgi?format=multiple&id=58868). That guess turned out to be wrong actually. After some more bisecting I found the culprit: 4fc7b5acfc1d42a0701c8fff726a3ebe7f563dd9. I've filed a GCC bug (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61408) and hope that the GCC folks can help us here one way or another. Meanwhile, I am writing a patch that fixes this problem for us.

comment:32 Changed 5 years ago by gk

Now the work with Precise: Tor Browser builds fine there with GCC 4.9.0 and ASan, UBSan and VTV. The other fancy options mentioned on https://developer.mozilla.org/en-US/docs/Building_SpiderMonkey_with_UBSan
are not supported in 4.9.0 yet but some of them landed already on GCC trunk. Anyway, the build crashes on start-up with the attached log. I have not had time to look into that yet. So my short term plan is getting hardened bundles built on Lucid out with ASan and UBSan while debugging the VTV issue and the one in comment:16. If someone is interested to debug the VTV issue let me know and I'll upload the bundle + debug symbols (ca. 500MiB).

Last edited 5 years ago by gk (previous) (diff)

Changed 5 years ago by gk

Attachment: precise_vtv_crash.log added

VTV crash on Ubuntu Precise

comment:33 in reply to:  16 Changed 5 years ago by gk

Replying to cypherpunks:

By starting https://people.torproject.org/~gk/testbuilds/asan/20140521/ on Debian wheezy x86_64 I get the following trace when browsing to tweakers.net: http://paste.debian.net/hidden/b7b2f353/

This is now a separate ticket: #12199.

comment:34 Changed 5 years ago by arma

Cc: arma added

comment:35 Changed 5 years ago by gk

After patching GCC (see: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61408 for the gory details) I was finally able to build TBBs with ASan and UBSan. They can be found on https://people.torproject.org/~gk/testbuilds/asan/20140617/ and are even fixing #12199. The branch I used for building is hardening_asan_ng_linux_x86-64 in my public repo. In order to be able to build it you need to have a tor-browser tag "asan-ng" which could be based on the branch the current nightly uses + the .mozconfig-asan attached (see 2\) in comment:15 for instructions on how to handle it).

Changed 5 years ago by gk

Attachment: mozconfig.2 added

.mozconfig-asan for ASan- and UBSan-enabled TBBs

comment:36 Changed 5 years ago by gk

Keywords: TorBrowserTeam201406 added

comment:37 Changed 5 years ago by gk

I uploaded a working build with ASan, UBSan and VTV to https://people.torproject.org/~gk/testbuilds/asan/20140620/.

They are currently compiled with "-fvtable-verify=std". "-fvtable-verify=preinit" does not work with ld but using gold seems to be fine. I'll add that piece in the next iteration of these builds. In order to avoid the browser exiting on VTV errors the compiler is built with -DVTV_NO_ABORT.

comment:38 Changed 5 years ago by gk

Excluding the host tools from SoftBound CETS seems to help a bit. But the next issue is not far:

clang -o Decimal.o -c -fvisibility=hidden -DIMPL_MFBT -DMOZ_GLUE_IN_PROGRAM -DNO_NSPR_10_SUPPORT -I/home/firefox64/softboundcets-34/tor-browser/mfbt -I. -I../dist/include -I/home/firefox64/softboundcets-34/tor-browser/obj-x86_64-unknown-linux-gnu/dist/include/nspr -I/home/firefox64/softboundcets-34/tor-browser/obj-x86_64-unknown-linux-gnu/dist/include/nss -fPIC -Qunused-arguments -Qunused-arguments -Wall -Wpointer-arith -Woverloaded-virtual -Werror=return-type -Wtype-limits -Wempty-body -Wsign-compare -Wno-invalid-offsetof -Wno-c++0x-extensions -Wno-extended-offsetof -Wno-unknown-warning-option -Wno-return-type-c-linkage -Wno-mismatched-tags -fsoftboundcets -fno-exceptions -fno-strict-aliasing -fno-rtti -ffunction-sections -fdata-sections -fno-exceptions -std=gnu++0x -pthread -pipe -DNDEBUG -DTRIMMED -g -Os -freorder-blocks -fomit-frame-pointer -Qunused-arguments -DMOZILLA_CLIENT -include ../mozilla-config.h -MD -MP -MF .deps/Decimal.o.pp /home/firefox64/softboundcets-34/tor-browser/mfbt/decimal/Decimal.cpp
1 warning generated.
clang: SoftBoundCETS.cpp:3375: void SoftBoundCETSPass::addDereferenceChecks(llvm::Function*): Assertion `0 && "Atomic Instructions not handled"' failed.
0 clang 0x0000000002614f92 llvm::sys::PrintStackTrace(_IO_FILE*) + 34
1 clang 0x0000000002614379
2 libpthread.so.0 0x00002ab31e673cb0
3 libc.so.6 0x00002ab31f4f6425 gsignal + 53
4 libc.so.6 0x00002ab31f4f9b8b abort + 379
5 libc.so.6 0x00002ab31f4ef0ee
6 libc.so.6 0x00002ab31f4ef192
7 clang 0x00000000015bf62c SoftBoundCETSPass::addDereferenceChecks(llvm::Function*) + 540
8 clang 0x00000000015c03a8 SoftBoundCETSPass::runOnModule(llvm::Module&) + 600
9 clang 0x000000000259a81f llvm::legacy::PassManagerImpl::run(llvm::Module&) + 927
10 clang 0x0000000000857aa6 clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::Module*, clang::BackendAction, llvm::raw_ostream*) + 3606
11 clang 0x0000000000854e7d
12 clang 0x0000000000a12534 clang::ParseAST(clang::Sema&, bool, bool) + 372
13 clang 0x00000000006bf6ca clang::FrontendAction::Execute() + 282
14 clang 0x000000000069ff50 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) + 352
15 clang 0x000000000068536d clang::ExecuteCompilerInvocation(clang::CompilerInstance*) + 1693
16 clang 0x000000000067c640 cc1_main(char const, char const, char const*, void*) + 1232
17 clang 0x00000000006837b9 main + 665
18 libc.so.6 0x00002ab31f4e176d libc_start_main + 237
19 clang 0x000000000067bff9
Stack dump:

  1. Program arguments: /home/firefox64/softboundcets-34/softboundcets-llvm-clang34/Release+Asserts/bin/clang -cc1 -triple x86_64-unknown-linux-gnu -emit-obj -disable-free -main-file-name Decimal.cpp -mrelocation-model pic -pic-level 2 -relaxed-aliasing -fmath-errno -masm-verbose -mconstructor-aliases -munwind-tables -target-cpu x86-64 -target-linker-version 2.22 -momit-leaf-frame-pointer -g -ffunction-sections -fdata-sections -coverage-file /home/firefox64/softboundcets-34/tor-browser/obj-x86_64-unknown-linux-gnu/mfbt/Decimal.o -resource-dir /home/firefox64/softboundcets-34/softboundcets-llvm-clang34/Release+Asserts/bin/../lib/clang/3.4 -dependency-file .deps/Decimal.o.pp -MT Decimal.o -sys-header-deps -MP -include ../mozilla-config.h -D IMPL_MFBT -D MOZ_GLUE_IN_PROGRAM -D NO_NSPR_10_SUPPORT -D NDEBUG -D TRIMMED -D MOZILLA_CLIENT -I /home/firefox64/softboundcets-34/tor-browser/mfbt -I . -I ../dist/include -I /home/firefox64/softboundcets-34/tor-browser/obj-x86_64-unknown-linux-gnu/dist/include/nspr -I /home/firefox64/softboundcets-34/tor-browser/obj-x86_64-unknown-linux-gnu/dist/include/nss -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../include/c++/4.6 -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../include/c++/4.6/x86_64-linux-gnu -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../include/c++/4.6/backward -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/4.6/../../../../include/x86_64-linux-gnu/c++/4.6 -internal-isystem /usr/local/include -internal-isystem /home/firefox64/softboundcets-34/softboundcets-llvm-clang34/Release+Asserts/bin/../lib/clang/3.4/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -Os -Wall -Wpointer-arith -Woverloaded-virtual -Werror=return-type -Wtype-limits -Wempty-body -Wsign-compare -Wno-invalid-offsetof -Wno-c++0x-extensions -Wno-extended-offsetof -Wno-unknown-warning-option -Wno-return-type-c-linkage -Wno-mismatched-tags -std=gnu++0x -fdeprecated-macro -fdebug-compilation-dir /home/firefox64/softboundcets-34/tor-browser/obj-x86_64-unknown-linux-gnu/mfbt -ferror-limit 19 -fmessage-length 80 -fvisibility hidden -pthread -fsoftboundcets -mstackrealign -fno-rtti -fobjc-runtime=gcc -fdiagnostics-show-option -fcolor-diagnostics -vectorize-loops -vectorize-slp -o Decimal.o -x c++ /home/firefox64/softboundcets-34/tor-browser/mfbt/decimal/Decimal.cpp
  2. <eof> parser at end of file
  3. Per-module optimization passes
  4. Running pass ' SoftBoundCETSPass' on module '/home/firefox64/softboundcets-34/tor-browser/mfbt/decimal/Decimal.cpp'.

clang: error: unable to execute command: Aborted (core dumped)
clang: error: clang frontend command failed due to signal (use -v to see invocation)
clang version 3.4 (branches/release_34)
Target: x86_64-unknown-linux-gnu
Thread model: posix
clang: note: diagnostic msg: PLEASE submit a bug report to http://llvm.org/bugs/ and include the crash backtrace, preprocessed source, and associated run script.
clang: note: diagnostic msg:

PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang: note: diagnostic msg: /tmp/Decimal-8da576.cpp
clang: note: diagnostic msg: /tmp/Decimal-8da576.sh
clang: note: diagnostic msg:


make[5]: * [Decimal.o] Error 254

Changed 5 years ago by gk

Attachment: decimal_error_cpp.bz2 added

bug2_1 -- Decimal.cpp

Changed 5 years ago by gk

Attachment: decimal_error_sh added

bug2_2 -- Decimal.cpp

comment:39 Changed 5 years ago by gk

Keywords: TorBrowserTeam201406 removed

comment:40 Changed 5 years ago by erinn

Component: Tor bundles/installationTor Browser
Keywords: tbb-gitian added
Owner: changed from erinn to tbb-team

comment:41 Changed 4 years ago by isis

Cc: isis added

comment:42 Changed 4 years ago by mikeperry

Keywords: TorBrowserTeam201509 added

comment:43 Changed 4 years ago by gk

Keywords: GeorgKoppen201509 added

I'll look at getting my former work dusted off and using it for our hardened bundles which are part of SponsorU. If anyone feels interested in working on the SoftBound parts, go ahead.

comment:44 Changed 4 years ago by gk

Suppose you start building a hardened tor in a gitian environment with GCC 5.1.0. Soon, you'll see the configure step is failing. Upon inspection of the config.log you'll see something like

==15310==ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD.

Your first thought is "Damn, libfaketime again!", right? If so, good, because that is indeed the issue. If not you are probably trying to compile it locally where this is working. Then you try using LD_PRELOAD as the error message is advising but the build is failing even earlier. So, searching a bit you'll find https://www.mail-archive.com/address-sanitizer@googlegroups.com/msg00591.html and concerns of GCC devs about this feature (https://gcc.gnu.org/ml/gcc-patches/2014-05/msg01919.html).

Still puzzled you log into the Gitian VM directly and re-run the build. Now it is not failing. Could be a Gitian bug, right? So copying the build script manually into the VM and making sure it gets exactly executed in the same way rules this issue out. And now upon rethinking the problem "libfaketime!" pops up in your mind and, Bingo!, that's it.

I think we should just comment out the Die() call in asan_linux.cc as done in the attached patch.

Changed 4 years ago by gk

Attachment: libfaketime-asan.patch added

comment:45 Changed 4 years ago by gk

Now the build gets much further but still breaks in the Firefox step:

In file included from ../../dist/system_wrappers/sys/cdefs.h:3:0,
                 from /usr/include/features.h:346,
                 from ../../dist/system_wrappers/features.h:3,
                 from /home/ubuntu/install/gcc/include/c++/5.1.0/x86_64-unknown-linux-gnu/bits/os_defines.h:39,
                 from /home/ubuntu/install/gcc/include/c++/5.1.0/x86_64-unknown-linux-gnu/bits/c++config.h:482,
                 from /home/ubuntu/install/gcc/include/c++/5.1.0/cstddef:44,
                 from ../../dist/system_wrappers/cstddef:3,
                 from ../../dist/include/mozilla/Compiler.h:46,
                 from ../../dist/include/mozilla/Attributes.h:12,
                 from ../../dist/include/mozilla/Assertions.h:16,
                 from ../../dist/include/mozilla/ArrayUtils.h:14,
                 from /home/ubuntu/build/tor-browser/xpcom/threads/BackgroundHangMonitor.cpp:7,
                 from /home/ubuntu/build/tor-browser/obj-x86_64-unknown-linux-gnu/xpcom/threads/Unified_cpp_xpcom_threads0.cpp:2:
/usr/include/bits/string3.h: In member function 'void mozilla::ThreadStackHelper::FillThreadContext(void*)':
/usr/include/bits/string3.h:49:1: error: inlining failed in call to always_inline 'void* memcpy(void*, const void*, size_t) throw ()': function attribute mismatch
 __NTH (memcpy (void *__restrict __dest, __const void *__restrict __src,
 ^
In file included from /home/ubuntu/build/tor-browser/obj-x86_64-unknown-linux-gnu/xpcom/threads/Unified_cpp_xpcom_threads0.cpp:29:0:
/home/ubuntu/build/tor-browser/xpcom/threads/ThreadStackHelper.cpp:730:66: error: called from here
          &context.uc_mcontext.gregs[REG_R8], 8 * sizeof(int64_t));
                                                                  ^
make[5]: Leaving directory `/home/ubuntu/build/tor-browser/obj-x86_64-unknown-linux-gnu/xpcom/threads'
make[5]: *** [Unified_cpp_xpcom_threads0.o] Error 1

comment:46 Changed 4 years ago by nicoo

Cc: nicoo added

comment:47 Changed 4 years ago by gk

Keywords: TorBrowserTeam201510 GeorgKoppen201510 added; TorBrowserTeam201509 GeorgKoppen201509 removed
Owner: changed from tbb-team to gk
Sponsor: SponsorU
Status: newassigned

comment:48 Changed 4 years ago by gk

Parent ID: #17304

comment:49 Changed 4 years ago by gk

This is a fun bug underlying https://bugzilla.mozilla.org/show_bug.cgi?id=1147248 as well. We hit it as FORTIFY_SOURCE makes memcpy always inline. I am still trying to pinpoint what is causing this (now with the help of tbsaunde).

Anyway, besides these two issues there is more around the corner:

/home/ubuntu/build/tor-browser/intl/icu/source/common/putil.cpp:2188: error: undefined reference to 'dlsym'
collect2: error: ld returned 1 exit status

comment:50 in reply to:  49 Changed 4 years ago by gk

Some updates here. The quest continues.

Replying to gk:

This is a fun bug underlying https://bugzilla.mozilla.org/show_bug.cgi?id=1147248 as well. We hit it as FORTIFY_SOURCE makes memcpy always inline. I am still trying to pinpoint what is causing this (now with the help of tbsaunde).

I can work around these problems by backporting

https://hg.mozilla.org/mozilla-central/rev/33e89c9a4172 and
https://hg.mozilla.org/mozilla-central/rev/5e86358d4ec2

Anyway, besides these two issues there is more around the corner:

/home/ubuntu/build/tor-browser/intl/icu/source/common/putil.cpp:2188: error: undefined reference to 'dlsym'
collect2: error: ld returned 1 exit status

This only happens with GCC 5. It seems to me this is a Mozilla bug which is why I filed https://bugzilla.mozilla.org/show_bug.cgi?id=1213698 (I intend to write a patch for that one in case this is still open after our October deadline).

But there is more:

/usr/bin/ld.gold.real: error: /path/to/tor-browser/tor-browser/obj-x86_64-unknown-linux-gnu/toolkit/library/../../gfx/skia/SkFontHost_FreeType.o: requires dynamic R_X86_64_PC32 reloc against 'FT_Get_X11_Font_Format' which may overflow at runtime; recompile with -fPIC
/usr/bin/ld.gold.real: error: read-only segment has dynamic relocations
/usr/bin/ld.gold.real: error: hidden symbol 'FT_Get_X11_Font_Format' is not defined locally
collect2: error: ld returned 1 exit status

Surprisingly this is happening since Firefox 30. It got fixed in Firefox 39 and backporting

https://hg.mozilla.org/mozilla-central/rev/afd840d66e6a

helps. Now, back to testing this in our Gitian environment. (On the bright side, I found an ICE while trying to compile ESR 38 with GCC master. Therefore, not everything was in vain so far then... :) )

comment:51 Changed 4 years ago by boklm

Cc: boklm added

comment:52 Changed 4 years ago by gk

Severity: Blocker

Oh, and we need to disable ICU for now as there is going some funky stuff on. E.g.:

=================================================================
==27938==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 216 byte(s) in 3 object(s) allocated from:
    #0 0x7f130aa4037a in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.2+0x9437a)
    #1 0x559932e19e6e in res_open /home/thomas/Arbeit/Tor/tor-browser/intl/icu/source/tools/genrb/reslist.c:845

SUMMARY: AddressSanitizer: 216 byte(s) leaked in 3 allocation(s).
Makefile:660: recipe for target 'out/build/icudt52l/coll/root.res' failed
make[7]: *** [out/build/icudt52l/coll/root.res] Error 23

comment:53 Changed 4 years ago by gk

Priority: HighVery High
Severity: BlockerNormal

comment:54 Changed 4 years ago by gk

It seems we are hitting an UBSan related internal compiler error with 5.1.0: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66190. Bumping the GCC version to 5.2.0 helps and the compilation succeeds \o/. The packaging step is still broken, though:

/home/ubuntu/build/tor-browser/tools/profiler/UnwinderThread2.cpp:693:66: runtime error: null pointer passed as argument 2, which is declared to never be null
/usr/include/bits/string3.h:52:71: runtime error: null pointer passed as argument 2, which is declared to never be null
/home/ubuntu/build/tor-browser/xpcom/components/nsComponentManager.cpp:341:5: runtime error: load of address 0x2b59c2fce270 with insufficient space for an object of type 'const struct Module *'
0x2b59c2fce270: note: pointer points here
 00 00 00 00  00 cb d7 a3 59 2b 00 00  60 e8 d7 a3 59 2b 00 00  20 1a d8 a3 59 2b 00 00  20 85 d9 a3
              ^ 
ASAN:SIGSEGV
=================================================================
==28557==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x000000000000 bp 0x2b5a00fff6d0 sp 0x2b5a00fff5d8 T2)
==28557==Hint: pc points to the zero page.

AddressSanitizer can not provide additional info.
/home/ubuntu/build/tor-browser/nsprpub/pr/src/io/prlayer.c:655:13: runtime error: null pointer passed as argument 2, which is declared to never be null
/usr/include/bits/string3.h:52:10: runtime error: null pointer passed as argument 2, which is declared to never be null
SUMMARY: AddressSanitizer: SEGV ??:0 ??
Thread T2 created by T0 here:
    #0 0x2b597c685054 in __interceptor_pthread_create ../../.././libsanitizer/asan/asan_interceptors.cc:179
    #1 0x2b597db679c0 in _PR_CreateThread /home/ubuntu/build/tor-browser/nsprpub/pr/src/pthreads/ptthread.c:453
    #2 0x2b597db6895e in PR_CreateThread /home/ubuntu/build/tor-browser/nsprpub/pr/src/pthreads/ptthread.c:544
    #3 0x2b5996ffb60e in nsThread::Init() /home/ubuntu/build/tor-browser/xpcom/threads/nsThread.cpp:469
    #4 0x2b5996ffbed9 in nsThreadManager::NewThread(unsigned int, unsigned int, nsIThread**) /home/ubuntu/build/tor-browser/xpcom/threads/nsThreadManager.cpp:362
    #5 0x2b599706fad1 in NS_NewThread(nsIThread**, nsIRunnable*, unsigned int) /home/ubuntu/build/tor-browser/xpcom/glue/nsThreadUtils.cpp:69
    #6 0x2b5997791fb2 in nsresult NS_NewNamedThread<13ul>(char const (&) [13ul], nsIThread**, nsIRunnable*, unsigned int) ../../../dist/include/nsThreadUtils.h:74
    #7 0x2b5997791fb2 in nsNotifyAddrListener::Init() /home/ubuntu/build/tor-browser/netwerk/system/linux/nsNotifyAddrListener_Linux.cpp:270
    #8 0x2b59977b3941 in nsNotifyAddrListenerConstructor /home/ubuntu/build/tor-browser/netwerk/build/nsNetModule.cpp:381
    #9 0x2b5996fd7950 in nsComponentManagerImpl::CreateInstanceByContractID(char const*, nsISupports*, nsID const&, void**) /home/ubuntu/build/tor-browser/xpcom/components/nsComponentManager.cpp:1199
    #10 0x2b5996fdcc23 in nsComponentManagerImpl::GetServiceByContractID(char const*, nsID const&, void**) /home/ubuntu/build/tor-browser/xpcom/components/nsComponentManager.cpp:1561
    #11 0x2b599705e375 in nsGetServiceByContractIDWithError::operator()(nsID const&, void**) const /home/ubuntu/build/tor-browser/xpcom/glue/nsComponentManagerUtils.cpp:292
    #12 0x2b599705e52e in nsCOMPtr_base::assign_from_gs_contractid_with_error(nsGetServiceByContractIDWithError const&, nsID const&) /home/ubuntu/build/tor-browser/xpcom/glue/nsCOMPtr.cpp:114
    #13 0x2b59971a4c3f in nsCOMPtr<nsINetworkLinkService>::operator=(nsGetServiceByContractIDWithError const&) ../../dist/include/nsCOMPtr.h:613
    #14 0x2b59971a4c3f in nsIOService::InitializeNetworkLinkService() /home/ubuntu/build/tor-browser/netwerk/base/nsIOService.cpp:281
    #15 0x2b59971c8490 in nsIOService::Init() /home/ubuntu/build/tor-browser/netwerk/base/nsIOService.cpp:232
    #16 0x2b59971ca5f3 in nsIOService::GetInstance() /home/ubuntu/build/tor-browser/netwerk/base/nsIOService.cpp:309
    #17 0x2b59977bfa6b in nsIOServiceConstructor /home/ubuntu/build/tor-browser/netwerk/build/nsNetModule.cpp:57
    #18 0x2b5996fd7950 in nsComponentManagerImpl::CreateInstanceByContractID(char const*, nsISupports*, nsID const&, void**) /home/ubuntu/build/tor-browser/xpcom/components/nsComponentManager.cpp:1199
    #19 0x2b5996fdcc23 in nsComponentManagerImpl::GetServiceByContractID(char const*, nsID const&, void**) /home/ubuntu/build/tor-browser/xpcom/components/nsComponentManager.cpp:1561
    #20 0x2b599705c944 in nsGetServiceByContractID::operator()(nsID const&, void**) const /home/ubuntu/build/tor-browser/xpcom/glue/nsComponentManagerUtils.cpp:280
    #21 0x2b599705ca50 in nsCOMPtr_base::assign_from_gs_contractid(nsGetServiceByContractID, nsID const&) /home/ubuntu/build/tor-browser/xpcom/glue/nsCOMPtr.cpp:103
    #22 0x2b599707e6bc in nsCOMPtr<nsIIOService>::nsCOMPtr(nsGetServiceByContractID) /home/ubuntu/build/tor-browser/xpcom/build/../glue/nsCOMPtr.h:514
    #23 0x2b599707e6bc in mozilla::services::GetIOService() /home/ubuntu/build/tor-browser/xpcom/build/ServiceList.h:18
    #24 0x2b5997040ef4 in do_GetIOService(nsresult*) ../../../dist/include/nsNetUtil.h:97
    #25 0x2b599704110c in net_EnsureIOService(nsIIOService**, nsCOMPtr<nsIIOService>&) (/home/ubuntu/build/tor-browser/obj-x86_64-unknown-linux-gnu/dist/bin/libxul.so+0x193cc10c)
    #26 0x2b599704143b in NS_NewURI(nsIURI**, nsACString_internal const&, char const*, nsIURI*, nsIIOService*) ../../../../dist/include/nsNetUtil.h:152
    #27 0x2b59970327f2 in nsChromeRegistry::ManifestProcessingContext::GetManifestURI() /home/ubuntu/build/tor-browser/chrome/nsChromeRegistryChrome.cpp:721
    #28 0x2b5997032e70 in nsChromeRegistry::ManifestProcessingContext::ResolveURI(char const*) /home/ubuntu/build/tor-browser/chrome/nsChromeRegistryChrome.cpp:738
    #29 0x2b599703de58 in nsChromeRegistryChrome::ManifestLocale(nsChromeRegistry::ManifestProcessingContext&, int, char* const*, int) /home/ubuntu/build/tor-browser/chrome/nsChromeRegistryChrome.cpp:819
    #30 0x2b5996fe66b4 in ParseManifest(NSLocationType, mozilla::FileLocation&, char*, bool, bool) /home/ubuntu/build/tor-browser/xpcom/components/ManifestParser.cpp:786
    #31 0x2b5996fd2b2d in DoRegisterManifest /home/ubuntu/build/tor-browser/xpcom/components/nsComponentManager.cpp:613
    #32 0x2b5996fd300c in nsComponentManagerImpl::RegisterManifest(NSLocationType, mozilla::FileLocation&, bool) /home/ubuntu/build/tor-browser/xpcom/components/nsComponentManager.cpp:626
    #33 0x2b5996fd300c in nsComponentManagerImpl::ManifestManifest(nsComponentManagerImpl::ManifestProcessingContext&, int, char* const*) /home/ubuntu/build/tor-browser/xpcom/components/nsComponentManager.cpp:635
    #34 0x2b5996fe6af4 in ParseManifest(NSLocationType, mozilla::FileLocation&, char*, bool, bool) /home/ubuntu/build/tor-browser/xpcom/components/ManifestParser.cpp:795
    #35 0x2b5996fd2b2d in DoRegisterManifest /home/ubuntu/build/tor-browser/xpcom/components/nsComponentManager.cpp:613
    #36 0x2b5996fd2e03 in nsComponentManagerImpl::RegisterManifest(NSLocationType, mozilla::FileLocation&, bool) /home/ubuntu/build/tor-browser/xpcom/components/nsComponentManager.cpp:626
    #37 0x2b5996fd2e03 in nsComponentManagerImpl::RereadChromeManifests(bool) /home/ubuntu/build/tor-browser/xpcom/components/nsComponentManager.cpp:821
    #38 0x2b5996fda5b8 in nsComponentManagerImpl::Init() /home/ubuntu/build/tor-browser/xpcom/components/nsComponentManager.cpp:430
    #39 0x2b599708b2fd in NS_InitXPCOM2 /home/ubuntu/build/tor-browser/xpcom/build/XPCOMInit.cpp:766
    #40 0x2b59985570d1 in XRE_XPCShellMain /home/ubuntu/build/tor-browser/js/xpconnect/src/XPCShellImpl.cpp:1382
    #41 0x2b59c4602c8c in __libc_start_main (/lib/libc.so.6+0x1ec8c)

==28557==ABORTING

Might be related to comment:35.

comment:55 Changed 4 years ago by gk

Yes, I needed to backport the GCC patch to 5.2.0 (attached) and then the error goes away. However, we are not done yet. The packaging is mysteriously freezing after the preparation of the startup cache. On the bright side this seems to imply just another bisecting as ESR 24 is getting compiled and packaged properly (after disabling LSan for the packaging step).

Changed 4 years ago by gk

fix packaging crash by making GCC patch work with 5.2.0

comment:56 Changed 4 years ago by gk

Some updates. The good news is I get it built (thanks to ted for the idea of using --disable-startupcache). Mozilla is not running LSan currently due to various issues comment:52 being one of them (https://bugzilla.mozilla.org/show_bug.cgi?id=1214464) and LSan breakage of the packaging step a second one (https://bugzilla.mozilla.org/show_bug.cgi?id=1215443).

The bad news is it crashes during start-up with

/home/ubuntu/build/tor-browser/tools/profiler/UnwinderThread2.cpp:693:66: runtime error: null pointer passed as argument 2, which is declared to never be null
/usr/include/bits/string3.h:52:71: runtime error: null pointer passed as argument 2, which is declared to never be null

ASAN:SIGSEGV
=================================================================
==13006==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x000000000000 bp 0x7fe43a94ab92 sp 0x7fff64f39528 T0)
==13006==Hint: pc points to the zero page.

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV ??:0 ??
==13006==ABORTING

which we already saw in comment:54.

comment:57 Changed 4 years ago by gk

Compiling with a non-custom GCC 5.2.0 on a Debian system and only with ASan (without UBSan and without --disable-startupcache and with ASAN_OPTIONS="detect_leaks=0" (to avoid the ICU blow-up)) there is no freeze in the packaging step and the build is working. The only thing we get on shutdown is

ASAN:SIGSEGV
=================================================================
==9717==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7fa0013e900b bp 0x7f9fd32fce80 sp 0x7f9fd32fce70 T45)
    #0 0x7fa0013e900a in RunWatchdog /home/thomas/Arbeit/Tor/tor-browser/toolkit/components/terminator/nsTerminator.cpp:151
    #1 0x7fa006ad2ae8 in _pt_root /home/thomas/Arbeit/Tor/tor-browser/nsprpub/pr/src/pthreads/ptthread.c:212
    #2 0x7fa00a4430a3 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x80a3)
    #3 0x7fa0096e306c in clone (/lib/x86_64-linux-gnu/libc.so.6+0xe606c)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /home/thomas/Arbeit/Tor/tor-browser/toolkit/components/terminator/nsTerminator.cpp:151 RunWatchdog
Thread T45 (Shutdow~minator) created by T0 here:
    #0 0x7fa00a68e0c4 in pthread_create (/home/thomas/Arbeit/Tor/debugging/10599/tor-browser_en-US/Browser/TorBrowser/Tor/libasan.so.2+0x360c4)
    #1 0x7fa006ad21a9 in _PR_CreateThread /home/thomas/Arbeit/Tor/tor-browser/nsprpub/pr/src/pthreads/ptthread.c:453
    #2 0x7fa006ad380e in PR_CreateThread /home/thomas/Arbeit/Tor/tor-browser/nsprpub/pr/src/pthreads/ptthread.c:544
    #3 0x7fa0013e8f46 in CreateSystemThread /home/thomas/Arbeit/Tor/tor-browser/toolkit/components/terminator/nsTerminator.cpp:77
    #4 0x7fa0013e9342 in mozilla::nsTerminator::StartWatchdog() /home/thomas/Arbeit/Tor/tor-browser/toolkit/components/terminator/nsTerminator.cpp:383
    #5 0x7fa0013e96e9 in mozilla::nsTerminator::Start() /home/thomas/Arbeit/Tor/tor-browser/toolkit/components/terminator/nsTerminator.cpp:353
    #6 0x7fa0013e9f68 in mozilla::nsTerminator::Observe(nsISupports*, char const*, char16_t const*) /home/thomas/Arbeit/Tor/tor-browser/toolkit/components/terminator/nsTerminator.cpp:439
    #7 0x7f9ffe1c4a79 in nsObserverList::NotifyObservers(nsISupports*, char const*, char16_t const*) /home/thomas/Arbeit/Tor/tor-browser/xpcom/ds/nsObserverList.cpp:100
    #8 0x7f9ffe1c4bb1 in nsObserverService::NotifyObservers(nsISupports*, char const*, char16_t const*) /home/thomas/Arbeit/Tor/tor-browser/xpcom/ds/nsObserverService.cpp:329
    #9 0x7fa00138b7e6 in nsAppStartup::Quit(unsigned int) /home/thomas/Arbeit/Tor/tor-browser/toolkit/components/startup/nsAppStartup.cpp:468
    #10 0x7fa00138b9d9 in nsAppStartup::ExitLastWindowClosingSurvivalArea() /home/thomas/Arbeit/Tor/tor-browser/toolkit/components/startup/nsAppStartup.cpp:540
    #11 0x7fa00138baea in nsAppStartup::Observe(nsISupports*, char const*, char16_t const*) /home/thomas/Arbeit/Tor/tor-browser/toolkit/components/startup/nsAppStartup.cpp:712
    #12 0x7f9ffe1c4a79 in nsObserverList::NotifyObservers(nsISupports*, char const*, char16_t const*) /home/thomas/Arbeit/Tor/tor-browser/xpcom/ds/nsObserverList.cpp:100
    #13 0x7f9ffe1c4bb1 in nsObserverService::NotifyObservers(nsISupports*, char const*, char16_t const*) /home/thomas/Arbeit/Tor/tor-browser/xpcom/ds/nsObserverService.cpp:329
    #14 0x7fa0010d9431 in nsXULWindow::Destroy() /home/thomas/Arbeit/Tor/tor-browser/xpfe/appshell/nsXULWindow.cpp:517
    #15 0x7fa0010d972c in nsWebShellWindow::Destroy() /home/thomas/Arbeit/Tor/tor-browser/xpfe/appshell/nsWebShellWindow.cpp:758
    #16 0x7fa0010d9bd8 in nsWebShellWindow::RequestWindowClose(nsIWidget*) /home/thomas/Arbeit/Tor/tor-browser/xpfe/appshell/nsWebShellWindow.cpp:305
    #17 0x7fa0008bc709 in delete_event_cb /home/thomas/Arbeit/Tor/tor-browser/widget/gtk/nsWindow.cpp:5342
    #18 0x7f9ffb4bba7e  (/usr/lib/x86_64-linux-gnu/libgtk-x11-2.0.so.0+0x132a7e)

==9717==ABORTING

But that crash is due to Mozilla's

    // Shutdown is apparently dead. Crash the process.
    MOZ_CRASH("Shutdown too long, probably frozen, causing a crash.");

Nevertheless, there might be a real issue underneath...

Last edited 3 years ago by gk (previous) (diff)

comment:58 Changed 4 years ago by gk

UBSan is actually involved/responsible for the freeze while compiling the startup cache. This got "solved" by https://hg.mozilla.org/mozilla-central/rev/f78c80504443 which is probably by accident as an exception is now thrown during that step which might break the freeze:

*************************
A coding exception was thrown and uncaught in a Task.

Full message: TypeError: invalid path component
Full stack: join@resource://gre/modules/osfile/ospath_unix.jsm:90:1
task_DI_initializePublicDownloadList@resource://gre/modules/DownloadIntegration.jsm:218:46
TaskImpl_run@resource://gre/modules/Task.jsm:330:41
TaskImpl@resource://gre/modules/Task.jsm:275:3
createAsyncFunction/asyncFunction@resource://gre/modules/Task.jsm:249:14
Task_spawn@resource://gre/modules/Task.jsm:164:12
this.DownloadIntegration.initializePublicDownloadList@resource://gre/modules/DownloadIntegration.jsm:206:1
this.Downloads.getList/this._promiseListsInitialized<@resource://gre/modules/Downloads.jsm:177:17
TaskImpl_run@resource://gre/modules/Task.jsm:330:41
Handler.prototype.process@resource://gre/modules/Promise.jsm -> resource://gre/modules/Promise-backend.js:934:23
this.PromiseWalker.walkerLoop@resource://gre/modules/Promise.jsm -> resource://gre/modules/Promise-backend.js:813:7
Promise*this.PromiseWalker.scheduleWalkerLoop@resource://gre/modules/Promise.jsm -> resource://gre/modules/Promise-backend.js:744:11
this.PromiseWalker.schedulePromise@resource://gre/modules/Promise.jsm -> resource://gre/modules/Promise-backend.js:776:7
Promise.prototype.then@resource://gre/modules/Promise.jsm -> resource://gre/modules/Promise-backend.js:451:5
this.DownloadCombinedList@resource://gre/modules/DownloadList.jsm:278:3
this.Downloads.getList/this._promiseListsInitialized<@resource://gre/modules/Downloads.jsm:172:28
TaskImpl_run@resource://gre/modules/Task.jsm:330:41
TaskImpl@resource://gre/modules/Task.jsm:275:3
createAsyncFunction/asyncFunction@resource://gre/modules/Task.jsm:249:14
Task_spawn@resource://gre/modules/Task.jsm:164:12
this.Downloads.getList@resource://gre/modules/Downloads.jsm:169:39
this.DownloadView.init@resource://app/modules/DownloadView.jsm:16:5
@resource://app/modules/DownloadView.jsm:35:1
load_modules_under@/path/to/mozilla-central/mozilla-central/toolkit/mozapps/installer/precompile_cache.js:76:7
precompile_startupcache@/path/to/mozilla-central/mozilla-central/toolkit/mozapps/installer/precompile_cache.js:87:3
@-e:1:1

*************************

Might be interesting to dinf out what revision was the first that caused the freeze given that ESR 24 is working fine...

Last edited 4 years ago by gk (previous) (diff)

comment:59 in reply to:  57 Changed 3 years ago by gk

Replying to gk:

But that crash is due to Mozilla's

    // Shutdown is apparently dead. Crash the process.
    MOZ_CRASH("Shutdown too long, probably frozen, causing a crash.");

Nevertheless, there might be a real issue underneath...

The leak sanitizer is active automatically nowadays and is causing the long delay during shutdown resulting in the Mozilla code kicking in. Setting ASAN_OPTIONS="detect_leaks=0" solves this.

comment:60 Changed 3 years ago by gk

Keywords: TorBrowserTeam201510R added; TorBrowserTeam201510 removed
Status: assignedneeds_review

bug_10599 (https://gitweb.torproject.org/user/gk/tor-browser.git/log/?h=bug_10599) in my public tor-browser repo has the Tor Browser changes for our hardened builds we start with (the related bundle changes will be posted to #17305).

We won't build the browser with UBSan for now as there are still issues to get sorted out (I am still investigating the freeze). I'll open a new bug devoted to this task.

The patches are basically backported fixes for various issues found in the previous comments + the respective changes in the .mozconfig-asan.

comment:61 Changed 3 years ago by gk

Keywords: tbb-hardening added

comment:62 Changed 3 years ago by gk

mcs, brade: could you take a look? I'd like to have these fixes in the alpha, too, in order to not have to have a separate tor-browser hardened branch.

The freetype related backport is actually not ASan related but helped me to create a build on my Linux machine and is thus useful for users trying to build a "normal" Tor Browser on Linux machines.

comment:63 Changed 3 years ago by mcs

r=mcs, r=brade
There is a small typo near the end of .mozconfig-asan: "alredy" should be "already".
Otherwise, these changes look OK, although we are not familiar with the ThreadStackHelper code. It surprises me that --disable-crashreporter does not skip this code, but I do not understand exactly how it is used.

comment:64 in reply to:  63 Changed 3 years ago by gk

Replying to mcs:

r=mcs, r=brade
There is a small typo near the end of .mozconfig-asan: "alredy" should be "already".

Thanks, fixed.

Otherwise, these changes look OK, although we are not familiar with the ThreadStackHelper code. It surprises me that --disable-crashreporter does not skip this code, but I do not understand exactly how it is used.

Well, there is crash reporter related code that is compiled in, e.g. for the profile IIRC. This could be the same case.

I'll leave the ticket open until I've filed follow-up tickets for the remaining issues found during compiling.

comment:65 Changed 3 years ago by gk

Keywords: TorBrowserTeam201511R added; TorBrowserTeam201510R removed

comment:66 Changed 3 years ago by gk

Keywords: GeorgKoppen201511 added; GeorgKoppen201510 removed

comment:67 Changed 3 years ago by gk

Resolution: fixed
Status: needs_reviewclosed

Alright, time to close this unwieldy ticket. Thanks to everyone who followed along! I created the follow-up bugs #17505, #17506, #17507, #17508 and #17509 to track the remaining issues with our ASan builds.

We investigated SoftBound CETS, too, but were unable to get it to compile Firefox even with the help of its designers. Alas, it is unusable at the moment.

Note: See TracTickets for help on using tickets.