When tor is built with --enable-openbsd-malloc running 'make test' consumes 100% of the CPU. Attaching a debugger to the process shows it looping forever in calloc (src/ext/OpenBSD_malloc_Linux.c). Changing from -O2 to -O3 (or -O1) in the Makefile "fixes" (as in the tests run) the issues so it seems to be a compiler bug.
I've attached the output of objdump for the failing and working object files, but I'm too stupid to debug it any further, so hopefully someone more intelligent can.
Trac: Username: icanhasaccount
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items
0
Show closed items
No child items are currently assigned. Use child items to break down this issue into smaller parts.
Linked items
0
Link issues together to show that they're related.
Learn more.
This bug seems to depend on the name of the "malloc" function. I'm attaching a minimal example to reproduce this bug (down to 67 lines from 2049). The comment near the head of the file shows three ways to build this: two of which work, and one of which gives us the infinite loop that you ran into.
I think it's time to report this upstream to gcc, unless they already know about it.
FWIW the OpenBSD memory allocator was added in ticket:468#comment:22.
I agree with comment:8 with regards to removing the openbsd malloc build option. It's obvious the code isn't tested or maintained and causes people to shoot themselves in the foot.
Sorry - I didn't look too closely at why the openbsd allocator was added in the first place. That was ignorant and lazy of me.
Adding the following to configure.ac (as suggested by Nick in comment 7) allows the tests to run. I'm not familiar with autotools though so perhaps there would be a cleaner way to do this if we could narrow it down to a specific version of gcc?
Renaming ticket, as my understanding is that this is not test-specific, and it occurs with tor in general when the --enable-openbsd-malloc feature is enabled.
Trac: Summary: Make test fails when --enable-openbsd-malloc is used to Tor maxes CPU when --enable-openbsd-malloc is used
Trac: Summary: Tor maxes CPU when --enable-openbsd-malloc is used to Remove --enable-openbsd-malloc (Tor maxes CPU when --enable-openbsd-malloc is used) Milestone: Tor: 0.3.2.x-final to Tor: 0.3.4.x-final
gcc 7 simply deletes the functions, so --enable-openbsd-malloc now instead does nothing except slightly increase compilation resources and executable size.
slightly decreased memory usage (fragmentation is much less bad in 0.3.2 already, but jemalloc still does better by default)
pros compared to openbsd malloc:
remove dead code
code path will be actually tested (unlike with openbsd malloc)
jemalloc is actually maintained by distributions and -ljemalloc is a standard method of overriding glibc. this way will continue to work, unlike just defining functions
cons:
if always-on, adds an extra prerequisite. if used only when available, might cause more Linux-only bugs
glibc does slightly more runtime checks, but AFAIK these are almost entirely useless for security
might be fingerprintable? seems highly unlikely, and would also apply to other OSes
so I read what I'm pretty sure is all of the information online about replacing malloc, and I'm pretty sure that this is the best global option possible. another option that can be explored is to have tor_malloc_ call jemalloc, but AFAICT most distros build jemalloc with no prefix, so dlopen magic would be needed.
(I'm guessing that the second patch should be applied without the first, right?)
I like this patch, but I'm not so sure about having jemalloc be the default right away. Let's let people test it as an option before we decide it's the best choice for everyone. Also, if I'm reading the patch right, this code will break on Linux if jemalloc isn't installed, which probably isn't what you'd intended?
Other questions:
Why move the mlockall check?
Why does the patch move dmalloc? Have you tested this? Does it work?
Where and how have you tested this patch?
Other notes:
This will need a changes file, explaining to people how to use this feature.
We'll still want to remove the openbsd_malloc code.
I wanted to put all of the allocator-related configure.ac code together. I tested at least one of the versions of the patches both with and without jemalloc installed, but I only tested on Linux. The plan was to remove openbsd malloc first. I see no benefit to using it over jemalloc.
I'd like to point out that jemalloc has more issues with security than either ptmalloc or OpenBSD's malloc. The large, aligned heaps make ASLR less effective (and that's one reason why Firefox on *nix machines is so much easier to compromise than on Windows).
additionally, starting with 5, jemalloc now uses extent based allocation, so from what I understand, ASLR will be just as good as with other allocators.
They use jemalloc for Windows? I was under the impression Windows used the system malloc. And yeah I know newer versions of jemalloc are improved (redzones, etc), but I'm still a little weary, especially given that OpenBSD's malloc is actually really well-designed from a security standpoint. Though honestly the Copperhead malloc (a hybrid of bionic and OpenBSD's) is even better, but I feel it's unlikely that it would be included in Tor.
Let's try to get this unstuck, by disentangling the various options.
I've made a branch remove_openbsd_malloc that ports icanhasaccount's removal code to master if we decide to do that. PR at https://github.com/torproject/tor/pull/224 . Not my preferred approach.
And I've made a branch called with_malloc that ports the parts of Hello71's patch that I agree with to master: it adds --with-malloc=, doesn't change our default, doesn't break old configure options, and keeps openbsd malloc as an option. Works for me, but probably needs more review and testing. PR at https://github.com/torproject/tor/pull/225 .
The above two branches apply to master only and are mutually exclusive, though we could probably combine them into one.
For backport purposes, I have a branch fix_nonstandard_malloc_029 that only does the minimum needed to fix the originally supported issue. PR at https://github.com/torproject/tor/pull/226 . Works for me.
Trac: Status: needs_revision to needs_review Reviewer: nickm toN/A
Let's try to get this unstuck, by disentangling the various options.
I've made a branch remove_openbsd_malloc that ports icanhasaccount's removal code to master if we decide to do that. PR at https://github.com/torproject/tor/pull/224 . Not my preferred approach.
And I've made a branch called with_malloc that ports the parts of Hello71's patch that I agree with to master: it adds --with-malloc=, doesn't change our default, doesn't break old configure options, and keeps openbsd malloc as an option. Works for me, but probably needs more review and testing. PR at https://github.com/torproject/tor/pull/225 .
The above two branches apply to master only and are mutually exclusive, though we could probably combine them into one.
For backport purposes, I have a branch fix_nonstandard_malloc_029 that only does the minimum needed to fix the originally supported issue. PR at https://github.com/torproject/tor/pull/226 . Works for me.
Thanks for fixing my patch and giving credit, but I'd appreciate it if you called me Alex Xu in changelog, or at least not put my nick in quotation marks (looks like scare quotes to me). Also, I think you put too many twos in the change file.
On the patch itself, I did actually test what happens when Rust is linked in: it works fine. I think Rust probably uses the system allocator on Linux, so replacing the malloc this way in glibc should also replace the Rust allocator and everything works fine.
Also, I'm not sure it's a good idea to leave all of these options in if nobody is actually going to test them (as shown by openbsd malloc being broken by all this time and nobody really caring that much to fix it).
ok, added a squash commit to fix ticket number and credit you in the form you prefer.
(FYI, our house style has been to put pseudonyms that don't look like names into quotation marks, since not doing so has caused confusion in the past with strings like "patch from randomguessing" or "patch from Some Guy".)