Opened 11 years ago

Last modified 8 years ago

#1060 closed defect (Won't fix)

Stack smashing detected in getinfo_helper_events()

Reported by: anonym Owned by:
Priority: Low Milestone:
Component: Core Tor/Tor Version:
Severity: Keywords:
Cc: anonym, arma, phobos, nickm Actual Points:
Parent ID: Points:
Reviewer: Sponsor:


When I try to connect to the control port using TorK, tor is killed due to a stack smashing "attack" in
getinfo_helper_events(). This is on Hardened Gentoo, with gcc 3.4.6 (old, yes, but that's what's considered
stable in Hardened Gentoo). Recompiling without SSP fixes it. The old stable 0.2.0.x branch does not have
this problem.

[Automatically added by flyspray2trac: Operating System: Other Linux]

Child Tickets

Change History (11)

comment:1 Changed 11 years ago by arma

Can you get us a core or a traceback or something? Or narrow it down to
which event is causing the problem?

comment:2 Changed 11 years ago by anonym

No core dump is created. I'm trying to get gdb working as we speak, but there seems to have been a lack
of interest in maintaining Hardened Gentoo for the past years. No version of gdb available in the package
management system seems to have the patches that makes it possible to debug position independent code.

Instead I'll try recompiling tor without PIE but with SSP to see if the crash persists. Hopefully it does
and then I'll use an unpatched gdb to get a stack trace.

How should I proceed to isolate the last event before the crash?

comment:3 Changed 11 years ago by nickm

If it's in getinfo_helper_events(), it's just asked for circuit-status, stream-status, orconn-status,
address-mappings/_, or status/_. It would help to know which one Tor was asked for before it died.

Dumb question: what makes you sure it's the stack and not the heap, and in getinfo_event_helper() ?

comment:4 Changed 11 years ago by anonym

what makes you sure it's the stack and not the heap, and in getinfo_event_helper() ?

When tor dies I get (approximately) "tor: stack smashing attack detected in function
getinfo_helper_events" in syslog.

Any way, I just tried compiling tor with gcc 4.1.2 and its SSP, whose implementation is different
from the one in 3.4.6, and the problem is gone. I've encountered a few bugs with 3.4.6's SSP before,
so I suspect that's the cause. This is on a different system though (the Hardened system with gcc
3.4.6 is my current Incognito development snapshot) so I'm not completely sure.

I guess the case is closed, but I'll try to get a debugger into the Hardened environment (which may
sound easier than it is...) just to see what's going on.

comment:5 Changed 11 years ago by nickm

Ah, so it might be an inlined function.

here's a crazy idea: add this to the top of getinfo_helper_events():

log_notice(LD_CONTROL, "Called with question '%s'", question);

This will log each appropriate GETINFO question before we try to answer it, and if we're in luck,
the last one logged will be the one that caused the crash.

[Thanks for persisting in this, by the way: stack corruption is serious business, and we need
to do all we can to keep it out of Tor.]

comment:6 Changed 11 years ago by anonym

This is what I get with the printf-style debugging:

Aug 11 13:15:12 livecd Tor[22778]: Bootstrapped 100%: Done.
Aug 11 13:15:12 livecd Tor[22778]: Now checking whether ORPort and DirPort are reachable... (this may take up to 20 minutes -- look for log messages
indicating success)
Aug 11 13:15:16 livecd Tor[22778]: Called with question 'circuit-status'
Aug 11 13:15:16 livecd Tor[22778]: Called with question 'stream-status'
Aug 11 13:15:16 livecd Tor[22778]: Called with question 'orconn-status'
Aug 11 13:15:16 livecd * stack smashing detected *: tor - terminated
Aug 11 13:15:16 livecd tor: stack smashing attack in function getinfo_helper_events - terminated

comment:7 Changed 11 years ago by anonym

Ok, so I finally managed to get gdb going. This is what I discovered: for some reason gcc puts the
canary value only 32 bytes after name, introduced on line 1734 in control.c. However, name is 128
bytes big, so there's an overlapping problem here. When orconn_target_get_name copies the nickname
and fingerprint of some router into name (in the end resorting to router_get_verbose_nickname) a
few lines below, the canary value gets overwritten and this is wrongly detected as a buffer
overflow when getinfo_helper_events returns.

But here's the kicker: the above only happens if a GETINFO for circuit-status, stream-status and
orconn-status are sent to the control port in rapid succession (ok, I haven't checked other
combinations). Me typing the commands is not fast enough, but pasting all of the commands at once
is. TorK also sends them fast enough.

I'm perplexed. Of course, my understanding of compilers and assembly isn't that great, so please
correct me if I'm wrong, but AFAIK local variables (like name, and now we're talking about the
actual array, not the "pointer") are located by a fixed offset (i.e. hardcoded in assembly) from
the base pointer. Since that's also the case for the canary value it seems impossible to me that
their location relative to each other on the stack ever could be different.

Any way, can we agree that this definitely is not a bug in tor, but most likely a bug in the SSP
patches for gcc 3.4.6 or evil gremlins in my computer? It's sad, though, that that would mean
that I'll have to disable SSP for tor in incognito.

comment:8 Changed 11 years ago by anonym

The reason why the canary value isn't overwritten for every 'GETINFO orconn-status' queriy is
that control_con->use_long_names is 0 which results in that only the nickname, not the
fingerprint, is copied into name by orconn_target_get_name (the strlcpy case). For nicknames
of length less than 32 bytes the canary value won't be overwritten and all is well. When
use_long_names is 1, however, the fingerprint (which itself is longer than 32 bytes) is copied
into name overwriting the canary and we're screwed.

Why control_con->use_long_names is 1 sometimes and other times 0 when 'GETINFO orconn-status'
is handled by getinfo_helper_events I do not know. Only TorK manages to consistently always
get it to 1 and trigger the stack smash. I tried setting a breakpoint in
orconn_target_get_name and manually change long_names to 0 for each call made to it when TorK
connects to the control port, and that actually prevented the stack smash from happening. TorK
works just fine after that.

Right now I'm completely convinced this is a compiler bug. For whatever reason name and the
canary value are put to close each other on the stack.

Case closed?

comment:9 Changed 11 years ago by nickm

Probably, I think. If SSP is treating a 32-word value as being 32-bytes instead, then we can't really work around that.

comment:10 Changed 11 years ago by nickm

flyspray2trac: bug closed.
Closing as "won't fix" -- this looks like a SSP bug.

comment:11 Changed 8 years ago by nickm

Component: Tor ClientTor
Note: See TracTickets for help on using tickets.