Opened 10 years ago

Closed 9 years ago

Last modified 7 years ago

#1125 closed defect (fixed)

(*chp)->next assertion fail?

Reported by: arma Owned by: nickm
Priority: High Milestone: Tor: 0.2.2.x-final
Component: Core Tor/Tor Version: 0.2.2.5-alpha
Severity: Keywords:
Cc: arma, nickm, Sebastian, Funkstar Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description (last modified by nickm)

I found this on irc. So I have no idea if it's for real. The person indicated
it was on 0.2.2.5-alpha.

[Info] buf_shrink_freelists():
Cleaning freelist for 4096-byte chunks: keeping 9, dropping 1.
[Error] Bug: buffers.c:269: buf_shrink_freelists: Assertion
(*chp)->next fail

He also said "polipo still not starting with vidalai 2.5", so I'm thinking
probably Windows.

[Automatically added by flyspray2trac: Operating System: All]

Child Tickets

Attachments (1)

tor-bug-1125.txt (52.1 KB) - added by janak 10 years ago.
Log debug level until crash occurs

Download all attachments as: .zip

Change History (27)

comment:1 Changed 10 years ago by nickm

The person showed up again to confirm that this happens on XP Pro SP3, and occurs in 0.2.2.5-alpha but has not
been observed in 0.2.2.4-alpha.

comment:2 Changed 10 years ago by nickm

Of course, this could just mean that 0.2.2.4-alpha crashes before this bug can happen.

Changed 10 years ago by janak

Attachment: tor-bug-1125.txt added

Log debug level until crash occurs

comment:3 Changed 10 years ago by janak

It seems to me like a server not a client bug. The log file is attached.

comment:4 Changed 9 years ago by nickm

Milestone: Tor: 0.2.2.x-final

Related to at least one case of Bug #977. For now, the best we can do is probably add more assertions to the buffer freelist code to try to detect whatever is going wrong earlier.

[Has anybody seen this happen in a recent version?]

comment:5 Changed 9 years ago by nickm

Description: modified (diff)
Status: newneeds_review

See branch warn1125 in my public repo. It doesn't fix the root problem, but it downgrades it from an assertion failure to a LD_BUG warning.

comment:6 Changed 9 years ago by nickm

Milestone: Tor: 0.2.2.x-final
Owner: set to nickm
Status: needs_reviewaccepted

06:22 < armadev> warn1125 looks fine. feel free to merge

So merged. Taking this bug out from "0.2.2.x-final," since it should no longer cause crashes, and its etiology is still uncertain. If we get more insight on it from the new debugging messages, that'd probably help.

comment:7 Changed 9 years ago by nickm

(Closed #1833 as a duplicate of this bug.)

comment:8 Changed 9 years ago by nickm

Funkstar reports on #1878 :

Aug 26 03:26:49.050 [Info] buf_shrink_freelists(): Cleaning freelist for 4096-byte chunks: length 9, keeping 8, dropping 1.
Aug 26 03:26:49.050 [Error] buf_shrink_freelists(): Bug: buffers.c:281: buf_shrink_freelists: Assertion n_to_skip + n_to_free == freelists[i].cur_length failed; aborting.
Win7x64 Limited Account, vidalia x15 alpha (tor error)

That's really bizarre, since (if the log can be believed), n_to_skip is 8 and n_to_free is 1 and freelists[i].cur_length is 9, and 8+1 should certainly equal 9 in my world.

Something very strange is going on here.

comment:9 Changed 9 years ago by nickm

Err, that should have been #1868 .

comment:10 Changed 9 years ago by Sebastian

Milestone: Tor: 0.2.2.x-final

Do we have any idea how to reproduce the problem? Is there something in common between different reports? I'd like to try and reproduce it myself.

Looks like the patch to downgrade the assert didn't really stop windows machines from asserting :( Adding the milestone back.

comment:11 Changed 9 years ago by asn

[cc]

comment:12 Changed 9 years ago by Sebastian

Cc: Funkstar added

Can anyone who has experienced this bug provide more details? Did it happen during shutdown, shortly after starting, when running a bridge/relay (if so, exit or non-exit?) etc? Anything at all that might provide a hint?

comment:13 in reply to:  12 Changed 9 years ago by funkstar

Replying to Sebastian:

Can anyone who has experienced this bug provide more details? Did it happen during shutdown, shortly after starting, when running a bridge/relay (if so, exit or non-exit?) etc? Anything at all that might provide a hint?

Sebastian, it was about 2 hours after the _relay_ was started, so it was just starting to receive connections, then it killed tor.exe with this assert message.

I'm not sure what you mean by exit/non-exit, I allow all exit types that's given to me in Vidalia (web, irc, etc..)

comment:14 Changed 9 years ago by Sebastian

So, I foudn a way to reproduce this. The next question is where to find someone with windows debugging experience...

comment:15 Changed 9 years ago by nickm

Sebastian reports that he can repro this in windows7 on a 64-bit host running the Tor binary in WoW64 mode.

comment:16 Changed 9 years ago by Sebastian

I'm not sure whether what "running in WoW64 mode" means. The Task Manager reports Tor to be running as a 32bit executable.

comment:17 Changed 9 years ago by arma

Doing triage: it looks like this might be a new bug to 0.2.2.x, so it's likely we'll want to try to resolve it during the lifetime of 0.2.2. It shouldn't block the release candidate though.

comment:18 Changed 9 years ago by arma

Today on IRC:

<PTVuser> Latest alpha, win32: [Error] buf_shrink_freelists(): Bug: buffers.c:281: buf_shrink_freelists: Assertion n_to_skip + n_to_free == freelists[i].cur_length failed; aborting.

comment:19 Changed 9 years ago by Sebastian

Priority: minormajor

More reports of this bug randomly happening. Bumping to major, especially because the failure mode is quite bad: Vidalia tells people to report the bug to the developers, but without an actual bug message (see #2142).

comment:20 Changed 9 years ago by nickm

Robert Ransom solved this. The problem is that buf_shrink_freelists() can call log_info, which can log to a controller, which can make the controller allocate buffer chunks, which can affect the freelists.

08:33 < rransom> See git://repo.or.cz/tor/rransom.git bug1125-master .
08:34 < rransom> ( http://repo.or.cz/w/tor/rransom.git/ )
08:40 < nickm> ok; my current inclination is to go with your moving of the 
               log_info stuff, add the enable/disable_control_logging functions 
               to wrap the whole function, and reinstate the log_warns.   Maybe 
               also, in master, rename the enable/disable_control_logging 
               functions to something that implies fragile logging, or 
               potentially reentrant logging, or callback-driven logging
08:40 < nickm> then to backport to 0.2.1, and 0.2.2

It's not a huge amount of work, but let's do it so we can close this dumb thing and feel smart.

comment:21 Changed 9 years ago by rransom

See bug1125-022 ( git://repo.or.cz/tor/rransom.git bug1125-022 ) and bug1125-021 ( git://repo.or.cz/tor/rransom.git bug1125-021 ) for backports.

bug1125-master will produce incorrect log messages, in addition to not being what nickm wants; apply bug1125-022 and merge instead.

comment:22 Changed 9 years ago by nickm

Merged. It is good to have this one finally fixed. I am going to leave this bug open for just a little longer, to get feedback from anybody who was previously able to reproduce it reliably. The fix is merged into maint-0.2.1, maint-0.2.2, and master on git, and it should appear in 0.2.1.27, 0.2.2.18-rc, and 0.2.3.1-alpha.

comment:23 Changed 9 years ago by Sebastian

So, here's a question about the merged fix. When a tor_assert() triggers inside a disable_control_logging() section, the controller (virtually all Vidalia users) will never have a chance to provide us with a log message of what went wrong. Is it safe to send messages of severity LOG_ERR to the controller even if control logging is otherwise disabled? If it is safe, should we do it?

comment:24 Changed 9 years ago by nickm

Sebastian: I think rransom opened #2191 to address the "send log messages anyway!" issue. So far, seems hard to me.

comment:25 Changed 9 years ago by nickm

Resolution: Nonefixed
Status: acceptedclosed

Closing as fixed. Let's reopen if it appears again.

comment:26 Changed 7 years ago by nickm

Component: Tor ClientTor
Note: See TracTickets for help on using tickets.