Opened 7 years ago

Closed 2 years ago

#8795 closed defect (fixed)

Make #8822 survivable

Reported by: cypherpunks Owned by:
Priority: High Milestone: Tor: unspecified
Component: Core Tor/Tor Version: Tor: 0.2.4.12-alpha
Severity: Normal Keywords: tor-client, regression
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

I have ran this instance of Tor about six times before this error. Upon getting this error it would crash after every attempt to run the Tor browser bundle. I had to delete the folder and re-extract the Tor browser bundle to get it to run again. Btw, Windows reported that tor.exe stopped running, so I reported this as a problem with Tor.

[Notice] Tor v0.2.4.12-alpha (git-91b8bc26f160f172) running on Windows 7 with Libevent 2.0.21-stable and OpenSSL 1.0.0k.
[Notice] Tor can't help you if you use it wrong! Learn how to be safe at https://www.torproject.org/download/download#warning
[Notice] This version is not a stable Tor release. Expect more bugs than usual.
[Notice] Read configuration file "K:\Router\Tor Browser\Data\Tor\torrc".
[Warning] You have asked to exclude certain relays from all positions in your circuits. Expect hidden services and other Tor features to be broken in unpredictable ways.
[Notice] Opening Socks listener on 127.0.0.1:9150
[Notice] Opening Control listener on 127.0.0.1:9151
[Notice] Parsing GEOIP IPv4 file .\Data\Tor\geoip.
[Notice] Parsing GEOIP IPv6 file .\Data\Tor\geoip6.
[Warning] Error replacing "K:/Router/Tor Browser/Data/Tor\cached-microdescs": Permission denied
[Warning] Error rebuilding microdescriptor cache: Permission denied
[Notice] We now have enough directory information to build circuits.
[Notice] Bootstrapped 80%: Connecting to the Tor network.
[Notice] New control connection opened.
[Error] getinfo_helper_dir(): Bug: control.c:1715: getinfo_helper_dir: Assertion md->body failed; aborting.

Child Tickets

Change History (27)

comment:1 Changed 7 years ago by cypherpunks

Keywords: getinfo_helper_dir added
Summary: Error rebuilding microdescriptor cache: Permission denied[Error] getinfo_helper_dir()

comment:3 Changed 7 years ago by arma

Summary: [Error] getinfo_helper_dir()control.c:1715: getinfo_helper_dir: Assertion md->body failed

comment:4 Changed 7 years ago by nickm

Status: newneeds_review
[Warning] Error replacing "K:/Router/Tor Browser/Data/Tor\cached-microdescs": Permission denied
[Warning] Error rebuilding microdescriptor cache: Permission denied

This part looks like a possible duplicate of #8822, a.k.a #2077.

[Error] getinfo_helper_dir(): Bug: control.c:1715: getinfo_helper_dir: Assertion md->body failed; aborting.

Well, that's bizarre. Since when do we allow a microdescriptor without a body? I suppose that must be provoked by a failure to write to the microdesc cache.

I've done a possible fix as branch "bug8795". I think that 6905c1f60d is responsible.

comment:5 Changed 7 years ago by nickm

Status: needs_reviewneeds_revision
Summary: control.c:1715: getinfo_helper_dir: Assertion md->body failedMake #2077 survivable

There's more work to do here so that the other mds don't get reindexed if the replace operation fails.

comment:6 Changed 6 years ago by nickm

Status: needs_revisionneeds_review

Okay, I have a much more thorough workaround in branch "bug8795_v2" in my public repository.

comment:7 Changed 6 years ago by andrea

This looks okay to me; for testing, maybe we can force a failure by exclusive-locking cached-microdescs? E.g., something like "flock -x /var/tor/cached-microdescs sleep <n>" to grab and hold for n seconds.

comment:8 Changed 6 years ago by andrea

I'll amend that; flock is just advisory locking by default. The fcntl(2) man page makes it sound like you can get the Windows-like mandatory locking behavior on Linux by mounting the filesystem with -o mand, though. This could probably be tested with its datadir on a loopback-mounted filesystem with that option and flock.

comment:9 Changed 6 years ago by andrea

More wackiness with mandatory locking:

"To make use of mandatory locks, mandatory locking must be enabled both on the file system that contains the file to be locked, and on the file itself. Mandatory locking is enabled on a file system using the "-o mand" option to mount(8), or the MS_MANDLOCK flag for mount(2). Mandatory locking is enabled on a file by disabling group execute permission on the file and enabling the set-group-ID permission bit (see chmod(1) and chmod(2))."

Sheesh...

comment:10 Changed 6 years ago by nickm

gevalt! maybe I'd be better off just making the mmap or write fail by building a Tor where it always fails.

comment:11 in reply to:  10 ; Changed 6 years ago by andrea

Replying to nickm:

gevalt! maybe I'd be better off just making the mmap or write fail by building a Tor where it always fails.

Yeah, that could work too. Alternately, maybe chown/chmod the file out from under it so it fails for permission reasons.

comment:12 in reply to:  11 Changed 6 years ago by rransom

Replying to andrea:

Replying to nickm:

gevalt! maybe I'd be better off just making the mmap or write fail by building a Tor where it always fails.

Yeah, that could work too. Alternately, maybe chown/chmod the file out from under it so it fails for permission reasons.

That won't work -- file-access permission checks are normally done when a file is opened, not when a file descriptor is used.

You could fill the filesystem so write returns ENOSPC, or unplug the disk so the OS starts reporting real errors immediately. Or just find a way to test on Windows.

comment:13 Changed 6 years ago by nickm

Except, start_writing_to_file opens an fd.

comment:14 in reply to:  13 Changed 6 years ago by andrea

Replying to nickm:

Except, start_writing_to_file opens an fd.

Yeah; I checked with lsof before I said anything about flock. If it holds the fd open continuously you wouldn't be able to mandatory-lock it either.

comment:15 in reply to:  13 Changed 6 years ago by rransom

Replying to nickm:

Except, start_writing_to_file opens an fd.

start_writing_to_file as called by microdesc_cache_rebuild opens an fd for a non-existent file, so calling chown or chmod on the destination file wouldn't work either.

To cause finish_writing_to_file to fail on Unix, one would have to change the containing directory's permissions at exactly the right time. On Windows NT, it's much easier to make DeleteFile fail: just keep the file ‘open for normal I/O or as a memory-mapped file’ -- as microdesc_cache_rebuild does.

comment:16 Changed 6 years ago by nickm

Keywords: tor-client 024-backport added; Tor Permisson denied getinfo_helper_dir removed
Milestone: Tor: 0.2.4.x-finalTor: 0.2.5.x-final
Summary: Make #2077 survivableMake #8822 survivable

Thanks for the tip. It looks like the underlying bug is not #2077. Bug #8822 (and the original report of this) were both caused by the change to the order of unmap and finish_writing to file in 6905c1f6. I'm using #8822 to track the bug reintroduced there, and saving this ticket to track the idea of making #8822 failure survivable.

comment:17 Changed 6 years ago by andrea

Keywords: 025-triaged added

comment:18 Changed 6 years ago by nickm

Milestone: Tor: 0.2.5.x-finalTor: 0.2.6.x-final
Status: needs_reviewneeds_revision

comment:19 Changed 5 years ago by nickm

Resolution: wontfix
Status: needs_revisionclosed

I believe we squashed the underlying bug here. If we didn't, we should reopen this ticket.

comment:20 Changed 4 years ago by gk

Milestone: Tor: 0.2.6.x-finalTor: 0.2.8.x-final
Resolution: wontfix
Severity: Normal
Status: closedreopened

comment:21 in reply to:  20 Changed 4 years ago by nickm

Keywords: 027-backport 026-backport regression added; 024-backport 025-triaged removed
Priority: MediumHigh

Replying to gk:

Seems the issue is still happening: https://blog.torproject.org/blog/tor-browser-503-released#comment-113899

Note: that's Tor 0.2.6.10.

comment:22 Changed 4 years ago by nickm

Milestone: Tor: 0.2.8.x-finalTor: 0.2.???

It is impossible that we will fix all 277 currently open 028 tickets before 028 releases. Time to move some out. This is my first pass through the "new" and "reopened" tickets, looking for things to move to ???.

comment:23 Changed 3 years ago by teor

Milestone: Tor: 0.2.???Tor: 0.3.???

Milestone renamed

comment:24 Changed 3 years ago by nickm

Keywords: tor-03-unspecified-201612 added
Milestone: Tor: 0.3.???Tor: unspecified

Finally admitting that 0.3.??? was a euphemism for Tor: unspecified all along.

comment:25 Changed 2 years ago by nickm

Keywords: tor-03-unspecified-201612 removed

Remove an old triaging keyword.

comment:26 Changed 2 years ago by nickm

Keywords: 027-backport 026-backport removed

comment:27 Changed 2 years ago by nickm

Resolution: fixed
Status: reopenedclosed
Note: See TracTickets for help on using tickets.