Opened 9 years ago
Closed 5 years ago
#3390 closed defect (worksforme)
Tor exit node crashed
Reported by: | whabib | Owned by: | |
---|---|---|---|
Priority: | High | Milestone: | Tor: unspecified |
Component: | Core Tor/Tor | Version: | |
Severity: | Keywords: | tor-relay | |
Cc: | whabib@… | Actual Points: | |
Parent ID: | Points: | ||
Reviewer: | Sponsor: |
Description
My tor exit node crashed out of the blue yesterday. I have since restarted. The router fingerprint is:
Krugman DFB1A14FD43679D57818AB706389892C4AB64A65
The line from the log file when it crashed was:
Jun 11 19:10:27.283 [err] Bug: routerlist.c:4783: routerlist_assert_ok: Assertion !memcmp(sd->extra_info_digest, d, DIGEST_LEN) failed; aborting.
I am running Tor 0.2.1.30 on CentOS linux.
Child Tickets
Change History (16)
comment:1 Changed 9 years ago by
Cc: | whabib@… added |
---|---|
Component: | - Select a component → Tor Relay |
comment:2 Changed 9 years ago by
comment:3 Changed 9 years ago by
Milestone: | → Tor: 0.2.2.x-final |
---|---|
Priority: | normal → major |
It's not an instance of that, I believe. The bug in the "hoo boy" comment is that nothing keeps two descriptors from having the same extrainfo_digest, and so there's no guarantee that a descriptor sd will appear in desc_by_eid_map[sd->extrainfo_digest], since some other descriptor might appear there.
But this bug is different: there's an entry sd in desc_by_eid_map[x] such sd->extrainfo_digest != x, which implies that desc_by_eid_map is inconsistent. This could happen because of three clauses:
- Adding someting to desc_by_eid_map incorrectly
- Changing an extrainfo_digest entry after adding it to desc_by_eid_map
- Memory corruption.
Investigating further...
comment:4 Changed 9 years ago by
.... as far as I can tell, none of the first 2 possible causes (not "clauses," apparently I can't type today) is happening. Revised possibilities seem to be:
- Adding someting to desc_by_eid_map incorrectly -- except that everywhere we add to desc_by_eid_map, I believe we are doing it right.
- Changing an extrainfo_digest entry after adding it to desc_by_eid_map -- except that nothing modifies an extrainfo_digest once it's set.
- Freeing a descriptor without removing it from desc_by_eid_map, which poisons the whole routerinfo_t. This is likeliest to be happening with a routerinfo_t, since there's only two places that call signed_descriptor_free(), and they both look safe to me.
- Generic corruption -- always conceivable, though if this is the cause here, it'll be hard to trace it down.
So the next step is to audit every routerinfo_free and make sure it's safe.
comment:5 Changed 9 years ago by
suspicious: routerlist_insert_old and signed_descriptor_from_routerinfo both free a routerinfo without guaranteeing that it's been removed from desc_by_eid_map. If either one can be called on a routerinfo_t that's already in the desc_by_eid_map without first removing it, that would account for this bug.
comment:6 Changed 9 years ago by
Hm, no. Every instance that calls signed_descriptor_from_routerinfo() seems to maintain desc_by_eid_map correctly, and everything that calls routerlist_insert_old seems to do so with a not-already-added descriptor. I'm hitting a wall here; will come back to look and think more later.
whabib -- if this ever happens again and you can get a stack trace, that would be great.
comment:7 Changed 9 years ago by
wanoskarnet reports that routerlist_replace()
overwrites map
's pointer to a new cache_info
if ri_new->cache_info.extra_info_digest
and ri_old->cache_info.extra_info_digest
are the same.
comment:8 Changed 8 years ago by
Hm. We're talking about overwriting the desc_by_eid_map entry? I don't see how that can cause this bug. If both ri_new and ri_old have the same nonzero extra_info_digest, then:
- First, we
sdmap_set(rl->desc_by_eid_map, ri_new->cache_info.extra_info_digest, &ri_new->cache_info);
- Next, if we decide to keep the old routerinfo, we make ri_old into an sd, and call
sdmap_set(rl->desc_by_eid_map, sd->extra_info_digest, sd);
. That makes the map entry point to sd, which shouldn't cause this particular assertion. - ...or if we decide NOT to keep the old routerinfo, and the descriptors are different, we call
sdmap_remove(rl->desc_by_eid_map, ri_old->cache_info.extra_info_digest);
. This would make there be no entry for this particular extrainfo_digest, which also wouldn't cause this assertion AFAICT. - OTOH, if we decide not to keep the old routerinfo, and the descriptors are the same, they necessarily have the same extrainfo_digest, so there can't be a dangling pointer in the map.
comment:9 Changed 8 years ago by
Ah, and looking at backlog, it seems that wanoskarnet may have changed his mind shortly after you posted this. :/
comment:10 Changed 8 years ago by
Status: | new → needs_information |
---|
Did this ever happen again? Maybe we can get this reproduced somehow
comment:11 Changed 7 years ago by
Milestone: | Tor: 0.2.2.x-final → Tor: 0.2.4.x-final |
---|
comment:12 Changed 7 years ago by
Keywords: | tor-relay added |
---|
comment:13 Changed 7 years ago by
Component: | Tor Relay → Tor |
---|
comment:14 Changed 7 years ago by
Milestone: | Tor: 0.2.4.x-final → Tor: unspecified |
---|
comment:15 Changed 7 years ago by
Just for the record, I have not seen a recurrence of this bug in a real long time. We can close it, unless you guys are using it to track something else.
comment:16 Changed 5 years ago by
Resolution: | → worksforme |
---|---|
Status: | needs_information → closed |
2 years old and whatbib did mention that the bug has not been seen in a long time.
I wonder if this is another instance of the issue that Nickm commented starting with "Hoo boy". (I see the comment has survived into master too.)