It's not an instance of that, I believe. The bug in the "hoo boy" comment is that nothing keeps two descriptors from having the same extrainfo_digest, and so there's no guarantee that a descriptor sd will appear in desc_by_eid_map[sd->extrainfo_digest], since some other descriptor might appear there.
But this bug is different: there's an entry sd in desc_by_eid_map[x] such sd->extrainfo_digest != x, which implies that desc_by_eid_map is inconsistent. This could happen because of three clauses:
Adding someting to desc_by_eid_map incorrectly
Changing an extrainfo_digest entry after adding it to desc_by_eid_map
Memory corruption.
Investigating further...
Trac: Milestone: N/Ato Tor: 0.2.2.x-final Priority: normal to major
.... as far as I can tell, none of the first 2 possible causes (not "clauses," apparently I can't type today) is happening. Revised possibilities seem to be:
Adding someting to desc_by_eid_map incorrectly -- except that everywhere we add to desc_by_eid_map, I believe we are doing it right.
Changing an extrainfo_digest entry after adding it to desc_by_eid_map -- except that nothing modifies an extrainfo_digest once it's set.
Freeing a descriptor without removing it from desc_by_eid_map, which poisons the whole routerinfo_t. This is likeliest to be happening with a routerinfo_t, since there's only two places that call signed_descriptor_free(), and they both look safe to me.
Generic corruption -- always conceivable, though if this is the cause here, it'll be hard to trace it down.
So the next step is to audit every routerinfo_free and make sure it's safe.
suspicious: routerlist_insert_old and signed_descriptor_from_routerinfo both free a routerinfo without guaranteeing that it's been removed from desc_by_eid_map. If either one can be called on a routerinfo_t that's already in the desc_by_eid_map without first removing it, that would account for this bug.
Hm, no. Every instance that calls signed_descriptor_from_routerinfo() seems to maintain desc_by_eid_map correctly, and everything that calls routerlist_insert_old seems to do so with a not-already-added descriptor. I'm hitting a wall here; will come back to look and think more later.
whabib -- if this ever happens again and you can get a stack trace, that would be great.
wanoskarnet reports that routerlist_replace() overwrites map's pointer to a new cache_info if ri_new->cache_info.extra_info_digest and ri_old->cache_info.extra_info_digest are the same.
Hm. We're talking about overwriting the desc_by_eid_map entry? I don't see how that can cause this bug. If both ri_new and ri_old have the same nonzero extra_info_digest, then:
First, we sdmap_set(rl->desc_by_eid_map, ri_new->cache_info.extra_info_digest, &ri_new->cache_info);
Next, if we decide to keep the old routerinfo, we make ri_old into an sd, and call sdmap_set(rl->desc_by_eid_map, sd->extra_info_digest, sd);. That makes the map entry point to sd, which shouldn't cause this particular assertion.
...or if we decide NOT to keep the old routerinfo, and the descriptors are different, we call sdmap_remove(rl->desc_by_eid_map, ri_old->cache_info.extra_info_digest);. This would make there be no entry for this particular extrainfo_digest, which also wouldn't cause this assertion AFAICT.
OTOH, if we decide not to keep the old routerinfo, and the descriptors are the same, they necessarily have the same extrainfo_digest, so there can't be a dangling pointer in the map.
Just for the record, I have not seen a recurrence of this bug in a real long time. We can close it, unless you guys are using it to track something else.