Opened 6 years ago

Closed 6 years ago

#11737 closed defect (implemented)

Damage of cached hte_hash values for HTs leads to undefined behavior

Reported by: cypherpunks Owned by:
Priority: Medium Milestone: Tor: 0.2.6.x-final
Component: Core Tor/Tor Version:
Severity: Keywords: tor-relay, 026-triaged-1, 026-deferrable, nickm-patch
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Low budget memory modules affected by soft errors
Broken hte_hash value indirectly but drastically changes logic of code. Tor should to fix damaged hte_hash values or assert on failure. Or to be ready for such damages so it wasn't reason of another looks alike random bugs.
For example usage of HT_NEXT(_RMV) for iterating over hash table with broken hash values leads to shortage or infinite loops.

Child Tickets

Attachments (1)

detect_broken_ht.diff (2.9 KB) - added by cypherpunks 6 years ago.
simple but brutal

Download all attachments as: .zip

Change History (16)

comment:1 Changed 6 years ago by cypherpunks

Summary: Damage of cached hte_hash values for HTs leads to undefined befaviorDamage of cached hte_hash values for HTs leads to undefined behavior

comment:2 Changed 6 years ago by nickm

Keywords: tor-relay added
Milestone: Tor: 0.2.5.x-final

comment:3 Changed 6 years ago by nickm

What kind of fix/detection do you suggest? Re-calculating the hte_hash values and verifying that they are as expected? Verifying that they match the expected bucket on FIND/REPLACE/NEXT? Something else?

Changed 6 years ago by cypherpunks

Attachment: detect_broken_ht.diff added

simple but brutal

comment:4 Changed 6 years ago by cypherpunks

Attached simple solution for case of nodelist_map. It should be fast enough, if that working solution actually (can it leads to false positive?). Probably need to warn only without crash by assert, but broken could be danger. And no general solution for plan B for every hashtables, this one could be fixed by using the_nodelist->nodes to iterate over actual list to recreate HT. On the other hand it's not task for Tor to fix memory failures or do some workarounds for that case.

comment:5 Changed 6 years ago by cypherpunks

if that working solution actually

Or maybe that can skip some cases?

comment:6 Changed 6 years ago by cypherpunks

Attached simple solution

HT_REP_IS_BAD_ should be enough too, instead of new fragile construction. Or probably better to modify HT funcs to check index on duty.
It's good to check every element on every HT func but it could to cost some cpu time.

comment:7 Changed 6 years ago by nickm

What would you think about adding an HT_REP_IS_BAD_ check to the logging code for #7164 in 0.2.5, and then add more internal checks to the HT code in 0.2.6?

comment:8 Changed 6 years ago by cypherpunks

Sounds good.

comment:9 Changed 6 years ago by nickm

Status: newneeds_review

Proposed 0.2.5 logging enhancement in branch "bug11737_diagnostic" in my public repo; needs review.

comment:10 Changed 6 years ago by nickm

Milestone: Tor: 0.2.5.x-finalTor: 0.2.6.x-final
Status: needs_reviewnew

Merged that branch; "more internal checks" is now for 0.2.6

comment:11 Changed 6 years ago by nickm

Keywords: 026-triaged-1 026-deferrable added

comment:12 Changed 6 years ago by nickm

Status: newneeds_review

There are some new assertions in ticket11737 ; please review?

comment:13 Changed 6 years ago by nickm

Keywords: nickm-patch added

Add the nickm-patch keyword to some needs_review tickets where I wrote or substantially revised the patch. This helps me find which tickets I should review and which I should find reviewers for.

comment:14 Changed 6 years ago by sysrqb

these sanity checks seem sane. Also, since you created HT_BUCKET_NUM_, if I'm not mistaken, you can use it again in _HT_NEXT() and _HT_REP_IS_BAD_() - it'll make the functions slightly more readable, I think.

otherwise, lgtm. (leaving in needs_review so others will still look at it).

diff --git a/src/ext/ht.h b/src/ext/ht.h
index ee64e55..481e8f1 100644
--- a/src/ext/ht.h
+++ b/src/ext/ht.h
@@ -287,8 +287,7 @@ ht_string_hash(const char *s)
              HT_BUCKET_NUM_(head,field,(*elm)->field.hte_next,hashfn)); \
       return &(*elm)->field.hte_next;                                   \
     } else {                                                            \
-      unsigned b = (HT_ELT_HASH_(*elm, field, hashfn)                   \
-      % head->hth_table_length)+1;                                      \
+      unsigned b = HT_BUCKET_NUM_(head,field,*elm,hashfn)+1;            \
       while (b < head->hth_table_length) {                              \
         if (head->hth_table[b]) {                                       \
           HT_ASSERT_(b ==                                               \
@@ -434,7 +433,7 @@ ht_string_hash(const char *s)
       for (elm = head->hth_table[i]; elm; elm = elm->field.hte_next) {  \
         if (HT_ELT_HASH_(elm, field, hashfn) != hashfn(elm))            \
           return 1000 + i;                                              \
-        if ((HT_ELT_HASH_(elm, field, hashfn) % head->hth_table_length) != i) \
+        if (HT_BUCKET_NUM_(head,field,elm,hashfn) != i)                 \
           return 10000 + i;                                             \
         ++n;                                                            \
       }                                                                 \

comment:15 Changed 6 years ago by nickm

Resolution: implemented
Status: needs_reviewclosed

In that case, merged. Thanks!

Note: See TracTickets for help on using tickets.