Opened 4 years ago

Closed 4 years ago

Last modified 4 years ago

#15816 closed defect (fixed)

HS Descriptor Fetch retry behavior is broken.

Reported by: yawning Owned by:
Priority: High Milestone: Tor: 0.2.7.x-final
Component: Core Tor/Tor Version: Tor: 0.2.7
Severity: Keywords: tor-hs
Cc: dgoulet Actual Points:
Parent ID: Points: medium
Reviewer: Sponsor: SponsorR

Description

#14847 broke the HS descriptor retry logic.

rendclient.c:lookup_last_hid_serv_request()

  base32_encode(hsdir_id_base32, sizeof(hsdir_id_base32),
                hs_dir->identity_digest, DIGEST_LEN);
  tor_snprintf(hsdir_desc_comb_id, sizeof(hsdir_desc_comb_id), "%s%s",
               hsdir_id_base32,
               desc_id_base32);

This used to be onion address based, but was changed to be the descriptor ID, which is ok.

rendclient.c:purge_hid_serv_from_last_hid_serv_requests()

    if (tor_memeq(key + LAST_HID_SERV_REQUEST_KEY_LEN -
                  REND_SERVICE_ID_LEN_BASE32,
                  onion_address,
                  REND_SERVICE_ID_LEN_BASE32)) {
      iter = strmap_iter_next_rmv(last_hid_serv_requests, iter);
      tor_free(val);
    } else {

This needs a corresponding change. The routine is called from rend_client_desc_trynow() if the fetch 404s to ensure that applications aren't stuck waiting for 15 mins if a fetch happens to fail.

Fixing this will also fix "HSFETCH" not always actually sending all the events back (particularly if the descriptor is missing, and a subsequent HSFETCH is issued).

Child Tickets

Change History (8)

comment:1 Changed 4 years ago by dgoulet

Keywords: SponsorR added
Points: medium
Status: newneeds_review

You can find a fix in bug15816_027_02.

It's medium-ish size split in 4 commits. Some refactoring was needed to come up with a reasonable fix. All this work made me wonder if last_hid_serv_requests table is at all useful but I'll leave that for another ticket.

comment:2 Changed 4 years ago by special

Status: needs_reviewneeds_revision

I don't have time to look closely now; but a quick reaction:

18f61f97e73c801ab942bf92d0aa95182744d65e

  • descriptor_id and desc_id_fetch are used as NUL-terminated strings, but rend_compute_v2_desc_id actually provides a binary digest.

(looks like that was a preexisting problem too)

  • rend_data_create is not very nice API. Redundant (pk_digest and onion_address), mutually exclusive (onion_address and desc_id), and confusing (descriptor_cookie and cookie) parameters. Weird to take descriptor_cookie but not auth_type.

comment:3 in reply to:  2 Changed 4 years ago by dgoulet

Replying to special:

I don't have time to look closely now; but a quick reaction:

18f61f97e73c801ab942bf92d0aa95182744d65e

  • descriptor_id and desc_id_fetch are used as NUL-terminated strings, but rend_compute_v2_desc_id actually provides a binary digest.

(looks like that was a preexisting problem too)

Good catch! Fixed that, now using memcpy.

  • rend_data_create is not very nice API. Redundant (pk_digest and onion_address), mutually exclusive (onion_address and desc_id), and confusing (descriptor_cookie and cookie) parameters. Weird to take descriptor_cookie but not auth_type.

True, new branch seperates them by service and client using the right arguments that each needs.

Also FYI, if we merge this, #15822 is fixed with this work. :)

See bug15816_027_03.

comment:4 Changed 4 years ago by dgoulet

Status: needs_revisionneeds_review

comment:5 Changed 4 years ago by special

Status: needs_reviewneeds_revision

025e135155dd70febbc98eef0fa6bbc7ef32f7f6

rendclient.c:

+  } else if (query->desc_id_fetch[0] != '\0') {
+    ret = fetch_v2_desc_by_descid(query->desc_id_fetch, query, hsdirs);

Another case where those are used as strings. This one will make 1/256 descriptors fail.

rendcommon.c:

+ * needed. Both can be given but the onion address will always be used first
+ * to make a descriptor fetch.

Does that mean the descriptor id will be ignored, or that it will be used after the onion id, or do we not know? If there isn't any logical behavior when both are non-NULL, maybe it shouldn't accept both.

98fd8a65471ada74e49710f5f112394298e869d2

purge_hid_serv_from_last_hid_serv_requests docs need to be updated, especially to mention that it takes an unencoded desc_id of DIGEST_LEN bytes.

a35440dca4e93ec4caf9ff88d9122a96314fbc32

rendclient.c:

+      /* Not equal from what we currently have so purge the last hid serv
+       * request cache and update the descriptor ID with the new value. */
+      purge_hid_serv_from_last_hid_serv_requests(descriptor_id);

This purges the new descriptor id, not the old one.

Does it matter that when we updating descriptor IDs in fetch_v2_desc_by_addr, we don't touch IDs for all replicas?

---

After that, I think this will be LGTM. Sorry I didn't read on and catch this stuff yesterday.

comment:6 Changed 4 years ago by dgoulet

Status: needs_revisionneeds_review

Big thanks. Here are the fixes: bug15816_027_04.

Does it matter that when we updating descriptor IDs in fetch_v2_desc_by_addr, we don't touch IDs for all replicas?

I'm not sure to understand but the loop here makes sure all replicas are updated if needed.

Does that mean the descriptor id will be ignored, or that it will be used after the onion id, or do we not know? If there isn't any logical behavior when both are non-NULL, maybe it shouldn't accept both.

Yes, I made a slight change to make it more obvious. There are no use cases right now that uses both but I kept the door open for the future if we wanted to have both in it for whatever reasons.

comment:7 Changed 4 years ago by nickm

Resolution: fixed
Status: needs_reviewclosed

Great; merged! (with +1 from special)

comment:8 Changed 4 years ago by dgoulet

Keywords: SponsorR removed
Sponsor: SponsorR
Note: See TracTickets for help on using tickets.