Opened 9 months ago

Last modified 4 months ago

#31632 needs_revision defect

hs-v3: Service doesn't re-upload descriptor on circuit failure

Reported by: dgoulet Owned by: dgoulet
Priority: Medium Milestone: Tor: 0.4.4.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: tor-hs, 042-deferred-20190918, 043-deferred
Cc: asn Actual Points:
Parent ID: Points: 0.5
Reviewer: asn Sponsor: Sponsor27-can

Description

I'm observing, quite often actually, a service posting its descriptor to an HSDir but the circuit collapses due to remote reason CHANNEL_CLOSED.

This is possible for many reasons where a link between two relays failed/disconnected/closed/...

However, we do not retry the upload after that which means that we can end up with HSDir(s) without our descriptor even though we think they are there.

Solution is unclear but it appears that we probably want to hook this case into hs_circ_cleanup() which is called from the mark for close function.

Child Tickets

Change History (15)

comment:1 Changed 9 months ago by nickm

Keywords: 042-deferred-20190918 added
Milestone: Tor: 0.4.2.x-finalTor: unspecified

Deferring various tickets from 0.4.2 to Unspecified.

comment:2 Changed 9 months ago by neel

Owner: set to neel
Status: newassigned

comment:3 Changed 9 months ago by neel

Owner: neel deleted

comment:4 Changed 9 months ago by neel

Status: assignednew

comment:5 Changed 9 months ago by arma

For the v2 case, check out these elements in the descriptor struct:

  /** Has descriptor been uploaded to all hidden service directories? */
  int all_uploads_performed;
  /** List of hidden service directories to which an upload request for
   * this descriptor could be sent. Smartlist exists only when at least one
   * of the previous upload requests failed (otherwise it's not important
   * to know which uploads succeeded and which not). */
  smartlist_t *successful_uploads;

So v2 has the ability to notice that it e.g. doesn't have enough directory information about a given hsdir, and it will try republishing to just those hsdirs later in that case. But v2 is still missing the feature where the hsdir info is all set, and we make the circuit and attempt the upload, and something goes wrong with the circuit or stream.

I looked at hs_circ_cleanup(), and maybe it is useful here but it's not obvious how.

My thought instead would be to put something like the above v2 state into the v3 descriptor struct, and then hook connection_dir_client_request_failed() to note that a given hsdir upload needs to be retried on the next iteration. See how connection_dir_about_to_close(), which calls that request_failed function, has a call to connection_dir_client_refetch_hsdesc_if_needed(). We could expand that refetch function to be about refetching-or-reposting, or we could make a second parallel function that is about reposting if needed.)

comment:6 Changed 8 months ago by dgoulet

Milestone: Tor: unspecifiedTor: 0.4.3.x-final
Points: 0.5

comment:7 Changed 8 months ago by dgoulet

Owner: set to dgoulet
Status: newassigned

comment:8 Changed 6 months ago by dgoulet

Need for #32020 to be merged upstream before I can fix this.

comment:9 Changed 6 months ago by dgoulet

Status: assignedneeds_review

Branch: ticket31632_043_01
PR: https://github.com/torproject/tor/pull/1593

comment:10 Changed 6 months ago by asn

Status: needs_reviewneeds_revision

Good stuff! A review has been done!

comment:11 Changed 6 months ago by dgoulet

Status: needs_revisionneeds_review

Some discussions on the PR that I need asn to go through.

comment:12 in reply to:  11 Changed 6 months ago by asn

Status: needs_reviewneeds_revision

Replying to dgoulet:

Some discussions on the PR that I need asn to go through.

Replied but probably failed to provide useful answers :/

comment:13 Changed 4 months ago by nickm

Keywords: 043-deferred added

All 0.4.3.x tickets without 043-must, 043-should, or 043-can are about to be deferred.

comment:14 Changed 4 months ago by nickm

Milestone: Tor: 0.4.3.x-finalTor: 0.4.4.x-final

comment:15 Changed 4 months ago by dgoulet

Parent ID: #30200
Sponsor: Sponsor27-mustSponsor27-can

Unparenting so we can close the parents. This will get merged into 044 at some point.

Note: See TracTickets for help on using tickets.