Opened 10 months ago

Closed 4 months ago

#28269 closed defect (implemented)

Repeated HSFETCH queries fail with QUERY_NO_HSDIR

Reported by: atagar Owned by: neel
Priority: Medium Milestone: Tor: 0.4.1.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: tor-hs, tor-control tor-spec
Cc: irl, neel Actual Points:
Parent ID: Points:
Reviewer: asn Sponsor:

Description

Hi lovely network team. Iain is working on a script to inform him of hidden service descriptor age. He's using stem to fetch them but unfortunately after a few HSFETCH calls it begins to fail with QUERY_NO_HSDIR. Maybe some attempt at rate limiting? Once the tor process gets into a borked state it doesn't seem to recover until I bounce the tor process.

Here's an example of my first query (which succeeds)...

>>> SETEVENTS HS_DESC HS_DESC_CONTENT
250 OK

>>> HSFETCH facebookcorewwwi
250 OK

>>> /events
HS_DESC_CONTENT facebookcorewwwi 3jjxhgi72xguhnihk4nwzrcpyvwf3ml5 $DA722ECCB9C0DE462C4FA585B93C99CCCA1C7547~SIRDRAKE2018
rendezvous-service-descriptor 3jjxhgi72xguhnihk4nwzrcpyvwf3ml5
version 2
permanent-key
-----BEGIN RSA PUBLIC KEY-----
MIGJAoGBALfng/krEfrBcvblDiM3PAkowkiAKxLoTsXt3nPEzyTP6Cw+Gdr0ODje
hmxTngN1pKiH7szk4Q1p2RabOrUHWwXmGXeDDNs00fcyU6HupgqsCoKOqCsmPac6
/58apC64A7xHeS02wtfWJp6qiZ8i6GGu6xWXRWux+ShPgcHvkajRAgMahU8=
-----END RSA PUBLIC KEY-----
secret-id-part wuyfue4erz2hlh4d5o4kduweosco6rw4
publication-time 2018-10-31 18:00:00
protocol-versions 2,3
introduction-points
-----BEGIN MESSAGE-----
aW50cm9kdWN0aW9uLXBvaW50IHR6c3U3ZXh5amVlZWU2cm92cDd1Z25lM3pxemlo
ZHk1CmlwLWFkZHJlc3MgMTQ0Ljc2Ljk2LjYKb25pb24tcG9ydCA5MDAxCm9uaW9u
... lot more base64...
L1UycythTGNtYmhITGJPWlhZRjRMcERTV1R3THkwb0ZzTEFnTUJBQUU9Ci0tLS0t
RU5EIFJTQSBQVUJMSUMgS0VZLS0tLS0KCg==
-----END MESSAGE-----
signature
-----BEGIN SIGNATURE-----
cMLWu42NG+I5hH9QAHZUQ8eDGCUzcny/uN/FwAYiLsUSc3QLg7MKbRTZ3v2ARonB
wcUgEAGpO4wDjuEj2ivNmpt6U0smJ7nM5KTWy4l0732QnTeSEX+P53qIJ7KwxDro
+JyBARfK4orCMieHuxYtJop6YgVRQ8XN6NtP6NiDrWY=
-----END SIGNATURE-----
OK
HS_DESC RECEIVED facebookcorewwwi NO_AUTH $DA722ECCB9C0DE462C4FA585B93C99CCCA1C7547~SIRDRAKE2018 3jjxhgi72xguhnihk4nwzrcpyvwf3ml5
HS_DESC REQUESTED facebookcorewwwi NO_AUTH $DA722ECCB9C0DE462C4FA585B93C99CCCA1C7547~SIRDRAKE2018 3jjxhgi72xguhnihk4nwzrcpyvwf3ml5

... and here's an example of my fourth onward, which always fail.

>>> /events clear
cleared event backlog

>>> HSFETCH facebookcorewwwi
250 OK

>>> /events
HS_DESC_CONTENT facebookcorewwwi 3jjxhgi72xguhnihk4nwzrcpyvwf3ml5 UNKNOWN

OK
HS_DESC FAILED facebookcorewwwi NO_AUTH UNKNOWN REASON=QUERY_NO_HSDIR
HS_DESC_CONTENT facebookcorewwwi csq76xfzmtyepzmkhzz2lb7ja3vmylwu UNKNOWN

OK
HS_DESC FAILED facebookcorewwwi NO_AUTH UNKNOWN REASON=QUERY_NO_HSDIR

I'm filing this on irl's behalf. If anything's needed from me on the stem front just let me know.

Child Tickets

Change History (22)

comment:1 Changed 10 months ago by dgoulet

There is. Upon a successful fetch of a descriptor, you can't query again that HSDir until:

/** The period for which a hidden service directory cannot be queried for
 * the same descriptor ID again. */
#define REND_HID_SERV_DIR_REQUERY_PERIOD (15 * 60)

However, the "HSDir queried time" cache can be purged with a NEWNYM signal.

But please, lets be careful here, there is a reason why that limit is there, to avoid network load and bombarding HSDir with requests like for example a client script that could be to insistent. ;)

comment:2 Changed 10 months ago by atagar

Great, thanks David! Maybe we can provide a more helpful controller response? QUERY_NO_HSDIR indicates that we don't have a hidden service directory available. Not that we're being throttled, or what to do.

comment:3 in reply to:  2 Changed 10 months ago by dgoulet

Keywords: tor-hs tor-control added
Milestone: Tor: unspecified

Replying to atagar:

Great, thanks David! Maybe we can provide a more helpful controller response? QUERY_NO_HSDIR indicates that we don't have a hidden service directory available. Not that we're being throttled, or what to do.

There are two cases where you get QUERY_NO_HSDIR:

  1. Rate limited and no more are available from the set you can query.
  1. All of the HSDir do not have the .onion descriptor so you have no more to query.

In both cases, they could probably benefit to have their own distinctive error code.

comment:4 Changed 10 months ago by irl

Cc: irl added

Ok. It would definitely help me to be able to handle the rate limiting by informing the user that they have incorrectly configured the check and should have it only run every 30 minutes.

For now I'll just leave it unhandled and leave a comment in the script.

comment:5 Changed 5 months ago by neel

Cc: neel added
Owner: set to neel
Status: newassigned

comment:6 Changed 5 months ago by neel

Status: assignedneeds_review

comment:7 Changed 5 months ago by irl

The changelog for the tor PR looks good, I did not look at the code.

The torspec PR does not mention which version of tor implements the new response, and seems to imply it appeared at the same time as the NO_HSDIR response, which is incorrect.

I'm not a network team person, but I'd suggest that the spec PR needs revision.

comment:8 Changed 5 months ago by neel

I totally forgot about the version numbers.

I have pushed the changes to the torspec PR.

comment:9 Changed 4 months ago by irl

All the details seem to be in the spec PR now.

comment:10 Changed 4 months ago by nickm

Keywords: tor-spec added
Milestone: Tor: unspecifiedTor: 0.4.1.x-final

comment:11 Changed 4 months ago by asn

Reviewer: asn

comment:12 Changed 4 months ago by asn

Status: needs_reviewmerge_ready

LGTM! Just a minor nitpick, please turn is_rate_limited and rate_limited into bools to make their purpose more clear. In particular, it's not clear what rate_limited is from its name, so turning it into a bool will be easier to read the code.

After that, we can put this in merge_ready. Thanks!

comment:13 Changed 4 months ago by asn

Status: merge_readyneeds_revision

comment:14 Changed 4 months ago by neel

Status: needs_revisionmerge_ready

I made rate_limited and is_rate_limited bools.

comment:15 Changed 4 months ago by asn

Status: merge_readyneeds_revision

That looks good but also perhaps let's initialize rate_limited to false just to make it clear and future-proof.

Also, I think there might be a small bug here in the case where we go in hs_pick_hsdir() with an empty responsible_dirs (as indicated by the comment in directory_get_from_hs_dir()). In that case rate_limited will get toggled to true because of rate_limited = rate_limited_count == responsible_dirs_count;, even tho it should be falsebecause the issue has nothing to do with rate limiting.

Putting this back in needs_rev.

comment:16 Changed 4 months ago by neel

Status: needs_revisionneeds_review

I made the corrections necessary.

I have also fixed the bug regarding rate_limited = rate_limited_count == responsible_dirs_count; by only setting rate_limited if rate_limited_count or responsible_dirs_count is greater than 0.

comment:17 in reply to:  16 Changed 4 months ago by asn

Status: needs_reviewneeds_revision

Replying to neel:

I made the corrections necessary.

Almost there! I meant to initialize the rate_limited of hs_pick_hsdir() not the other one, but again I did not clarify.

Now, if responsible_dirs_count is zero, then rate_limited will be uninitialized and since it's on stack it can have garbage value, and give bad results to is_rate_limited.

comment:18 Changed 4 months ago by neel

Status: needs_revisionneeds_review

Pushed it!

comment:19 Changed 4 months ago by asn

Status: needs_reviewmerge_ready

Yes, looks great! Thanks!

comment:20 Changed 4 months ago by nickm

Seems reasonable; merging.

comment:21 Changed 4 months ago by nickm

Merged, and tweaked new argument name to end with _out since that's our convention.

comment:22 Changed 4 months ago by nickm

Resolution: implemented
Status: merge_readyclosed
Note: See TracTickets for help on using tickets.