Opened 2 years ago

Closed 13 months ago

#24346 closed defect (fixed)

prop224: Service stops uploading one of its two descriptors

Reported by: asn Owned by: dgoulet
Priority: Medium Milestone: Tor: unspecified
Component: Core Tor/Tor Version: Tor: 0.3.2.4-alpha
Severity: Normal Keywords: prop224, tor-hs, 034-triage-20180328, 034-removed-20180328
Cc: Actual Points:
Parent ID: Points: 2
Reviewer: Sponsor:

Description

My prop224 service stopped working. I did some digging and I noticed that it's only uploading one of its two descriptors, so it's only uploading next or current but not both. This has reachability consequences.

Here are some logs (grepping for run_upload_descriptor_event()):

Nov 16 15:06:51.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service current descriptor for service onion with 3/3 introduction points.
Nov 16 15:51:04.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service next descriptor for service onion with 3/3 introduction points.
Nov 16 17:00:25.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service next descriptor for service onion with 3/3 introduction points.
Nov 16 17:06:02.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service current descriptor for service onion with 3/3 introduction points.
Nov 16 18:22:23.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service next descriptor for service onion with 3/3 introduction points.
Nov 16 19:05:22.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service current descriptor for service onion with 3/3 introduction points.
Nov 16 19:05:22.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service next descriptor for service onion with 3/3 introduction points.
Nov 16 20:13:08.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service current descriptor for service onion with 3/3 introduction points.
Nov 16 20:20:11.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service next descriptor for service onion with 3/3 introduction points.
Nov 16 21:18:50.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service next descriptor for service onion with 3/3 introduction points.
Nov 16 21:30:21.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service current descriptor for service onion with 3/3 introduction points.
Nov 16 22:49:34.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service current descriptor for service onion with 3/3 introduction points.
Nov 17 00:23:30.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service current descriptor for service onion with 3/3 introduction points.
Nov 17 01:08:16.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service next descriptor for service onion with 5/3 introduction points.
Nov 17 01:50:14.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service next descriptor for service onion with 3/3 introduction points.
Nov 17 03:37:20.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service next descriptor for service onion with 3/3 introduction points.
Nov 17 04:40:29.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service next descriptor for service onion with 3/3 introduction points.
Nov 17 05:30:14.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service next descriptor for service onion with 3/3 introduction points.
Nov 17 06:41:39.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service next descriptor for service onion with 3/3 introduction points.
Nov 17 08:11:41.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service next descriptor for service onion with 3/3 introduction points.
Nov 17 09:18:45.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service next descriptor for service onion with 3/3 introduction points.
Nov 17 10:40:24.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service next descriptor for service onion with 3/3 introduction points.
Nov 17 12:22:55.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service next descriptor for service onion with 3/3 introduction points.
Nov 17 13:30:14.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service next descriptor for service onion with 3/3 introduction points.
Nov 17 14:39:14.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service next descriptor for service onion with 3/3 introduction points.
Nov 17 14:40:15.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service next descriptor for service onion with 3/3 introduction points.
Nov 17 16:13:06.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service next descriptor for service onion with 3/3 introduction points.
Nov 17 18:01:49.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service next descriptor for service onion with 3/3 introduction points.
Nov 17 19:46:57.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service current descriptor for service onion with 3/3 introduction points.
Nov 17 20:01:08.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service next descriptor for service onion with 3/3 introduction points.
Nov 17 20:16:59.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service next descriptor for service onion with 3/3 introduction points.
Nov 17 20:52:18.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service next descriptor for service onion with 3/3 introduction points.
Nov 17 21:28:40.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service current descriptor for service onion with 3/3 introduction points.
Nov 17 21:37:22.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service next descriptor for service onion with 3/3 introduction points.
Nov 17 22:31:59.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service current descriptor for service onion with 3/3 introduction points.
Nov 17 23:00:11.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service next descriptor for service onion with 3/3 introduction points.
Nov 18 00:17:02.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service current descriptor for service onion with 3/3 introduction points.
Nov 18 00:22:14.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service next descriptor for service onion with 3/3 introduction points.
Nov 18 01:12:15.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service current descriptor for service onion with 3/3 introduction points.
Nov 18 01:12:15.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service next descriptor for service onion with 3/3 introduction points.
Nov 18 02:44:15.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service current descriptor for service onion with 3/3 introduction points.
Nov 18 02:44:16.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service next descriptor for service onion with 3/3 introduction points.
Nov 18 04:09:30.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service current descriptor for service onion with 3/3 introduction points.
Nov 18 06:01:37.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service current descriptor for service onion with 3/3 introduction points.
Nov 18 07:03:35.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service current descriptor for service onion with 3/3 introduction points.
Nov 18 08:53:53.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service current descriptor for service onion with 3/3 introduction points.
Nov 18 10:50:47.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service current descriptor for service onion with 3/3 introduction points.
Nov 18 11:50:15.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service current descriptor for service onion with 3/3 introduction points.
Nov 18 13:33:04.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service current descriptor for service onion with 3/3 introduction points.
Nov 18 14:54:15.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service current descriptor for service onion with 3/3 introduction points.

Seems to be a pretty rare edge case, but putting it on 0.3.2 in any case. We can defer if needed.

We probably need some more diagnostic logs in should_service_upload_descriptor() to understand more about this issue.

Child Tickets

Attachments (1)

bug24346.log.gz (1.3 MB) - added by asn 2 years ago.

Download all attachments as: .zip

Change History (14)

comment:1 Changed 2 years ago by asn

Uploading info logs for this issue. Still haven't add debug logs on should_service_upload_descriptor() yet.

Changed 2 years ago by asn

Attachment: bug24346.log.gz added

comment:2 Changed 2 years ago by dgoulet

Cc: dgoulet removed
Keywords: tor-hs added
Owner: set to dgoulet
Status: newaccepted

Agree, I've looked at the logs and notice here, the next descriptor stops being uploaded:

Nov 16 21:30:21.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service current descriptor for service onion with 3/3 introduction points.
Nov 16 22:49:34.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service current descriptor for service onion with 3/3 introduction points.
Nov 17 00:23:30.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service current descriptor for service onion with 3/3 introduction points.
Nov 17 01:08:16.000 [info] run_upload_descriptor_event(): Initiating upload for hidden service next descriptor for service onion with 5/3 introduction points.

And then before 01:08:16, the current is discarded, the next is the new current (current = next) and we create a new next descriptor. So at that point, the current descriptor becomes the one never uploading and the next is fine because it is a brand new one. Interestingly enough, this pattern comes back hours later...

I can't tell from the logs why this is happening so as suggested, we need to log every failing condition in should_service_upload_descriptor() so we can have more information on why it is deciding to not upload the descriptor...

comment:3 Changed 2 years ago by dgoulet

Status: acceptedneeds_review

See branch for the added logging statements: ticket24346_032_01

Once merged, lets move this ticket to 033 in needs_information because without these log statements, I fear understanding whats going on will be near impossible.

comment:4 in reply to:  3 Changed 2 years ago by asn

Status: needs_reviewneeds_revision

Replying to dgoulet:

See branch for the added logging statements: ticket24346_032_01

Once merged, lets move this ticket to 033 in needs_information because without these log statements, I fear understanding whats going on will be near impossible.

Hm. Here is how this branch looks like on my HS (I bumped the msgs to warn):

Nov 30 08:25:03.000 [warn] Service onion can't upload its current descriptor. Next upload time is 1512035656, it is now 1512030303. [1198 similar message(s) suppressed in last 600 seconds]
Nov 30 08:35:02.000 [warn] Service onion can't upload its current descriptor. Next upload time is 1512035656, it is now 1512030902. [1197 similar message(s) suppressed in last 600 seconds]
Nov 30 08:45:02.000 [warn] Service onion can't upload its current descriptor. Next upload time is 1512035656, it is now 1512031502. [1199 similar message(s) suppressed in last 600 seconds]
Nov 30 08:55:02.000 [warn] Service onion can't upload its current descriptor. Next upload time is 1512035656, it is now 1512032102. [1197 similar message(s) suppressed in last 600 seconds]
Nov 30 09:05:02.000 [warn] Service onion can't upload its current descriptor. Next upload time is 1512035656, it is now 1512032702. [1199 similar message(s) suppressed in last 600 seconds]
Nov 30 09:15:03.000 [warn] Service onion can't upload its current descriptor. Next upload time is 1512035656, it is now 1512033303. [1199 similar message(s) suppressed in last 600 seconds]
Nov 30 09:25:02.000 [warn] Service onion can't upload its current descriptor. Next upload time is 1512035656, it is now 1512033902. [1197 similar message(s) suppressed in last 600 seconds]
Nov 30 09:35:03.000 [warn] Service onion can't upload its current descriptor. Next upload time is 1512035656, it is now 1512034503. [1199 similar message(s) suppressed in last 600 seconds]
Nov 30 09:45:02.000 [warn] Service onion can't upload its current descriptor. Next upload time is 1512035656, it is now 1512035102. [1197 similar message(s) suppressed in last 600 seconds]
Nov 30 09:55:03.000 [warn] Service onion can't upload its current descriptor. Next upload time is 1512042695, it is now 1512035703. [1198 similar message(s) suppressed in last 600 seconds]
Nov 30 10:05:02.000 [warn] Service onion can't upload its current descriptor. Next upload time is 1512042695, it is now 1512036302. [1197 similar message(s) suppressed in last 600 seconds]
Nov 30 10:15:02.000 [warn] Service onion can't upload its current descriptor. Next upload time is 1512042695, it is now 1512036902. [1197 similar message(s) suppressed in last 600 seconds]
Nov 30 10:20:59.000 [warn] Service onion can't upload its current descriptor. Intro circuits aren't yet all established (2/3). [712 similar message(s) suppressed in last 60 seconds]
Nov 30 10:30:58.000 [warn] Service onion can't upload its current descriptor. Next upload time is 1512042695, it is now 1512037858. [1197 similar message(s) suppressed in last 600 seconds]
Nov 30 10:40:59.000 [warn] Service onion can't upload its current descriptor. Next upload time is 1512042695, it is now 1512038459. [1199 similar message(s) suppressed in last 600 seconds]
Nov 30 10:50:58.000 [warn] Service onion can't upload its current descriptor. Next upload time is 1512042695, it is now 1512039058. [1197 similar message(s) suppressed in last 600 seconds]
Nov 30 11:00:58.000 [warn] Service onion can't upload its current descriptor. Next upload time is 1512042695, it is now 1512039658. [1199 similar message(s) suppressed in last 600 seconds]
Nov 30 11:10:58.000 [warn] Service onion can't upload its current descriptor. Next upload time is 1512042695, it is now 1512040258. [1197 similar message(s) suppressed in last 600 seconds]
Nov 30 11:20:58.000 [warn] Service onion can't upload its current descriptor. Next upload time is 1512042695, it is now 1512040858. [1199 similar message(s) suppressed in last 600 seconds]
Nov 30 11:30:59.000 [warn] Service onion can't upload its current descriptor. Next upload time is 1512042695, it is now 1512041459. [1199 similar message(s) suppressed in last 600 seconds]

As you can see it only cares about current desc. I think that's because the rate limiting subsystem does not care about the log msg, but only about the rate limiting block. So it will suppress different msgs as well (in this case suppress the next descriptor msgs), if they are within the same rate limiting block.

If the above is correct, I think the branch needs revision because it will hide useful information from us.

comment:5 Changed 2 years ago by dgoulet

Arf indeed... So I think we either want to have a rlimit_t object per condition so each of them control their limit.

Second thing, we probably want to decouple current and next then so they don't get caught in the same rlimit.

comment:6 Changed 2 years ago by dgoulet

Two things:

1) See https://trac.torproject.org/projects/tor/ticket/23603#comment:19 for the likely cause of this.

2) We maybe want to hold off on this patch because this is getting weirdly complicated and we kind of found the issue. I agree for "later" it would be good to have but 032 is near stable so we can possibly postpone?

comment:7 Changed 2 years ago by asn

Milestone: Tor: 0.3.2.x-finalTor: 0.3.3.x-final

comment:8 Changed 23 months ago by dgoulet

Status: needs_revisionaccepted

comment:9 Changed 22 months ago by asn

Milestone: Tor: 0.3.3.x-finalTor: 0.3.4.x-final

There is a theory that this was caused by the now-fixed #23603. I'll push this to 034, and let's see if it will come back again.

comment:10 Changed 20 months ago by nickm

Keywords: 034-triage-20180328 added

comment:11 Changed 20 months ago by nickm

Keywords: 034-removed-20180328 added

Per our triage process, these tickets are pending removal from 0.3.4.

comment:12 Changed 20 months ago by nickm

Milestone: Tor: 0.3.4.x-finalTor: unspecified

These tickets, tagged with 034-removed-*, are no longer in-scope for 0.3.4. We can reconsider any of them, if time permits.

comment:13 Changed 13 months ago by dgoulet

Resolution: fixed
Status: acceptedclosed

With OPE code as the revision counter and the fixes in #23603, I think we are done here. We've seen other upload issues but there are specific tickets for it now.

Note: See TracTickets for help on using tickets.